Being able to reload code modules is one of the many nice features of
Python. This allows developers to modify parts of a Python application
while the interpreter is running. In general, all that needs to be done is
pass a module object to the imp.reload() function (or
just reload() in Python 2.x), and the module will be reloaded
from its source file.
There are a few potential complications, however.
If any other code references symbols exported by the reloaded module, they may
still be bound to the original code. For example, imagine if module A
contains the constant INTERVAL = 5, and module B imports that constant
into its namespace (from A import INTERVAL). If we change the constant to
INTERVAL = 10 and just reload module A, any values in module B that were
based on INTERVAL won’t be updated to reflect its new value.
The solution to this problem is to also reload module B. But it’s important
to only reload module Bafter module A has been reloaded. Otherwise, it
won’t pick up the updated symbols.
PyUnit deals with a variation of this problem by introducing a rollback
importer. That approach “rolls back” the set of imported modules
to some previous state by overriding Python’s global __import__ hook.
PyUnit’s solution is effective at restoring the interpreter’s state to
pre-test conditions, but it’s not a general solution for live code reloading
because the unloaded modules aren’t automatically reloaded.
The following describes a general module reloading solution which aims to make
the process automatic, transparent, and reliable.
Recording Module Dependencies
It is important to understand the dependencies between loaded modules so that
they can be reloaded in the correct order. The ideal solution is to build a
dependency graph as the modules are loaded. This can be accomplished by
installing a custom import hook that is called as part of the regular module
This code chains the built-in __import__ hook (stored in _baseimport). It
also tracks the current “parent” module, which is the module that is
performing the import operation. Top-level modules won’t have a parent.
After a module has been successfully imported, it is added to its parent’s
dependency list. Note that this code is only interested in file-based
modules; built-in extensions are ignored because they can’t be reloaded.
This results in a complete set of per-module dependencies for all modules that
are imported after this custom import hook has been installed. These
dependencies can be easily queried at runtime:
The next step is to build a dependency-aware reload() routine.
This reload() implementation uses recursion to reload all of the requested
module’s dependencies in reverse order before reloading the module itself. It
uses the visited set to avoid infinite recursion should individual modules’
dependencies cross-reference one another. It also rebuilds the modules’
dependency lists from scratch to ensure that they accurately reflect the
updated state of the modules.
Custom Reloading Behavior
The reloading module may wish to implement some custom reloading logic, as
well. For example, it may be useful to reapply some pre-reloaded state to the
reloaded module. To support this, the reloader looks for a module-level
function named __reload__(). If present, this function is called after a
successful reload with a copy of the module’s previous (pre-reload)
Instead of simply calling imp.reload(), the code expands to:
The _deepcopy_module_dict() helper routine exists to avoid deepcopy()-ing
unsupported or unnecessary data.
Monitoring Module Changes
A nice feature of a reloading system is automatic detection of module changes.
There are many ways to monitor the file system for source file changes. The
approach implemented here uses a background thread and the stat()
system call to watch each file’s last modification time. When an updated
source file is detected, its filename is added to a thread-safe queue.