| Home | Trees | Indices | Help |
|
|---|
|
|
Service to handle serialization and deserialization of python objects.
Object serialization is useful for long term storage, interlanguage communication and network transmission. In all cases, the process involves an initial encode() followed by a later decode().
The following properties are desirable for serialization/deserialization:
Python's builtin serialization, pickle/cPickle, cannot meet these needs. It is python specific, and not friendly to human readers or readers from other environments such as IDL which may want to load or receive data from a python program. Pickle inf/nan doesn't work on windows --- some of our models may use inf data, and some of our results may be nan. pickle has minimal support for versioning: users can write __setstate__ which accepts a dictionary and adjusts it accordingly. Beware though that version must be an instance variable rather than a class variable, since class variables are not seen by pickle. If the class is renamed, then pickle can do nothing to recover it.
Instead of pickle, we break the problem into parts: structucture and encoding. A pair of functions deconstruct and reconstruct work directly with the structure. Deconstruct extracts the state of the python object defined using a limited set of python primitives. Reconstruct takes an extracted state and rebuilds the complete python object. See documentation on the individual functions for details.
Object persistence for long term storage places particular burdens on the serialization protocol. In particular, the class may have changed since the instance was serialized. To aid the process of maintaining classes over the long term, the class definition can contain the following magic names:
We need to coexist with third party libraries which may or may not use our object serialization technology, or even if they do, we may want to replace the dependence of one third party library with our own or another implementation. In order to do so we allow have class registry with the function refactor to add entries to the registry. Rather than blindly restoring the class we first see if the class and version are in the registry, and call the registered function to transform the data. For example:
import danse.util.serial as serial
serial.refactor('Numeric.array','numeric_converter','0.0')
This says that items stored as the class Numeric.array will be loaded by calling numeric_converter, which returns a
The following example shows how to use reconstruct and factory to get maximum flexibility when restoring an object.
mylib.__init__.py:
def data():
from mylib.core.data import Data
return Data()
mylib.core.data.py:
from danse.util.serial import isnewer, reconstruct, setstate
class Data(object):
__version__ = '1.2'
__factory__ = 'mylib.data'
def __reconstruct__(self,instance):
'''
Reconstruct the state from
'''
if isnewer('1.0',instance['version']):
raise RuntimeError('pre-1.0 data objects no longer supported')
if isnewer('1.1',instance['version']):
# Version 1.1 added uncertainty; default it to zero
instance['state']['uncertainty'] = 0
setstate(self,reconstruct(instance['state']))
TODO: reconstruct needs to raise specialized error which is a subclass of Runtime error that indicates the name of the package that is out of date so that the application can trigger its package manager.
TODO: what if class registry depends on which package is asking for the redirect? Do we want to allow a chained registry, where we can override the application default? Probably not, but the full answer depends on the solution to the Deployment Problem.
TODO: cannot handle self-referential data structures
| Functions | |||
|
|||
|
|||
|
|||
|
|||
|
|||
| Function Details |
Convert an object hierarchy into python primitives. The primitives used are int, float, str, unicode, bool, None, list, tuple, and dict. Classes are encoded as a dict with keys '.class', '.version', and '.state'. Version is copied from the attribute __version__ if it exists. Functions are encoded as a dict with key '.function'. Raises RuntimeError if object cannot be deconstructed. For example, deconstruct on deconstruct will cause problems since '.class' will be in the dictionary of a deconstructed object. |
Reconstruct an object hierarchy from a tree of primitives. The tree is generated by deconstruct from python primitives (list, dict, string, number, boolean, None) with classes encoded as a particular kind of dict. Unlike pickle, we do not make an exact copy of the original object. In particular, the serialization format may not distinguish between list and tuples, or str and unicode. We also have no support for self-referential structures. Raises RuntimeError if could not reconstruct |
Version comparison function. Returns true if version is at least as new as the target version. A version number consists of two or three dot-separated numeric components, with an optional "pre-release" tag on the end. The pre-release tag consists of the letter 'a' or 'b' followed by a number. If the numeric components of two version numbers are equal, then one with a pre-release tag will always be deemed earlier (lesser) than one without. The following will be true for version numbers: 8.2 < 8.19a1 < 8.19 == 8.19.0 You should follow the rule of incrementing the minor version number if you add attributes to your models, and the major version number if you remove attributes. Then assuming you are working with e.g., version 2.2, your model loading code will look like:
if isnewer(version, Model.__version__):
raise IOError('software is older than model')
elif isnewer(xml.version, '2.0'):
instantiate current model from xml
elif isnewer(xml.version, '1.0'):
instantiate old model from xml
copy old model format to new model format
else:
raise IOError('pre-1.0 models not supported')
Based on distutils.version.StrictVersion |
Register the renaming of a class. As code is developed and maintained over time, it is sometimes beneficial to restructure the source to support new features. However, the structure and location of particular objects is encoded in the saved file format. When you move a class that may be stored in a model, be sure to put an entry into the registry saying where the model was moved, or None if the model is no longer supported. reconstructor as a function to build a python object from a particular class/version, presumably older than the current version. This is necessary, e.g., to set default values for new fields or to modify components of the model which are now represented differently. The reconstructor function takes the structure above as its argument and returns a python instance. You are free to restructure the state and version fields as needed to bring the object in line with the next version, then call setstate(tree) to build the return object. Indeed this technique will chain, and you can morph an ancient version of your models into the latest version. |
| Home | Trees | Indices | Help |
|
|---|
| Generated by Epydoc 3.0.1 on Mon Mar 16 15:03:12 2009 | http://epydoc.sourceforge.net |