Getting Started with dip.io

The dip.io module serves two purposes. Firstly it provides the support to applications to enable the reading and writing of models. Secondly it provides support for implementing new formats and forms of storage so that they can be used by existing applications without needing to make any changes to those applications. The basis of all this support is an i/o manager which is an object that implements or can be adapted to the IIoManager interface.

Concepts

dip considers storage to be either streaming storage or structured storage. Streaming storage stores data as a byte stream. A filesystem is the most common example of streaming storage. Structured storage stores data in a storage specific structure. An SQL database would be an example of structured storage.

A model is stored according to a data format. When a model is written it is encoded according to the format. When it is read it is decoded from the format. A model will normally have a native format which is used when reading and writing the model. It may be possible to export a model to other formats and to import a model from other formats. Each format has a unique string identifier.

A codec is an object that implements an encoder and/or a decoder for a particular format. Streaming storage has no implicit format and so it is used in conjunction with an explicit codec. The codec effectively imposes the structure of the data on top of the byte stream. Structured storage has, by definition, an implicit format and, therefore, an implicit codec.

A codec specifies encoder and decoder interfaces that a model must implement, or be able to be adapted to, in order to be encoded and decoded.

A model is stored at a storage location that is unique within a particular piece of storage. For example, the location of a model stored in a filesystem is the absolute path name of the file containing the encoded model. A storage location may be implicit. This means that where a model is stored is determined by the value of the model and is not specified by the user.

Reading and Writing a Model

In Getting Started with dip.shell we briefly described the steps to be taken to ensure that a model could be written to and read from storage. Here we will reiterate those steps.

  • Unless one already exists, create a codec, i.e. an object that implements the ICodec interface, for the model’s native format. A codec will specify decoder and encoder interfaces that a model must implement or be able to be adapted to.
  • Ensure that the model implements the decoder and encoder interfaces. This is normally done by creating appropriate adapters. Whether or not a single adapter is used for both interfaces is a matter of personal programming style.
  • Create an instance of the codec and add it to the i/o manager’s list of codecs.

Note that some codecs may only support decoding or encoding but not both. Such codecs would not be used as a model’s native format but they would still be useful when importing from or exporting to non-native formats.

dip provides two codecs that between them cover many common cases.

  • The UnicodeCodec codec, with its IUnicodeDecoder decoder interface and IUnicodeEncoder encoder interface, stores a model as a Unicode byte stream. By default it uses the UTF-8 encoding.
  • The XmlCodec codec, with its IXmlDecoder decoder interface and IXmlEncoder encoder interface, stores a Model instance as XML.

Implementing a New Type of Storage

The need to implement a new type of storage arises less often than the need to implement a new codec. When the need does arise it is typically as a result of some new technology or service becoming available that can be used by many applications rather than something that is application specific. For example, an organisation may subscribe to a cloud based file service. A new type of storage would then be implemented to provide access to it. All existing applications could then use it without making changes to those applications.

The other situation that would require a new type of storage to be implemented is when a database is being used as structured storage.

In this section we describe the high level steps taken to implement a new type of storage, including the interfaces and classes that need to be written.

The IStreamingStorageFactory interface must be implemented by a streaming storage factory, and its __call__() method must return an implementation of the IStorage interface.

The IStructuredStorageFactory interface must be implemented by a structured storage factory, and its __call__() method must also return an implementation of the IStorage interface.

The IStorage interface defines read() and write() methods to do the reading and writing of an object from and to a specific storage location.

IStorage also defines the ui attribute which is an implementation of the IStorageUi interface. This interface defines methods that create the necessary user interfaces that the user will use to select a storage location. For example, the filesystem storage type included with dip provides access (assuming the default PyQt5 toolkit) to QFileDialog using this mechanism. A storage type that handled a database may implement a database browser.

Defining a Storage Policy

Sometimes you may have a situation where a model can be read from or written to a particular storage type, but you want to place restrictions on that access and the options presented to the user. For example, a certain type of user may only be able to read from the storage, or access to the storage may be limited to certain times of the day, or you simply wish to prevent a certain type of model from ever being written a certain type of storage.

The i/o manager will consult a list of storage policies to determine if a model using a particular format should actually be allowed to be read from or written to a particular storage instance. A policy is a callable that is passed the format and the storage instance and should return True if the access is permitted. If any policy returns False then the access is denied.