Difference between revisions of "Content Manager"

From Gcube Wiki
Jump to: navigation, search
(Content Model)
(Content Model)
Line 45: Line 45:
 
== Content Model ==
 
== Content Model ==
  
Architectural considerations aside, the most distinguished element in the design of the Content Manager is its content model. Instead of settling for a fixed set of document structures, the service chooses a single, generic structure that can acts as a 'carrier' for an arbitrary number of document models. In particular, the Content Manager chooses an edge-labelled and node-attributed tree with text-valued leaves, the <code>gDoc</code> tree.  
+
Architectural considerations aside, the most distinguished element in the design of the Content Manager is its content model. Instead of settling for a fixed set of document structures, the service chooses a single generic structure that can acts as a 'carrier' for an arbitrary number of concrete document models. In particular, the Content Manager model documents as edge-labelled and node-attributed trees, the <code>gDoc</code> trees.  
  
The expectation is that producers and consumers will convene on more concrete document models by exchanging <code>gDoc</code> trees with an agreed shape. The agreement may be bilateral or involve any number of parties, and it may apply to the entire document or distinguished parts of it (e.g. the metadata, the annotations, the representation of raw content, and so forth). For maximum decoupling between consumers and producers, the agreement may be captured by system-wide conventions and result in ''canonical tree forms''.  
+
The expectation is that producers (service plugins) and consumers (service clients) will convene on more concrete document models by exchanging <code>gDoc</code> trees with an agreed shape. The agreement may be bilateral or involve any number of parties, and it may apply to the entire document or to distinguished parts of it (e.g. the metadata, the annotations, the representation of raw content, etc). For maximum decoupling between consumers and producers, the agreement may be captured by system-wide conventions and result in [[Canonical gDoc Forms|''canonical tree forms'']].  
  
 
=== The gDoc Tree Model ===
 
=== The gDoc Tree Model ===
  
* nodes may have a textual ''identifier'' and an ordered list of uniquely named and text-valued ''attributes''. Attributes names ''may'' be qualified with a namespace.  
+
A <code>gDoc</code> tree has the following properties:
* nodes have a ''state'' of either <code>NEW</code>, </code>UPDATED</code>, or <code>DELETED</code>, which marks how the node has changed with respect to its persistent representation in the hosting repository. The state of the node is used in write operations, as discussed below.   
+
 
* inner nodes may have a list of named ''edges'', where names ''may'' be qualified with a namespace.
+
* nodes may have an ''identifier'';
* leaf nodes ''may'' have a textual ''value''.
+
:The identifier is arbitrary text
* the root node ''may'' report the identifier of the container document collection;
+
 
 +
* nodes have zero or more ''attributes''.  
 +
:Attributes are uniquely named and names may be qualified with a namespace.
 +
:Attribute are text-valued.
 +
 
 +
* nodes have a ''state''.
 +
:The state may be either <code>NEW</code>, <code>UPDATED</code>, or <code>DELETED</code>.
 +
:The state marks how the node deviates from its persistent representation in a repository. It is used in the write operations of the service.   
 +
 
 +
* inner nodes have zero or more ''edges''.
 +
: Edgea are named and names may be qualified with a namespace.
 +
 
 +
* leaf nodes have a ''value''.
 +
:The value is arbitrary text.
 +
 
 +
* the root may be marked with the identifier of the collection that contains the document represented by the tree;
 +
 
 +
The figure below uses a graphical representation to show an example of a <code>gDoc</code> tree.
  
The figure below uses a graphical representation to show an example of a <code>gDoc</code> tree:
 
  
 
[[Image:Samplegdoc.jpg|A sample <code>gDoc</code> tree]]
 
[[Image:Samplegdoc.jpg|A sample <code>gDoc</code> tree]]
  
=== The <code>gDoc</code> API ===
+
The model has the natural representation in XML:
 +
 
 +
* nodes map onto elements named as the incoming edge. The root element is arbitrarily named;
 +
* attributes map onto XML attributes;
 +
* identifiers maps onto values of attributes called <code>http://gcube-system.org/namespaces/contentmanagement/gdoc:id</code>  
 +
* collection identifiers maps onto values of attributes called <code>http://gcube-system.org/namespaces/contentmanagement/gdoc:collID</code> 
 +
* states map onto values of attributes called <code>http://gcube-system.org/namespaces/contentmanagement/gdoc:state</code> 
 +
 
 +
=== The gDoc API ===
  
  

Revision as of 01:23, 2 September 2010

The Content Manager service provides its clients with uniform access to content hosted or served by a variety of back-ends, both inside and outside the system. It is the central component of the gCube subsystem that deals with organisation of content and related data.


Service Design

The Content Manager is designed as an OCMA service. In OCMA terms, it classifies as a multi-type, 1-N adapter service:

  • it is a multi-type service because it supports two front types for, respectively, reading and writing content modelled as labelled trees.
Collectively, the front types and the tree content model form the gDoc access type of the service.
  • it is an adapter service because it adapts the gDoc access type to multiple back types, where each back type corresponds to the access type of a whole class of remote repositories.
For this, the service employes an open architecture of type-specific plugins to which it delegates the creation and operation of its collection managers.
Plugins are dynamically deployed within single instances of the services, and different instances may host different plugins. In addition, some plugins may support both service front types, i.e. grant read and write access to the corresponding repository. Others may instead support read-only access or, less commonly, write-only access.

The figure below overviews the design and use of the service in the context of one its running instances. The instance exposes three stateful port-types:

  • the ReadManager serves as the interface of collection managers that offer read-only operations over the content of the bound collection.
The interface defines the gDocRead front type of the service.
The front type and the identifier of the bound collection are published as Resource Properties of the manager, in accordance with OCMA patterns for publication and discovery of service state. A third Resource Property is the name of the bound plugin, i.e. the plugin to which the manager delegates the resolution of its requests.
  • the WriteManager serves as the interface of collection managers that offer write-only operations over the content of the bound collection.
The interface defines the gDocWrite front type of the service.
Again, the type, the identifier of the bound collection, and the name of the bound plugin are published as Resource Properties of the manager.
  • the Factory serves as the front-end of a single WS Resource that creates ReadManager and WriteManager resources .
The resource is created at the activation of the service instance in the gCube Hosting Node.
During its lifetime, it publishes creation requests as activation records. Conversely, it subscribes for the activation records that are published by other instances of the service, in line with OCMA patterns for replication of service state.
The resource also publishes as a Resource Property the list of summary descriptions of the plugins that are hosted at the service instance.


Service plugins logically extend factory and collection manager resources with corresponding resource delegates. In particular:

  • the factory delegate extends the Factory resource at plugin deployment time in order to handle requests that are specifically addressed to the plugin;
  • at each such request, the factory delegate processes plugin-specific parameters to create one ore more read delegates and/or write delegates, which the service instance uses to create and extend corresponding collection managers;
  • future requests to the managers are then handled by their delegates, which translate the requests against the back-end repository that exposes the collection bound to the managers.


Finally, note that factory and collection managers are persistent resources and may thus be re-activated across restarts of the gCube Hosting Node:

  • the factory persists the history of its activations, i.e. the activation records that it published and/or processed.
  • the collection managers persist the name and state of their delegates.

Collection Manager Design Overview

Content Model

Architectural considerations aside, the most distinguished element in the design of the Content Manager is its content model. Instead of settling for a fixed set of document structures, the service chooses a single generic structure that can acts as a 'carrier' for an arbitrary number of concrete document models. In particular, the Content Manager model documents as edge-labelled and node-attributed trees, the gDoc trees.

The expectation is that producers (service plugins) and consumers (service clients) will convene on more concrete document models by exchanging gDoc trees with an agreed shape. The agreement may be bilateral or involve any number of parties, and it may apply to the entire document or to distinguished parts of it (e.g. the metadata, the annotations, the representation of raw content, etc). For maximum decoupling between consumers and producers, the agreement may be captured by system-wide conventions and result in canonical tree forms.

The gDoc Tree Model

A gDoc tree has the following properties:

  • nodes may have an identifier;
The identifier is arbitrary text
  • nodes have zero or more attributes.
Attributes are uniquely named and names may be qualified with a namespace.
Attribute are text-valued.
  • nodes have a state.
The state may be either NEW, UPDATED, or DELETED.
The state marks how the node deviates from its persistent representation in a repository. It is used in the write operations of the service.
  • inner nodes have zero or more edges.
Edgea are named and names may be qualified with a namespace.
  • leaf nodes have a value.
The value is arbitrary text.
  • the root may be marked with the identifier of the collection that contains the document represented by the tree;

The figure below uses a graphical representation to show an example of a gDoc tree.


A sample gDoc tree

The model has the natural representation in XML:

The gDoc API

Tree Predicates

Service Interface

Service Plugins

Client Libraries

Stub Distribution

Content Management Library