Difference between revisions of "Content Manager"

From Gcube Wiki
Jump to: navigation, search
(Creating Trees)
(Binding Trees)
Line 257: Line 257:
  
 
==== Binding Trees ====
 
==== Binding Trees ====
 +
 +
The <code>Bindings</code> class offers static facilities to transform native models of <code>gDoc</code> trees into XML-based models. Two representations are supported natively, based on which other XML-based representation can be produced using standard platform facilities (e.g. TRAX):
 +
 +
* <code>Bindings.toElement(GDoc)</code> converts native models of <code>gDoc</code> trees into equivalent DOM models.
 +
* <code>Bindings.fromElement(Element)</code> converts DOM models of <code>gDoc</code> trees into equivalent native models.
 +
* <code>Bindings.toXML(GDoc, Writer, boolean?)</code> converts the native model into XML document streams, optionally excluding document declarations.
 +
* <code>Bindings.fromXML(Reader)</code> converts XML document streams into <code>gDoc</code> trees.
 +
 +
'''note''': DOM conversions of native models are implemented directly, as they are most commonly required for interactions with the Content Manager service. Stream conversions are instead derived from DOM conversions via TRAX, at an additional processing cost.
 +
 +
'''note''': conversions from native models to XML-based models assign the conventional name <code>http://gcube-system.org/namespaces/contentmanagement/gdoc:gdoc</code> (cf. <code>Bindings.GDOC_NS</code>, and <code>Bindings.GDOC_NAME</code> constants) to the document element. Vice versa, conversion from XML-based representations to native models discard the name of the document element.
 +
 +
Here is a usage example, which shows that equivalence of native models is preserved under round-trip conversions.
 +
 +
<source lang="java5">
 +
import static org.gcube.contentmanagement.contentmanager.stubs.model.trees.Bindings.*;
 +
...
 +
GDoc doc = ....
 +
 +
//DOM conversion
 +
GDOc doc2 = fromElement(toElement(doc));
 +
assert doc.equals(doc2); //true!
 +
 +
//stream conversion
 +
StringWriter w = new StringWriter();
 +
toXML(doc,w);
 +
GDOC doc3 = fromXML(w.toString());
 +
assert doc.equals(doc3);  //true!
 +
</source>
 +
 +
'''note''': due to the treatment of root element names, equivalence of XML-based representations is not necessarily preserved after round-trip conversion. It is preserved only if the XML-based representations have been previously produced with the conversion routines. 
 +
 +
'''note''': in all the conversions above, <code>null</code> values in attribute and leaf values are serialised using a special constant (exposed programmatically as <code>Node.NULL</code>).
 +
 +
'''note''': the conversions are also available at arbitrary inner nodes, not only roots (cf. <code>Bindings.nodeToElement(Node, QName?)</code>, <code>Bindings.nodeFromElement(Element)</code>,<code>Bindings.nodeToXML(Node, Writer, QName)</code>, and <code>Bindings.nodeFromXML(Reader)</code>.
  
 
=== Tree Predicates ===
 
=== Tree Predicates ===

Revision as of 17:09, 6 September 2010

The Content Manager service provides its clients with uniform access to content served by a variety of back-ends, both inside and outside the system. It is the central component of the gCube subsystem that deals with the organisation of content and related data.


Service Design

The Content Manager is designed as an OCMA service. In OCMA terms, it classifies as a multi-type, 1-N adapter service:

  • it is a multi-type service because it supports two front types for, respectively, reading and writing content modelled as labelled trees.
Collectively, the front types and the tree content model form the gDoc access type of the service.
  • it is an adapter service because it adapts the gDoc access type to multiple back types, where each back type corresponds to the access type of a whole class of remote repositories.
For this, the service employes an open architecture of type-specific plugins to which it delegates the creation and operation of its collection managers.
Plugins are dynamically deployed within single instances of the services, and different instances may host different plugins. In addition, some plugins may support both service front types, i.e. grant read and write access to the corresponding repository. Others may instead support read-only access or, less commonly, write-only access.

The figure below overviews the design and use of the service in the context of one its running instances. The instance exposes three stateful port-types:

  • the ReadManager serves as the interface of collection managers that offer read-only operations over the content of the bound collection.
The interface defines the gDocRead front type of the service.
The front type and the identifier of the bound collection are published as Resource Properties of the manager, in accordance with OCMA patterns for publication and discovery of service state. A third Resource Property is the name of the bound plugin, i.e. the plugin to which the manager delegates the resolution of its requests.
  • the WriteManager serves as the interface of collection managers that offer write-only operations over the content of the bound collection.
The interface defines the gDocWrite front type of the service.
Again, the type, the identifier of the bound collection, and the name of the bound plugin are published as Resource Properties of the manager.
  • the Factory serves as the front-end of a single WS Resource that creates ReadManager and WriteManager resources .
The resource is created at the activation of the service instance in the gCube Hosting Node.
During its lifetime, it publishes creation requests as activation records. Conversely, it subscribes for the activation records that are published by other instances of the service, in line with OCMA patterns for replication of service state.
The resource also publishes as a Resource Property the list of summary descriptions of the plugins that are hosted at the service instance.


Service plugins logically extend factory and collection manager resources with corresponding resource delegates. In particular:

  • the factory delegate extends the Factory resource at plugin deployment time in order to handle requests that are specifically addressed to the plugin;
  • at each such request, the factory delegate processes plugin-specific parameters to create one ore more read delegates and/or write delegates, which the service instance uses to create and extend corresponding collection managers;
  • future requests to the managers are then handled by their delegates, which translate the requests against the back-end repository that exposes the collection bound to the managers.


Finally, note that factory and collection managers are persistent resources and may thus be re-activated across restarts of the gCube Hosting Node:

  • the factory persists the history of its activations, i.e. the activation records that it published and/or processed.
  • the collection managers persist the name and state of their delegates.

Collection Manager Design Overview

Content Model

Architectural considerations aside, the most distinguished element in the design of the Content Manager is its content model. Rather than settle for a fixed set of document structures, the service adopts a generic structure that can act as a 'carrier' for an arbitrary number of concrete document models. In particular, the service deals with edge-labelled and node-attributed trees, the gDoc trees.

The expectation here is that producers (service plugins) and consumers (service clients) will convene on concrete document models and exchange gDoc trees with an agreed shape. The agreement may be bilateral or involve any number of parties, and it may apply to the entire document or to distinguished parts of it (e.g. document metadata, annotations, raw content packaging, etc). For maximum decoupling between consumers and producers, the agreement may reflect system-wide conventions and result in canonical tree forms.

gDoc Trees

A gDoc tree has the following properties:

  • its nodes may have an identifier and a number of uniquely named attributes;
  • its edges have a label;
  • its leaf nodes may have a value;
  • its root may identify the collection of the corresponding document.

In particular:

  • identifiers, attributes, and leaves have text values;
  • attribute names and labels may be qualified with a namespace.

Finally:

  • nodes may have a state of NEW, MODIFIED, or DELETED.
States denote changes with respect to persistent representations of documents. They are used in the write operations of the service.

The figure below uses a graphical representation to show an example of a gDoc tree.


A sample gDoc tree


gDoc trees serialise to XML documents for exchange over the network. In particular:

For example, the gDoc tree above may serialise as:

<g:gdoc xmlns:g="http://gcube-system.org/namespaces/contentmanagement/gdoc"
	g:id="1" g:state="MODIFIED" x="..." y="..." g:collID="...">
	<a g:id="2" g:state="MODIFIED">
		<b g:id="5" g:state="MODIFIED" />
	</a>
	<a g:id="3" g:state="MODIFIED">
		<c g:state="NEW">
			<d g:state="NEW">...</d>
			<d g:state="NEW" w="..">...</d>
		</c>
	</a>
	<b g:id="4" g:state="MODIFIED" w="..." />
</g:gdoc>

Note that gDoc trees inherit constraints from their XML serialisation. In particular, the names of edges, the names of attributes, the values of attributes, and the values of leaves are regulated by the definition of the format.

gDoc API

The XML serialisation of gDoc trees is 'natural', in that it does not employ dedicated element structures for the representations nodes, edges, attributes, etc. This streamlines its manipulation with standard XMl technologies (e.g. XPath, XSLT, XQuery, DOM, SAX, etc.) and does not inhibit object binding technologies (e.g. JAXB, XStream, etc). As a native option, however, the service defines a bespoke object model and API for gDoc trees which offer:

  • dedicated support for tree processing requirements associated with the use of the service;
  • transparencies and optimisations for tree storage, construction, deconstruction, and input/output.

While the model is available to service clients, it also forms the basis of the interface between the service and its plugins. For this reason, its main features are overviewed here while its client-oriented features are discussed later on.

As the figure below illustrates, the model is defined in org.gcube.contentmanagement.contentmanager.stubs.model.trees in terms of the following components:

  • Node: an abstract base for nodes with an identifier, a state, and a map of QName-ed attributes.
  • State: an inner enumeration of Node for node states.
  • Edge: A QName-ed edge to a target Node.
  • InnerNode: a Node with a list of outgoing Edges.
  • Leaf: a Node with a value.
  • gDoc: an InnerNode with a collection identifier.
  • Nodes: a collection of static utilities to generate Nodes and Edges.
  • Bindings: a collection of static utilities to serialise and deserialise Nodess to and from DOM trees and/or character streams.
  • NodeView: a base class for JAXB bindings to Nodes.
  • GDocView: a NodeViewM for JAXB bindings to GDoc nodes.


the ...model.tree package

The model API is illustrated by example in the rest of this Section. The full list of methods and their signatures can be found in the code documentation.

Creating Trees

The first and obvious way to create gDoc trees is with the constructors of the concrete node classes (GDoc, InnerNode, Leaf). As a first example, the following code illustrates the creation of a tree with an attributed root and two leaf nodes:

GDoc doc = new GDoc("someid");
doc.setAttribute(new QName("x"), "1");
doc.setAttribute(new QName("someNS","y"), new Date().toString());
doc.collectionID("...");
 
Leaf leaf1 = new Leaf(null,"2"); //no identifier
Leaf leaf2 = new Leaf(null,"true");
 
Edge e1 = new Edge(new QName("a"),leaf1);
Edge e2 = new Edge(new QName("someNS","b"),leaf2);
 
doc.add(e1,e2);

While already more convenient than cross-language and format-oriented tree APIs (e.g. DOM), step-by-step construction is verbose, even in the case of small trees. For a first degree of improvement, the node classes offer rich suites of constructors and setter overloads that allow for more 'in-lined' tree constructions and absorb the creation of QNames:

GDoc doc2 = new GDoc("someid",
		new Edge("a", new Leaf(null,"2")),
		new Edge("someNS","b", new Leaf(null,"true")));
 
doc2.setAttribute("x", "1");
doc2.setAttribute("someNS","y", new Date().toString());
doc2.collectionID("somecollID");

For additional convenience, the Nodes class defines a large number of generators, i.e. factory methods that can be statically imported and then composed into a pseudo literal syntax for gDoc trees:

import static org.gcube.contentmanagement.contentmanager.stubs.model.trees.Nodes.*;
...
 
GDoc doc3 = attr(
		gdoc("somecollID","someid",
			e("a",2), 
			e("someNS","b",true)),
            a("x",1),a("someNS","y",new Date()));>

Here, gdoc, attr, e, a are examples of node, attribute, and edge generators. Besides allowing fully in-lined tree expressions without the use of the new operator, the generators offer QName creation transparencies and object-to-string conversion transparencies (cf. the int, boolean, and Date example above). The transparency of date conversions is particularly important here, as it ensures adherence to XML serialisation standards that are not natively adopted in Java (e.g. in the implementation of toString). See the code documentation for the full list of available generators, as well as for the additional examples that follow:

doc = gdoc();
doc = gdoc("someid");
doc = gdoc("collectionid","someid");
 
doc = gdoc("1",
			e("a",
				n("2",//n() => inner node generator
					e("b",
						l("3",0)))));
 
doc = gdoc(//no identifier here
		e("a",
			attr(
				n("2", 
					e("b",
						l("3",0)), //l()= explicit leaf generator for identity assignment
					e("a",
						 l("4",0))),
			a("foo","0"))));
 
 
doc = attr(gdoc("1",
		e("a",l("2",5)),
		e("b",attr(
				n("3",e("c",4)),
			  a("foo",0))),
		e("c",5)),
      a("x",0));
 
 
doc = attr(gdoc("1",
		e("a",
			n("2",
			  e("b",
				   n("$2"))
			)),
		e("a",
			n("a1",
				e("c",n(
						e("d","..."),
						e("d",attr( //l()= explicit leaf generator for attribute assignments
								l("<xml>..</xml>"),
							  a("w",".."))))
				)
			)),
		e("b",
			attr(
				n("1:/2"),
			a("w","..."))
			)),
      a("x","http://org.acme:8080"),a("y","<a>...</a>"));

The literal construction of trees is particularly convenient in during testing, though it composes well with the programmatic construction in the development of production code:

Edge edge = ....
InnerNode node = ....;
 
attr( 
   node.add(e("before","..."), edge, e("after","..."))
), a("newattr","...");

note: the node classes override equals for equivalence-based comparisons, and hashCode for correct use as keys within hash-based data structures, and toString for convenience of debugging.

Consuming Trees

Serialising and Deserialising Trees

Binding Trees

The Bindings class offers static facilities to transform native models of gDoc trees into XML-based models. Two representations are supported natively, based on which other XML-based representation can be produced using standard platform facilities (e.g. TRAX):

  • Bindings.toElement(GDoc) converts native models of gDoc trees into equivalent DOM models.
  • Bindings.fromElement(Element) converts DOM models of gDoc trees into equivalent native models.
  • Bindings.toXML(GDoc, Writer, boolean?) converts the native model into XML document streams, optionally excluding document declarations.
  • Bindings.fromXML(Reader) converts XML document streams into gDoc trees.

note: DOM conversions of native models are implemented directly, as they are most commonly required for interactions with the Content Manager service. Stream conversions are instead derived from DOM conversions via TRAX, at an additional processing cost.

note: conversions from native models to XML-based models assign the conventional name http://gcube-system.org/namespaces/contentmanagement/gdoc:gdoc (cf. Bindings.GDOC_NS, and Bindings.GDOC_NAME constants) to the document element. Vice versa, conversion from XML-based representations to native models discard the name of the document element.

Here is a usage example, which shows that equivalence of native models is preserved under round-trip conversions.

import static org.gcube.contentmanagement.contentmanager.stubs.model.trees.Bindings.*;
...
GDoc doc = ....
 
//DOM conversion
GDOc doc2 = fromElement(toElement(doc));
assert doc.equals(doc2); //true!
 
//stream conversion
StringWriter w = new StringWriter();
toXML(doc,w);
GDOC doc3 = fromXML(w.toString());
assert doc.equals(doc3);   //true!

note: due to the treatment of root element names, equivalence of XML-based representations is not necessarily preserved after round-trip conversion. It is preserved only if the XML-based representations have been previously produced with the conversion routines.

note: in all the conversions above, null values in attribute and leaf values are serialised using a special constant (exposed programmatically as Node.NULL).

note: the conversions are also available at arbitrary inner nodes, not only roots (cf. Bindings.nodeToElement(Node, QName?), Bindings.nodeFromElement(Element),Bindings.nodeToXML(Node, Writer, QName), and Bindings.nodeFromXML(Reader).

Tree Predicates

Service Interface

Service Plugins

Client Libraries

Stub Distribution

Content Management Library