Content Manager: Stub Distribution

From Gcube Wiki
Jump to: navigation, search

The stub distribution of the Content Manager offers abstractions over the content model and interface of the service. In this sense, the distribution acts both as a client library and and a service-side library for plugin developments. In particular, it is a dependency of the service as well as a dependency of service clients.

The distribution includes:

  • the API for gDoc trees;
  • the API for gDoc tree predicates;
  • the stubs of the service automatically generated from the WSDL definition of its port-types;
  • the high-level calls, a set of abstractions over the service stubs;
  • a Java protocol handler and associated facilities for deriving and resolving content URIs, i.e. resolvable URIs to arbitrary nodes of gDoc trees.


The Stub Distribution


We have previously presented most of the APIs for gDoc trees and tree predicates. We concentrate here on high-level calls and content URIs, completing the presentation of the tree and tree predicate APIs in the process.

High-Level Calls

High-level calls are Java objects that model single-step or multi-step interactions with the Content Management service. The objects encapsulate stub-based interactions behind local object-oriented interfaces that offer transparencies over the remote interfaces of the service port-types.

The local interfaces are based on language features that are not found in the service stubs, including high-level models of inputs and outputs, method overloading, parametric types, asynchronous callbacks.

Behind these abstractions, the call objects engage in optimised and best-effort interactions with the WS-Resources of the services; in particular, they can hide from clients the complexity of resource discovery while keeping visible the remote nature of the interactions and the possibility of their failure.

High-level calls are defined in the package org.gcube.contentmanagement.contentmanager.stubs.calls and in the package org.gcube.contentmanagement.contentmanager.stubs.calls.iterators. The main components are depicted below:


High-Level Calls


  • BaseCall: the base class for all high-level calls.
  • FactoryCall: a BaseCall that represents calls to the code>Factory</code> resource of the service.
  • FactoryParams: used in FactoryCall to model the input of operations to the code>Factory</code> resource of the service.
  • FactoryConsumer: used in FactoryCall to callback invokers of the asynchronous operation of the code>Factory</code> resource of the service.
  • ManagerCall: an abstract extension of BaseCall for calls to the Collection Managers of the service.
  • ReadManagerCall: a ManagerCall that represents calls to <core>ReadManager</code> resources of the service.
  • WriteManagerCall: a ManagerCall that represents calls to <core>WriteManager</code> resources of the service.
  • MappingRegistry: a central registry of type mappings for I/O.
  • Constants: a collection of service-specific constants.
  • Utils: a collection of utilities for I/O conversions.
  • BaseRSIterators<T>: the base class for all iterators backed by a ResultSet of records that can be parsed by a ResultSetParser<T>>.
  • <code>ResultParser<T>: a parser of ResultSet records into objects of type T.
  • GDocParser: a ResultParser of gDoc trees that uses the gDoc native API.
  • AddOutcomeParser: a ResultParser of AddOutcomes.
  • UpdateFailureParser: a ResultParser of UpdateFailureOutcomes.
  • RSIterator<T>: a BaseRSIterator that delivers parsing failures synchronously.
  • RSIterator<T>: a RSIterator that uses a GDocParser.
  • AsyncRSIterator<T>: a BaseRSIterator that tolerates parsing failures and delivers them asynchronously, to a FaultReader .
  • FaultReader: a processor of parsing failures during ResultSet iterations.
  • RSCollection<T>: a lazy collection that can iterated over by a AsyncRSIterator<T%>.
  • GDocRSCollection: an RSCollection that uses a AsyncRSIterator<GDoc%>.

In what follows, we exemplify the use of FactoryCalls, ReadManagerCalls, and WriteManagerCalls.

Factory Calls

A FactoryCall is created in a a scope:

//some scope
GCUBEScope scope = .....
 
FactoryCall call = new FactoryCall(scope);

In a secure infrastructure, the call may also be created with a security manager:

//some scope
GCUBEScope scope = .....
 
//some security manager = ....
GCUBESecurityManager manager = ....
 
FactoryCall call = new FactoryCall(scope,manager);

The call may then be issued, i.e. used to create CollectionManagers. In line with the operations of the remote port-type, this can be done synchronously or asynchronously. The synchronous invocation requires the preparation of FactoryParameters;

FactoryParameters params = new FactoryParameters() ;
params.setPlugin("..somepluginname...");
params.setBroadcast(false);
 
//the DOM serialisation of plugin-specific creation parameters
org.w3c.dom.Element payload = ...
 
params.setPayload(payload)
 
//issue the call
List<EprPair> eprs =  call.create(params); 
//process the response
for (EprPair pair : eprs)
   .... pair.getPorttype() ... pair.getEpr() ...

note: typically, plugin will offer object bindings for the payloads that they support. The payload input to the create() method will then be obtained by serialising the bound objects.

The asynchronous invocation requires the additional preparation of a FactoryConsumer:

//prepare as above
FactoryParameters params = .....
 
//creates consumer
FactoryConsumer consumer = new FactoryConsumer {
 
     protected void onCompletion(List<EprPair> eprs) {
             .... process pairs as above
     };
 
     protected void onFailure(Exception e) {
             ... handle failure
     };
};
 
//issue the call
call.createASync(params,consumer);

In both interactions above, the FactoryCall will attempt to discover Factory WS-Resources that host the plugin named in the parameters. It will then try to interact with each resource in turn, until one responds successfully or else indicates that continuing will be to no avail (by returning a GCUBEUnretrievableFault).

note: clients can obtain and customise the query that underlies the strategy (cf. getQuery()) and, if needed, reset it to its default (resetQuery()).

note: while call objects are often created anew for individual calls to the remote port-type, clients can use the same object for multiple calls (though this is unlikely for FactoryCalls). When this is the case, the calls occur in the same, initially configured scope and the second call 'sticks' to the resource used by the first. The best-effort strategy is intentionally limited to the first invocation only.

Clients who know and wish to target a specific Factory resource, can disable the best-effort strategy by configuring the call with a reference to its endpoint:

//a reference to the endpoint of a Content Manager RI
EndpointReferenceType epr = ...
 
call.setEndpointReferenceType(epr); 
//alternatively:
call.setEndpoint("... somehostname ...",".. someport ..");

ReadManager Calls

A ReadManagerCall gives high-level write access to the content of a given collection, as allowed by a ReadManager resource bound to that collection. It follows the same patterns already illustrated for FactoryCalls. In particular, it is created in a scope and, optionally, with a security manager.

//some scope
GCUBEScope scope = .....
 
ReadManagerCall call = new ReadManagerCall(scope); 
//some security manager = ....
GCUBESecurityManager manager = ....
 
ReadManagerCall secureCall = new ReadManagerCall(scope,manager);

As a further option, it may be crated with the identifier of the target collection:

//some scope
GCUBEScope scope = .....
 
ReadManagerCall call = new ReadManagerCall("... some collection identifier ...",scope); 
//some security manager = ....
GCUBESecurityManager manager = ....
 
ReadManagerCall secureCall = new ReadManagerCall("... some collection identifier ...",scope,manager);

note: the collection identifier may also be set after call construction (cf. setCollectionID(String)).

The call object may be configured as a FactoryCall, i.e. setting reference to resource endpoint for targeted interactions (cf. setEndpointReference(EndpointReference)), or else relying on implicit discovery and best-effort strategy. In the latter case, the query that underlie the strategy can be customised and reset (cf. getQuery(),resetQuery()).

The call object may then be used to retrieve gDoc trees from the target collection. To this end, its operations may be classified in two groups: the those that return single trees and those that return multiple trees. The first class includes lookup operations while the second class includes both lookup and query operations based on tree predicates. Multi-valued operations are execute asynchronously at the service, based on the ResultSet mechanism.

The following example illustrates the use single-valued lookups:

//synchronous: return one gDoc tree
GDoc doc1 = call.get("... tree root identifier ..."); 
//some tree predicate to use for pruning
Predicate projection = ....
 
//synchronous: prune and return one gDoc tree
GDoc doc2 = call.get("... tree root identifier ...",projection);

Here, get(String) and get(String,Predicate) bind the output tree to the object model of gDoc tree API. We note that there are semantically equivalent operations that return DOM bindings, so as to raise no further parsing costs if a binding other than to the gDoc tree API is required upstream (cf. getAsElement(String) and getAsElement(String,Predicate)).


Muti-valued lookups may be exemplified as follows:

//a locator to a ResultSet of tree root identifiers, produced using standard ResultSet production idioms
RSLocator identifiers = ....
 
//asynchronous: returns a locator to a remote ResultSet of gDoc trees with given identifiers
RSLocator locator1 = call.get(identifiers); 
//asynchronous: returns a locator to a remote ResultSet of gDoc trees with given identifiers, pruned by a tree predicate
RSLocator locator2 = call.get(identifiers,predicate);

The ResultSets returned by the lookups contain XML representations of gDoc trees. Standard ResultSet consumption idioms may then be used to extract the XML representations and bind them to object models of choice. The distribution supports more transparent idioms, however:

//a locator to a ResultSet of gDoc trees.
RSLocator locator = ....
 
GDocRSCollection docs = new GDocRSCollection(locator); 
//use standard
for (GDoc doc : doc)  ...process document...

Here, GDocRSCollection is a collection of gDoc trees which is backed by the ResultSet identified by the locator. The collection is 'lazily' assembled, in that it does not allow direct access to its elements, but can only be iterated over with standard language idioms, as shown. The iteration subsumes XML bindings to the native object model and hides binding failures in the process. Clients that wish to process failures can do so asynchronously with respect to the iteration, by previously registering a FaultListener with the collection (e.g. at construction time):

//a locator to a ResultSet of gDoc trees.
RSLocator locator = ....
 
FaultListener listener = new FaultListener() {  @Override void onFault(String unparsedResult, Throwable failure) {...process failure...} } GDocRSCollection docs = new GDocRSCollection(locator, listener); 
for (GDoc doc : doc)
  ...process document...

note: a GDocRSCollection may be iterated over an arbitrary number of times, as expected. Due to its lazy nature, however, the iteration may unconventionally fail at the start. To reduce this chance, the collection eagerly creates a first iterator at construction time. If this succeeds, the first iteration is guaranteed to start correctly. Later iterations may still fail, however, as further iterators are created on demand.

Clients which are not well-served by the asynchronous delivery of failures can instead opt for a GDocRSIterator, which again offers binding transparencies but delivers failures synchronously:

//a locator to a ResultSet of gDoc trees.
RSLocator locator = ....
 
GDocRSIterator it = new GDocRSCollection(iterator); 
while (it.hasNext) {    ... 
  try {
    GDoc doc = it.next();    ....process document...
  }
 catch (Throwable failure) {        ...handle failure...  } ...
}

note: GDocRSCollection and GDocRSIterator are specialisation of more generic facilities: RSCollection<T> and RSIterator<T>, respectively, where T can be specialised to bindings other than GDoc (e.g. JAXB bindings), and in fact to results other than gDoc trees. For this, clients must define a suitable implementation of ResultParser<T>. For more details, see the code documentation.


Finally, we give an example of queries for gDoc trees:

//asynchronous: return a locator to a remote ResultSet of many gDoc trees pruned by a tree predicate
RSLocator locator3 = call.get(projection); 
//some tree predicate to use for filtering
Predicate filter = ....
 
//asynchronous: return a locator to a remote ResultSet of many pruned gDoc trees that satisfy a given filter
RSLocator locator4= call.get(projection,filter); 
//asynchronous: return a locator to a remote ResultSet of all the gDoc trees in the collection
RSLocator locator5 = call.get();

Again, the ResultSets returned by the queries can be consumed with standard ResultSet consumption idioms.

WriteManager Calls

A WriteManagerCall gives high-level write access to the content of a given collection, as allowed by a WriteManager resource bound to that collection. It follows the same patterns already seen for FactoryCalls. In particular, it is created in a scope and, optionally, with a security manager:

//some scope
GCUBEScope scope = .....
 
WriteManagerCall call = new WriteManagerCall(scope); 
//some security manager = ....
GCUBESecurityManager manager = ....
 
WriteManagerCall secureCall = new WriteManagerCall(scope,manager);

As a further option, it may be crated with the identifier of the target collection:

//some scope
GCUBEScope scope = .....
 
WriteManagerCall call = new WriteManagerCall("... some collection identifier ...",scope); 
//some security manager = ....
GCUBESecurityManager manager = ....
 
ReadManagerCall secureCall = new WriteManagerCall("... some collection identifier ...",scope,manager);

note: the collection identifier may also be set after call construction (cf. setCollectionID(String)).

The call may be configured as a FactoryCall, i.e. setting reference to resource endpoint for targeted interactions (cf. setEndpointReference(EndpointReference)). or else relying on implicit discovery and best-effort strategy; in the latter case, the query that underlie the strategy can be customised (cf. getQuery(),resetQuery() ).

The call may then be used to add or update individual or multiple gDoc trees into the target collection. Additions and updates can be applied to individual trees as well as in bulk. In the latter case, the changes are applied by the service asynchronously, based on the ResultSet mechanism.

Additions operations may be illustrated as follows:

//A gDoc tree without identifiers
GDoc doc = ....
 
//synchronous: adds a gDoc tree and receives the identifier assigned to its root
String rootID = call.add(doc); 
//a locator to a ResultSet of gDoc trees without identifiers.
RSLocator locator1 = ....
 
//asynchronous: adds many gDoc trees and receives a locator to a remote ResultSet of AddOutcome objects (see WSDL)
RSLocator locator2 = call.add(locator);

Here, add(Gdoc) requires a binding of the input tree to the object model of gDoc tree API. There is semantically equivalent operation that takes DOM bindings, so as to raise no further serialisation costs if a binding other than to the gDoc tree API is already used upstream (cf. getAsElement(String) and add(Element)).

The ResultSet returned by the second method may be consumed with standard ResultSet idioms but, as already shown for ReadManager calls, RSCollection<T> and RSIterator<T> are more convenient options. In particular, the stub distribution offers a AddOutcomeParser to use in conjunction with RSCollection<AddOutcome> or RSIterator<AddOutcome>:

//A locator to ResultSet of AddOutcomes
RSLocator locator = ...
 
//may also set a FaultListener, if required
RSCollection<AddOutcome> outcomes = new RSCollection<AddOutcome>(locator, new AddOutcomeParser()); 
for (AddOutcome outcome : outcomes)
   ...process outcomes...
 
//or, alternatively..
 
RSIterator<AddOutcome> it = new RSIterator<AddOutcome>(locator, new AddOutcomeParser()); 
while (it.hasNext() {
   ...
   try {
      AddOutcome outcome = it.next();
      ... process outcome...
   }
   catch(Throwable failure) {
     ...handle failure...
  }
}

note: since the gDoc trees in input are still to be assigned identifiers, they can only be associated to successes or failures in output based on their order within the ResultSet.

The next example illustrates update operations, which rely on the notion of delta document:

//A delta tree
GDoc delta = ....
 
//synchronous: updates the document with a delta tree
call.update(delta); 
//a locator to a ResultSet of DOM representations of delta trees.
RSLocator deltas = ...
 
//asynchronous: updates zero or more documents with corresponding delta trees and receives a locator to a ResultSet of UpdateFailure objects (see WSDL)
RSLocator locator3 = call.update(deltas);

As for the add operations, consuming the ResultSet can conveniently rely on appropriate type specialisation of RSCollection<T> and RSIterator<T>. In particular, the stub distribution offers a UpdateFailureParser to use in conjunction with RSCollection<UpdateFailure> or RSIterator<UpdateFailure>:

//A locator to ResultSet of UpdateFailures
RSLocator locator = ...
 
//may also set a FaultListener, if required
RSCollection<UpdateFailure> failures = new RSCollection<UpdateFailure>(locator, new UpdateFailureParser()); 
for (UpdateFailure failure : failures)
   ...process failure...
 
//or, alternatively..
 
RSIterator<UpdateFailure> it = new RSIterator<UpdateFailure>(locator, new UpdateFailureParser()); 
while (it.hasNext() {
   ...
   try {
      UpdateFailure failure = it.next();
      ... process failure...
   }
   catch(Throwable f) {
     ...handle iteration failure...
  }
}


A difficulty in issuing updates is to produce the corresponding delta documents. To this end, the gDoc tree API offers a method delta(GDoc) on its GDoc class for root nodes. The method takes the root of a second gDoc tree and computes the delta document between the two trees in the assumption that the second tree represents the evolution of the first under update (i.e. its future version).

Accordingly, the API supports the following 'clone-change-compare' model of update at the client-side:

  • the client clones trees;
  • the client updates the clones;
  • the client computes and uses delta documents between the original trees and the evolved clones;

The following example illustrates the model:

//original tree, as obtained from the service, directly or indirectly.			     
GDoc doc = .....
 
//use copy-constructor to clone the tree
GDoc clone = new GDoc(doc);  
... update the clone ...
 
//compute delta document
GDoc delta = doc.delta(clone); 
... use delta in update operations ...

note: delta() leaves the original tree and its evolved clone untouched.

note: the model does not require that the clone is updated using the gDoc tree API. The clone may be updated under any object model, including a JAXB-annotated class with mutator methods. When the delta document is needed, it may be unbound from such model and bound to the object model of the gDoc tree API, so as to pass it to the delta() method and, from there, to the updated operation of the call object.

Content URIs

ReadManagerCalls allows clients to easily lookup gDoc trees if they know the collections that contain them. Lookups thus need context, and this context complicates the dissemination of content within the system. In particular, we cannot exchange root identifiers as references to the trees, as these alone are sufficient for neither identification nor lookup. The problem worsens if we wish to exchange references an arbitrary tree node, as in this case the context increases to include not only the collection that contains the tree but also all the path that connects the root of the tree to the node. To solve this problem, the stub distribution defines a scheme for URIs that point to arbitrary nodes of gDoc trees, encapsulating all the information required to resolve the trees and access the nodes.

A content URI is a URI of the following form:


cms URI form


where:

  • cms is the scheme of the URI;
  • id(0) is the identifier of a collection;
  • id(1) is the identifier of the root a gDoc tree in the collection;
  • id(i) is the identifier of a child of the node identified by id(i-1), where i>1.

Overall, a content URI identifies a node of a gDoc tree in some collection, and service clients can disseminate it as a compact reference to it.

Of course, of our URIs should be resolvable, i.e. should have the semantics of URLs. The stubs distribution provides two means to resolve content URIs. The first is via the static method get(URI) of the ReadManagerCall:

//A content URI
URI uri = ....
 
//Scope of resolution
GCUBEScope scope = ...
 
//resolve
Node node = ReadManagerCall.get(uri,scope);

Essentially, the method extracts the collection identifier from the URI in input and uses it to create a ReadManagerCall in the input scope. It then extracts the document identifier from the URI and passes it to the method get(String) of the call object to resolve the corresponding gDoc tree. It then navigates the tree along the node identifiers in the URI and returns the node with the last identifier.

The second method of resolution is stream-based and relies on standard Java APIs for URL resolution:

import static org.gcube.contentmanagement.contentmanager.stubs.model.protocol.URIs.*;
import static org.gcube.contentmanagement.contentmanager.stubs.model.trees.Bindings.*;
...
 
//A content URI
URI uri = ....
 
//Scope of resolution
GCUBEScope scope = ...
 
//resolve
URLConnection connection = connection(uri,scope);Node node = fromXML(connection.getInputStream());

Here, we work with a standard URLConnection, though we obtain it from a utility of the URIs class. The utility converts the URI to a URL, obtains from it a URLConnection and configures it with the target scope (cf. URIs.connection(URI,scope)). We then obtain a stream from the connection and use the binding facilities in Bindings to parse the stream into a node (other bindings are of course possible).

note: URL-based resolution requires the registration of a protocol handler for the cms scheme. The handler is part of the stub distribution and is registered automatically when the URIs class is first loaded, e.g. when invoking the method connection() above. Manual registrations are also possible (cf. org.gcube.contentmanagement.contentmanager.stubs.model.protocol.cms.Handler.activateProtocol()).


As for the generation of content URIs for given nodes of gDoc trees, clients may invoke the method uri() of the Node class of the gDoc tree API:

//a node of a gDoc tree
Node n = ...
 
//Its content URI
URI uri = node.uri()

note: uri() fails if the root of the tree has no collection identifier, or if any of the nodes on the path that connects it to the root have no identifiers.


Besides the method connection() discussed above, the class URIs offers a number facilities to create and manipulate content URLs, including:

  • make(String, String*) : creates a content URI from a collection identifier and a number of object identifiers.
the method is invoked by the method uri() just discussed, but clients may also invoke it directly if they obtain the components of the URI through other means.
  • collectionID(URI): returns the identifier of the collection in a content URI;
  • documentID(URI): returns the identifier of the document in a content URI;
  • nodeID(URI): returns the identifier of the node referred to by the content URI;
  • nodeIDs(URI): returns the all the node identifiers in a content URI;
  • parentURI(URI): returns the content URI of the parent of the node referred to by the content URI;
  • documentURI(URI): returns the content URI of the document in the content URI;
  • predicate(URI): returns a gDoc tree predicate for the existence of the path comprised of the identifiers in a content URI;