Difference between revisions of "OCMA: The Open Content Management Architecture"

From Gcube Wiki
Jump to: navigation, search
 
(33 intermediate revisions by 2 users not shown)
Line 1: Line 1:
''OCMA'', the Open Content Management Architecture, defines a number of design patterns for content management services, primarily the gCube [[GCube_Information_Organisation_Services_(NEW)|Information Services]].
+
[[Category:TO BE REMOVED]]
  
Underlying the OCMA patterns are a set of requirements and assumptions. In particular, OCMA acknowledges that gCube is concerned with content that may:
+
''OCMA'', the Open Content Management Architecture, is an abstract architecture for systems of content management services. The architecture is abstract in that it does not specify concrete services or interactions between services. Rather, it defines a number of design patterns for ''classes'' of services that rely on the same mechanisms for scalability, publication, and discovery. The patterns arise from requirements for content management in gCube and are adopted by the gCube [[GCube_Information_Organisation_Services_(NEW)|Information Services]] in the context of a concrete service-oriented architecture.
  
* be hosted inside or outside a gCube infrastructure;
+
= Assumptions and Requirements =
* be described with a variety of models, for different media, and with different degrees of structure;
+
* be accessed with a variety of protocols;
+
  
Accordingly, OCMA makes only the following assumptions about content:
+
OCMA makes minimal assumptions about the content managed by the system:
  
 
* content is created, accessed, and distributed in units called ''documents'';
 
* content is created, accessed, and distributed in units called ''documents'';
Line 13: Line 11:
 
* collections are managed in ''repositories''.   
 
* collections are managed in ''repositories''.   
  
Finally, OCMA acknowledges that content management in gCube needs to:
+
OCMA acknowledges that content may otherwise be:
  
# ''embrace heterogeneity'', i.e. support simultaneously multiple locations, protocols, and models;  
+
* hosted inside or outside the control of the system;
# ''hide heterogeneity'', i.e. abstract over differences in location, protocol, and model;
+
* represented in a variety of models, for different media, and with different degrees of structure;  
# ''scale'', i.e. retain good throughput under heavy load;
+
* accessed through a variety of interfaces;
  
== Collection Managers ==
+
In particular, OCMA acknowledges that the system may be required to:
  
Let X be an OCMA service.
+
* ''embrace heterogeneity'', i.e. allow for multiple locations, protocols, and models;
 +
* ''hide heterogeneity'', i.e. abstract over differences in location, protocol, and model;
 +
* ''scale'', i.e. retain good throughput under heavy load and high availability in the face of partial failures;
  
* X is a WSRF-service designed to manage WS-Resources called '''Collection Managers'''. A Collection Manager CM grants typed access to a content collection C stored in some repository R.
+
Vice versa, OCMA defines a single prerequisite for the system, namely the provision of scalable and highly available mechanisms for:
  
: '''Note''': We say that combination of a content model and an access protocol defines an '''access type'''. <br/>
+
* publication and discovery of service state.  
: '''Note''':  We say that CM defines a ''T-interface'' over C for some access type T or, equivalently, that CM is a ''T-Manager'' for C. We also say that X ''supports'' T.  
+
* subscription and notification of service state publication.
  
* CM has two standard WSRF Resource Properties (RPs) <code>TypeID</code> and <code>CollectionID</code> that  identify, respectively, the access type and the collection bound to CM.  
+
In what follows, we identify the system with gCube and the nodes of its topology with gCube Hosting Nodes (gHN). gCube satisfies OCMA prerequisites via the services in its [[gCore Based Information System]].
  
:'''Note''': Standard RPs may allow type-based discovery of CM and of any other T-Manager for the same collection. This supports ''generic clients'' of OCMA services.<br>
+
= Collection Managers =
:'''Note''': Standardisation is partly achieved trough port-type definitions in an OCMA namespace. It is also achieved with a OCMA ''registry'' of type identifiers.
+
+
:'''TBD''': The namespace for OCMA should be defined bearing in mind that what is about ''content'' today will be extended to ''data'' in a second stage.<br/>
+
:'''TBD''': Should the registry be modeled as a resource or just as out-of-band information. If it is a resource, what kind of resource is and how is it managed?
+
  
[[Image:OCMACollectionManagers.jpg|550px]]
+
An OCMA service S is a stateful gCube service designed in compliance with the Web Service Resource Framework (WSRF). In particular:
  
== Service Roles ==
+
* S manages WS-Resources called ''Collection Managers''. A Collection Manager M grants typed access to a content collection C stored in some repository R.
  
Different classes of OCMA services may be identified by contextualisation of the previous notions. Each class defines a role that its instances may play in a gCube infrastructure.
+
: '''note''': we think of an ''access type'' as the combination of a content model and content access interface. We then say that M defines a ''T-interface'' over C for some access type T. Equivalently, we say that M is a ''T-Manager'' for C. More generically, we say that S ''supports'' T.
 +
 
 +
* M has two standard WSRF Resource Properties (RPs) in the namespace <code>http://gcube-system.org/namespaces/common</code>:
 +
** <code>TypeID</code>, which identifies the access type of M;
 +
** <code>CollectionID</code>, which identifies the collection bound to M.
 +
 
 +
:'''note''': standard RPs allow type-based discovery of M and of any other T-Manager for the same collection. This supports ''generic clients'', i.e. clients that abstract over the actual service that implements the Collection Managers.<br/>
 +
 +
[[Image:OCMACollectionManagers.jpg|Collection Managers]]
 +
 
 +
= Service Classes =
 +
 
 +
We contextualise the previous notions to identify classes of OCMA services, where services in different classes play different roles in the system.
  
 
* '''Hosting Services and Access services'''.<br/>
 
* '''Hosting Services and Access services'''.<br/>
:R may offer no public interface to C. This may be the case, for example, when R is a local content management system or else a plain file system. In this case X is a content '''hosting service''' and can act as a repository for other OCMA services. Otherwise X is an '''access service'''.
+
:R may offer no public interface to C. This may be the case, for example, when R is a local content management system or else a plain file system. <br>
 +
:In this case S is a content ''hosting service'' and can act as a repository for other OCMA services. <br>
 +
:Otherwise S is an ''access service''.
 +
 
 +
 
 +
[[Image:HostingAndAccess.jpg|Hosting Services and Access Services]]
 +
 
  
 
* '''Wrapper Services'''<br/>
 
* '''Wrapper Services'''<br/>
:R may already offer a native T-interface to C. In this case CM acts as a wrapper of R. We then say that X is a '''wrapper service''' for T-repositories or, equivalently, that X ''wraps'' T or that it is a ''T-wrapper''. This is a common case when R is external to gCube. It is the simplest way for external content to enter gCube.  
+
:R may already offer a native T-interface to C. In this case M acts as a wrapper for R.  
 +
:We then say that S is a ''wrapper service'' for T-repositories or, equivalently, that S ''wraps'' T or that it is a ''T-wrapper''.  
 +
:This is a common case when R is external to the system.  
 +
 
 +
 
 +
[[Image:Wrapper.jpg|Wrapper Service]]
 +
 
 +
 
 +
* '''Adapter Services'''<br/>
 +
:R may offer a different, T'-interface to C. In this case M acts as an adapter for R.
 +
:We then say that S is an ''adapter service'' over T'-repositories or, equivalently, that S ''adapts'' T' or that S is a ''T'-adapter''.
 +
:We then say that T is a ''front type'' for S and T' is a ''back type'' for it.
 +
:Adapters are the main route to interoperability over heterogenous content.
 +
 
 +
 
 +
[[Image:Adapter.jpg|Adapter Service]]
  
* '''Adapter and Bridge Services'''<br/>
 
:R may offer a different, T'-interface to C. In this case CM acts as an adapter for R.  We then say that X is an '''adapter service''' over T'-repositories or, equivalently, that X ''adapts'' T' or that it is a ''T'-adapter''. T is then a ''front type'' for X and T' is a ''back type'' for it. If R is external to gCube, then we say more specifically of X that it is a '''bridge service''' to T'-repositories. Adapters and bridges are the main route to interoperability over heterogenous content.
 
  
 
* '''Single-Type, Multi-Type, and Hybrid Services'''<br/>  
 
* '''Single-Type, Multi-Type, and Hybrid Services'''<br/>  
:If X supports multiple access types we say that is '''multi-type service''', otherwise we say that it is a '''single-type service'''. If X is multi-type then it may be a T-wrapper and T'-adapter for two types T and T' among those that X supports. When then say that X is a '''hybrid service'''.  
+
:If S supports multiple access types we say that is ''multi-type service'', otherwise we say that it is a ''single-type service''.  
 +
: If S is multi-type then it may be a T-wrapper and T'-adapter for two types T and T' among those that are supported by S . When then say that S is a ''hybrid service''.  
 +
 
 +
 
 +
[[Image:MultiAndHybrid.jpg|Multi-type and Adapter Services]]
 +
 
  
 
* '''1-to-1 and 1-to-N Adapter Services'''<br/>
 
* '''1-to-1 and 1-to-N Adapter Services'''<br/>
:If X is a service adapter then X may adapt multiple back types. This may occur because X is multi-type, and different front types adapt different back types. In this case, we say that X is a '''1-to-1''' adapter. Less obviously, it may also occur if X adapts a front type T to multiple back types (regardless of whether X is a single-type or multi-type).  In this case, we say that X is a '''1-to-N''' service adapter.
+
:If S is a service adapter then S may adapt multiple back types.  
 +
:This may occur because S is multi-type, and different front types adapt different back types. In this case, we say that S is a ''1-to-1'' adapter.
 +
:Less obviously, it may also occur when S adapts a front type T to multiple back types (regardless of whether S is a single-type or multi-type).  In this case, we say that S is a ''1-to-N'' service adapter.
 +
 
 +
= Collection Manager Factories =
 +
 
 +
M may be created in response to requests made to specific port-types of S. In particular:
  
== Collection Manager Factories ==
+
* S is designed to manage singleton WS-Resources called ''collection manager factories''.
 +
: A collection manager factory MF creates T-Managers for some access type T (supported by S) and for some collection C.
  
CM may be created in response to requests made to specific port-types of X. In particular:
+
: '''note''': if S is single-type, then S has a single factory: If S is multi-type, then S ''may'' have a separate factory for each type it supports. <br/>
 +
: '''note''': running instances of S are responsible for creating their factories at startup.
  
* X is designed to manage singleton WS-Resources called '''Collection Manager Factories'''. A Collection Manager Factory CMF creates T-Managers for some access type T supported by X and for some collection C.  
+
* MF exposes an operation that can be invoked to create collection managers. The name of the operation is defined by S, but its input type extends the following type <code>CreateParams</code>, again in the namespace http://gcube-system.org/namespaces/common:
  
: '''Note''': If X is single-type, then X has a single factory: If X is multi-type, then X may have a separate factory for each type it supports.<br/>
+
<source lang="xml">
: '''Note''': The running instance of X is responsible for creating CMF at startup.
+
<xsd:complexType name="CreateParameters">
 +
<xsd:sequence>
 +
<xsd:element name="broadcast" nillable="false" type="xsd:boolean" minOccurs="1" maxOccurs="1" default="false" />
 +
</xsd:sequence>
 +
</xsd:complexType>
 +
</source>
  
* CMF has a standard operation <code>create</code> that can be invoked to create Collection Managers. The declared input type of create is a standard <code>CreateParams</code> type, but X extends that type for each of the supported types.  
+
The element <code>broadcast</code> is discussed below.
  
: '''Note''': The extensions of <code>CreateParams</code> are part of the documentation of X and may be part of its stub distribution. They are ''not'' standard in OCMA.
+
: '''note''': in the following, we assume that S calls the operation above <code>create</code>.
: '''Note''': OCMA does not require a one-to-one correspondence between <code>create</code> requests and Collection Managers. Depending on XCMF ''may'' create more than one Collection Manager in response to a single <code>create</code> requests.
+
: '''note''': the extensions of <code>CreateParams</code> are part of the documentation of S and may be part of its client-side distributions. They are not standard in OCMA.
 +
: '''note''': OCMA does not require a one-to-one correspondence between <code>create</code> requests and collection managers. Depending on SMF ''may'' create more than one Collection Manager in response to a single <code>create</code> requests.
  
* CMF publishes a gCube resource for the collection bound to CM, if one does not exist already.
+
= Activation Records =
  
: '''Note''': To ensure synchronisation across running instances of X, CMF deterministically synthesises a collection identifier from of <code>create</code> requests and uses it to search and create collection profiles.  
+
A collection manager M may be created in response to the existence of specific resources in the system. In particular:
  
== Activation Records ==
+
* MF ''may'' subscribe with the system for the creation of ''activation records'', i.e. resources that record invocations of the <code>create</code> operation at some other instance of S. Upon notification, MF may then self-invoke its <code>create</code> operation with the payload of the activation record that is included in the notification.
 +
:Effectively, S may be designed to self-stage in order to replicate the state of its running instances.
  
CM may be created in response to the existence of specific gCube resources. In particular:
+
: '''note''' : activation records are expected to persist beyond all running instances of S. This guarantees replication in the face of service failure.
 +
: '''note''': activation records are not easily reconciled with content hosting services, as the very assumption of local state discourages replication.
 +
: '''note''': in gCube, activation records are defined as a type of [[Reference_Model#Resource_Domain|application-specific resource]].
 +
: '''note''': MF must subscribe for activation records whenever its running instance start.
 +
: '''note''': MF must discard the activation records that it has previously published, either by refining its subscription or by avoiding to process notifications of its own past behaviour. This requires that MF records and persists the history of its own activation records.
  
* CMF ''may'' subscribe with the Information System for the creation of '''Activation Records''', a type of Generic Resources characterised by a payload that extends <code>CreateParams</code>. Upon notification, CMF self-invokes its <code>create</code> operation with the payload of the Activation Record that is included in the notification.
+
:'''requirement''': OCMA requires the system to support ''fine-grained'' and ''updatable'' subscriptions for activation records of interest (e.g. among other records).
 +
:'''requirement''': OCMA requires the system to monitor load on the running instances of S and to answer queries for collection managers by returning them in inverse order of load of their running instances.  
 +
:'''note''': both requirements are satisfied by gCube via its [[GCube_Infrastructure_Enabling_Services|Enabling Services]]. The Enabling Services can also autonomically spawn new instances for S upon detection of eccessive load on existing ones.
  
: '''Note''': Effectively, X may be designed to self-stage in order to replicate the state of its running instances.
+
The publication of activation records may take place in a variety of scenarios, from ad-hoc batch clients to interactive portlets. <br>
: '''Note''': Activation Records persist in the Information System beyond all running instances of X. This guarantees replication in the face of failure.<br/>
+
They may also be published by the MF in response to explicit <code>create</code> requests. In particular:
: '''Note''': Activation Records do not reconcile easily with content hosting services, as the very assumption of local state discourages replication.
+
  
:'''Requirement''': OCMA needs ''fine-grained'' and ''updatable'' subscriptions to reach activation records of interest (among other records and other generic resources). The current gCube Information System does not support updatable subscriptions. Fine-grained subscriptions are planned as an extension of the IS-Notifier service version 1.03.01.<br/>
+
*  <code>CreateParams</code> defines a standard boolean flag <code>broadcast</code> which instructs MF to publish activation records corresponding to <code>create</code> requests that complete successfully.  
:'''Requirement''': OCMA expects Enabling Services to spawn new running instances of X when they detect load on existing ones. It also expects them to answer WS-Resource queries for Collection Managers by returning them in inverse order of load of their running instances. It is therefore required that the Enabling Services link WS-Resources to the load on their running instances.
+
  
Activation Records may be published in the Information System in a variety of scenarios, from ad-hoc batch clients to interactive portlets. Noticeably, they may also be published by the CMF in response to explicit <code>create</code> requests. In particular:
+
: '''note''': MF honours these requests if S  supports activation records, and rejects it them otherwise. <br/>
 +
: '''note''': this effectively broadcasts (asynchronously) the request to any other running instance of S which may be deployed in the same scope as S.<br/>
 +
: '''note''': the default value of <code>broadcast</code> is <code>false</code> and implies that the collection managers created by MF are 'private' resources of its running instance.
  
*  <code>CreateParams</code> defines a standard boolean flag <code>broadcast</code> that instructs CMF to publish an Activation Record corresponding to <code>create</code> requests that complete successfully.
 
  
: '''Note''': CMF honours these requests if X  supports Activation Records and rejects them otherwise. <br/>
+
[[Image:ActivationRecords.jpg|Activation Records]]
: '''Note''': This effectively broadcasts (asynchronously) the request to any other running instance of X that may be deployed in the same scope as X.<br/>
+
: '''Note''': The default value of <code>broadcast</code> is <code>false</code> and implies that the Collection Managers created by CMF are 'private' resources of its running instance.
+

Latest revision as of 14:08, 19 October 2016


OCMA, the Open Content Management Architecture, is an abstract architecture for systems of content management services. The architecture is abstract in that it does not specify concrete services or interactions between services. Rather, it defines a number of design patterns for classes of services that rely on the same mechanisms for scalability, publication, and discovery. The patterns arise from requirements for content management in gCube and are adopted by the gCube Information Services in the context of a concrete service-oriented architecture.

Assumptions and Requirements

OCMA makes minimal assumptions about the content managed by the system:

  • content is created, accessed, and distributed in units called documents;
  • documents are grouped in collections;
  • collections are managed in repositories.

OCMA acknowledges that content may otherwise be:

  • hosted inside or outside the control of the system;
  • represented in a variety of models, for different media, and with different degrees of structure;
  • accessed through a variety of interfaces;

In particular, OCMA acknowledges that the system may be required to:

  • embrace heterogeneity, i.e. allow for multiple locations, protocols, and models;
  • hide heterogeneity, i.e. abstract over differences in location, protocol, and model;
  • scale, i.e. retain good throughput under heavy load and high availability in the face of partial failures;

Vice versa, OCMA defines a single prerequisite for the system, namely the provision of scalable and highly available mechanisms for:

  • publication and discovery of service state.
  • subscription and notification of service state publication.

In what follows, we identify the system with gCube and the nodes of its topology with gCube Hosting Nodes (gHN). gCube satisfies OCMA prerequisites via the services in its gCore Based Information System.

Collection Managers

An OCMA service S is a stateful gCube service designed in compliance with the Web Service Resource Framework (WSRF). In particular:

  • S manages WS-Resources called Collection Managers. A Collection Manager M grants typed access to a content collection C stored in some repository R.
note: we think of an access type as the combination of a content model and content access interface. We then say that M defines a T-interface over C for some access type T. Equivalently, we say that M is a T-Manager for C. More generically, we say that S supports T.
  • M has two standard WSRF Resource Properties (RPs) in the namespace http://gcube-system.org/namespaces/common:
    • TypeID, which identifies the access type of M;
    • CollectionID, which identifies the collection bound to M.
note: standard RPs allow type-based discovery of M and of any other T-Manager for the same collection. This supports generic clients, i.e. clients that abstract over the actual service that implements the Collection Managers.

Collection Managers

Service Classes

We contextualise the previous notions to identify classes of OCMA services, where services in different classes play different roles in the system.

  • Hosting Services and Access services.
R may offer no public interface to C. This may be the case, for example, when R is a local content management system or else a plain file system.
In this case S is a content hosting service and can act as a repository for other OCMA services.
Otherwise S is an access service.


Hosting Services and Access Services


  • Wrapper Services
R may already offer a native T-interface to C. In this case M acts as a wrapper for R.
We then say that S is a wrapper service for T-repositories or, equivalently, that S wraps T or that it is a T-wrapper.
This is a common case when R is external to the system.


Wrapper Service


  • Adapter Services
R may offer a different, T'-interface to C. In this case M acts as an adapter for R.
We then say that S is an adapter service over T'-repositories or, equivalently, that S adapts T' or that S is a T'-adapter.
We then say that T is a front type for S and T' is a back type for it.
Adapters are the main route to interoperability over heterogenous content.


Adapter Service


  • Single-Type, Multi-Type, and Hybrid Services
If S supports multiple access types we say that is multi-type service, otherwise we say that it is a single-type service.
If S is multi-type then it may be a T-wrapper and T'-adapter for two types T and T' among those that are supported by S . When then say that S is a hybrid service.


Multi-type and Adapter Services


  • 1-to-1 and 1-to-N Adapter Services
If S is a service adapter then S may adapt multiple back types.
This may occur because S is multi-type, and different front types adapt different back types. In this case, we say that S is a 1-to-1 adapter.
Less obviously, it may also occur when S adapts a front type T to multiple back types (regardless of whether S is a single-type or multi-type). In this case, we say that S is a 1-to-N service adapter.

Collection Manager Factories

M may be created in response to requests made to specific port-types of S. In particular:

  • S is designed to manage singleton WS-Resources called collection manager factories.
A collection manager factory MF creates T-Managers for some access type T (supported by S) and for some collection C.
note: if S is single-type, then S has a single factory: If S is multi-type, then S may have a separate factory for each type it supports.
note: running instances of S are responsible for creating their factories at startup.
  • MF exposes an operation that can be invoked to create collection managers. The name of the operation is defined by S, but its input type extends the following type CreateParams, again in the namespace http://gcube-system.org/namespaces/common:
	<xsd:complexType name="CreateParameters">
		<xsd:sequence>
			<xsd:element name="broadcast" nillable="false" type="xsd:boolean" minOccurs="1" maxOccurs="1" default="false" />
		</xsd:sequence>
	</xsd:complexType>

The element broadcast is discussed below.

note: in the following, we assume that S calls the operation above create.
note: the extensions of CreateParams are part of the documentation of S and may be part of its client-side distributions. They are not standard in OCMA.
note: OCMA does not require a one-to-one correspondence between create requests and collection managers. Depending on S, MF may create more than one Collection Manager in response to a single create requests.

Activation Records

A collection manager M may be created in response to the existence of specific resources in the system. In particular:

  • MF may subscribe with the system for the creation of activation records, i.e. resources that record invocations of the create operation at some other instance of S. Upon notification, MF may then self-invoke its create operation with the payload of the activation record that is included in the notification.
Effectively, S may be designed to self-stage in order to replicate the state of its running instances.
note : activation records are expected to persist beyond all running instances of S. This guarantees replication in the face of service failure.
note: activation records are not easily reconciled with content hosting services, as the very assumption of local state discourages replication.
note: in gCube, activation records are defined as a type of application-specific resource.
note: MF must subscribe for activation records whenever its running instance start.
note: MF must discard the activation records that it has previously published, either by refining its subscription or by avoiding to process notifications of its own past behaviour. This requires that MF records and persists the history of its own activation records.
requirement: OCMA requires the system to support fine-grained and updatable subscriptions for activation records of interest (e.g. among other records).
requirement: OCMA requires the system to monitor load on the running instances of S and to answer queries for collection managers by returning them in inverse order of load of their running instances.
note: both requirements are satisfied by gCube via its Enabling Services. The Enabling Services can also autonomically spawn new instances for S upon detection of eccessive load on existing ones.

The publication of activation records may take place in a variety of scenarios, from ad-hoc batch clients to interactive portlets.
They may also be published by the MF in response to explicit create requests. In particular:

  • CreateParams defines a standard boolean flag broadcast which instructs MF to publish activation records corresponding to create requests that complete successfully.
note: MF honours these requests if S supports activation records, and rejects it them otherwise.
note: this effectively broadcasts (asynchronously) the request to any other running instance of S which may be deployed in the same scope as S.
note: the default value of broadcast is false and implies that the collection managers created by MF are 'private' resources of its running instance.


Activation Records