OCMA: The Open Content Management Architecture

From Gcube Wiki
Jump to: navigation, search

OCMA, the Open Content Management Architecture, is an abstract architecture for systems of content management services. The architecture is abstract in that it does not specify concrete services or interactions between services. Rather, it defines a number of design patterns for classes of services that rely on the same mechanisms for scalability, publication, and discovery. The patterns arise from requirements for content management in gCube and are adopted by the gCube Information Services in the context of a concrete service-oriented architecture.

Assumptions and Requirements

OCMA makes minimal assumptions about the content managed by the system:

  • content is created, accessed, and distributed in units called documents;
  • documents are grouped in collections;
  • collections are managed in repositories.

OCMA acknowledges that content may otherwise be:

  • hosted inside or outside the control of the system;
  • represented in a variety of models, for different media, and with different degrees of structure;
  • accessed through a variety of interfaces;

In particular, OCMA acknowledges that the system may be required to:

  • embrace heterogeneity, i.e. allow for multiple locations, protocols, and models;
  • hide heterogeneity, i.e. abstract over differences in location, protocol, and model;
  • scale, i.e. retain good throughput under heavy load and high availability in the face of partial failures;

Vice versa, OCMA defines a single prerequisite for the system, namely the provision of scalable and highly available mechanisms for:

  • publication and discovery of service state.
  • subscription and notification of service state publication.

In what follows, we identify the system with gCube and the nodes of its topology with gCube Hosting Nodes (gHN). gCube satisfies OCMA prerequisites via the services in its gCore Based Information System.

Collection Managers

An OCMA service S is a stateful gCube service designed in compliance with the Web Service Resource Framework (WSRF). In particular:

  • S manages WS-Resources called Collection Managers. A Collection Manager M grants typed access to a content collection C stored in some repository R.
note: we think of an access type as the combination of a content model and content access interface. We then say that M defines a T-interface over C for some access type T. Equivalently, we say that M is a T-Manager for C. More generically, we say that S supports T.
  • M has two standard WSRF Resource Properties (RPs) in the namespace http://gcube-system.org/namespaces/common:
    • TypeID, which identifies the access type of M;
    • CollectionID, which identifies the collection bound to M.
note: standard RPs allow type-based discovery of M and of any other T-Manager for the same collection. This supports generic clients, i.e. clients that abstract over the actual service that implements the Collection Managers.

Collection Managers

Service Classes

We contextualise the previous notions to identify classes of OCMA services, where services in different classes play different roles in the system.

  • Hosting Services and Access services.
R may offer no public interface to C. This may be the case, for example, when R is a local content management system or else a plain file system.
In this case S is a content hosting service and can act as a repository for other OCMA services.
Otherwise S is an access service.

Hosting Services and Access Services

  • Wrapper Services
R may already offer a native T-interface to C. In this case M acts as a wrapper for R.
We then say that S is a wrapper service for T-repositories or, equivalently, that S wraps T or that it is a T-wrapper.
This is a common case when R is external to the system.

Wrapper Service

  • Adapter Services
R may offer a different, T'-interface to C. In this case M acts as an adapter for R.
We then say that S is an adapter service over T'-repositories or, equivalently, that S adapts T' or that S is a T'-adapter.
We then say that T is a front type for S and T' is a back type for it.
Adapters are the main route to interoperability over heterogenous content.

Adapter Service

  • Single-Type, Multi-Type, and Hybrid Services
If S supports multiple access types we say that is multi-type service, otherwise we say that it is a single-type service.
If S is multi-type then it may be a T-wrapper and T'-adapter for two types T and T' among those that are supported by S . When then say that S is a hybrid service.

Multi-type and Adapter Services

  • 1-to-1 and 1-to-N Adapter Services
If S is a service adapter then S may adapt multiple back types.
This may occur because S is multi-type, and different front types adapt different back types. In this case, we say that S is a 1-to-1 adapter.
Less obviously, it may also occur when S adapts a front type T to multiple back types (regardless of whether S is a single-type or multi-type). In this case, we say that S is a 1-to-N service adapter.

Collection Manager Factories

M may be created in response to requests made to specific port-types of S. In particular:

  • S is designed to manage singleton WS-Resources called collection manager factories.
A collection manager factory MF creates T-Managers for some access type T (supported by S) and for some collection C.
note: if S is single-type, then S has a single factory: If S is multi-type, then S may have a separate factory for each type it supports.
note: running instances of S are responsible for creating their factories at startup.
  • MF exposes an operation that can be invoked to create collection managers. The name of the operation is defined by S, but its input type extends the following type CreateParams, again in the namespace http://gcube-system.org/namespaces/common:
	<xsd:complexType name="CreateParameters">
			<xsd:element name="broadcast" nillable="false" type="xsd:boolean" minOccurs="1" maxOccurs="1" default="false" />

The element broadcast is discussed below.

note: in the following, we assume that S calls the operation above create.
note: the extensions of CreateParams are part of the documentation of S and may be part of its client-side distributions. They are not standard in OCMA.
note: OCMA does not require a one-to-one correspondence between create requests and collection managers. Depending on S, MF may create more than one Collection Manager in response to a single create requests.

Activation Records

A collection manager M may be created in response to the existence of specific resources in the system. In particular:

  • MF may subscribe with the system for the creation of activation records, i.e. resources that record invocations of the create operation at some other instance of S. Upon notification, MF may then self-invoke its create operation with the payload of the activation record that is included in the notification.
Effectively, S may be designed to self-stage in order to replicate the state of its running instances.
note : activation records are expected to persist beyond all running instances of S. This guarantees replication in the face of service failure.
note: activation records are not easily reconciled with content hosting services, as the very assumption of local state discourages replication.
note: in gCube, activation records are defined as a type of application-specific resource.
note: MF must subscribe for activation records whenever its running instance start.
note: MF must discard the activation records that it has previously published, either by refining its subscription or by avoiding to process notifications of its own past behaviour. This requires that MF records and persists the history of its own activation records.
requirement: OCMA requires the system to support fine-grained and updatable subscriptions for activation records of interest (e.g. among other records).
requirement: OCMA requires the system to monitor load on the running instances of S and to answer queries for collection managers by returning them in inverse order of load of their running instances.
note: both requirements are satisfied by gCube via its Enabling Services. The Enabling Services can also autonomically spawn new instances for S upon detection of eccessive load on existing ones.

The publication of activation records may take place in a variety of scenarios, from ad-hoc batch clients to interactive portlets.
They may also be published by the MF in response to explicit create requests. In particular:

  • CreateParams defines a standard boolean flag broadcast which instructs MF to publish activation records corresponding to create requests that complete successfully.
note: MF honours these requests if S supports activation records, and rejects it them otherwise.
note: this effectively broadcasts (asynchronously) the request to any other running instance of S which may be deployed in the same scope as S.
note: the default value of broadcast is false and implies that the collection managers created by MF are 'private' resources of its running instance.

Activation Records