OCMA: The Open Content Management Architecture

From Gcube Wiki
Revision as of 20:47, 28 August 2010 by Fabio.simeoni (Talk | contribs) (Service Classes)

Jump to: navigation, search

OCMA, the Open Content Management Architecture, is an architecture for content management services. The architecture is abstract, in that it does not specify concrete services or interactions between services. Rather, it defines a number of design patterns for classes of services that rely on the same mechanisms for scalability, publication, and discovery. The patterns arise from requirements for content management in gCube and they are adopted by the gCube Information Services.

Assumptions and Requirements

OCMA makes minimal assumptions about content:

  • content is created, accessed, and distributed in units called documents;
  • documents are grouped in collections;
  • collections are managed in repositories.

OCMA acknowledges that content may otherwise be:

  • hosted inside or outside the system's boundaries;
  • described with a variety of models, for different media, and with different degrees of structure;
  • served through a variety of interfaces;

In particular, OCMA assumes that the system may be required to:

  1. embrace heterogeneity, i.e. allow for multiple locations, protocols, and models;
  2. hide heterogeneity, i.e. abstract over differences in location, protocol, and model;
  3. scale, i.e. retain good throughput under heavy load;

Collection Managers

An OCMA service is a stateful Web Service compliant with the Web Service Resource Framework (WSRF). Let S be one such service.

  • S manages WS-Resources called Collection Managers. A Collection Manager M grants typed access to a content collection C stored in some repository R.
note: we say that combination of a content model and an access protocol defines an access type.
note: we say that M defines a T-interface over C for some access type T. Equivalently, we say that M is a 'T-Manager for C. More generically, we say that S supports T.
  • M has two standard WSRF Resource Properties (RPs) in the namespace http://gcube-system.org/namespaces/common:
    • TypeID, which identifies the access type of M;
    • CollectionID, which identifies the collection bound to M.
note: standard RPs allow type-based discovery of M and of any other T-Manager for the same collection. This supports generic clients, i.e. clients that abstract over the actual service that implements the Collection Managers.

OCMACollectionManagers.jpg

Service Classes

Contextualising the previous notions yields different classes of OCMA services, where services in different classes play different roles in the system.

  • Hosting Services and Access services.
R may offer no public interface to C. This may be the case, for example, when R is a local content management system or else a plain file system. In this case S is a content hosting service and can act as a repository for other OCMA services. Otherwise S is an access service.

HostingAndAccess.jpg

  • Wrapper Services
R may already offer a native T-interface to C. In this case M acts as a wrapper of R. We then say that S is a wrapper service for T-repositories or, equivalently, that S wraps T or that it is a T-wrapper. This is a common case when R is external to the system.
  • Adapter and Bridge Services
R may offer a different, T'-interface to C. In this case M acts as an adapter for R. We then say that S is an adapter service over T'-repositories or, equivalently, that S adapts T' or that it is a T'-adapter. T is then a front type for S and T' is a back type for it. If R is external to the system, then we say more specifically of S that it is a bridge service to T'-repositories. Adapters and bridges are the main route to interoperability over heterogenous content.
  • Single-Type, Multi-Type, and Hybrid Services
If S supports multiple access types we say that is multi-type service, otherwise we say that it is a single-type service. If S is multi-type then it may be a T-wrapper and T'-adapter for two types T and T' among those that supported by S . When then say that S is a hybrid service.
  • 1-to-1 and 1-to-N Adapter Services
If S is a service adapter then S may adapt multiple back types. This may occur because S is multi-type, and different front types adapt different back types. In this case, we say that S is a 1-to-1 adapter. Less obviously, it may also occur when S adapts a front type T to multiple back types (regardless of whether S is a single-type or multi-type). In this case, we say that S is a 1-to-N service adapter.

Collection Manager Factories

CM may be created in response to requests made to specific port-types of X. In particular:

  • X is designed to manage singleton WS-Resources called Collection Manager Factories. A Collection Manager Factory CMF creates T-Managers for some access type T supported by X and for some collection C.
Note: If X is single-type, then X has a single factory: If X is multi-type, then X may have a separate factory for each type it supports.
Note: The running instance of X is responsible for creating CMF at startup.
  • CMF has a standard operation create that can be invoked to create Collection Managers. The declared input type of create is a standard CreateParams type, but X extends that type for each of the supported types.
Note: The extensions of CreateParams are part of the documentation of X and may be part of its stub distribution. They are not standard in OCMA.
Note: OCMA does not require a one-to-one correspondence between create requests and Collection Managers. Depending on X, CMF may create more than one Collection Manager in response to a single create requests.
  • CMF publishes a gCube resource for the collection bound to CM, if one does not exist already.
Note: To ensure synchronisation across running instances of X, CMF deterministically synthesises a collection identifier from of create requests and uses it to search and create collection profiles.

Activation Records

CM may be created in response to the existence of specific gCube resources. In particular:

  • CMF may subscribe with the Information System for the creation of Activation Records, a type of Generic Resources characterised by a payload that extends CreateParams. Upon notification, CMF self-invokes its create operation with the payload of the Activation Record that is included in the notification.
Note: Effectively, X may be designed to self-stage in order to replicate the state of its running instances.
Note: Activation Records persist in the Information System beyond all running instances of X. This guarantees replication in the face of failure.
Note: Activation Records do not reconcile easily with content hosting services, as the very assumption of local state discourages replication.
Requirement: OCMA needs fine-grained and updatable subscriptions to reach activation records of interest (among other records and other generic resources). The current gCube Information System does not support updatable subscriptions. Fine-grained subscriptions are planned as an extension of the IS-Notifier service version 1.03.01.
Requirement: OCMA expects Enabling Services to spawn new running instances of X when they detect load on existing ones. It also expects them to answer WS-Resource queries for Collection Managers by returning them in inverse order of load of their running instances. It is therefore required that the Enabling Services link WS-Resources to the load on their running instances.

Activation Records may be published in the Information System in a variety of scenarios, from ad-hoc batch clients to interactive portlets. Noticeably, they may also be published by the CMF in response to explicit create requests. In particular:

  • CreateParams defines a standard boolean flag broadcast that instructs CMF to publish an Activation Record corresponding to create requests that complete successfully.
Note: CMF honours these requests if X supports Activation Records and rejects them otherwise.
Note: This effectively broadcasts (asynchronously) the request to any other running instance of X that may be deployed in the same scope as X.
Note: The default value of broadcast is false and implies that the Collection Managers created by CMF are 'private' resources of its running instance.