Difference between revisions of "GCore Based Information System"

From Gcube Wiki
Jump to: navigation, search
(Reference Architecture)
m (Luca.frosini moved page Information System to GCore Based Information System: Creating new Page for smartgear based IS)
 
(25 intermediate revisions by 5 users not shown)
Line 1: Line 1:
== Information System ==
+
[[Category: Developer's Guide]][[Category:Information System]]
The gCube Information System (shortly, IS) plays a central role in a gCube Infrastructure: it delivers functionalities for the publishing, discovery and ‘real-time’ monitoring of the set of resources forming the infrastructure. It acts as the registry of the infrastructure, i.e. all the resources are registered in the IS and every service partaking to the infrastructure must refer to it to dynamically discover the rest of Infrastructure constituents. This is of great support for the dynamic deployment capabilities of gCube.
+
{| align="right"
 +
|| __TOC__
 +
|}
 +
The gCube Information System (shortly, IS) delivers functionalities for publishing, discovering, and monitoring the set of resources forming the infrastructure. It acts as the registry of the infrastructure, i.e. all the resources are registered in the IS and every service partaking in the infrastructure must refer to it to dynamically discover the other infrastructure constituents. Moreover, the approach provided by the IS is of great support for the dynamic deployment capabilities of gCube.
  
Resources can be:
+
In this context, a resource can be:
* ''profile'', statically characterising the resource, e.g. its type;  
+
* a [[Reference_Model#Resource_Domain|''gCube resource'']], supporting the deployment and operation of a gCube infrastructure;  
* ''instance state'', characterising the operational state of an instance of a gCube service
+
* an ''instance state'', characterizing the operational state of an instance of a gCube service
* ''generic resource'', any piece of well-formed information
+
* a ''generic resource'', any XML well-formed document (a text that follows all the syntactic rules labelled as well-formedness rules in the [http://www.w3.org/TR/REC-xml/ XML specification])
  
 
Because of its central role, key requirements in terms of quality of service for such a subsystem are ''performance'', ''scalability'', ''freshness'' and ''availability''. Moreover, facilities supporting the interaction with such subsystem have been included in the gCore Framework.
 
Because of its central role, key requirements in terms of quality of service for such a subsystem are ''performance'', ''scalability'', ''freshness'' and ''availability''. Moreover, facilities supporting the interaction with such subsystem have been included in the gCore Framework.
  
=== Reference Architecture ===
+
== Reference Architecture ==
Figure 1 presents the components of the Information System and their main interactions. These can be classified as follows:
+
Architecturally, the IS is composed by a group of services and libraries enhancing the experience of potential clients. The central role is played by the '''InformationCollector''' (IC) service, in charge of collecting and storing information about the infrastructure (or a subset) and responding to those that call for discovering.  
* production/publishing
+
There are two ways to feed the IC, depending on the nature of the information published. If the information is a gCube Resource profile, a request for publication must be sent to the '''Registry''' service. This service is devoted to validate and filter profiles in order to decide whether a resource is accepted or not as part of the infrastructure (other gCube services are in charge of regulating the access to the accepted resources).
* collection/storage
+
On the other hand, if the information to publish is an instance state or a generic resource, it does not need to pass through the Registry service's acceptance procedure and can be directly sent to the IC.
* consumption/query
+
  
[[Image:IS-Architecture.jpg|frame|center|Figure 1. Information System Architecture and Main Interactions]]
+
The third service belonging the IS is the '''Notifier''', offering a mechanism for subscription/notification on events related to gCube Resource's lifetime. By relying on the [http://www.ibm.com/developerworks/library/specification/ws-notification/ WS-Notification] and in cooperation with the Registry service, this service sends notifications to subscribed consumers about events happening in the Registry service (such as the registration of a new resource).
  
The components supporting the production/publishing phase are:
+
All of the three services have a related client library abstracting over the details of the services' interface:
* '''[[IS-Registry]]''' – this Service supports the publishing/un-publishing of ''gCube resources''; a gCube resource is advertised through its ''profile'', i.e. the resource profile represents the existence of a resource;
+
* IS-Client: for interacting with the IC service for discovering
* '''[[IS-gLiteBridge]]''' – this Service supports the publishing/un-publishing of ''resources ''gathered from a gLite based infrastructure; a gCube-based infrastructure include resources forming a gLite-based infrastructure;
+
* IS-Publisher: for interacting with the IC and Registry services for publication
* '''[[IS-Publisher]]''' – this Library supports services in publishing/un-publishing groups of ''resource properties'' as well as registering/un-registering groups of ''topics''. Actually, this library is an ''interface'' other Services will rely on. Because of this fundamental role in supporting Services operation in a gCube-based infrastructure, a reference implementation of such an interface (''gCubePublisher'') is part of the gCore Framework;
+
* IS-Notification: for becoming a consumer of gCube's notification events sent by the Notifier
* '''[[IS-Notification]]''' – this Library provides a Publish/Subcribe notification mechanism for Topics produced and consumed by services.
+
 
 +
Finally, the Information System subsystem is equipped with an optional service named '''gLiteBridge'''. Its role is to foster the interoperability with gLite-based infrastructures by publishing in the IS computing elements, storage elements and sites harvested from their information systems (mainly BDII).
 +
 
 +
Figure 1 presents the components of the Information System and their main interactions:
 +
 
 +
[[Image:IS-Architecture.jpg|frame|center|Figure 1. Information System Architecture and Main Interactions]]
  
The component supporting the collection/storage phase is:
+
They globally deliver the following functionalities with respect to the information handled:
 +
* production and publication
 +
* collection and storage
 +
* discovery and consumption
  
* '''[[IS-Collector|IS-InformationCollector]]''' – this Service collects and makes available information related to the actual state of a gCube infrastructure and/or of an assigned subset of it;
+
The Information System supports two deployment scenarios: Standard Configuration and Advanced Configuration
 +
== Standard Configuration ==
 +
It does support the new [[Featherweight Stack| Featherweight Client Stack]], born to better support clients in interacting with web services. It currently does not yet provide support for subscription and notification.
 +
=== Server Side ===
 +
* '''[[IS-Collector|IS-InformationCollector]]''' – gCube Web Service: collect, store, and make available information related to the actual state of a gCube infrastructure and/or of an assigned subset of it;
 +
* '''[[IS-Registry]]''' – gCube Web Service: support the publishing/un-publishing of profiles describing gCube resources;
 +
* '''[[IS-gLiteBridge]]''' – Optional - gCube Web Service: support the publishing/un-publishing of ''resources ''gathered from a gLite based infrastructure that gCube services may access to;
 +
=== Client Side ===
 +
* [[ic-client|'''ic-client''']] - NEW gCube [[Featherweight Stack| Featherweight Client Stack ]] Library: build on the API of <code>discovery-client</code> to support resource discovery over the [[IS-Collector|Information Collector]] service.
  
The components supporting the consumption/query phase are:
+
* [[Registry-Publisher|'''registry-publisher''']] - NEW gCube [[Featherweight Stack| Featherweight Client Stack ]] Library: API to publish resources with the [[IS-Registry|Registry]] service.
  
* '''[[IS-Client]]''' – this Library supports Services in retrieving information published in the IS; it supports the discovery of both ''profiles'' and ''properties''. Actually, this library is an ''interface'' Services will rely on. Because of this fundamental role in supporting Services operation in a gCube-based infrastructure, a reference implementation of such an interface (''[[ExistClient|ExistLibrary]]'') is part of the gCore Framework;
+
== Advanced Configuration ==
* '''[[IS-Notifier]]''' – this Service supports other Services in subscribing/unsubscribing to ''topics'' produced by the various Services; this service decouples the actual producer of the topic from the actual consumer allowing for producers re-location;
+
It does provide support for subscription and notification. However, it imposes constraints on client side.
* '''IS-Manager''' – this Service supports other Services and clients in observing, checking, or keeping a continuous record of the status of the resources forming the infrastructure. Because of this role, it can also be classified as a component supporting the collection/storage phase but it is preferable to have it in the components supporting consumption/query phase because it is considered closer to this area. Since the IS-Manager is not yet an official component of the current IS subsystem and it is not delivered with it the documentation about the IS-Manager is not yet available.
+
=== Server Side ===
* '''[[IS-Sweeper]]''' (coming soon) this [[Executor|Executor plugin]] keep updated the GHN and RI profiles when the related GHN dies or have communication problems;
+
* '''[[IS-Collector|IS-InformationCollector]]''' – gCube Web Service: collect, store, and make available information related to the actual state of a gCube infrastructure and/or of an assigned subset of it;
 +
* '''[[IS-Registry]]''' gCube Web Service: support the publishing/un-publishing of profiles describing gCube resources;
 +
* '''[[IS-gLiteBridge]]''' – Optional - gCube Web Service: support the publishing/un-publishing of ''resources ''gathered from a gLite based infrastructure that gCube services may access to;
 +
* '''[[IS-Notifier]]''' – gCube Web Service: support other services in subscribing/unsubscribing to ''topics'' produced by the various Services; this service decouples the actual producer of the topic from the actual consumer allowing for producers re-location;
 +
=== Client Side ===
 +
* '''[[IS-Publisher]]''' – gCube Library: support services in publishing/un-publishing information in the Information Collector service. It's the gateway for any information going to the IS;
 +
* '''[[IS-Client]]''' – gCube Library: support services in discovering information published in the IS;
 +
* '''[[IS-Notification]]''' – gCube Library: provide a publication/subscription/notification mechanism for Topics produced and consumed by services.
 +
* '''[[IS-Cache]]''' - gCube Library: provide caching functionality for the information published in the IS;
  
=== Design Notes ===
+
== Design Notes ==
  
The IS has been conceived to rely on standards, most noticeable:
+
The IS has been conceived to rely on standards, most noticeably:
  
 
* [http://www.ibm.com/developerworks/library/specification/ws-notification/ WS-Notifications]
 
* [http://www.ibm.com/developerworks/library/specification/ws-notification/ WS-Notifications]
* [http://docs.oasis-open.org/wsrf/2004/06/wsrf-WS-ServiceGroup-1.2-draft-02.pdf WS-ServiceGroup]
+
* [http://docs.oasis-open.org/wsrf/wsrf-ws_service_group-1.2-spec-pr-01.pdf WS-ServiceGroup 1.2]
* WS-ResourceProperty
+
* [http://docs.oasis-open.org/wsrf/wsrf-ws_resource_properties-1.2-spec-os.pdf WS-ResourceProperty 1.2]
 
* [http://www.ogf.org/documents/GFD.75.pdf Web Services Data Access and Integration – The XML Realization (WS-DAIX) Specification, Version 1.0]
 
* [http://www.ogf.org/documents/GFD.75.pdf Web Services Data Access and Integration – The XML Realization (WS-DAIX) Specification, Version 1.0]
* XQuery
+
* [http://www.w3.org/TR/xpath-functions/ XQuery 1.0]
 
   
 
   
 
Early versions mostly exploited WS-ServiceGroup and WS-ResourceProperty  specifications. Starting from version 2.0 (released in Feb 2011), the IS is designed around the WS-DAIX specification for publishing.
 
Early versions mostly exploited WS-ServiceGroup and WS-ResourceProperty  specifications. Starting from version 2.0 (released in Feb 2011), the IS is designed around the WS-DAIX specification for publishing.
Line 49: Line 75:
 
Worthy to mention, during the design of the IS, the following principle has been widely adopted: ''program to an interface, not an implementation''. This means that we tried to maintain the IS consumers and producers as much as possible decoupled from its implementation. More concretely, a gCube service has to know only the IS-Client, IS-Notifier and IS-Publisher interfaces and that's all. It does not need to care about their implementation (mechanisms to dynamically load the IS-Client, IS-Notifier and IS-Publisher at runtime have been put in place) nor the actual IS deployment scenario (completely abstracted by the IS client libraries).  
 
Worthy to mention, during the design of the IS, the following principle has been widely adopted: ''program to an interface, not an implementation''. This means that we tried to maintain the IS consumers and producers as much as possible decoupled from its implementation. More concretely, a gCube service has to know only the IS-Client, IS-Notifier and IS-Publisher interfaces and that's all. It does not need to care about their implementation (mechanisms to dynamically load the IS-Client, IS-Notifier and IS-Publisher at runtime have been put in place) nor the actual IS deployment scenario (completely abstracted by the IS client libraries).  
  
[[Category:Information System]]
+
== QoS ==
 +
All the design aspects of the IS have been tackled taking into account the fact that if the IS does not work or works slowly or offers a poor service, all the infrastructure follows.
 +
The chain of operations involving the discovery phase is carefully designed and implemented to reduce the waiting time of callers. The IC service works in a stateless manner in this part, by only executing the query against the underlying XML indexing system. Also the SOAP messages sends and received are the simplest possible in order to reduce the marshaling and unmarshaling computation time.
 +
Yet, to do not overlap with the discovery phase, the publications work in a bulky way to reduce the incoming calls to the IC and do not compete with the invocations for queries. The IS-Publisher collects and queues requests for publication and sends them to the Registry and then to the IC by cutting as much as possible the number of competing calls.
 +
Form the deployment point of view, IS services can be distributed and partially replicated in a gCube infrastructure to manage subsets of resources (usually belonging to different scopes). Different scenarios can be set up in order to meet the performance and scalability requirements according to the extent of the infrastructure itself (e.g. how many resources to be managed, how many nodes are available, and so on).

Latest revision as of 13:14, 19 October 2016

The gCube Information System (shortly, IS) delivers functionalities for publishing, discovering, and monitoring the set of resources forming the infrastructure. It acts as the registry of the infrastructure, i.e. all the resources are registered in the IS and every service partaking in the infrastructure must refer to it to dynamically discover the other infrastructure constituents. Moreover, the approach provided by the IS is of great support for the dynamic deployment capabilities of gCube.

In this context, a resource can be:

  • a gCube resource, supporting the deployment and operation of a gCube infrastructure;
  • an instance state, characterizing the operational state of an instance of a gCube service
  • a generic resource, any XML well-formed document (a text that follows all the syntactic rules labelled as well-formedness rules in the XML specification)

Because of its central role, key requirements in terms of quality of service for such a subsystem are performance, scalability, freshness and availability. Moreover, facilities supporting the interaction with such subsystem have been included in the gCore Framework.

Reference Architecture

Architecturally, the IS is composed by a group of services and libraries enhancing the experience of potential clients. The central role is played by the InformationCollector (IC) service, in charge of collecting and storing information about the infrastructure (or a subset) and responding to those that call for discovering. There are two ways to feed the IC, depending on the nature of the information published. If the information is a gCube Resource profile, a request for publication must be sent to the Registry service. This service is devoted to validate and filter profiles in order to decide whether a resource is accepted or not as part of the infrastructure (other gCube services are in charge of regulating the access to the accepted resources). On the other hand, if the information to publish is an instance state or a generic resource, it does not need to pass through the Registry service's acceptance procedure and can be directly sent to the IC.

The third service belonging the IS is the Notifier, offering a mechanism for subscription/notification on events related to gCube Resource's lifetime. By relying on the WS-Notification and in cooperation with the Registry service, this service sends notifications to subscribed consumers about events happening in the Registry service (such as the registration of a new resource).

All of the three services have a related client library abstracting over the details of the services' interface:

  • IS-Client: for interacting with the IC service for discovering
  • IS-Publisher: for interacting with the IC and Registry services for publication
  • IS-Notification: for becoming a consumer of gCube's notification events sent by the Notifier

Finally, the Information System subsystem is equipped with an optional service named gLiteBridge. Its role is to foster the interoperability with gLite-based infrastructures by publishing in the IS computing elements, storage elements and sites harvested from their information systems (mainly BDII).

Figure 1 presents the components of the Information System and their main interactions:

Figure 1. Information System Architecture and Main Interactions

They globally deliver the following functionalities with respect to the information handled:

  • production and publication
  • collection and storage
  • discovery and consumption

The Information System supports two deployment scenarios: Standard Configuration and Advanced Configuration

Standard Configuration

It does support the new Featherweight Client Stack, born to better support clients in interacting with web services. It currently does not yet provide support for subscription and notification.

Server Side

  • IS-InformationCollector – gCube Web Service: collect, store, and make available information related to the actual state of a gCube infrastructure and/or of an assigned subset of it;
  • IS-Registry – gCube Web Service: support the publishing/un-publishing of profiles describing gCube resources;
  • IS-gLiteBridge – Optional - gCube Web Service: support the publishing/un-publishing of resources gathered from a gLite based infrastructure that gCube services may access to;

Client Side

Advanced Configuration

It does provide support for subscription and notification. However, it imposes constraints on client side.

Server Side

  • IS-InformationCollector – gCube Web Service: collect, store, and make available information related to the actual state of a gCube infrastructure and/or of an assigned subset of it;
  • IS-Registry – gCube Web Service: support the publishing/un-publishing of profiles describing gCube resources;
  • IS-gLiteBridge – Optional - gCube Web Service: support the publishing/un-publishing of resources gathered from a gLite based infrastructure that gCube services may access to;
  • IS-Notifier – gCube Web Service: support other services in subscribing/unsubscribing to topics produced by the various Services; this service decouples the actual producer of the topic from the actual consumer allowing for producers re-location;

Client Side

  • IS-Publisher – gCube Library: support services in publishing/un-publishing information in the Information Collector service. It's the gateway for any information going to the IS;
  • IS-Client – gCube Library: support services in discovering information published in the IS;
  • IS-Notification – gCube Library: provide a publication/subscription/notification mechanism for Topics produced and consumed by services.
  • IS-Cache - gCube Library: provide caching functionality for the information published in the IS;

Design Notes

The IS has been conceived to rely on standards, most noticeably:

Early versions mostly exploited WS-ServiceGroup and WS-ResourceProperty specifications. Starting from version 2.0 (released in Feb 2011), the IS is designed around the WS-DAIX specification for publishing. WS-Notifications is at the heart of the functionalities delivered by the IS-Notifier service. Finally, the queries accepted by the IS has to be compliant with the XQuery language.

Worthy to mention, during the design of the IS, the following principle has been widely adopted: program to an interface, not an implementation. This means that we tried to maintain the IS consumers and producers as much as possible decoupled from its implementation. More concretely, a gCube service has to know only the IS-Client, IS-Notifier and IS-Publisher interfaces and that's all. It does not need to care about their implementation (mechanisms to dynamically load the IS-Client, IS-Notifier and IS-Publisher at runtime have been put in place) nor the actual IS deployment scenario (completely abstracted by the IS client libraries).

QoS

All the design aspects of the IS have been tackled taking into account the fact that if the IS does not work or works slowly or offers a poor service, all the infrastructure follows. The chain of operations involving the discovery phase is carefully designed and implemented to reduce the waiting time of callers. The IC service works in a stateless manner in this part, by only executing the query against the underlying XML indexing system. Also the SOAP messages sends and received are the simplest possible in order to reduce the marshaling and unmarshaling computation time. Yet, to do not overlap with the discovery phase, the publications work in a bulky way to reduce the incoming calls to the IC and do not compete with the invocations for queries. The IS-Publisher collects and queues requests for publication and sends them to the Registry and then to the IC by cutting as much as possible the number of competing calls. Form the deployment point of view, IS services can be distributed and partially replicated in a gCube infrastructure to manage subsets of resources (usually belonging to different scopes). Different scenarios can be set up in order to meet the performance and scalability requirements according to the extent of the infrastructure itself (e.g. how many resources to be managed, how many nodes are available, and so on).