Difference between revisions of "GCore Based Information System"

From Gcube Wiki
Jump to: navigation, search
(Information System)
m (Luca.frosini moved page Information System to GCore Based Information System: Creating new Page for smartgear based IS)
 
(16 intermediate revisions by 5 users not shown)
Line 1: Line 1:
== Information System ==
+
[[Category: Developer's Guide]][[Category:Information System]]
The gCube Information System (shortly, IS) plays a central role in a gCube Infrastructure: it delivers functionalities for publishing, discovering and monitoring the set of resources forming the infrastructure. It acts as the registry of the infrastructure, i.e. all the resources are registered in the IS and every service partaking in the infrastructure must refer to it to dynamically discover the other infrastructure constituents. Moreover, the approach provided by the IS is of great support for the dynamic deployment capabilities of gCube.
+
{| align="right"
 +
|| __TOC__
 +
|}
 +
The gCube Information System (shortly, IS) delivers functionalities for publishing, discovering, and monitoring the set of resources forming the infrastructure. It acts as the registry of the infrastructure, i.e. all the resources are registered in the IS and every service partaking in the infrastructure must refer to it to dynamically discover the other infrastructure constituents. Moreover, the approach provided by the IS is of great support for the dynamic deployment capabilities of gCube.
  
 
In this context, a resource can be:
 
In this context, a resource can be:
 
* a [[Reference_Model#Resource_Domain|''gCube resource'']], supporting the deployment and operation of a gCube infrastructure;  
 
* a [[Reference_Model#Resource_Domain|''gCube resource'']], supporting the deployment and operation of a gCube infrastructure;  
 
* an ''instance state'', characterizing the operational state of an instance of a gCube service
 
* an ''instance state'', characterizing the operational state of an instance of a gCube service
* a ''generic resource'', any piece of well-formed information
+
* a ''generic resource'', any XML well-formed document (a text that follows all the syntactic rules labelled as well-formedness rules in the [http://www.w3.org/TR/REC-xml/ XML specification])
  
 
Because of its central role, key requirements in terms of quality of service for such a subsystem are ''performance'', ''scalability'', ''freshness'' and ''availability''. Moreover, facilities supporting the interaction with such subsystem have been included in the gCore Framework.
 
Because of its central role, key requirements in terms of quality of service for such a subsystem are ''performance'', ''scalability'', ''freshness'' and ''availability''. Moreover, facilities supporting the interaction with such subsystem have been included in the gCore Framework.
  
=== Reference Architecture ===
+
== Reference Architecture ==
 +
Architecturally, the IS is composed by a group of services and libraries enhancing the experience of potential clients. The central role is played by the '''InformationCollector''' (IC) service, in charge of collecting and storing information about the infrastructure (or a subset) and responding to those that call for discovering.
 +
There are two ways to feed the IC, depending on the nature of the information published. If the information is a gCube Resource profile, a request for publication must be sent to the '''Registry''' service. This service is devoted to validate and filter profiles in order to decide whether a resource is accepted or not as part of the infrastructure (other gCube services are in charge of regulating the access to the accepted resources).
 +
On the other hand, if the information to publish is an instance state or a generic resource, it does not need to pass through the Registry service's acceptance procedure and can be directly sent to the IC.
 +
 
 +
The third service belonging the IS is the '''Notifier''', offering a mechanism for subscription/notification on events related to gCube Resource's lifetime. By relying on the [http://www.ibm.com/developerworks/library/specification/ws-notification/ WS-Notification] and in cooperation with the Registry service, this service sends notifications to subscribed consumers about events happening in the Registry service (such as the registration of a new resource).
 +
 
 +
All of the three services have a related client library abstracting over the details of the services' interface:
 +
* IS-Client: for interacting with the IC service for discovering
 +
* IS-Publisher: for interacting with the IC and Registry services for publication
 +
* IS-Notification: for becoming a consumer of gCube's notification events sent by the Notifier
 +
 
 +
Finally, the Information System subsystem is equipped with an optional service named '''gLiteBridge'''. Its role is to foster the interoperability with gLite-based infrastructures by publishing in the IS computing elements, storage elements and sites harvested from their information systems (mainly BDII).
 +
 
 
Figure 1 presents the components of the Information System and their main interactions:  
 
Figure 1 presents the components of the Information System and their main interactions:  
  
Line 19: Line 35:
 
* discovery and consumption
 
* discovery and consumption
  
The components belonging the production and publication phase are:
+
The Information System supports two deployment scenarios: Standard Configuration and Advanced Configuration
* '''[[IS-Registry]]''' – this Service supports the publishing/un-publishing of profiles describing gCube resources;  
+
== Standard Configuration ==
* '''[[IS-gLiteBridge]]''' – this Service supports the publishing/un-publishing of ''resources ''gathered from a gLite based infrastructure that gCube services may access to;
+
It does support the new [[Featherweight Stack| Featherweight Client Stack]], born to better support clients in interacting with web services. It currently does not yet provide support for subscription and notification.
* '''[[IS-Publisher]]''' – this Library supports services in publishing/un-publishing information in the Information Collector service. It's the gateway for any information going to the IS;
+
=== Server Side ===
* '''[[IS-Notification]]''' – this Library provides a Publish/Subcribe notification mechanism for Topics produced and consumed by services.
+
* '''[[IS-Collector|IS-InformationCollector]]''' –  gCube Web Service: collect, store, and make available information related to the actual state of a gCube infrastructure and/or of an assigned subset of it;
 
+
* '''[[IS-Registry]]''' – gCube Web Service: support the publishing/un-publishing of profiles describing gCube resources;  
The component supporting the collection and storage phase is:
+
* '''[[IS-gLiteBridge]]''' – Optional - gCube Web Service: support the publishing/un-publishing of ''resources ''gathered from a gLite based infrastructure that gCube services may access to;
 
+
=== Client Side ===
* '''[[IS-Collector|IS-InformationCollector]]''' – this Service collects and makes available information related to the actual state of a gCube infrastructure and/or of an assigned subset of it;
+
* [[ic-client|'''ic-client''']] - NEW gCube [[Featherweight Stack| Featherweight Client Stack ]] Library: build on the API of <code>discovery-client</code> to support resource discovery over the [[IS-Collector|Information Collector]] service.
  
The components supporting the discovery and consumption phase are:
+
* [[Registry-Publisher|'''registry-publisher''']] - NEW gCube [[Featherweight Stack| Featherweight Client Stack ]] Library: API to publish resources with the [[IS-Registry|Registry]] service.
  
* '''[[IS-Client]]''' – this Library supports services in discovering information published in the IS;
+
== Advanced Configuration ==
* '''[[IS-Notifier]]''' – this Service supports other services in subscribing/unsubscribing to ''topics'' produced by the various Services; this service decouples the actual producer of the topic from the actual consumer allowing for producers re-location;
+
It does provide support for subscription and notification. However, it imposes constraints on client side.
* '''[[IS-Sweeper]]''' (coming soon) this [[Executor|Executor plugin]] keep updated the GHN and RI profiles when the related GHN dies or have communication problems;
+
=== Server Side ===
 +
* '''[[IS-Collector|IS-InformationCollector]]''' – gCube Web Service: collect, store, and make available information related to the actual state of a gCube infrastructure and/or of an assigned subset of it;
 +
* '''[[IS-Registry]]''' – gCube Web Service: support the publishing/un-publishing of profiles describing gCube resources;
 +
* '''[[IS-gLiteBridge]]''' – Optional - gCube Web Service: support the publishing/un-publishing of ''resources ''gathered from a gLite based infrastructure that gCube services may access to;
 +
* '''[[IS-Notifier]]''' – gCube Web Service: support other services in subscribing/unsubscribing to ''topics'' produced by the various Services; this service decouples the actual producer of the topic from the actual consumer allowing for producers re-location;
 +
=== Client Side ===
 +
* '''[[IS-Publisher]]''' – gCube Library: support services in publishing/un-publishing information in the Information Collector service. It's the gateway for any information going to the IS;
 +
* '''[[IS-Client]]''' – gCube Library: support services in discovering information published in the IS;
 +
* '''[[IS-Notification]]''' – gCube Library: provide a publication/subscription/notification mechanism for Topics produced and consumed by services.
 +
* '''[[IS-Cache]]''' - gCube Library: provide caching functionality for the information published in the IS;
  
=== Design Notes ===
+
== Design Notes ==
  
The IS has been conceived to rely on standards, most noticeable:
+
The IS has been conceived to rely on standards, most noticeably:
  
 
* [http://www.ibm.com/developerworks/library/specification/ws-notification/ WS-Notifications]
 
* [http://www.ibm.com/developerworks/library/specification/ws-notification/ WS-Notifications]
Line 50: Line 75:
 
Worthy to mention, during the design of the IS, the following principle has been widely adopted: ''program to an interface, not an implementation''. This means that we tried to maintain the IS consumers and producers as much as possible decoupled from its implementation. More concretely, a gCube service has to know only the IS-Client, IS-Notifier and IS-Publisher interfaces and that's all. It does not need to care about their implementation (mechanisms to dynamically load the IS-Client, IS-Notifier and IS-Publisher at runtime have been put in place) nor the actual IS deployment scenario (completely abstracted by the IS client libraries).  
 
Worthy to mention, during the design of the IS, the following principle has been widely adopted: ''program to an interface, not an implementation''. This means that we tried to maintain the IS consumers and producers as much as possible decoupled from its implementation. More concretely, a gCube service has to know only the IS-Client, IS-Notifier and IS-Publisher interfaces and that's all. It does not need to care about their implementation (mechanisms to dynamically load the IS-Client, IS-Notifier and IS-Publisher at runtime have been put in place) nor the actual IS deployment scenario (completely abstracted by the IS client libraries).  
  
[[Category:Information System]]
+
== QoS ==
 +
All the design aspects of the IS have been tackled taking into account the fact that if the IS does not work or works slowly or offers a poor service, all the infrastructure follows.
 +
The chain of operations involving the discovery phase is carefully designed and implemented to reduce the waiting time of callers. The IC service works in a stateless manner in this part, by only executing the query against the underlying XML indexing system. Also the SOAP messages sends and received are the simplest possible in order to reduce the marshaling and unmarshaling computation time.
 +
Yet, to do not overlap with the discovery phase, the publications work in a bulky way to reduce the incoming calls to the IC and do not compete with the invocations for queries. The IS-Publisher collects and queues requests for publication and sends them to the Registry and then to the IC by cutting as much as possible the number of competing calls.
 +
Form the deployment point of view, IS services can be distributed and partially replicated in a gCube infrastructure to manage subsets of resources (usually belonging to different scopes). Different scenarios can be set up in order to meet the performance and scalability requirements according to the extent of the infrastructure itself (e.g. how many resources to be managed, how many nodes are available, and so on).

Latest revision as of 13:14, 19 October 2016

The gCube Information System (shortly, IS) delivers functionalities for publishing, discovering, and monitoring the set of resources forming the infrastructure. It acts as the registry of the infrastructure, i.e. all the resources are registered in the IS and every service partaking in the infrastructure must refer to it to dynamically discover the other infrastructure constituents. Moreover, the approach provided by the IS is of great support for the dynamic deployment capabilities of gCube.

In this context, a resource can be:

  • a gCube resource, supporting the deployment and operation of a gCube infrastructure;
  • an instance state, characterizing the operational state of an instance of a gCube service
  • a generic resource, any XML well-formed document (a text that follows all the syntactic rules labelled as well-formedness rules in the XML specification)

Because of its central role, key requirements in terms of quality of service for such a subsystem are performance, scalability, freshness and availability. Moreover, facilities supporting the interaction with such subsystem have been included in the gCore Framework.

Reference Architecture

Architecturally, the IS is composed by a group of services and libraries enhancing the experience of potential clients. The central role is played by the InformationCollector (IC) service, in charge of collecting and storing information about the infrastructure (or a subset) and responding to those that call for discovering. There are two ways to feed the IC, depending on the nature of the information published. If the information is a gCube Resource profile, a request for publication must be sent to the Registry service. This service is devoted to validate and filter profiles in order to decide whether a resource is accepted or not as part of the infrastructure (other gCube services are in charge of regulating the access to the accepted resources). On the other hand, if the information to publish is an instance state or a generic resource, it does not need to pass through the Registry service's acceptance procedure and can be directly sent to the IC.

The third service belonging the IS is the Notifier, offering a mechanism for subscription/notification on events related to gCube Resource's lifetime. By relying on the WS-Notification and in cooperation with the Registry service, this service sends notifications to subscribed consumers about events happening in the Registry service (such as the registration of a new resource).

All of the three services have a related client library abstracting over the details of the services' interface:

  • IS-Client: for interacting with the IC service for discovering
  • IS-Publisher: for interacting with the IC and Registry services for publication
  • IS-Notification: for becoming a consumer of gCube's notification events sent by the Notifier

Finally, the Information System subsystem is equipped with an optional service named gLiteBridge. Its role is to foster the interoperability with gLite-based infrastructures by publishing in the IS computing elements, storage elements and sites harvested from their information systems (mainly BDII).

Figure 1 presents the components of the Information System and their main interactions:

Figure 1. Information System Architecture and Main Interactions

They globally deliver the following functionalities with respect to the information handled:

  • production and publication
  • collection and storage
  • discovery and consumption

The Information System supports two deployment scenarios: Standard Configuration and Advanced Configuration

Standard Configuration

It does support the new Featherweight Client Stack, born to better support clients in interacting with web services. It currently does not yet provide support for subscription and notification.

Server Side

  • IS-InformationCollector – gCube Web Service: collect, store, and make available information related to the actual state of a gCube infrastructure and/or of an assigned subset of it;
  • IS-Registry – gCube Web Service: support the publishing/un-publishing of profiles describing gCube resources;
  • IS-gLiteBridge – Optional - gCube Web Service: support the publishing/un-publishing of resources gathered from a gLite based infrastructure that gCube services may access to;

Client Side

Advanced Configuration

It does provide support for subscription and notification. However, it imposes constraints on client side.

Server Side

  • IS-InformationCollector – gCube Web Service: collect, store, and make available information related to the actual state of a gCube infrastructure and/or of an assigned subset of it;
  • IS-Registry – gCube Web Service: support the publishing/un-publishing of profiles describing gCube resources;
  • IS-gLiteBridge – Optional - gCube Web Service: support the publishing/un-publishing of resources gathered from a gLite based infrastructure that gCube services may access to;
  • IS-Notifier – gCube Web Service: support other services in subscribing/unsubscribing to topics produced by the various Services; this service decouples the actual producer of the topic from the actual consumer allowing for producers re-location;

Client Side

  • IS-Publisher – gCube Library: support services in publishing/un-publishing information in the Information Collector service. It's the gateway for any information going to the IS;
  • IS-Client – gCube Library: support services in discovering information published in the IS;
  • IS-Notification – gCube Library: provide a publication/subscription/notification mechanism for Topics produced and consumed by services.
  • IS-Cache - gCube Library: provide caching functionality for the information published in the IS;

Design Notes

The IS has been conceived to rely on standards, most noticeably:

Early versions mostly exploited WS-ServiceGroup and WS-ResourceProperty specifications. Starting from version 2.0 (released in Feb 2011), the IS is designed around the WS-DAIX specification for publishing. WS-Notifications is at the heart of the functionalities delivered by the IS-Notifier service. Finally, the queries accepted by the IS has to be compliant with the XQuery language.

Worthy to mention, during the design of the IS, the following principle has been widely adopted: program to an interface, not an implementation. This means that we tried to maintain the IS consumers and producers as much as possible decoupled from its implementation. More concretely, a gCube service has to know only the IS-Client, IS-Notifier and IS-Publisher interfaces and that's all. It does not need to care about their implementation (mechanisms to dynamically load the IS-Client, IS-Notifier and IS-Publisher at runtime have been put in place) nor the actual IS deployment scenario (completely abstracted by the IS client libraries).

QoS

All the design aspects of the IS have been tackled taking into account the fact that if the IS does not work or works slowly or offers a poor service, all the infrastructure follows. The chain of operations involving the discovery phase is carefully designed and implemented to reduce the waiting time of callers. The IC service works in a stateless manner in this part, by only executing the query against the underlying XML indexing system. Also the SOAP messages sends and received are the simplest possible in order to reduce the marshaling and unmarshaling computation time. Yet, to do not overlap with the discovery phase, the publications work in a bulky way to reduce the incoming calls to the IC and do not compete with the invocations for queries. The IS-Publisher collects and queues requests for publication and sends them to the Registry and then to the IC by cutting as much as possible the number of competing calls. Form the deployment point of view, IS services can be distributed and partially replicated in a gCube infrastructure to manage subsets of resources (usually belonging to different scopes). Different scenarios can be set up in order to meet the performance and scalability requirements according to the extent of the infrastructure itself (e.g. how many resources to be managed, how many nodes are available, and so on).