Difference between revisions of "GCore Based Information System Specification"

From Gcube Wiki
Jump to: navigation, search
(Small deployment)
 
(5 intermediate revisions by one other user not shown)
Line 23: Line 23:
 
:infrastructures with more than 10K of resources successfully powered  
 
:infrastructures with more than 10K of resources successfully powered  
  
;Production level QoS - Reliableness
+
;Production level QoS - Permanent and Uninterrupted Functioning
 
:IS instances have been continuously up for more than one year without human intervention
 
:IS instances have been continuously up for more than one year without human intervention
  
Line 87: Line 87:
 
Regarding co-deployments of IS services, IS-InformationCollector and IS-Registry are highly contacted services and compete for the container's threads serving incoming calls; therefore, they work at their maximum when they are deployed on different hosts. IS-Notifier is a less stressed service that might be co-deployed with the IS-Registry to reduce the number of exclusively dedicated nodes.
 
Regarding co-deployments of IS services, IS-InformationCollector and IS-Registry are highly contacted services and compete for the container's threads serving incoming calls; therefore, they work at their maximum when they are deployed on different hosts. IS-Notifier is a less stressed service that might be co-deployed with the IS-Registry to reduce the number of exclusively dedicated nodes.
  
[[Image:IS-deployment.png|frame|center|Typical Deployment Scenarion]]
+
[[Image:IS-deployment.png|frame|center|Typical Deployment Scenario]]
  
 
=== Small deployment ===
 
=== Small deployment ===
Line 93: Line 93:
 
To stay conservative in terms of resource consumption, a single instance of all the IS services may be deployed. How such a scheme affects the responsiveness of the system depends on how many resources compose the infrastructure.
 
To stay conservative in terms of resource consumption, a single instance of all the IS services may be deployed. How such a scheme affects the responsiveness of the system depends on how many resources compose the infrastructure.
  
 +
[[Image:IS-SingleDeployment.png|frame|center|Single Instance Deployment Scenario]]
  
  
 
+
Alternative deployment schemes may plan aggregation of VOs in the same IS instances.
Alternative deployment schemes might be created, by aggregating VOs in the same IS instances.
+
  
 
== Use Cases ==
 
== Use Cases ==
Line 103: Line 103:
 
=== Well suited Use Cases ===
 
=== Well suited Use Cases ===
  
 +
The Information System has been longed used for serving the e-infrastructure purposes. Producers and consumers of resource belonging VOs and VREs with thousands of resources have been successfully connected through it over the years.
 +
 +
Because of the adoption of widely recognized standards, the IS is today an open system that can be exploited even by non-gCube native components.
 +
 +
The flexibility of its deployment solution offers great opportunities of (re-)configuration towards the optimal schema.
  
 
=== Less well suited Use Cases ===
 
=== Less well suited Use Cases ===
  
Describe here scenarios where the subsystem partially satisfied the expectations.
+
Resource discovery is still partially compatible with WS-DAIX. Not too far from it, but not 100% compatible. Therefore, clients that want to query the IS-InformationCollector by their own (and not using the abstraction facilities offered by the IS-Client) must comply with its interface (still XQuery based).

Latest revision as of 13:22, 19 October 2016

Overview

The Information System (IS) is the core subsystem connecting producers and consumers of resources. It acts as a registry of the infrastructure by offering global and partial views of its resources and their current status and notification instruments.

The approach provided by the IS is of great support for the dynamic deployment capabilities and the interoperability solutions offered by the Resource Management facilities.

Key features

Resource Publication, Access and Discovery
IS is the connecting point among the resources of the e-Infrastructure
Consistency with the new Resource Model
IS grants publication and access to resources compliant with the Resource Model (2nd generation)
Production level QoS - Responsiveness
each query served in milliseconds, thousands of queries served each hour
Production level QoS - Scalability
infrastructures with more than 10K of resources successfully powered
Production level QoS - Permanent and Uninterrupted Functioning
IS instances have been continuously up for more than one year without human intervention
Support to Standards - WS-DAIX Specification v1.0
full implementation of WS-DAIX v.1.0, a widely accepted standard defining a set of data access interfaces for XML data resources
Support to Standards - XQuery 1.0
Resource discovery can be performed through expressions compliant with XQuery 1.0
Support to Standards - WS-Notifications
Consumers of resources can subscribe to the IS for receiving WS-Notifications about any change occurred in they resources the are interested in
Flexible deployment scenarios
IS components can be deployed in several ways, to best fit the needs of an infrastructure or a specific VO

Design

Philosophy

The IS has been designed and implemented to:

  • rely on standards
  • support distribution at maximum and replication wherever it is possible
  • abstract clients from the deployment scenario

A central role of the Information System is also to publicly manifest resources and connect them to their consumers. A consistent Resource Model has been created at the beginning of the gCube development and served for many years as a solid basis of gCube core-facilities. With the increasing openness of the system, a second generation of the model has been shaped and being integrated in the IS.

Architecture

To deliver the quality of service and performances and to handle growing amounts of information (scalability), the Information System is composed by a set of Web Services and client libraries.

Information System Architecture

They globally deliver the following functionalities with respect to the information handled:

  • production and publication
  • collection, indexing and storage
  • discovery and consumption

The components belonging the production and publication phase are:

  • IS-Registry: this service exposes an API for publishing/un-publishing profiles of resources compliant with the Resource Model of both first and second generation;
  • IS-Notifier: this service builds on top to WS-Notification to deliver notifications about changes occurring in the resources registered in the IS-Registry; it also supports other services in subscribing/unsubscribing to topics produced by the various Services; this service decouples the actual producer of the topic from the actual consumer allowing for producers re-location
  • IS-gLiteBridge: this service publishes and unpublishes resources gathered from a gLite based infrastructure that gCube services may access to
  • IS-Publisher: a library available to gCube services for publishing/un-publishing information in the IS

The component supporting the collection, indexing and storage phase is:

  • IS-InformationCollector: a service that collects and makes available information related to the actual state of the gCube infrastructure and/or of an assigned subset of it; it exposes APIs compliant with WS-DAIX for feeding and then accessing indexed resources

The components supporting the discovery and consumption phase are:

  • IS-Client: a library available to gCube services for discovering information published in the IS
  • IS-Notification: a library available to gCube services with publication/subscription/notification mechanism for Topics produced and consumed by any actor of the infrastructure compliant with WS-Notification

Deployment

As far as the client libraries, the deployment scheme is trivial: they reside on each node of the infrastructure equipped with the gCore platform in order to allow hosted services to interact wit the IS services. Their main role is in fact to hide the actual deployment scenario of the services. These can be variously distributed and replicated at many hosts.

The distribution criterion is the scope, meaning that each of the IS services can manage a single scope or aggregate multiple scopes. And each service can adopt a different distribution policy: for instance, an InformationCollector may work in scope A and B, while two different IS-Registry instances may independently manage A and B. Hence, the possible distribution schemes grow with the complexity of the infrastructure.

The replication criteria are the type of the resources handled and again the scope. There may exist IS services configured to accept only certain resources (such as nodes) and others configured for different resources. The most important thing is that at the end of the scheme, all the resource types are covered by the available services. However, replication holds for IS-InformationCollector and IS-Registry, while the IS-gLiteBridge and the IS-Notifier do not support replication.

About temporal constraints, IS-InformationCollector has to be deployed firstly, then IS-Registry and finally the IS-Notifier and IS-gLiteBridge (these two in no particular order).

Final remark, all the IS services must be hosted on dedicated nodes, i.e. no service (other than the Resource Management services working at node level) has to be co-deployed with them.

Large deployment

To obtain a balanced trade off between scalability and resource consumption, a scheme with IS instances distributed at VO level and infrastructure level could maximize the results in most of the cases. VREs are usually fairly handled by the instances at VO level. Regarding co-deployments of IS services, IS-InformationCollector and IS-Registry are highly contacted services and compete for the container's threads serving incoming calls; therefore, they work at their maximum when they are deployed on different hosts. IS-Notifier is a less stressed service that might be co-deployed with the IS-Registry to reduce the number of exclusively dedicated nodes.

Typical Deployment Scenario

Small deployment

To stay conservative in terms of resource consumption, a single instance of all the IS services may be deployed. How such a scheme affects the responsiveness of the system depends on how many resources compose the infrastructure.

Single Instance Deployment Scenario


Alternative deployment schemes may plan aggregation of VOs in the same IS instances.

Use Cases

The subsystem has been conceived to support a number of use cases moreover it will be used to serve a number of scenarios. This area will collect these "success stories".

Well suited Use Cases

The Information System has been longed used for serving the e-infrastructure purposes. Producers and consumers of resource belonging VOs and VREs with thousands of resources have been successfully connected through it over the years.

Because of the adoption of widely recognized standards, the IS is today an open system that can be exploited even by non-gCube native components.

The flexibility of its deployment solution offers great opportunities of (re-)configuration towards the optimal schema.

Less well suited Use Cases

Resource discovery is still partially compatible with WS-DAIX. Not too far from it, but not 100% compatible. Therefore, clients that want to query the IS-InformationCollector by their own (and not using the abstraction facilities offered by the IS-Client) must comply with its interface (still XQuery based).