Difference between revisions of "Tabular Data Flow Manager"

From Gcube Wiki
Jump to: navigation, search
m
m
Line 12: Line 12:
 
The goal of this service is to offer a facilities for tabular data workflow creation, management and monitoring.  
 
The goal of this service is to offer a facilities for tabular data workflow creation, management and monitoring.  
 
The workflow can involve a number of data manipulation steps each performed by potentially different services to produce the desired output.  
 
The workflow can involve a number of data manipulation steps each performed by potentially different services to produce the desired output.  
Uer defined workflow can be scheduled for deferred execution and the user notified about the workflow progress.
+
User defined workflow can be scheduled for deferred execution and the user notified about the workflow progress.
  
 
== Design ==
 
== Design ==
Line 19: Line 19:
  
 
Tabular Data Flow Manager offers a service for tabular data workflow creation, management and monitoring.  
 
Tabular Data Flow Manager offers a service for tabular data workflow creation, management and monitoring.  
The data flow can touch different services in order to produce the desired output.
+
The underlying idea is to decouple the logic needed to represent and execute workflows of tabular data processing from the single steps each taking care of part of the overall manipulation. 
Planned flow can be scheduled for deferred execution and the user notified about the flow progress.
+
This aims at maximizing the exploitation and reuse of components aiming at offering data manipulation facilities.
 +
Moreover, this make it possible to 'codify' standard (including domain oriented ones) data manipulation processes and execute them whenever data deserving such a kind of processing manifest.  
  
 
=== Architecture ===
 
=== Architecture ===
 
The subsystem comprises the following components:
 
The subsystem comprises the following components:
  
* '''Tabular Data Flow Service''': the central system for the flow creation, management and monitoring;
+
* '''Tabular Data Flow Service''': the core element of this functional area. It provides for workflow creation, management and monitoring;
  
* '''Tabular Data Flow UI''': the user interface of the service where the user can create, execute and monitor the data flow;
+
* '''Tabular Data Flow UI''': the user interface of this functional area. It  the user can create, execute and monitor the workflow(s);
  
 
* '''Tabular Data Agent''': an helper component for the service that want to expose tabular data functionality to the data flow service.
 
* '''Tabular Data Agent''': an helper component for the service that want to expose tabular data functionality to the data flow service.

Revision as of 17:08, 18 May 2012

The goal of this facility is to realise an integrated environment supporting the definition and management of workflows of tabular data. Each workflow consists of a number of tabular data processing steps where each step is realized by an existing service conceptually offered by a gCube based infrastructure.

In the following, the design rationale, key features, high-level architecture, as well as the deployment scenarios are described.

Overview

The goal of this service is to offer a facilities for tabular data workflow creation, management and monitoring. The workflow can involve a number of data manipulation steps each performed by potentially different services to produce the desired output. User defined workflow can be scheduled for deferred execution and the user notified about the workflow progress.

Design

Philosophy

Tabular Data Flow Manager offers a service for tabular data workflow creation, management and monitoring. The underlying idea is to decouple the logic needed to represent and execute workflows of tabular data processing from the single steps each taking care of part of the overall manipulation. This aims at maximizing the exploitation and reuse of components aiming at offering data manipulation facilities. Moreover, this make it possible to 'codify' standard (including domain oriented ones) data manipulation processes and execute them whenever data deserving such a kind of processing manifest.

Architecture

The subsystem comprises the following components:

  • Tabular Data Flow Service: the core element of this functional area. It provides for workflow creation, management and monitoring;
  • Tabular Data Flow UI: the user interface of this functional area. It the user can create, execute and monitor the workflow(s);
  • Tabular Data Agent: an helper component for the service that want to expose tabular data functionality to the data flow service.

A diagram of the relationships between these components is reported in the following figure:

Tabular Data Flow Manager, internal Architecture

Deployment

The Service should be deployed in a single node, while the agents should be deployed with the service that want to offer his functionality to the flow service. The User Interface can be deployed in the infrastructure portal.

Use Cases

Well suited Use Cases

This component well fit all the cases where is necessary to manage a flow of tabular data between the infrastructure services. An example can be the enhancement of catch statistics offered by the Time Series Service and elaborated using both the Statistical Service and the Occurrence Service.