Data Transfer 2

From Gcube Wiki
Revision as of 17:48, 9 September 2016 by Fabio.sinibaldi (Talk | contribs) (Architecture)

Jump to: navigation, search

Overview

The implementation of a reliable data transfer mechanisms between the nodes of a gCube-based Hybrid Data Infrastructure is one of the main objectives when dealing with large set of multi-type datasets distributed across different repositories.

To promote an efficient and optimized consumption of these data resources, a number of components have been designed to meet the data transfer requirements.

This document outlines the design rationale, key features, and high-level architecture, the options for their deployment and as well some use cases.

Key features

The components belonging to this class are responsible for:

Point to Point transfer
direct transfer invocation to a gCube Node
reliable data transfer between Infrastructure Data Sources and Data Storages
by exploiting the uniform access interfaces provided by gCube and standard transfer protocols
automatic transfer optimization
by exploiting best available transfer options between invoker and target nodes
advanced and extensible post transfer processing
plugin - oriented implementation to serve advanced use case

Design

Philosophy

In a Hybrid Data e-Infrastructure, transferring data between nodes is a crucial feature. Given the heterogeneous nature of datasets, applications, networks capabilities and restrictions, efficient data-transfers can be problematic. Negotiation of best-efficient-available-transfer method is essential in order to achieve a global efficiency, while an extensible set of post operations lets application serve specific needs among common ones. We wanted to provide a light-weight solution that could fit the evolving nature of the infrastructure and its communities needs.

Architecture

The architecture of Data Transfer is a simple client server model. The client contacts the server asking for its capabilities in order to decide the best efficient transfer way between the two sides, submits a transfer requests through the selected transfer channel and monitors its status/outcome. The server side is designed exploiting the plugin design pattern, so that it can be easily extended without introducing hard dependencies between the architecture's components.

The picture below describes the overall architecture.

Data Transfer Architecture

Deployment

Large Deployment

Small Deployment

Use Cases

Well suited use cases

Less well suited use cases