SDMX Data Source

From Gcube Wiki
Revision as of 12:54, 21 December 2017 by Ciro.formisano (Talk | contribs) (Introduction)

Jump to: navigation, search

Introduction

GCube SDMX Data Source Service is a REST web service compliant with SDMX standard versions 2.0 and 2.1 and enables to export tabular data stored in D4Science Infrastructure in SDMX format. The service leverages the Tabular Data Facilities and the Information_System to export data.

High level Architecture

GCube SDMX Data Source Service is a web service deployed on Tomcat and configured as a Smart Gear application: this means that Smart Gear Security Model is applyed on it (i.e. a valid token is needed). The Service leverages a set of gCube services to work, the following picture shows the model:


SDMX-exporter.png

The Service gets all the references from the Information System, in particular it gets the following pieces of information:

  • URL of the associated SDMX Registry
  • References of Tabular Resources and Tables
  • References of Time Dimension and Primary Measure columns.

Tabular data are obtained in real-time from Tabular Data Management Service basing on the information get from the IS and the Data Structures obtained from SDMX Registry. The SDMX Data Source Service creates an SDMX Document of the requested version and provides the client with requested data.

Commands/Data flow

An SDMX Client, which, in this case, is a REST client (since the Service supports only SDMX REST requsts) asks for some data. The requested data must have been already exported in SDMX format by Tabular Data Management Service. The request contains a gCube token associated with a certain VRE. If the token is valid, the Service:

1. gets from the Information System the URL of the SDMX Registry associated with that VRE

2. gets from the SDMX Registry the associated Data Structure Definition

3. gets from the Information System the IDs of the Tabular Resource, Table, Time Dimension Column and Primary Measure Column associated with that Data Structure Definition

4. gets the tables from Tabular Data Management Service

5. creates a SDMX Data Document and sends the response to the Client.


Supported versions, REST URL and examples

Currently the Service supports the following SDMX versions:

  • Structure specific time series version 2.1 (Data type: application/vnd.sdmx.structurespecifictimeseriesdata+xml;version=2.1)
  • Generic time series version 2.1 (Data type: application/vnd.sdmx.generictimeseriesdata+xml;version=2.1)
  • Structure specific time series version 2.0 (Data type: application/vnd.sdmx.generictimeseriesdata+xml;version=2.0)
  • Generic time series version 2.0 (Data type: application/vnd.sdmx.generictimeseriesdata+xml;version=2.0)
  • Structure specific cross sectional data version 2.0 (Data type: application/vnd.sdmx.structurespecificdata+xml;version=2.0).

The client can ask for a certain version by including one or mode Data types on the Accept Header of the request message. Of no valid data types are in the Accept Header, the default Generic time series version 2.1 version is used. If more than one valid data type is chosen, the priority is identical than the order of the list above.

The REST URL used to get the data is (almost) compliant to SDMX standard:


<sdmx-service-base-url>/ws/data/<data-flow-agency>,<data-flow-id>,<data-flow-version>/<dimensions-filters>/?<optional-parameters>&gcube-token=<token>

The only non-standard field is gcube-token parameter, which is used by Smart Gear to authenticate the user and to define the VRE. The other fields are compliant to the standard, in particular:

  • data-flow-agency,data-flow-id,data-flow-version: only data-flow-id is mandatory, but if it is not enough to unambiguously define a data flow and error is returned. If data-flow-agency or data-flow-version are not set, the field is left blank and the comma is not used
  • dimensions-filters: this optional field is a filter on the dimensions (and not on attributes). Standard dot based notation is used, multiple filters are supported, please refer to the standard for more details
  • <optional-parameters: the current version supports startPeriod, endPeriod, firstNObservations, endNObservations, dimensionAtObservation and detail.


For more information, please refer to specific SDMX documentation.

A valid example is the following


Header:

Accept: application/vnd.sdmx.structurespecifictimeseriesdata+xml;version=2.1 </code>


URL:

GET <sdmx-service-base-url>/ws/data/BlueBridge,NEW_DS_DIVISION_dataFlow/1/?startperiod=2005&endPeriod=2011&gcube-token=<token>


This request asks for data associated to the last version of the data flow NEW_DS_DIVISION_dataFlow, maintained by BlueBridge agency. The response should contain only data whose first (and unique) dimension (according with the order defined in the SDMX Registry) is 1 and are referred to the period from 2005 to 2011.