Tabular Data Manager: Template Management

From Gcube Wiki
Jump to: navigation, search


Template

A template is a predefined data structure (characterizing both the data element entities and a set of rules constraining data values) a Tabular Resource should comply with.

The main menu enables to create, open or delete a template as well as to use it to characterise a Tabular Resource.

Tabular Data Manager, Template tab

New Template

A Template can be defined in three steps:

  1. Definition of metadata;
  2. Definition and Validation of initial structure of the Template;
  3. Definition of Actions to execute.

Definition of metadata

In this step the user has to compile the following elements:

  • TEMPLATE TYPE the TabularResource type that the template will create (DATASET, CODELIST or GENERIC);
  • NAME the given name of the templete;
  • AGENCY the responsible party defining this template;
  • DESCRIPTION a textual description of the template;
  • ON ERROR the behavior of the execution in case of errors selected among the following possible ones:
    • ASK to stop the execution;
    • DISCARD to remove all rows with errors and continue the execution;
    • SAVE to store all rows with errors in a separate file and continue the execution.
  • NUMBER OF COLUMNS the number of columns of the initial structure.
Tabular Data Manager, template type: dataset, codelist, generic

Definition and Validation of initial structure of the Template

In this step the user specifies the typologies and associated data types for each column. Moreover, it specifies validation rules and data flows characterising the behaviour of every TabularResource complying with the template.

For each type of table the template is intended for, the user is provided with detailed guidelines supporting the selection of proper typologies and related data types.

The following picture shows the typologies supported for each column of a Codelist template (Code Name, Code Description, Annotation, Code):

Tabular Data Manager, Structure of a codelist template

The following picture shows the typologies supported for each column of a Dataset template (Attribute, Dimension, Measure, Time Dimension):

Tabular Data Manager, Structure of a dataset template

The following picture shows the typologies supported for each column of a Dataset template (Measure, Attribute, Time Dimension):

Tabular Data Manager, Structure of a generic template

Rules

During definition and validation of a template users can add one or more expressions for data validation (Rules). Expressions can be defined on columns where data typologies and data types have been previously specified. Allowed rules will take into account the column characterization (data typology and data type).

Template, Rule definition on Measure Column
Template, e.g. of a template with two rules

Flow

Flows are a special type of TabularResource that can be created only from a Template. Adding a flow on Template Definition means that every entry resulted from the application of this template will be copied in the selected FLOW. The flow, once created, cannot be modified. It can only be cloned or analysed using the Analyse Tab.

User has to fill in all the needed metadata (as for the creation of a new TabularResource). The user has to select the behavior in case of duplicate entries.

Template, Flow dialog creation

Definition of Actions to execute

Users can add Actions to be applied to the template defined in step 2.

The available operations are:

  • Add Column
  • Remove Column
  • Create Time Dimension
  • Aggregate By Time
  • Normalize
Template, Post Operations

Users are provided with the history of the applied operations by selecting History Operations

Add Column

This Action adds a new column to the TabularResource. The new column will be initialized with the expression defined by clicking on the Set Value button.

Template, Add Column Operation

Create Time Dimension

This Action creates a new column of type TimeDimension using other columns. The user has to select the columns for 'YEAR', YEAR and MONTH, YEAR, MONTH and DAY, YEAR and QUARTER depending on what type of TimeDimension columns he wants to be created.

Template, Create Time Dimension Operation

Aggregate By Time

Aggregates a list of columns by time (see the Aggregate By Time operation )

Template, Aggregate By Time Operation

Normalize

Applies the normalization operation (see Normalize operation)

Template, Normalize Operation

Apply Template

The Apply Template function allows application of a template to the current TabularResource. The TabularResource structure MUST be compatible with the initial structure defined in the template otherwise a TemplateNotCompatible error will be thrown.

Tabular Data Manager, Applying the template to a tabular resource