Tabular Data Manager: Curation

From Gcube Wiki
Revision as of 16:31, 16 December 2015 by Giancarlo.panichi (Talk | contribs) (Codelist)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Curation

Tabular Data Manager, Curation tab

Validation Menu

Show Validations

Show validations on current tabular resource using the Show button in the Validation Menu.The users can manage validations using the context menu.
Tabular Data Manager, validations

Delete Validations

Delete validations on current tabular resource using the Delete button in the Validation Menu.

Duplicate Detection

There is the possibility to detect duplicate in your tabular resource using the Duplicate Detection button in the Validation Menu.

Structure Menu

Table type

This function allows to define the table type of your resource.
Tabular Data Manager, Tabular Data Manager table type


In case the structure of the table will not support one of the selected table type, the operation will save and the validation tab on the left panel will report the errors.
Tabular Data Manager, Validation panel and invalidated operations
Codelist
This is an example of how to create a Codelist. Given a tabular resource of generic type:
Tabular Data Manager, Codelist creation
Set code column
Tabular Data Manager, Code column
Set code name column
Tabular Data Manager, Code Name column
Change table type to Codelist
Tabular Data Manager, Table Type to Codelist
Set tabular resource final and save properties
Tabular Data Manager, Set Tabular Resource Final

Position Column

Users can change the position of columns.
Tabular Data Manager, Position Column

Labels

Users can change all the column labels of the tabular resource all at once using the Labels button.
Tabular Data Manager, Labels

Column type

this is a very important function to define the type of columns of your table and therefore properly manipulate its data. Users can define per column the column type and its attributes.
Tabular Data Manager, Column type
Tabular Data Manager, Attribute type


Add column

By specifying a label and a column type, users can add a column per time to their tabular resources.
Tabular Data Manager, Add column

Delete column

The left side panel allows to select the column to delete all at once.
Tabular Data Manager, Delete column


Split Column

The column of a tabular resource can be split accordingly to different criteria: 'char_sequence', 'index', and 'regex'.
Here below, an example of the application of the CHAR SEQUENCE method in the Column Split function:
Tabular Data Manager, split column and char sequence method
The original table is transformed into:
Tabular Data Manager, split column by char sequence
Here below an example of the application of the INDEX method in the Column Split function:
Tabular Data Manager, split column and index method
The original table is transformed into:
Tabular Data Manager, split column by index
Here below an example of the application of the REGEX method in the Column Split function:
Tabular Data Manager, split column and regex method
The original table is transformed into:
Tabular Data Manager, split column by regex
N.B. Value is a POSIX Regular Expression

Merge column

Users can decide to merge two column and to create a new one. The original separated columns can be deleted or not. A new column label has to be specified.
Tabular Data Manager, Merge column


Denormalize

A table resource can be denormalized using the Denormalize button in the Structure menu of the Curation tab:
Tabular Data Manager, denormalize original table
The original table, can be transformed by selecting the Value column and the Attribute column.
By setting 'Quantity' as value column and 'Year' as attribute column we will obtain the table:
Tabular Data Manager, denormalized table

Normalize

If you want to normalize the data in your Tabular resource, you can continue as follow:
Tabular Data Manager, normalize
Considering the structure of our table you will name a Normalized and a Value column. The system will create these two new columns at the end of the normalization.
In the pop-up setting windows, the Normalized Column will be the column containing the normalized variables, whereas the Value column will contain the values of the normalized data (i.e Normalized column: 'Year', Value column:'Quantity' and columns to normalize '1998', '1999' and '2000' )
The original table is transformed into:
Tabular Data Manager, normalize result

Helper Menu

Extract Codelist

There is the possibility to extract a codelist from Tabular resources. Let's use the table below:
Tabular Data Manager, extract codelist
Click on the Extract codelist button, in the Helper menu of the Curation tab:
Tabular Data Manager, extract codelist button
Select the columns, for example 'code' and 'name':
Tabular Data Manager, extract codelist source column
Define the target column, for example new column:
Tabular Data Manager, extract codelist target column
Define the label and the type of your new column:
Tabular Data Manager, extract codelist target new column
The target column should appear as follow:
Tabular Data Manager, extract codelist target fill column
Choose a name for the new codelist:
Tabular Data Manager, extract codelist detail
A new codelist is extracted now:
Tabular Data Manager, extract codelist result
If you want to directly connect the tabular resource to the codelist extracted, you must set attach on detail so:
Tabular Data Manager, extract codelist attach
The original tabular resource is attached to new codelist:
Tabular Data Manager, extract codelist result by attach
Note: in this case the extracted codelist is set to final automatically.


Map Import