Difference between revisions of "GCube Data Catalogue for GRSF"

From Gcube Wiki
Jump to: navigation, search
m (Common Metadata)
m (Common Metadata)
Line 28: Line 28:
 
| No
 
| No
 
| No
 
| No
 +
|
 +
|
 
|-
 
|-
 
| Description
 
| Description

Revision as of 09:23, 18 October 2016

** THIS PAGE IS UNDER CONSTRUCTION **

GCube Data Catalogue: support for GRSF

In this page are reported the relevant information about the GRSF Data Catalogue, which is available here. This page is somehow an extension of the main gCube Data Catalogue guide, you are suggested to read before continue.

The GRFS Data Catalogue stores, as well as allows the publication of products of two types: Stock and Fishery. Apart from the default set of metadata, each type of product will also have specific fields. Some of them will also become automatically tags of the product. The same reasoning applies for group associations. In fact a set of groups was already available and each product will be automatically associated to them during publication, if that is the case. Fields that fire tags creation or groups association are documented below.

The publication phase is performed by means of a RESTful service whose publish methods accept JSON objects.

Metadata

Common Metadata

The following table shows the set of core metadata, that is the ones shared by both Stock and Fishery types. Some of them are automatically filled. The values given to some fields are automatically used to tag the product. Check the 'Is Tag' column of the table below. Other fields have a controlled vocabulary (that is, they can assume values selected from a defined set), and the value assigned to these fields allow to automatically determine to which group assign the product. Check the 'Is Group' column below.

Name Api Name (JSON) Is Tag Is Group Example Guidelines/Comments
Title* stock_name or fishery_name No No
Description description No No This product contains attributes of ... A brief description of the dataset written in plain language. It should provide a sufficiently comprehensive overview of the resource for anyone to understand its content origins and any continuing work on it.
License * license_id No No CC-BY-SA-4.0 The list of licenses' ids can be retrieved by using the service (see below). By default the CC-BY-SA-4.0 will be used.
Author author No No Bloggs, Joe This field is automatically compiled by using the information of the caller entity.
Author contact author_contact No No joe.blogg@example.com This field is automatically compiled by using the information of the caller entity
Maintainer maintainer No No A person: Bloggs, Joe. An authority: D4Science Mantainer of the dataset
Maintainer Contact maintainer_contact No No joe@example.com Contact details of the resource maintainer.
Version version No No 1.0 Increase manually after editing

mandatory fields are marked with an asterisk (*)

Besides the above common metadata, there is the following set of attributes that are captured for both Stock and Fishery objects.

Name Api Name (JSON) Is Tag Is Group Example Guidelines/Comments
Catches or landings catches_or_landings No No Catch - 18962 - ton - 2014 A combination of value, unit and date
Database Sources * database_sources Yes Yes [{"name":"FIRMS", "description": "unknown", "url":"http://....."}, ...] A list of elements of the type {"name": "a name", "description": "a description", "url": "http://...."}. Name and url are mandatory.

For the attribute name there is a controlled vocabulary: FIRMS, RAM, FishSource.

Source of Information * source_of_information No No [{"name":"...", "description": "...", "url":"http://....."},...] A list of elements of the type {"name": "a name", "description": "a description", "url": "http://...."}. Name and url are mandatory.
Data owner data_owner No No IATTC

mandatory fields are marked with an asterisk (*)

Stock Metadata

The Stock product type also supports the following list of fields.

Name API Name Is Tag Is Group Example Guidelines/Comments
Stock Name * stock_name No No Skipjack tuna - Eastern Pacific The title of the product. It is expected to be a unique name.
Stock ID stock_id No No SKJ- EPO
Type type Yes Yes Assessment Unit Controlled vocabulary: Assessment Unit, Resource
Species Scientific Name * species_scientific_name Yes No Katsuwonus pelamis (or SKJ)
Assessment distribution area * assessment_distribution_area No No East Pacific Ocean
Exploiting Fishery exploiting_fishery No No Tunas and billfishes fishery
Management entity management_entity No No DFO
Assessment methods assessment_methods No No Analytical assessment
State of marine resources state_of_marine_resource No No
Exploitation Rate exploitation_rate Yes Yes Moderate fishing mortality Controlled vocabulary: Moderate fishing mortality, High fishing mortality, No or low fishing mortality
Abundance level abundance_level Yes Yes Intermediate abundance Controlled vocabulary: Intermediate abundance, Low abundance, Uncertain/Not assessed
Narrative state and trend narrative_state_and_trend No No Stock size and fishing pressure are considered to be close to their value at MSY. A textual description
Scientific advice scientific_advice No No The indices of abundance from two longline fleets available for this stock present divergent trends over the last few years, the differences observed in targeting are not fully explained. A textual description
Reporting entity reporting_entity No No GRP3
Reporting year reporting_year No No 2005
Status * status Yes Yes Pending Controlled vocabulary: Pending, Confirmed.

mandatory fields are marked with an asterisk (*)

Fishery Metadata

The Fishery product type also supports the following list of fields.

Name API Name Is Tag Is Group Example Guidelines/Comments
Fishery Name * fishery_name No No NAFO Flemish Cap groundfish fisheries This will be the title of the product and a unique name will be generated starting from this.
Fishery ID fishery_id No No COD - 21.3.M - NAFO - OTB - CAN - Industrial
Type type Yes Yes Fishery Activity Controlled vocabulary: Fishery Activity, Fishing Description
Scientific Name scientific_name Yes No Caribbean spiny lobster
Fishing area fishing_area No No North Atlantic If missing then Jurisdiction Area cannot be null
Exploited stocks exploited_stocks No No Capelin - Southern Grand Bank
Management entity management_entity Yes No European Union
Jurisdiction Area jurisdiction_area No No Senegal If missing then Fishing Area cannot be null
Production system type production_system_type Yes Yes Industrial Controlled vocabulary: Subsistence, Recreational, Commercial, Artisanal, Semi-industrial, Industrial, Exploratory_fishery, Unspecified
Flag state flag_state Yes No ESP
Fishing gear fishing_gear Yes No PUN
Environment environment No No
Status * status Yes Yes Pending Controlled vocabulary: Pending, Confirmed.

mandatory fields are marked with an asterisk (*)

GRSF Publication Web Service

Publication of products is performed by means of a RESTful web service. Almost every call to the service requires the security token of the user for the context in which he wish to publish or exploit the other functionalities. Please note that in case of product publication it is needed that the user has enough privileges. The list of roles and associated privileges for the catalogue users is reported here. The VRE Manager assignes them.

In order to retrieve your security token you can use the token generator portlet.

The right address for contacting the service in the GRSF context can be discovered by means of the Information System [1]. You need the following parameters

Service Name = GRSFPublisher
Service Class = Data-Catalogue
Entry Name = jersey-servlet

For testing purpose, a running instance can be contacted at the following address

https://next.d4science.org/grsf-publisher-ws/rest/  [GRSF_PUBLISHER_WS_BASE_URL]

The token for testing purpose can be retrieved from the VRE at this url https://next.d4science.org/group/nextnext/home (register yourself if needed)

Check Service Availability

To check that the stock/fishery service is up and running, just put the url below in your browser

[GRSF_PUBLISHER_WS_BASE_URL]/fishery/hello

and the response should look like

Hello.. Fishery service is here

or

[GRSF_PUBLISHER_WS_BASE_URL]/stock/hello

and the response should look like

Hello.. Stock service is here

Retrieve the licenses list

The default license that will be associated to the products, if not specified, is the CC-BY-SA-4.0 one. However, if it doesn't feet your needs, you can use one of the others available and retrievable by contacting the service(s) this way

[GRSF_PUBLISHER_WS_BASE_URL]/fishery/get-licenses?gcube-token=YOUR_TOKEN

or, for stock

[GRSF_PUBLISHER_WS_BASE_URL]/stock/get-licenses?gcube-token=YOUR_TOKEN

The response is a JSON object, containing couples <license key, license name>, which looks like

{
    "AFL-3.0": "Academic Free License 3.0",
    "RPSL-1.0": "RealNetworks Public Source License 1.0",
    "ODC-BY-1.0": "Open Data Commons Attribution License 1.0",
    "IPL-1.0": "IBM Public License 1.0",
    "ODbL-1.0": "Open Data Commons Open Database License 1.0",
    "PostgreSQL": "PostgreSQL License",
    "W3C": "W3C License",
    ....
}

During the publication phase, the identifier of the license chosen should be provided.

Stock Publication Example

The publish method to invoke to publish a stock is the following

[GRSF_PUBLISHER_WS_BASE_URL]/stock/publish-product?gcube-token=YOUR_TOKEN

The JSON object you must provide in input has the following structure (of course, not all fields are mandatory)

{
   "description": ...,
   "license_id": ...,
   "version": ...,
   "maintainer": ...,
   "maintainer_contact": ...,
   "catches_or_landings": ...,
   "database_sources": [{...}, ...],
   "source_of_information": [{...}, ...],
   "data_owner": ...,
   "type": ...,
   "stock_name": ...,
   "stock_id": ...,
   "species_scientific_name": ...,
   "assessment_distribution_area": ...,
   "exploiting_fishery": ...,
   "management_entity": ...,
   "assessment_methods": ...,
   "state_of_marine_resource": ...,
   "exploitation_rate": ...,
   "abundance_level": ...,
   "narrative_state_and_trend": ...,
   "scientific_advice": ...,
   "reporting_entity": ...,
   "reporting_year": ...,
   "status": ...
}

The response of the method is a JSON object of this kind

{
   "id": ... , // identifier of the created product
   "dataset_url": ..., // url of the created product
   "error": ... // in case of error, check this field
}

In case of success the HTTP code is 201 (CREATED) and the response contains the url and the unique identifier assigned to the product. In case of errors (BAD_REQUEST, INTERNAL_SERVER_ERROR, FORBIDDEN ... ) the "error" message of the above object reports what was wrong.

Example

A valid JSON, for example, is the following one

{
   "description":"This stock product was generated for testing purposes, to show how publication works. Please refer to https://wiki.gcube-system.org/gcube/GCube_Data_Catalogue_for_GRSF for more information",
   "license_id":"CC-BY-SA-4.0",
   "version":1,
   "maintainer":"Costantino Perciante",
   "maintainer_contact":"costantino.perciante@isti.cnr.it",
   "catches_or_landings":"unknown",
   "database_sources":[
      {
         "name":"FIRMS",
         "url":"test url"
      }
   ],
   "source_of_information":[
      {
         "name":"source of information",
         "url":"http://www.iattc.org/PDFFiles2/FisheryStatusReports/FisheryStatusReport13.pdf"
      }
   ],
   "data_owner":"IATTC",
   "type":"Assessment Unit",
   "stock_name":"Skipjack tuna - Eastern Pacific",
   "stock_id":"SKJ - EPO",
   "species_scientific_name":"SKJ",
   "assessment_distribution_area":"East Pacific Ocean",
   "exploiting_fishery":"Tunas and billfishes fishery",
   "management_entity":"DFO",
   "assessment_methods":"Analytical assessment",
   "state_of_marine_resource":null,
   "exploitation_rate":"Moderate fishing mortality",
   "abundance_level":"Intermediate abundance",
   "narrative_state_and_trend":"Stock size and fishing pressure are considered to be close to their value at MSY.",
   "scientific_advice":"The indices of abundance from two longline fleets available for this stock present divergent trends over the last few years, the differences observed in targeting are not fully explained.",
   "reporting_entity":"GRP3",
   "reporting_year":2016,
   "status":"pending"
}

the response obtained from the service is

{
   "id":"6d44b6b2-af80-4aa4-860a-a17db27b40df",
   "dataset_url":"https://next.d4science.org/group/nextnext/data-catalogue/?path=/dataset/skipjack_tuna_-_eastern_pacific",
   "error":null
}

Fishery Publication Example

The publish method to invoke to publish a fishery product is the following

[GRSF_PUBLISHER_WS_BASE_URL]/fishery/publish-product?gcube-token=YOUR_TOKEN

The JSON object you must provide in input has the following structure (of course, not all fields are mandatory)

{
   "description": ...,
   "license_id": ...,
   "version": ...,
   "maintainer": ...,
   "maintainer_contact": ...,
   "catches_or_landings": ...,
   "database_sources": [{...}, ...],
   "source_of_information": [{...}, ...],
   "data_owner": ...,
   "type": ...,
   "fishery_name": ...,
   "fishery_id": ...,
   "scientific_name": ...,
   "fishing_area": ...,
   "exploited_stocks": ...,
   "management_entity": ...,
   "jurisdiction_area": ...,
   "production_system_type": ...,
   "flag_state": ...,
   "fishing_gear": ...,
   "status":...,
   "environment":..
}

The response of the method is a JSON object of this kind

{
   "id": ... , // identifier of the created product
   "dataset_url": ..., // url of the created product
   "error": ... // in case of error, check this field
}

In case of success the HTTP code is 201 (CREATED) and the response contains the url and the unique identifier assigned to the product. In case of errors (BAD_REQUEST, INTERNAL_SERVER_ERROR, FORBIDDEN ... ) the "error" message of the above object reports what was wrong.

Example

A valid JSON, for example, is the following one

{
   "description":"This fishery product was generated for testing purposes, to show how publication works. Please refer to https://wiki.gcube-system.org/gcube/GCube_Data_Catalogue_for_GRSF for more information",
   "license_id":"CC-BY-SA-4.0",
   "version":1,
   "maintainer":"Costantino Perciante",
   "maintainer_contact":"costantino.perciante@isti.cnr.it",
   "catches_or_landings":"unknown",
   "database_sources":[
      {
         "name":"FishSource",
         "url":"url"
      }
   ],
   "source_of_information":[
      {
         "name":"source of information",
         "url":"test url for source of information"
      }
   ],
   "data_owner":"IATTC",
   "type":"Fishery Activity",
   "fishery_name":"NAFO Flemish Cap groundfish fisheries ",
   "fishery_id":"COD - 21.3.M - NAFO - OTB - CAN - Industrial",
   "scientific_name":"Caribbean spiny lobster",
   "fishing_area":"North Atlantic",
   "exploited_stocks":"Capelin - Southern Grand Bank",
   "management_entity":"European Union",
   "jurisdiction_area":"Senegal",
   "production_system_type":"Industrial",
   "flag_state":"ESP",
   "fishing_gear":"PUN",
   "status":"Pending",
   "environment":null
}

the response obtained from the service is

{
   "id":"7b2989fc-cc4f-4b67-b7ff-42cc044881b1",
   "dataset_url":"https://next.d4science.org/group/nextnext/data-catalogue?path=/dataset/nafo_flemish_cap_groundfish_fisheries",
   "error":null
}

Delete a published product

If for some reason you need to delete a published product, you can invoke the following delete http methods. For fishery it is

[GRSF_PUBLISHER_WS_BASE_URL]/fishery/delete-product?gcube-token=YOUR_TOKEN

whereas for stock

[GRSF_PUBLISHER_WS_BASE_URL]/stock/delete-product?gcube-token=YOUR_TOKEN

You must provide the identifier (returned back at creation time) of the product, in a JSON that looks like

{"id", "identifier-deleted-product"}

The response status of the service, in case of success is 200 (OK)

How To Publish a GRSF product using JAVA

Below you find a simple Java application publishing a GRSF product (i.e. a stock).

package [YOUR PACKAGE];

import org.apache.commons.httpclient.HttpClient;
import org.apache.commons.httpclient.HttpException;
import org.apache.commons.httpclient.MultiThreadedHttpConnectionManager;
import org.apache.commons.httpclient.methods.ByteArrayRequestEntity;
import org.apache.commons.httpclient.methods.PostMethod;
import org.apache.http.HttpStatus;
import org.apache.log4j.Logger;

/**
 * The Class GRSFPublishMetadata.
 *
 * @author Francesco Mangiacrapa francesco.mangiacrapa@isti.cnr.it
 * Oct 13, 2016
 */
public class GRSFPublishMetadata {

	public static final Logger logger = Logger.getLogger(GRSFPublishMetadata.class);
	private static final String GRSF_PUBLISHER_REST_SERVICE_BASE_URL = "https://next.d4science.org/grsf-publisher-ws/rest/";

	/**
	 * The Enum PRODUCT_TYPE.
	 *
	 * @author Francesco Mangiacrapa francesco.mangiacrapa@isti.cnr.it
	 * Oct 13, 2016
	 */
	private static enum PRODUCT_TYPE{stock, fishery}
	private static final String PUBLISH_PRODUCT_REQUEST = "publish-product";
	private static final String GCUBE_TOKEN_PARAMETER = "gcube-token";
	private static final String GCUBE_TOKEN_VALUE = [YOUR TOKEN]; //***********SET YOUR TOKEN************
	private static final String CONTENTTYPE = "application/json";
	private HttpClient httpClient = null;
	public static final int TIME_OUT_REQUESTS = 5000; //5 sec

	/**
	 * Instantiates a new GRSF publish metadata.
	 */
	public GRSFPublishMetadata() {
		MultiThreadedHttpConnectionManager connectionManager = new MultiThreadedHttpConnectionManager();
		connectionManager.getParams().setSoTimeout(TIME_OUT_REQUESTS);
		this.httpClient = new HttpClient(connectionManager);

	}

	/**
	 * Publish product.
	 *
	 * @param type the type
	 * @param body the body
	 * @return the string
	 * @throws Exception the exception
	 */
	public String publishProduct(PRODUCT_TYPE type, String body) throws Exception {
		// Create a method instance.
		String buildURL = GRSF_PUBLISHER_REST_SERVICE_BASE_URL + "/" + type.toString() +"/"+PUBLISH_PRODUCT_REQUEST +"?"+GCUBE_TOKEN_PARAMETER +"="+GCUBE_TOKEN_VALUE;
		PostMethod method = new PostMethod(buildURL);
		method.setRequestHeader("Content-type", CONTENTTYPE);
		logger.debug("call post to URI .... " + method.getURI());
		logger.debug("	the body is..." + body);
		method.setRequestEntity(new ByteArrayRequestEntity(body.getBytes()));
		byte[] responseBody = null;
		try {
			// Execute the method.
			int statusCode = httpClient.executeMethod(method);

			if (statusCode != HttpStatus.SC_OK && statusCode != HttpStatus.SC_CREATED) {
				logger.error("Method failed: " + method.getStatusLine()+"; Response bpdy: "+method.getResponseBody());
				method.releaseConnection();
				throw new Exception("Method failed: " + method.getStatusLine()+"; Response body: "+new String(method.getResponseBody()));
			}
			// Read the response body.
			responseBody = method.getResponseBody();

		} catch (HttpException e) {
			logger.error("Fatal protocol violation: ", e);
			method.releaseConnection();
			throw new Exception("Fatal protocol violation: " + e.getMessage());
		} catch (Exception e) {
			logger.error("Fatal transport error: ", e);
			method.releaseConnection();
			throw new Exception("Fatal transport error: " + e.getMessage());
		}
		method.releaseConnection();
		return new String(responseBody);
	}

	/**
	 * The main method.
	 *
	 * @param args the arguments
	 */
	public static void main(String[] args) {
		try {
		GRSFPublishMetadata grsfP = new GRSFPublishMetadata();
			String minimal_json_stock =
			   "{\"description\":\"This stock product was generated for testing purposes, to show how publication works. Please refer to https://wiki.gcube-system.org/gcube/GCube_Data_Catalogue_for_GRSF for more information\"," +
			   "\"license_id\":\"CC-BY-SA-4.0\"," +
			   "\"version\":1," +
			   "\"maintainer\":\"Francesco Mangiacrapa\","+
			   "\"maintainer_contact\":\"francesco.mangiacrapa@isti.cnr.it\","+
			   "\"catches_or_landings\":\"unknown\","+
			   "\"database_sources\":["+
			      "{\"name\":\"RAM\",\"url\":\"test url\"}" +
			      "]" +
			    ",\"source_of_information\":[{\"name\":\"the source of information\",\"url\":\"http://www.google.com\"}" +
			    "]," +
			   "\"data_owner\":\"IATTC\","+
			   "\"type\":\"Assessment Unit\","+
			   "\"stock_name\":\"Skipjack tuna - Western Pacific Ocean 4\","+ //YOU MUST CHANGE THE STOCK NAME FOR TESTING
			   "\"stock_id\":\"SKJ - EPO - TESTING\","+
			   "\"species_scientific_name\":\"SKJ\","+
			   "\"assessment_distribution_area\":\"Western Pacific Ocean 4\","+
			   "\"exploiting_fishery\":\"Tunas and billfishes fishery\","+
			   "\"management_entity\":\"DFO\","+
			   "\"assessment_methods\":\"Analytical assessment\","+
			   "\"state_of_marine_resource\":null,"+
			   "\"exploitation_rate\":\"High fishing mortality\","+
			   "\"abundance_level\":\"Intermediate abundance\","+
			   "\"narrative_state_and_trend\":\"Stock size and fishing pressure are considered to be close to their value at MSY.\","+
			   "\"scientific_advice\":\"The indices of abundance from two longline fleets available for this stock present divergent trends over the last few years, the differences observed in targeting are not fully explained.\","+
			   "\"reporting_entity\":\"GRP3\","+
			   "\"reporting_year\":2016,"+
			   "\"status\":\"pending\"}";

			String response = grsfP.publishProduct(PRODUCT_TYPE.stock, minimal_json_stock);
			logger.info("The Response: "+response);
		}catch (Exception e) {
			e.printStackTrace();
		}
	}

}

The response is:


DEBUG server.GRSFPublishMetadata [main,publishProduct:62] call post to URI .... https://next.d4science.org/grsf-publisher-ws/rest/stock/publish-product?gcube-token=[YOUR TOKEN]
DEBUG server.GRSFPublishMetadata [main,publishProduct:63] 	the body is...{"description":"This stock product was generated for testing purposes, to show how publication works. Please refer to https://wiki.gcube-system.org/gcube/GCube_Data_Catalogue_for_GRSF for more information","license_id":"CC-BY-SA-4.0","version":1,"maintainer":"Francesco Mangiacrapa","maintainer_contact":"francesco.mangiacrapa@isti.cnr.it","catches_or_landings":"unknown","database_sources":[{"name":"RAM","url":"test url"}],"source_of_information":[{"name":"the source of information","url":"http://www.google.com"}],"data_owner":"IATTC","type":"Assessment Unit","stock_name":"Skipjack tuna - Western Pacific Ocean 4","stock_id":"SKJ - EPO - TESTING","species_scientific_name":"SKJ","assessment_distribution_area":"Western Pacific Ocean 4","exploiting_fishery":"Tunas and billfishes fishery","management_entity":"DFO","assessment_methods":"Analytical assessment","state_of_marine_resource":null,"exploitation_rate":"High fishing mortality","abundance_level":"Intermediate abundance","narrative_state_and_trend":"Stock size and fishing pressure are considered to be close to their value at MSY.","scientific_advice":"The indices of abundance from two longline fleets available for this stock present divergent trends over the last few years, the differences observed in targeting are not fully explained.","reporting_entity":"GRP3","reporting_year":2016,"status":"pending"}
INFO  server.GRSFPublishMetadata [main,main:130] The Response: {"id":"8426bce9-15b9-4c4a-a526-e829882b91ec","dataset_url":"https://next.d4science.org/group/nextnext/data-catalogue?path=/dataset/skipjack_tuna_-_western_pacific_ocean_4","error":null}

You must use the following dependencies (if you are using Maven):

<!-- COMMONS HTTP -->
<dependency>
    <groupId>commons-httpclient</groupId>
    <artifactId>commons-httpclient</artifactId>
    <version>3.1</version>
</dependency>
<!-- LOGS -->
<dependency>
  <groupId>log4j</groupId>
  <artifactId>log4j</artifactId>
  <version>1.2.16</version>
</dependency>
  1. Information System can be queried via ic-client read more