Difference between revisions of "Ckan 2 zenodo library"

From Gcube Wiki
Jump to: navigation, search
(Configuration)
(Common Example Mappings)
 
(9 intermediate revisions by the same user not shown)
Line 38: Line 38:
 
</source>
 
</source>
  
===Mappings===
+
===XML Mappings===
====Directives====
+
Each mapping directive selects a source value list and applies them to a specific deposition field.
 +
 
 +
Several options and features are supported on both source values and target values directives.
 +
 
 +
Field are selected using JSON paths (see [[[https://github.com/json-path/JsonPath JsonPath library]]]).
 +
 
 +
Following is a simple directive setting a ''constan'' value to path ''$.metadata.upload_type''.
 +
 
 +
<source lang="xml">
 +
<mapping>
 +
  <source>
 +
  <value type="constant">dataset</value>
 +
  <//source>
 +
  <targetPath>$.metadata</targetPath>
 +
  <targetElement>upload_type</targetElement>
 +
</mapping>
 +
</source>
 +
 
 +
====Directives Options====
 +
=====Source=====
 +
*'''Type''' : can be either '''constant''' or '''jsonPath'''
 +
*'''Split''' : [optional] is used to split found values
 +
*Multiple '''value''' directives can be declared. The first one to actual produce values is used.
 +
<source lang="xml">
 +
<mapping>
 +
<source>
 +
  <value type="jsonPath">$.extras[?(@.key=='Author')].value</value>
 +
  <value type="jsonPath" split=";">$.author</value>
 +
<//source>
 +
<targetPath>$.metadata.creators[0]</targetPath>
 +
<targetElement>name</targetElement>
 +
</mapping>
 +
</source>
 +
=====Target=====
 +
*'''Append''' : [optional, default = false] append the value to the existing one instead of overwriting it.
 +
=====Regexp=====
 +
* The use of '''Regexp''' directive is optional.
 +
* Multiple '''Regexp''' directives can be declared.
 +
*'''Extract''' : Extract only matching string from processed ''values''
 +
<source lang="xml">
 +
<mapping>
 +
<source type="jsonPath">
 +
  <value>$.extras[?(@.key=='Author')].value</value>
 +
<//source>
 +
<targetPath>$.metadata.contributors[0]</targetPath>
 +
<targetElement>orcid</targetElement>
 +
<regexp type="extract">
 +
  <target>(https://)?orcid.org/.*</target>
 +
</regexp>
 +
</mapping>
 +
</source>
 +
 
 +
*'''Replace''' : Replace ''matching pattern'' with ''replacementString''
 +
<source lang="xml">
 +
<mapping>
 +
<source>
 +
  <value type="jsonPath">$.extras[?(@.key=='AccessMode:Accessibility')].value</value>
 +
<//source>
 +
<targetPath>$.metadata</targetPath>
 +
<targetElement append="true">access_conditions</targetElement>
 +
<regexp type="replace">
 +
  <target>^</target>
 +
  <replacement>AccessMode.Accessibility : </replacement>
 +
</regexp>
 +
<regexp type="replace">
 +
  <target>$</target>
 +
  <replacement>; </replacement>
 +
</regexp>
 +
</mapping>
 +
</source>
  
 
== Integration ==
 
== Integration ==
 
===Maven coordinates===
 
===Maven coordinates===
 +
<source lang="xml">
 +
<groupId>org.gcube.data.publishing</groupId>
 +
<artifactId>ckan2zenodo-library</artifactId>
 +
<version>1.0.2-SNAPSHOT</version>
 +
</source>
 +
 
===Source Code===
 
===Source Code===
 +
Source code available [[[https://code-repo.d4science.org/gCubeSystem/ckan2zenodo-library.git here]]].
 +
 +
=== Usage ===
 +
====Check Environment====
 +
The library exposes reports on the status of the current environment.
 +
 +
Currently implemented checks are :
 +
* gCat Service presence
 +
* Zenodo Credentials presence
 +
 +
Following code is a simple test case.
 +
 +
<source lang="java">
 +
Ckan2Zenodo client=new Ckan2ZenodoImpl();
 +
EnvironmentReport report=client.checkEnvironment();
 +
Assume.assumeTrue(report.isok());       
 +
</source>
 +
 +
The '''EnvironmentReport''' bean exposes :
 +
* A map String->Status : containing messages and status of performed checks
 +
* The context checked
 +
* Simple method ''boolwan : isok()'' for fast checking.
 +
 +
<source lang="javascript">
 +
EnvironmentReport(reportItems={GCat Presence : OK=PASSED, Zenodo : OK=PASSED}, context=/pred4s/preprod/preVRE)
 +
</source>
 +
 +
====Publication Use Case====
 +
Main use of the library is to preview and publish a gCat item onto Zenodo. Following code illustrates how to do it.
 +
 +
<source lang="java">
 +
// Obtain the client
 +
Ckan2Zenodo client=new Ckan2ZenodoImpl();
 +
// Load the gCat item
 +
CkanItemDescriptor item=client.read(toPublishItemName);
 +
 +
//Get a preview of the deposition to be published
 +
ZenodoDeposition preview=client.translate(item);
 +
 +
//Filter resources according to VRE policies
 +
List<CkanResource> toFilter=client.filterResources(item);
 +
 +
//Eventually update values
 +
preview.getMetadata().setAccess_conditions("Ask me");
 +
 +
//Actually publish to zenodo :
 +
// Step 1 : metadata
 +
preview=client.updatedMetadata(preview);
 +
//Step 2 : publish Resources
 +
Future<ZenodoDeposition> future_Dep=client.uploadFiles(Collections.singleton(toFilter.get(0)), preview);
 +
preview=future_Dep.get();
 +
</source>
 +
 +
====Common Example Mappings====
 +
=====Community=====
 +
Community is handled just like other metadata mapping instructions.
 +
NB : Zenodo expects existing community IDs
 +
<syntaxhighlight lang="xml">
 +
<mapping>
 +
<source>
 +
  <value type="constant">[{'identifier':'ecfunded'}]</value>
 +
</source>
 +
<targetPath>$.metadata</targetPath>
 +
<targetElement>communities</targetElement>               
 +
</mapping>
 +
</syntaxhighlight>
 +
 +
====Common Resource Filtering====
 +
=====Pass ALL =====
 +
<syntaxhighlight lang="xml">
 +
<resourceFilters>
 +
  <filter>
 +
    <condition>$.resources[?(@.format)]</condition>
 +
  </filter>
 +
</resourceFilters>
 +
</syntaxhighlight>
 +
 +
=====Select Multiple Extensions=====
 +
<syntaxhighlight lang="xml">
 +
<resourceFilters>
 +
  <filter>
 +
    <condition>$.resources[?(@.format=='GIF')]</condition>
 +
  </filter>
 +
  <filter>
 +
    <condition>$.resources[?(@.format=='CSV')]</condition>
 +
  </filter>
 +
</resourceFilters>
 +
</syntaxhighlight>

Latest revision as of 12:07, 17 October 2022

The org.gcube.data.publishing.ckan2Zenodo gCube component is a Java library which translates CKAN items into Zenodo depositions, allowing to further editing and publication.

Overview

Ckan2 zenodo.png

The library is invoked by the client (in this case the gCube CKAN GUI), which relies on the gCat service for interacting with CKAN. After the ckan item is loaded, a default translation is then applied on the item's core fields generating in a Draft Zenodo Deposition. NB If the item has already a linked Zenodo DOI, the Draft Deposition overrides its specified informations.

A XML Mapping associated with item's profile, is then loaded from the IS (token must be specified by the client). The XML Mapping extracts data from the CKAN item and applies them to the Draft Deposition, resulting in a Final Deposition.

The Final Deposition is then returned to the client for further editing, which can eventually ask for its publication (the new Zenodo Doi will be then applied to the CKAN item through gCat).

IS Configuration

  • Zenodo Credentials ServiceEndpoint
..
<Category>Repository</Category>
..
<Platform>
  <Name>Zenodo</Name>
...
  • XML Mapping Generic Resource
NB: one XML Mapping per profile is supported
..
<SecondaryType>Ckan-Zenodo-Mappings</SecondaryType>
 
<Name>[***PROFILE_ID***]</Name>
..

XML Mappings

Each mapping directive selects a source value list and applies them to a specific deposition field.

Several options and features are supported on both source values and target values directives.

Field are selected using JSON paths (see [[JsonPath library]]).

Following is a simple directive setting a constan value to path $.metadata.upload_type.

<mapping>
  <source>
   <value type="constant">dataset</value>					
  <//source>					
  <targetPath>$.metadata</targetPath>
  <targetElement>upload_type</targetElement>
</mapping>

Directives Options

Source
  • Type : can be either constant or jsonPath
  • Split : [optional] is used to split found values
  • Multiple value directives can be declared. The first one to actual produce values is used.
<mapping>
 <source>
  <value type="jsonPath">$.extras[?(@.key=='Author')].value</value>
  <value type="jsonPath" split=";">$.author</value>						
 <//source>
 <targetPath>$.metadata.creators[0]</targetPath>
 <targetElement>name</targetElement>
</mapping>
Target
  • Append : [optional, default = false] append the value to the existing one instead of overwriting it.
Regexp
  • The use of Regexp directive is optional.
  • Multiple Regexp directives can be declared.
  • Extract : Extract only matching string from processed values
<mapping>
 <source type="jsonPath">
  <value>$.extras[?(@.key=='Author')].value</value>
 <//source>
 <targetPath>$.metadata.contributors[0]</targetPath>
 <targetElement>orcid</targetElement>
 <regexp type="extract">
  <target>(https://)?orcid.org/.*</target>
 </regexp>
</mapping>
  • Replace : Replace matching pattern with replacementString
<mapping>
 <source>
  <value type="jsonPath">$.extras[?(@.key=='AccessMode:Accessibility')].value</value>
 <//source>
 <targetPath>$.metadata</targetPath>
 <targetElement append="true">access_conditions</targetElement>
 <regexp type="replace">
  <target>^</target>
  <replacement>AccessMode.Accessibility : </replacement>
 </regexp>
 <regexp type="replace">
  <target>$</target>
  <replacement>; </replacement>
 </regexp>
</mapping>

Integration

Maven coordinates

<groupId>org.gcube.data.publishing</groupId>
<artifactId>ckan2zenodo-library</artifactId>
<version>1.0.2-SNAPSHOT</version>

Source Code

Source code available [[here]].

Usage

Check Environment

The library exposes reports on the status of the current environment.

Currently implemented checks are :

  • gCat Service presence
  • Zenodo Credentials presence

Following code is a simple test case.

 Ckan2Zenodo client=new Ckan2ZenodoImpl();
 EnvironmentReport report=client.checkEnvironment();
 Assume.assumeTrue(report.isok());

The EnvironmentReport bean exposes :

  • A map String->Status : containing messages and status of performed checks
  • The context checked
  • Simple method boolwan : isok() for fast checking.
EnvironmentReport(reportItems={GCat Presence : OK=PASSED, Zenodo : OK=PASSED}, context=/pred4s/preprod/preVRE)

Publication Use Case

Main use of the library is to preview and publish a gCat item onto Zenodo. Following code illustrates how to do it.

// Obtain the client
Ckan2Zenodo client=new Ckan2ZenodoImpl();
// Load the gCat item 
CkanItemDescriptor item=client.read(toPublishItemName);
 
//Get a preview of the deposition to be published
ZenodoDeposition preview=client.translate(item);
 
//Filter resources according to VRE policies
List<CkanResource> toFilter=client.filterResources(item);
 
//Eventually update values
preview.getMetadata().setAccess_conditions("Ask me");
 
//Actually publish to zenodo : 
// Step 1 : metadata
preview=client.updatedMetadata(preview);
//Step 2 : publish Resources 
Future<ZenodoDeposition> future_Dep=client.uploadFiles(Collections.singleton(toFilter.get(0)), preview);
preview=future_Dep.get();

Common Example Mappings

Community

Community is handled just like other metadata mapping instructions. NB : Zenodo expects existing community IDs

<mapping>
 <source>
   <value type="constant">[{'identifier':'ecfunded'}]</value>
 </source>
 <targetPath>$.metadata</targetPath>
 <targetElement>communities</targetElement>                
</mapping>

Common Resource Filtering

Pass ALL
<resourceFilters>
   <filter>
     <condition>$.resources[?(@.format)]</condition>
   </filter>
</resourceFilters>
Select Multiple Extensions
<resourceFilters>
   <filter>
     <condition>$.resources[?(@.format=='GIF')]</condition>
   </filter>
   <filter>
     <condition>$.resources[?(@.format=='CSV')]</condition>
   </filter>
</resourceFilters>