Difference between revisions of "Occurrence Data Enrichment Service"

From Gcube Wiki
Jump to: navigation, search
m
m
 
(7 intermediate revisions by 3 users not shown)
Line 1: Line 1:
 +
[[Category:gCube Features]]
 
{| align="right"
 
{| align="right"
 
||__TOC__
 
||__TOC__
 
|}
 
|}
  
A service for performing the enrichment of occurrence points of species with additional information, e.g. environmental parameters characterising the points.  
+
A service for performing the enrichment of occurrence points of species with additional information, e.g. environmental parameters characterising the points. The aim is to provide users with an interface for searching among the available environmental information that can be attached to the occurrence points under analysis. The rationale for such a service is described in http://wiki.i-marine.eu/index.php/Environmental_Data_Enrichment; some background useful for implementation and deciding on priorities for data is available  from http://wiki.i-marine.eu/index.php/Environmental_Data_Enrichment.
The aim is to provide users with an interface for searching among the available environmental information that can be attached to the occurrence points under analysis.
+
  
 
This document outlines the design rationale, key features, and high-level architecture, as well as the options deployment.
 
This document outlines the design rationale, key features, and high-level architecture, as well as the options deployment.
Line 15: Line 15:
  
 
The environmental information will be supplied by the Environmental Service of d4Science along with the list of the available information resident in the infrastructure.
 
The environmental information will be supplied by the Environmental Service of d4Science along with the list of the available information resident in the infrastructure.
 +
 +
=== Key features ===
 +
 +
* Merge, Subtraction and Intersection operations
 +
* Points Clustering
 +
* Anomaly Point Detection
  
 
== Design ==
 
== Design ==
Line 50: Line 56:
 
=== Well suited Use Cases ===
 
=== Well suited Use Cases ===
  
The subsystem is particularly suited when users want to investigate marine properties of the places where species live. This helps in understanding the characteristics of the places they prefer. The advantage to have environmental information discovered by an external service (e.g. the [https://gcube.wiki.gcube-system.org/gcube/index.php/Environmental_Service Environmental Service]) can boost the investigation of species habitat, which normally requires a big amount of time to scholars.
+
The subsystem is particularly suited when users want to investigate marine properties of the places where species live. This helps in understanding the characteristics of the places they prefer. The advantage to have environmental information discovered by an external service (e.g. the [[Environmental Service]]) can boost the investigation of species habitat, which normally requires a big amount of time to scholars.
  
 
== Subsystems ==
 
== Subsystems ==

Latest revision as of 16:15, 21 November 2013

A service for performing the enrichment of occurrence points of species with additional information, e.g. environmental parameters characterising the points. The aim is to provide users with an interface for searching among the available environmental information that can be attached to the occurrence points under analysis. The rationale for such a service is described in http://wiki.i-marine.eu/index.php/Environmental_Data_Enrichment; some background useful for implementation and deciding on priorities for data is available from http://wiki.i-marine.eu/index.php/Environmental_Data_Enrichment.

This document outlines the design rationale, key features, and high-level architecture, as well as the options deployment.

Overview

The goal of this service is to offer a single entry point for enriching information associated to the coordinates corresponding to some occurrence points set. Data can come from the Species Discovery Service, from the Occurrence Data Reconciliation Service or they could be uploaded from a user by means of a web interface.

The service is able to interface to other infrastructural services in order to expand the number of functionalities and applications to the data under analysis.

The environmental information will be supplied by the Environmental Service of d4Science along with the list of the available information resident in the infrastructure.

Key features

  • Merge, Subtraction and Intersection operations
  • Points Clustering
  • Anomaly Point Detection

Design

Philosophy

This represents an endpoint for users who want to add some environmental information to coordinates associated to occurrence points. It is meant as a complement to other services for species and occurrence points analysis.

Architecture

The subsystem comprises the following components:

  • Inputs Managers: a set of internal processors which manage the variety of inputs that could come from users or from other services. Data can come from the Occurrence Data Reconciliation;
  • Occurrence Points Sets Operations: a set of internal objects which can invoke external systems in order to process data sets. Merge, Subtraction and Intersection operations can be invoked by interfacing to the Statistical Manager;
  • Occurrence Points Enrichment: a connector to the Environmental Service for (i) retrieving discoverable information (ii) retrieving environmental data yet present in d4Science (iii) produce data by interpolation or kriging if necessary;
  • Processing Orchestrator: an internal process which manages the interaction and the usage of the other components. It accepts and dispatches requests coming from outside the service.

A diagram of the relationships between these components is reported in the following figure:

Occurrence Points Enrichment Service, internal architecture

Deployment

All the components of the service must be deployed together in a single node. This subsystem can be replicated on multiple hosts and scopes, this does not guarantee a performance improvement because this is a management system for a single input dataset.

Small deployment

The deployment follows the following schema as it needs the presence of other complementary services.

Occurrence Points Enrichment Service, deployment schema

Use Cases

Well suited Use Cases

The subsystem is particularly suited when users want to investigate marine properties of the places where species live. This helps in understanding the characteristics of the places they prefer. The advantage to have environmental information discovered by an external service (e.g. the Environmental Service) can boost the investigation of species habitat, which normally requires a big amount of time to scholars.

Subsystems

Occurrence Data Enrichment Service depends on the following subsystems, where each specializes along the structure or the semantics of the data: