ServiceManager Guide

From Gcube Wiki
Revision as of 17:19, 28 October 2024 by Francesco.mangiacrapa (Talk | contribs) (Resource Catalogue)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Contents

This part of the guide is intended to cover the installation and configuration of gCube services that are not mentioned in the Administration guide. Mainly we refer to services that are not Enabling and that can be installed dynamically by the Infrastructure/VO Managers. The list includes also for each component known issues and specific configuration steps to follow.

Search (DISMISSED)

==Search V 2.xx (DISMISSED)==


The installation of a Search Node in gCube is characterised by the installation of 2 web-services ( in the minimal configuration ) :

  • SearchSystemService
  • ExecutionEngineService

This is the minimal installation scenario but it's possible to enable distributed search as well and this will required the installation and configuration of several ExecutionEngineServices

HW requirements

The minimal installation requirements for a Search node are a Single CPU node with 2GB RAMm but it's more than recommended to have at least 3GB RAM on the node dedicated to the GHN.

Configuration

The SearchSystemService and ExecutionEngineService have to be automatically/manually deployed in a VRE scope. In addition if we want to configure the SearchSystemService to exploit the local ExecutionEngineService to run the queries ( minimal installation) we should configure the jndi service as follows:

  • excludeLocal = false
  • collocationThreshold = 0.3f
  • complexPlanNumNodes = 800000

Search v 3.x.x (DISMISSED)

The 3.0 version has moved to Smartgears and tomcat.

The requirement of the codeployment with Execution Engine Service is also there , so the Execution Engine Service v 2.0.0 has been also ported to SmartGears

HW requirements

The minimal installation requirements for a Search node are a Single CPU node with 2GB RAMm but it's more than recommended to have at least 3GB RAM on the node dedicated to the GHN.

Configuration

in order to fix an issue with datanucleus compatibility and java 7 there is a change to be included in the tomcat configuration:

  • uncomment and modify the following line on the $CATALINA_HOME/bin/catalina.sh file:
JAVA_OPTS="$JAVA_OPTS -noverify -Dorg.apache.catalina.security.SecurityListener.UMASK=`umask`"
  • The conf file $CATALINA_HOME/conf/infrastructure.properties containing infra and scope informations needs to be present
# a single infrastructure
infrastructure=d4science.research-infrastructures.eu
 # multiple scopes must be separated by a common (e.g FARM,gCubeApps)
scopes=Ecosystem
clientMode=false


  • The conf file $CATALINA_HOME/webapps/<search>WEB-INF/classes/deploy.properties needs to be filled with this info:
hostname = xx
startScopes = xx
port=xx

Known Issues

Excecution Engine (DISMISSED)

The 2.0 version has moved to Smartgears and tomcat.

HW requirements

The minimal installation requirements for an Execution Engine node are a Single CPU node with 2GB RAMm but it's more than recommended to have at least 3GB RAM on the node dedicated to the GHN.

Installation

Different packagings of the Execution engine are available depending on the service they are going to be co-deployed with and invoked:

  • DTS : <artifactId>executionengineservice-dts</artifactId>
  • Search: <artifactId>executionengineservice-search</artifactId>

Configuration

in order to fix an issue with datanucleus compatibility and java 7 there is a change to be included in the tomcat configuration:

  • uncomment and modify the following line on the $CATALINA_HOME/bin/catalina.sh file:
JAVA_OPTS="$JAVA_OPTS -noverify -Dorg.apache.catalina.security.SecurityListener.UMASK=`umask`"
  • The conf file $CATALINA_HOME/conf/infrastructure.properties containing infra and scope informations needs to be present
# a single infrastructure
infrastructure=d4science.research-infrastructures.eu
 # multiple scopes must be separated by a common (e.g FARM,gCubeApps)
scopes=Ecosystem
clientMode=false


  • The conf file $CATALINA_HOME/webapps/<execution-engine>WEB-INF/classes/deploy.properties needs to be filled with this info:
hostname = xx
startScopes = xx
port=xx
pe2ng.port = 4000
  • in case the exeucution engine needs to call DTS on the container.xml add:
<property name='dts.execution' value='true' />

Executor and GenericWorker (DISMISSED)

HW requirements

The minimal installation requirements for an Executor node with a Generic Worker plugin are a Single CPU node with 2GB RAM but it's more than recommended to have at least 3GB RAM on the node dedicated to the GHN.

Configuration

The following Software should be installed on the VM:

  • R version 2.14.1

whit the following components

  • coda
  • R2jags
  • R2WinBUGS
  • rjags
  • bayesmix
  • runjags

Known Issues

  • The GenericWorker is exploited by the Statistical Manager service to run distributed computations. Given that the SM use the root scope to discover instances of the GenericWorker. the plugin must be deployed at root scope level
  • Given that the GenericWorker plugin depends on the Executor Service, when dynamically deploying the plugin the Executor Service is also deployed.

SmartExecutor

HW requirements

The minimal installation requirements for an Executor node with a Generic Worker plugin are a Single CPU node with 2GB RAM but it's more than recommended to have at least 3GB RAM on the node dedicated to the vHN (Smartgears gHN).

Configuration

No specific configuration are needed for SmartExecutor

Known Issues

  • When correctly started the SmartExecutor publishes a ServiceEndpoint with <Category>VREManagement</Category> and <Name>SmartExecutor</Name>. You can check the availability of the plugin on that resource. there is one <AccessPoint> per plugin.

SmartGenericWorker

HW requirements

The minimal installation requirements for an Executor node with a Generic Worker plugin are a Single CPU node with 2GB RAM but it's more than recommended to have at least 3GB RAM on the node dedicated to the vHN.

Configuration

The following Software should be installed on the VM:

  • R version 2.14.1

whit the following components

  • coda
  • R2jags
  • R2WinBUGS
  • rjags
  • bayesmix
  • runjags

Known Issues

  • The SmartGenericWorker is exploited by the Statistical Manager service to run distributed computations. Given that the SM use the root scope to discover instances of the SmartGenericWorker, the plugin must be deployed at root scope level
  • To deploy SmartGenericWorker you need to copy the SmartGenericWorker jar-with-dependecies in $CATALINA_HOME/webapps/smart-executor/WEB-INF/lib/ directory. A container restart is needed to load the new plugin.
  • When the container is restarted the plugin availability can be cheeked looking at the Service Endpoint published by the SmartExecutor.

This simple script can help the deployment process.


#!/bin/bash $CATALINA_HOME/bin/shutdown.sh -force rm -rf $CATALINA_HOME/webapps/smart-executor*

cp ~/smart-executor.war $CATALINA_HOME/webapps/

mkdir $CATALINA_HOME/webapps/smart-executor unzip $CATALINA_HOME/webapps/smart-executor.war -d $CATALINA_HOME/webapps/smart-executor

cp ~/smart-generic-worker-*.jar $CATALINA_HOME/webapps/smart-executor/WEB-INF/lib/

sleep 5s $CATALINA_HOME/bin/startup.sh

DTS (ABANDONWARE / DISMISSED)

DTS v2.x

HW requirements

The minimal installation requirements for an DTS node are a Single CPU node with 2GB RAMm but it's more than recommended to have at least 3GB RAM on the node dedicated to the GHN.

Configuration

DTS uses Execution Engine to run the transformations so at least one Execution Engine should be deployed in the same scope as DTS and the related GHNLabels.xml file should contain:

<Variable>
      <Key>dts.execution</Key>
      <Value>true</Value>
</Variable>

Known Issues

none

DTS v3.x

HW requirements

The minimal installation requirements for an DTS node with a Generic Worker plugin are a Single CPU node with 2GB RAMm but it's more than recommended to have at least 3GB RAM on the node dedicated to the GHN.

Configuration

  • The conf file $CATALINA_HOME/conf/infrastructure.properties containing infra and scope informations needs to be present
# a single infrastructure
infrastructure=d4science.research-infrastructures.eu
 # multiple scopes must be separated by a common (e.g FARM,gCubeApps)
scopes=Ecosystem
clientMode=false
  • The conf file $CATALINA_HOME/webapps/<dts>/WEB-INF/classes/deploy.properties needs to be filled with this info:
hostname = xx
startScopes = xx
port=xx

DTS uses Execution Engine to run the transformations so at least one Execution Engine should be deployed in the same scope as DTS and the related Smartgears conf file ( container.xml ) should have this properties:

<property name='dts.execution' value='true' /> 

= Index (DISMISSED)=

Index Service (DISMISSED)

The Index Service is the latest released Restful Service running on Smartgears. It implements both FW and FT index functionalitoes

HW requirements

Given codeployment with ElasticSearch ( embedded) it's recommended at least a VM with 4GB RAM and 2 CPUs.

Also open file limit should be raised to 32000

Configuration

Details on the Index Service configuration are available at https://gcube.wiki.gcube-system.org/gcube/index.php/Index_Management_Framework#Deployment_Instructions

ForwardIndexNode ( Dismissed)

The ForwardIndexNode service needs to be codeployed with an instance of CouchBase service

HW requirements

Given codeployment with Couchbase it's recommended at least a VM with 4GB RAM and 2 CPUs.

Configuration

The installation of Couchbase should be performed manually and it depends on the OS ( binary package, rpm, debs).

It's recommended to put an higher limit of the open files on the VM ( 32000 min).

The configuration for the FWIndexNode that should be customized (jndi file):

  • couchBaseIP = IP of the server hosting Couchbase ( so the same as the GHN)
  • couchBaseUseName = the username set when configuring Couchbase
  • couchBasePassword = the password set when configuring Couchbase

Once configured it's needed to initialize Couchbase using the cb_initialize_node.sh script contained into the service configuration folder.

Known Issues

  • Sometimes the cb_initialize_node.sh script fails, it could mean that there is not enough memory to inizialize the data bucket , try to reduce the value of ramQuota in the jndi file.<s>

<s>= Statistical Manager (DISMISSED) =

Resources

Runtime Resources ' '
DataStorage/StorageManager VO/VRE StorageManager
Database/Obis2Repository VRE Trendylyzer
Database/StatisticalManagerDatabase INFRA/VO/VRE Statistical
Database/AquamapsDB VO/VRE Algorithms
Database/FishCodesConversion VO/VRE Algorithms
Database/FishBase VO/VRE Algorithms - TaxaMatch
DataStorage/Storage Manager INFRA/VO/VRE All
Gis/Geoserver1..n VRE Maps Algorithms
Gis/TimeSeriesDatastore VO/VRE Maps Algorithms
Gis/GeoNetwork VRE Maps Algorithms
Service/MessageBroker VO Service
BiodiversityRepository/CatalogofLife VO/VRE Occurrence Algorithms
BiodiversityRepository/GBIF VO/VRE Occurrence Algorithms
BiodiversityRepository/ITIS VO/VRE Occurrence Algorithms
BiodiversityRepository/WoRDSS VO/VRE Occurrence Algorithms
BiodiversityRepository/WoRMS VO/VRE Occurrence Algorithms
BiodiversityRepository/OBIS VO/VRE Occurrence Algorithms
BiodiversityRepository/NCBI VO/VRE Occurrence Algorithms
BiodiversityRepository/SpeciesLink VO/VRE Occurrence Algorithms
DataAnalysis/Dataminer VRE Required if Dataminer is needed in the VRE
Database/UsersGisTablesDB VRE Required if Dataminer and SDI are needed in the VRE


WS Resources ' '
Workers INFRA/VO Parallel Computations


Generic Resources ' '
ISO/MetadataConstants VO/VRE Maps Algorithms

Known Issues

Tested on ghn 4.0.0 and StatisticalManager service 1.4.0:

  • install the SM on the same network where the database and the used resources are located. Otherwise it would imply to restart production databases because direct access could not be granted to such resources.
  • remove lib axis-1.4.jar from gCore/lib
  • replace the library hsqldb-1.8.jar with the library hsqldb-2.2.8.jar in gCore/lib

Additional Installation Steps

  • create a suitable R environment[1]
  • download the file following file gebco under /home/gcube/gCore/etc/statistical-manager-service-full-XXX/cfg and rename it as gebco_08.nc
  • copy the gcube keys under /home/gcube/gCore/etc/statistical-manager-service-full-XXX/cfg/PARALLEL_PROCESSING

Services and Databases used by the Statistical Manager and Data Analysis facilities

GHN

gcube@statistical-manager1.d4science.org

gcube@statistical-manager2.d4science.org

gcube@statistical-manager3.d4science.org

gcube@statistical-manager4.d4science.org

gcube2@statistical-manager.d.d4science.org

TOMCAT

(root user)

thredds.research-infrastructures.eu

wps.statistical.d4science.org

rstudio.p.d4science.research-infrastructures.eu

geoserver.d4science.org

geoserver2.d4science.org

geoserver3.d4science.org

geoserver4.d4science.org

geoserver-dev.d4science-ii.research-infrastructures.eu

geoserver-dev2.d4science-ii.research-infrastructures.eu

geonetwork.geothermaldata.d4science.org

geonetwork.d4science.org

THIRD PARTY SERVICES

(root user)

rstudio.p.d4science.research-infrastructures.eu (sw rstudio, command: rstudio-server restart)

DATABASES

(root user)

geoserver-db.d4science.org

node49.p.d4science.research-infrastructures.eu

biodiversity.db.i-marine.research-infrastructures.eu

db1.p.d4science.research-infrastructures.eu

db5.p.d4science.research-infrastructures.eu

dbtest.research-infrastructures.eu

dbtest3.research-infrastructures.eu

geoserver.d4science-ii.research-infrastructures.eu

geoserver2.i-marine.research-infrastructures.eu

geoserver-db.d4science.org

geoserver-test.d4science-ii.research-infrastructures.eu

node50.p.d4science.research-infrastructures.eu

node49.p.d4science.research-infrastructures.eu

node59.p.d4science.research-infrastructures.eu

obis2.i-marine.research-infrastructures.eu

statistical-manager.d.d4science.org

WORKER NODES

(gcube2 user)

(production)

node3.d4science.org

node4.d4science.org

node11.d4science.org

node12.d4science.org

node13.d4science.org

node14.d4science.org

node15.d4science.org

node16.d4science.org

node18.d4science.org

node20.d4science.org

node21.d4science.org

node23.d4science.org

node27.d4science.org

node28.d4science.org

node29.d4science.org

node30.d4science.org

node31.d4science.org

node32.d4science.org

node33.d4science.org

node34.d4science.org

node35.d4science.org

node36.d4science.org

node37.d4science.org

node38.d4science.org

node39.d4science.org


(development)

node17.d4science.org

node19.d4science.org

node22.d4science.org

TESTING

Test plan for the Statistical Manager.

SDI / GIS Technologies

This section describes the configuration of gCube Spatial Data Infrastructure (SDI), responsible for handling GIS technologies and (meta)data. It comprises various technologies, both from gCube and from third party developers.

A brief summary :

  • gCube technologies
    • sdi-service : utility service for the management of SDI configuration in a context
    • geonetwork : library for the interaction with GeoNetwork service
    • gis-interface : library for the publication of dataset and related metadata
    • ws-thredds [Deprecated] : library for the synchronization of a StorageHub folder with Thredds
    • Gis -Viewer : GUI for the rendering of layers
    • GeoExplorer : GUI for the browsing of metadata in GeoNetwork
  • Third parties technologies
    • GeoNetwork is used in contexts where ISO Metadata needs to be managed
    • GeoServer is used for the registration of GIS datasets in certain formats (e.g. Shape files)
    • Thredds is used for the registration of GIS datasets in certain formats (e.g. netcdf)


NB: In order to handle GIS Technologies, developers should rely on java libraries geonetwork and gis-interface, both distributed under subsystem org.gcube.spatial.data. Please note that lots of scenarios do not involve java gCube libraries, so they directly contact third party services after getting context configuration from sdi-service.


For administration purposes, please note that reports on the current SDI configuration can be obtained by contacting the gCube sdi-service interfaces :


gCube Software

In this section we describe the IS resources needed by specific usages of gCube SDI software.

SDI Service

This service aim is to manage the available third party GIS technologies in the VRE, so it rely on their proper registration. Please refer to #Third party technologies for more details.

geonetwork library

Geonetwork library rely on the presence of a GeoNetwork service in the context. Please refer to #GeoNetwork Service for further details.

Metadata Publication

In cases where geonetwork library is used to generate ISO metadata, the following Generic Resource must be defined in the current context and filled with common/default metadata values.

  • Secondary Type : ISO
  • Name : MetadataConstants


ISO Metadata are published with a resolver http link generated by "Uri Resolver Manager", so this needs to be configured with a Generic Resource with the following coordinates :

  • Secondary Type : UriResolverMap
  • Name : Uri-Resolver-Map


gis-interface library

Gis-interface publishes (meta)data in the gCube SDI. It is built on top of #geonetwork library so it needs to be preperly configured. It also uses a GeoServer service in the context as repository, so such service should be configured.


GeoExplorer

In order to let GeoExplorer portlet work fine, you must copy the resources following from root scope (/d4science.research-infrastructures.eu/) to the VRE where it must run:

  • Transect
<Type>RuntimeResource</Type>
<Caegory>Application</Category>
<Name>Transect</Name>
  • Gis Resolver

https://gcube.wiki.gcube-system.org/gcube/URI_Resolver#GIS_Resolver

<Type>RuntimeResource</Type>
<Category>Service</Category>
<Name>Gis-Resolver</Name>
  • Gis Viewer Application
<Type>GenericResource</Type>
<SecondaryType>ApplicationProfile</SecondaryType>
<Name>Gis Viewer Application</Name>

and then must edit the Generic Reosurce shown here: https://gcube.wiki.gcube-system.org/gcube/URI_Resolver#Generic_Resource_for_Gis_Viewer_Application


Third party technologies

gCube SDI heavily rely on third party GIS services capabilities in order to handle GIS (meta)data. Each of these services have specific configuration needs that should be adressed in provisioning rules, so they go beyond the scope of this page.

In this section we describe what IS resources are needed in a gCube context in order to declare the availablility of these services.

gCube SDI software will use these resources for the discovery and further management of these third-party service instances.

GeoNetwork Service

GeoNetwork services are registered in a gCube context with a Service Endpoint with the following coordinates:

  • Category : Gis
  • Platform/Name : geonetwork

NB The resource is expected to define credentials for admin user under an access point with the following characteristics (you can find more details here):

  • Endpoint EntryName = geonetwork
  • property priority (integer value)
  • property suffixes (leave empty or blank)

GeoServer Service

GeoServer services are registered in a gCube context with a Service Endpoint with the following coordinates:

  • Category : Gis
  • Platform/Name : GeoServer

NB The resource is expected to define credentials for admin user under an access point with the following characteristics:

  • Endpoint EntryName = geoserver

Thredds Service

Thredds services are registered in a gCube context with a Service Endpoint with the following coordinates:

  • Category : Gis
  • Platform/Name : thredds

=Tabular Data Manager (DISMISSED)= Each service's operation may need a specific configuration. The following is a list of needed resources per operation module.

Operation View

The module requires GIS Technologies to be already configured in the operating scope. See Gis Technologies.

The module requires also the following Generic Resource :

  • Secondary Type : TDMConfiguration

Since the operation needs to put data in a postgis database already connected with Geoserver, a Service Endpoint for such database must be present in the same scope. Constraints for retrieving such Service Endpoint are taken from the Generic Resource described above (values are indicated with their xml Element name as declared in the Generic Resource's body) :

  • Category : <gisDBCategory>
  • Platform/Name : <gisDBPlatformName>
  • AccessPoint/<tdmDataStoreFlag> : true

Resource Catalogue

The steps needed to have a working catalogue running in a given scope, namely a VRE, are the following:

  1. add the CKAN Services to the scope (this step can be avoided if you use the gcat configuration APIs as described below)
    1. add the CKAN Data Catalogue Service Endpoint to the scope (see CKan Data Catalogue instance)
    2. add the CKAN Database Service Endpoint to the scope (see CKan Database);
    3. (optional) add the Zenodo Service Endpoint (see Zenodo API);
  2. add the gCat Service to the scope;
    1. Add the scope in file d4science-smartgears-services/group_vars/gcat_service_production/gcat_service_production.yml in ansible-playbook
    2.  Run the playbook as following:
      1. ./run.sh gcat.yml -i inventory_production/hosts.openstack_isti_production -l gcat_service -e 'gcube_admin_token=<TOKEN>' -t smartgears_conf The problem with this command is that it starts a workflow on conductor for each scope defined in d4science-smartgears-services/group_vars/gcat_service_production/gcat_service_production.yml.
      2. The alternative is invoking this command ./run.sh gcat.yml -i inventory_production/hosts.openstack_isti_production -l gcat_service_production -e "gcube_admin_token=<TOKEN>" -e 'smartgears_conductor_scope=<NEW_SCOPE>' --tags=smartgears_conf Please note that: You must provide as input parameter smartgears_conductor_scope with the scope to be added. Please note that the scope must also be added to d4science-smartgears-services/group_vars/gcat_service_production/gcat_service_production.yml in ansible-playbook as explained in the previous step. You must provide the tag smartgears_conf. The playbook role invokes the conductor also with tag smartgears_conf. This has been added to avoid adding a scope in the node without invoking the conductor role; Please note that the conductor invocation runs add_workspace_client_to_context workflow to enact gCat to interact with the workspace (eg for storing resources).
    3. configure gCat by /configurations (see gCat Configuration API); This step allows us to avoid adding the resource to the scope as described in the first step (apart the Zenodo part). The best way to create the configuration is by reading the configuration from another VRE which uses the same Ckan instance. The obtained configuration must be copied and changed in the parts it differs e.g the default_organization. Please note you must be Catalogue-Manager to be allowed to create/change the configuration.
  3. add the CKAN Connector to the scope (see CKAN Connector);
  4. add the URI Resolver Map Generic Resource to the scope (see Uri-Resolver-Map);
  5. create the CKan Portlet Generic Resource with the URL hosting the catalogue in the VRE (see CKan Portlet resource)
  6. (to configure a catalogue at VO or root VO level) configure the DataCatalogueMapScopesUrls Generic Resource with the URL hosting the catalogue in the VRE (see #DataCatalogueMapScopesUrls);
  7. (automatic) configure the Catalogue Resolver Generic Resource (see Catalogue-Resolver resource);
  8. (automatic) configure the Catalogue Generic Resource used by the social service (see Catalogue Resource);
  9. (automatic) configure the News Feed Generic Resource (see #News Feed & Catalogue);
  10. (optional) define any namespace needed to group extra fields (see Namespaces Resource);
  11. (optional) define the mappings driving the publishing to Zenodo
  12. ...

CKAN Connector

ServiceClass = DataAccess
ServiceName = CkanConnector

This is the service that allows to perform login operation from the Gateways on CKAN. It runs on SmartGears so once it is published in the context there is no much left to do. However, it is fundamental.

Generic Resource

The following Generic Resources impact on the Catalogue Service behaviour.

CkanPortlet: this is the Portlet URL

SecondaryType = ApplicationProfile
Name = CkanPortlet
Description = The url of the gcube-ckan-datacatalog portlet for this scope

The content (body) of the resource has to report the url of the catalogue portlet for this context (VRE), e.g.

<url>https://services.research-infrastructures.eu/group/d4science-services-gateway/data-catalogue</url>

Catalogue-Resolver

SecondaryType = ApplicationProfile
Name = Catalogue-Resolver
Description = Used by Catalogue Resolver for mapping VRE NAME with its SCOPE so that resolve correctly URL of kind: 
              https://[CATALOGUE_RESOLVER_SERVLET]/[VRE_NAME]/[entity_context value]/[entity_name value]

See wiki page at: CATALOGUE_Resolver

NOTE: the resource is automatically updated by the Catalogue Resolver

Catalogue

Update this configuration at ROOT VO level. It is used by social to support gCat social notifications/posts properly. Temporary solution: automatically updated by Catalogue Portlet accessing to portlet deployed in a new VRE

SecondaryType = ApplicationProfile
Name = Catalogue
Description = This is the Item Catalogue application profile for alerting items creation in the infrastructure catalogues
<Body><AppId>service-account-gcat</AppId>...

The above Generic Resource stored at ROOT VO level must be updated by adding an entry of kind:

<EndPoint>
	<Scope>[THE SCOPE]</Scope>
	<URL>[THE PORTLET URL TO THE GATEWAY IN ACT FOR THE SCOPE]</URL>
</EndPoint>

e.g. for /d4science.research-infrastructures.eu/D4OS/EOSCPillarServiceRegistry

<EndPoint>
    <Scope>/d4science.research-infrastructures.eu/D4OS/EOSCPillarServiceRegistry</Scope>
    <URL>https://eosc-pillar.d4science.org/group/eoscpillarserviceregistry/catalogue</URL>
</EndPoint>

for the SCOPE where the gCat has been added

CKan to Zenodo Mappings

A set of generic resources with SecondaryType = Ckan-Zenodo-Mappings is expected in order to enable the upload to Zenodo. Since each of these generic resources maps a precise CKAN item profile, The required set may vary depending on the VRE. The user requesting the VRE creation is expected to specify the minimum set of these resources to be registered in the context.

DataCatalogueMapScopesUrls

SecondaryType = ApplicationProfile
Name = DataCatalogueMapScopesUrls
Description = EndPoints that map url to scope for the data catalogue portlet instances

This resource is deployed at root level. It contains a list of "exceptions", i.e. how to manage catalogues at VO or root VO level.

DataCatalogueNamespace

SecondaryType = DataCatalogueNamespace
Name = Namespaces Catalogue Categories
Description = This resource defines namespaces for the catalogue categories

This resource has been created at root level. To allow gcat to properly works must be added into every scopes where is present gcat.

Ckan

The organization to be assigned to the context must be created on Ckan via gCat by using Create Organization API.

Only a Catalogue-Manager (see Catalogue Roles can create an organization.

Please note that only if gcat has already been added to the context and properly configured can it create the organization properly.

If you don't create the organization, gCat will not be able to manage items.

You can check if gCat is properly configured and which is the configuration by using REad Catagloue Configuration API.

Catalogue Badge

Update the following GR at ROOT VO level. It is used by Catalogue Badge

SecondaryType = ApplicationProfile
Name = DataCatalogueMapScopesUrls
Description = EndPoints that map url to scope for the data catalogue portlet instances

You need to add an entry of kind:

<EndPoint>
	<Scope>[THE SCOPE]</Scope>
	<URL>https://[GATEWAY-HOSTNAME]/group/[GATEWAY-NAME]-gateway</URL>
</EndPoint>

e.g. for /d4science.research-infrastructures.eu/SoBigData/TerritoriAperti

<EndPoint>
   <Scope>/d4science.research-infrastructures.eu/SoBigData/TerritoriAperti</Scope>
   <URL>https://territoriaperti.d4science.org/group/territoriaperti-gateway</URL>
</EndPoint>

you need to add the above entries for the Gateway https://territoriaperti.d4science.org where the Catalogue Badge is in action

The URL https://territoriaperti.d4science.org/group/territoriaperti-gateway is built and used by ckan-util-library to get the VRE SCOPE (i.e. /d4science.research-infrastructures.eu/SoBigData/TerritoriAperti)

String clientURL = gatewaySiteURL+siteLandingPage;
String appPerScopeURL = ApplicationProfileScopePerUrlReader.getScopePerUrl(clientURL);

needed to discover at VRE level the property `SOLR_INDEX_ADDRESS` stored into SeviceEndpoint `CKanDataCatalogue`

Catalogue For GRSF

Update this configuration at ROOT VO level. This resource is used only to support GRSF social posts

SecondaryType = ApplicationProfile 
Name = Catalogue
Description = This is the Item Catalogue application profile for alerting items creation in the infrastructure catalogues
<Body><AppId>org.gcube.datacatalogue.ProductCatalogue</AppId>...

The above Generic Resource stored at ROOT VO level must be updated by adding an entry of kind:

<EndPoint>
	<Scope>[THE SCOPE]</Scope>
	<URL>[THE PORTLET URL TO THE GATEWAY IN ACT FOR THE SCOPE]</URL>
</EndPoint>

e.g. for /d4science.research-infrastructures.eu/D4OS/EOSCPillarServiceRegistry

<EndPoint>
	<Scope>/d4science.research-infrastructures.eu/FARM/GRSF_Admin</Scope>
	<URL>https://i-marine.d4science.org/group/grsf_admin/data-catalogue</URL>
</EndPoint>

for the SCOPE where the Catalogue has been added.

Service Endpoint(s)

The following service endpoints are needed by the Service Catalogue to work.

CKanDataCatalogue

Category = Application
Name = CKanDataCatalogue
Description = A Tomcat Server hosting the ckan data catalogue

Among the other properties of the SE, these should be reported:

  • HostedOn (in RunTime) is the url of the ckan instance, e.g. ckan-d4s.d4science.org;
  • Username (in AccessData) is the username of the CKAN SYSAdmin;
  • Property URL_RESOLVER, whose value is equal to the url of the URI-RESOLVER in the context;
  • Encrypted property API_KEY, is the api key of the CKAN SYSAdmin;
  • SOCIAL_POST: (true/false) instruct gCat to create the social post in the VRE. If this property is not present it is assumed as false. The value can be overridden by the gCat client on the item creation request.
  • ALERT_USERS_ON_POST_CREATION: (true/false) instruct gCat to request to social service if notify users about the generated social post. If this property is not present it is assumed as false.

CKanDatabase

Category = Database
Name = CKanDatabase
Description = A Postgres Server hosting the ckan database

Among the other properties of the SE, these should be reported:

  • HostedOn (in RunTime) is the machine hosting the postgres CKAN uses (e.g. ckan-pg-d4s.d4science.org);
  • EndPoint (in AccessPoint) is the machine URL hosting the postgres CKAN uses followed by the port number (e.g., ckan-pg-d4s.d4science.org:5432);
  • In AccessData please report the credentials (password must be encrypted) of the user allowed to access the database.

Please note that gCat requires to dial with postgres, hence the gCat host must be enabled on postgres installation

Zenodo API

Category = Repository
Platform.Name = Zenodo

A service endpoint defining the Zenodo API address and credentials is expected in order to enable the "Upload to Zenodo" feature. Credentials may vary depending on the context.

Enable view per VRE

In order to enable this special view (which allows the catalogue portlet to render itself on a single organization), one should access the portal and as administrator enable a special custom field of the VRE. The custom field can be found, on the VRE Page, under "Admin > Pages > Configuration > Site Settings > Custom Field". Set it to true to enable the view.

gCat & SHUB & Catalogue

The GCat_Service must be authorized to operate in the VRE. Add the "gCat" user in the VRE as reported at https://gcube.wiki.gcube-system.org/gcube/StorageHub_REST_API#Add_User_To_Vre (this is a temporary solution it will be replaced by WORKFLOW).

gCat & Uri-Resolver-Manager

In order to operate properly in a VRE the GCat_Service uses the Uri-Resolver-Manager, so you check if the required GR Uri-Resolver-Map is published at the VRE level

News Feed & Catalogue

When you have added all GR/RR to serve a VRE with an instance of D4Science Catalogue, in order to be able to publish social posts via Social Networking Library must be added an entry of kind:

<EndPoint>
	<Scope>[THE SCOPE]</Scope>
	<URL>[THE RELATIVE URL OF SCOPE SAVED IN THE GATEWAY]</URL>
</EndPoint>

e.g.

<EndPoint>
   <Scope>/d4science.research-infrastructures.eu/D4OS/EOSCPillarServiceRegistry</Scope>
   <URL>/group/eoscpillarserviceregistry</URL>
</EndPoint>

into the following Generic Resource:

SecondaryType = ApplicationProfile
Name = News Feed

published at ROOT VO level.

News Feed & gCat

see at: DataCatalogueNamespace

Accounting Dashboard & Catalogue

see the wiki page at Add_Google_Analytics_to_the_Accounting_Dashboard

SocialNetworking service

see the wiki page at Social Networking Library

Known Issues

The socialnetworking service must be restarted when liferay is up

=GFeed (ABANDONWARE / DISMISSED)=

The following is a list of minimal requirements for the execution of gFeed Service.

  • Database : the service needs a dedicated DB for its logic and looks in the current context for a DB registered as Service Endpoint with
    • Category : Database
    • Name : Feeder_DB
  • Common configuration : the service loads default plugins configurations from the IS by lookig for a Generic Resource registered as
    • Secondary type : configuration
    • Name : gcat-feeder

The following parameters need to be customized for every context in which the resource is pubilshed :

Please keep in mind that depending on deployed plugins these requirements may not be enough.

GeoPortal

The following instructions are meant in order to configure the "Geoportale Nazionale per l'Archeologia".

Interactions among the engines.

Geoportal-Service Workflow and Interactions with Engines.png


NB : Please note that actual requirements for a specific service instance vary depending on which plugins extensions are deployed in that particular instance.

Health Checks

Three types of health checks are implemented: Service, Mongo, Database. They are contactable via REST API whose responses are compliant to the https://microprofile.io/specifications/microprofile-health/ specification.

Service health : checks if the `geoportal` service is up at

https://{geoportal_endpoint}/geoportal-service/srv/health

e.g. In DEV at https://geoportal.cloud-dev.d4science.org/geoportal-service/srv/health

Mongo health : checks if the `geoportal` service is able to communicate with MongoDB instance at

https://{geoportal_endpoint}/geoportal-service/srv/health/mongo?context={GCUBE_CONTEXT}
or (to include the collections)
https://{geoportal_endpoint}/geoportal-service/srv/health/mongo?context={GCUBE_CONTEXT}&include_collections=true

eg. In DEV at https://geoportal.cloud-dev.d4science.org/geoportal-service/srv/health/mongo?context=/gcube/devsec/devVRE or (to include the collections) https://geoportal.cloud-dev.d4science.org/geoportal-service/srv/health/mongo?context=/gcube/devsec/devVRE&include_collections=true

Database health : checks if the `geoportal` service is able to communicate with PostGIS instance at

https://{geoportal_endpoint}/geoportal-service/srv/health/database?context={GCUBE_CONTEXT}

eg. In DEV at https://geoportal.cloud-dev.d4science.org/geoportal-service/srv/health/database?context=/gcube/devsec/devVRE

Service Requirements

This section states the resources needed by a vanilla geoportal-service instance (meaning no extension is deployed). Specific requirements of plugins extensions are reported in dedicated subsections.

Document Store

Geoportal service relies on a mongoDB instance registered in the gCube IS with the following coordinates :

  • Profile/Category : Database
  • Profile/Platform/Name : mongodb
  • Profile/AccessPoint//Property/Name : GNA_DB
  • Profile/AccessPoint//Property/Value : internal-db

Fileset Archive

Geoportal service relies on gCube StorageHub in order to archive registered Filesets. Please refere to specific section in this page.

UCDs

Current UCD provider implementation relies on a Generic Resource with the following coordinates in order to assess the available UCDs in a VRE :

  • Secondary Type : CMS
  • Name : UCDs

It is expected to declare links to UCD documents, which they are then loaded into the application. It's body must be like the following example :

<UCDs>   
    <record label=.. ucdUrl=... />
    <record label=.. ucdUrl=... />
    ...
</UCDs>

NB : this resource is strictly dependant on the context, since it declares supported projects collections

Service Plugins Requirements

In this section we report specific requirements introduced by plugins. Please note that plugins are optionally deployed in geoportal-service, so the following resources are needed only in specific cases.

SDI Plugins

NB : SDI Plugins exploit gCube SDI Resources available in the current context. In order to do this, the SDI should be properly configured in the context.

In summary :

  • SDI Materializer uses a Geoserver instance enabled with gCube Data Transfer service.
  • SDI Indexer uses a postgisDB and Geoserver

SDI Materializer

The SDI Materializer handles registered Filesets by creating layers in the context SDI's Geoserver. In order to do this, in most cases gCube Data Transfer service must be enabled in GeoServer. Please refer to dedicated sections on SDI', GeoServer and gCube Data Transfer for further details.

SDI Indexer

SDI Indexer creates Geoserver layers in the current context SDI, representing centroids of some projects registered in geoportal-service. In order to do this, it needs a 'postgis database registered in the IS as a Service endpoint with the following coordinates :

  • Profile/Category : Database
  • Profile/Platform/Name : postgis
  • Profile/AccessPoint//Property/Name : GNA_DB
  • Profile/AccessPoint//Property/Value : Concessioni

NB The postgis database will be registered in GeoServer by the plugin, so the database should be reachable from GeoServer.

Please refer to dedicated sections on SDI and GeoServer.

Notification Plugins

The plugin requires a proper configuration in the UCD. Please refer to notifications-plugins to create it.

Other requirements:

1. a service account for geoportal named service-account-geoportal must be created at VRE level in KC (see more at https://support.d4science.org/issues/27108)

2. the clientId and the secret of the service-account-geoportal must be registered in the SE with coordinates:

Name: geoportal
Category: SystemWorkspaceClient

with the AccessPoint

<AccessPoint>
	<Description>service account credentials</Description>
	<Interface>
		<Endpoint EntryName="geoportal">none</Endpoint>
	</Interface>
	<AccessData>
		<Username>geoportal</Username>
		<Password>{ADD HERE THE SECRET}</Password>
	</AccessData>
</AccessPoint>

3. in order to send VRE post via social service, the service-account-geoportal requires a generic resource with coordinates:

Name: Geoportal
SecondaryType: ApplicationProfile

and body

<AppId>service-account-geoportal</AppId>
<ThumbnailURL>https://data.d4science.org/shub/E_OS9QOE5zcVl6UXJCcEsvUUFhMWFTRXY1OXh6TXhFbEplOERhNGhaZ1RLV1VBblErY3lxQW5RbXMrVEM1WC9UQQ==</ThumbnailURL>
<EndPoint>
    <Scope>{SCOPE}</Scope>
    <URL>{GEOPORTAL_DATA_ENTY_URL_IN_THE_SCOPE}</URL>
</EndPoint>

please refer to https://support.d4science.org/issues/27108

Catalogue Binding Plugins

The plugin requires a proper configuration in the UCD. Please refer to catalogue-binding-plugin to create it.

Other requirements:

1. a service account for geoportal named service-account-geoportal is required at VRE level in KC (see 1. of the #Notification_Plugins)

2. the service-account-geoportal must be able to operate with gCat at VRE level, so it must have the role of "Catalogue-Admin". Please assign it via KC.

Mapping from "Geoportal Project" to "Catalogue Dataset" for any UCD

Geoportal-Service requires the Geoportal_Resolver

The Geoportal-Service requires the Geoportal_Resolver published at VRE level. See the Geoportal_Resolver dependencies at Geoportal_Resolver or here ticket

NB. The Geoportal_Resolver requires:

  • the "URI-Resolver" (gCoreEndpoint) published at VO level with option "authorizeChild"
  • the Runtime Resource named "HTTP-URL-Shortener-DL" at VRE level

Export (as PDF) Requiriments

The Geoportal system allows the Export as PDF facivility if properly configured in the scope. In order to configure it, see at:

GUI Requirements

The SE with the following coordinates has to be added in the proper VRE:

<Category>Service</Category>
<Name>HTTP-URL-Shortener-DL</Name>

For geoportal-data-viewer-app:

The Service Endpoints with the following coordinates have to be added in the proper VRE.

1 - It is used by GNA Viewer to retrieve the list of base maps that should be displayed in the Viewer:

<Category>Application</Category>
<Name>GNABaseMaps</Name>

2 - It is used by GNA Viewer to contact the Geoportal Service with guest/public access (from out of portal, no login required).

<Category>SystemClient</Category>
<Name>geoportal-data-viewer-app</Name>
<Description>IAM Client for geoportal-data-viewer-app</Description>

Generic Resources

For geoportal-data-entry-app:

1. All Generic Resources with

<SecondaryType>GeoNaMetadata</SecondaryType> 

or

<SecondaryType>GeoportalMetadata</SecondaryType> 

must be copied in the proper VRE. They are used by 'geoportal-data-entry-app' portlet (e.g. 'GNA-Data-Entry' in the context of GNA) to build dynamically the web-forms for data entries.

2. The Generic Resource with coordinates:

SecondaryType: ApplicationProfile 
Name: Geoportal-DataEntry-Configs

must be copied in the proper VRE. It is used by 'geoportal-data-entry-app' portlet (e.g. 'GNA-Data-Entry' in the context of GNA) to read the configurations: (i) the permissions on the operations for the roles (Data-Member, Data-Editor, Data-Manager), (ii) list of fields used by the searching facility.

For geoportal-data-viewer-app:

3. The Generic Resource (renamed from GeoNa-Viewer-Profile) with the following coordinates:

   <SecondaryType>ApplicationProfile</SecondaryType>
   <Name>Geoportal-DataViewer-Configs</Name>

   /*having in the body the following AppId*/

   <AppId>geoportal-data-viewer-app</AppId>

has to be copied in the proper VRE. Used by 'geoportal-data-entry-app' and 'geoportal-data-viewer-app' portlets to read several configurations: (i) common info like portlet URLs in the VRE, (ii) the URL of the centroid layer/s, (iii) list of fields used by the searching facility and so on

4. The Generic Resource named "Namespaces Catalogue Categories" must be added in the proper VRE, it is required for the Metadata Form Builder:

see at https://gcube.wiki.gcube-system.org/gcube/ServiceManager_Guide#DataCatalogueNamespace

Resolvers

These are the resources that must be updated when changing the URI-Resolver balancer and/or its hostname:

ServiceEndpoints:

 <Category>Service</Category>
 <Name>HTTP-URI-Resolver</Name>
<Category>Service</Category>
<Name>Gis-Resolver</Name>
<Category>Application</Category>
<Name>Transect</Name>
<Category>Application</Category>
<Name>CKanDataCatalogue</Name>
<Category>Service</Category>
<Name>Analytics-Resolver</Name>

Generic Resources:

<SecondaryType>ApplicationProfile</SecondaryType>
<Name>Workspace-Explorer-App</Name>
<SecondaryType>ApplicationProfile</SecondaryType>
<Name>Gis Viewer Application</Name>