Difference between revisions of "How to use the Data Miner Pool Manager"

From Gcube Wiki
Jump to: navigation, search
Line 6: Line 6:
 
==Overview==
 
==Overview==
  
The service may accept an algorithm descriptor, including its dependencies (either os, r and custom packages), queries the IS for dataminers in the current scope, generates (via templating) ansible playbook, inventory and roles for relevant stuff (algorithm installer, algorithms, dependencies), executes ansible playbook on a DataMiner.
+
The service may accept an algorithm descriptor, including its dependencies (either OS, R and custom packages), queries the IS for dataminers in the current scope, generates (via templating) ansible playbook, inventory and roles for relevant stuff (algorithm installer, algorithms, dependencies), executes ansible playbook on a DataMiner.
In such sense, the service accepts as input, among the others, the url of an algorithm package (including jar, and metadata), install the script and return asynchronously execution outcome to the caller.
+
In such sense, the service accepts as input, among the others, the url of an algorithm package (including jar, and metadata), extract the information needed to installation, install the script and return asynchronously the execution outcome to the caller.
  
  
Line 15: Line 15:
 
An instance has already been deployed and configured at Development level.
 
An instance has already been deployed and configured at Development level.
  
http://node2-d-d4s.d4science.org:8080
+
<source lang="text">
 +
http://node2-d-d4s.d4science.org:8080/dataminer-pool-manager-1.0.0-SNAPSHOT/rest/
 +
</source>
  
In order to allow Ansible to access the DataMiner host it is necessary that the SSH key of the host where the Service is deployed is correctly configured at DataMiner host level.
+
In order to allow Ansible to access the DataMiner, it is necessary that the SSH key of the host where the Service is deployed is correctly configured at DataMiner host level.
  
  
==API==
+
==Requirements==
 +
 
 +
The dependencies in the metadata file inside the package must respect the following guidelines:
 +
- R Dependencies must have prefix '''cran:'''
 +
- OS Dependencies must have prefix package '''os:'''
 +
- Custom Dependencies must have prefix '''github:'''
 +
 
 +
In case no prefix is specified, the service consider such dependencies as OS ones.
 +
 
 +
 
 +
==Usage and APIs==
 
Currently the service exposes the following REST methods:
 
Currently the service exposes the following REST methods:
 +
 +
===Adding An Algorithm to DataMiner===
 +
 +
Such functionality installs the Algorithm on the specific DataMiner and return the Id to log useful to monitor the execution.
 +
 +
<source lang="text">
 +
'''addAlgorithmToHost'''(algorithm, hostname, name, description, category, algorithmType, skipJava)
 +
</source>
  
 
<source lang="java">
 
<source lang="java">
Line 40: Line 60:
 
   }
 
   }
 
</source>
 
</source>
Mandatory params
 
Options params
 
  
Return the id
+
 
 +
It is possible to distinguish among mandatories parameters and optional ones:
 +
 
 +
Mandatories:
 +
- '''algorithm''': URL related the package of the Algorithm; such parameter is mandatory.
 +
- '''hostname''': the hostname of the DataMiner on which deploy the script; such parameter is mandatory.
 +
 
 +
Optionals (The overall set of parameters can be extract from the metadata file (where available), or overwritten by the caller):
 +
- '''name''': name of the Algorithm (e.g.,ICHTHYOP_MODEL_ONE_BY_ONE )
 +
- '''description''': description of the Algorithm
 +
- '''category''': category to which the Algorithm belongs to (e.g, ICHTHYOP_MODEL)
 +
- '''algorithmType''': by default set to "transducerers"
 +
- '''skipJava''': by default set to "N"
 +
 
  
  
 
An example of the usage is the following:
 
An example of the usage is the following:
<sourc lang ="text">
+
<source lang ="text">
http://node2-d-d4s.d4science.org:8080/dataminer-pool-manager-1.0.0-SNAPSHOT/rest/hosts/add?gcube-token=TOKEN_ID&algorithm=URL_TP_ALGORITHM&hostname=TARGET_DATAMINER
+
http://node2-d-d4s.d4science.org:8080/dataminer-pool-manager-1.0.0-SNAPSHOT/rest/hosts/add?gcube-token=TOKEN_ID&'''algorithm'''=URL_TP_ALGORITHM&'''hostname'''=TARGET_DATAMINER
 
</source>
 
</source>
  
 +
===Monitoring the execution===
  
 
getLogById: returning asynchronously the detail of the execution
 
getLogById: returning asynchronously the detail of the execution
Line 69: Line 101:
 
http://node2-d-d4s.d4science.org:8080/dataminer-pool-manager-1.0.0-SNAPSHOT/rest/log?gcube-token=**********************************&logUrl=log_id
 
http://node2-d-d4s.d4science.org:8080/dataminer-pool-manager-1.0.0-SNAPSHOT/rest/log?gcube-token=**********************************&logUrl=log_id
 
</source>
 
</source>
 
parameters can be the same of the metadata or sovrascritti
 
 
 
  
 
-- "addAlgorithmToVRE" returning immediatly the ID of the log
 
-- "addAlgorithmToVRE" returning immediatly the ID of the log

Revision as of 16:12, 3 April 2017


Data Miner Pool Manager

DMPM is a REST service able to rationalize and automatize the current process for publishing SAI algorithms on DataMiner nodes and keep DataMiner cluster updated.

Overview

The service may accept an algorithm descriptor, including its dependencies (either OS, R and custom packages), queries the IS for dataminers in the current scope, generates (via templating) ansible playbook, inventory and roles for relevant stuff (algorithm installer, algorithms, dependencies), executes ansible playbook on a DataMiner. In such sense, the service accepts as input, among the others, the url of an algorithm package (including jar, and metadata), extract the information needed to installation, install the script and return asynchronously the execution outcome to the caller.


Usage

DMPM is a SmartGear compliant service. An instance has already been deployed and configured at Development level.

http://node2-d-d4s.d4science.org:8080/dataminer-pool-manager-1.0.0-SNAPSHOT/rest/

In order to allow Ansible to access the DataMiner, it is necessary that the SSH key of the host where the Service is deployed is correctly configured at DataMiner host level.


Requirements

The dependencies in the metadata file inside the package must respect the following guidelines: - R Dependencies must have prefix cran: - OS Dependencies must have prefix package os: - Custom Dependencies must have prefix github:

In case no prefix is specified, the service consider such dependencies as OS ones.


Usage and APIs

Currently the service exposes the following REST methods:

Adding An Algorithm to DataMiner

Such functionality installs the Algorithm on the specific DataMiner and return the Id to log useful to monitor the execution.

'''addAlgorithmToHost'''(algorithm, hostname, name, description, category, algorithmType, skipJava)
  @GET
  @Path("/hosts/add")
  @Produces("text/plain")
  public String addAlgorithmToHost(
      @QueryParam("algorithm") String algorithm, 
      @QueryParam("hostname") String hostname,
      @QueryParam("name") String name,
      @QueryParam("description") String description,
      @QueryParam("category") String category,
      @QueryParam("algorithmType") String algorithmType,
      @QueryParam("skipJava") String skipJava) throws IOException, InterruptedException {
    Algorithm algo= this.getAlgorithm(algorithm, null, hostname, name, description, category, algorithmType, skipJava);
    //service.addAlgToIs(algo);
    return service.addAlgorithmToHost(algo, hostname);
  }


It is possible to distinguish among mandatories parameters and optional ones:

Mandatories: - algorithm: URL related the package of the Algorithm; such parameter is mandatory. - hostname: the hostname of the DataMiner on which deploy the script; such parameter is mandatory.

Optionals (The overall set of parameters can be extract from the metadata file (where available), or overwritten by the caller): - name: name of the Algorithm (e.g.,ICHTHYOP_MODEL_ONE_BY_ONE ) - description: description of the Algorithm - category: category to which the Algorithm belongs to (e.g, ICHTHYOP_MODEL) - algorithmType: by default set to "transducerers" - skipJava: by default set to "N"


An example of the usage is the following:

http://node2-d-d4s.d4science.org:8080/dataminer-pool-manager-1.0.0-SNAPSHOT/rest/hosts/add?gcube-token=TOKEN_ID&'''algorithm'''=URL_TP_ALGORITHM&'''hostname'''=TARGET_DATAMINER

Monitoring the execution

getLogById: returning asynchronously the detail of the execution


@GET @Path("/log") @Produces("text/plain") public String getLogById(@QueryParam("logUrl") String logUrl) throws IOException { // TODO Auto-generated method stub LOGGER.debug("Returning Log =" + logUrl); return service.getScriptFromURL(service.getURLfromWorkerLog(logUrl)); }


http://node2-d-d4s.d4science.org:8080/dataminer-pool-manager-1.0.0-SNAPSHOT/rest/log?gcube-token=**********************************&logUrl=log_id

-- "addAlgorithmToVRE" returning immediatly the ID of the log

skipjava ...


-- "getLogById" returning asynchronously the detail of the execution