Difference between revisions of "How to use the Data Miner Pool Manager"

From Gcube Wiki
Jump to: navigation, search
Line 4: Line 4:
 
DMPM is a REST service able to rationalize and automatize the current process for publishing SAI algorithms on DataMiner nodes and keep DataMiner cluster updated.
 
DMPM is a REST service able to rationalize and automatize the current process for publishing SAI algorithms on DataMiner nodes and keep DataMiner cluster updated.
  
The service is now able to:
+
==Overview==
- accept an algorithm descriptor, including its dependencies (either os, r and custom packages)
+
- query the IS for dataminers in the current scope
+
- generate (via templating) ansible playbook, inventory and roles for relevant stuff (algorithm installer, algorithms, dependencies)
+
- execute ansible playbook on a remote machine hosted at eng (no dataminer there; no need for that at this stage)
+
- roles/templates for the installation of the algorithm installer and installation of os packages are working fine
+
  
Next steps are (list is not exhaustive):
+
The service may accept an algorithm descriptor, including its dependencies (either os, r and custom packages), queries the IS for dataminers in the current scope, generates (via templating) ansible playbook, inventory and roles for relevant stuff (algorithm installer, algorithms, dependencies), executes ansible playbook on a DataMiner.
- accept as input the url of an algorithm package (including jar, and metadata)
+
In such sense, the service accepts as input, among the others, the url of an algorithm package (including jar, and metadata), install the script and return asynchronously execution outcome to the caller.
- complete the role for algorithm installation (i.e. the one wrapping 'addAlgorithm.sh')
+
- execute the playbook on a machine running a dataminer
+
- return execution outcome to the caller
+
  
  
Completed features:
+
==Usage==
- accept as input the url of an algorithm package (including jar, and metadata)
+
- ansible role for algorithm installation (i.e. the one wrapping 'addAlgorithm.sh')
+
- support for custom ansible roles (e.g. for CRAN & custom packages)
+
  
Ongoing features:
+
DMPM is a SmartGear compliant service.
- 'smartgear-isation' of the service is ongoing
+
An instance has already been deployed and configured at Development level.
  
One-week from now:
+
http://node2-d-d4s.d4science.org:8080
- smartgear-isation of the service
+
 
- return (asynchronously) execution outcome to the caller
+
In order to allow Ansible to access the DataMiner host it is necessary that the SSH key of the host where the Service is deployed is correctly configured at DataMiner host level.
  
  
Line 34: Line 23:
 
Currently the service exposes the following REST methods:
 
Currently the service exposes the following REST methods:
  
 +
<source lang="java">
 +
  @GET
 +
  @Path("/hosts/add")
 +
  @Produces("text/plain")
 +
  public String addAlgorithmToHost(
 +
      @QueryParam("algorithm") String algorithm,
 +
      @QueryParam("hostname") String hostname,
 +
      @QueryParam("name") String name,
 +
      @QueryParam("description") String description,
 +
      @QueryParam("category") String category,
 +
      @QueryParam("algorithmType") String algorithmType,
 +
      @QueryParam("skipJava") String skipJava) throws IOException, InterruptedException {
 +
    Algorithm algo= this.getAlgorithm(algorithm, null, hostname, name, description, category, algorithmType, skipJava);
 +
    //service.addAlgToIs(algo);
 +
    return service.addAlgorithmToHost(algo, hostname);
 +
  }
 +
</source>
 +
Mandatory params
 +
Options params
  
-- "addAlgorithmToVRE" returning immediatly the ID of the log
+
Return the id
  
skipjava
 
...
 
  
 +
An example of the usage is the following:
 +
<sourc lang ="text">
 +
http://node2-d-d4s.d4science.org:8080/dataminer-pool-manager-1.0.0-SNAPSHOT/rest/hosts/add?gcube-token=TOKEN_ID&algorithm=URL_TP_ALGORITHM&hostname=TARGET_DATAMINER
 +
</source>
  
-- "getLogById" returning asynchronously the detail of the execution
 
  
==Usage==
+
getLogById: returning asynchronously the detail of the execution
  
==Testing==
 
DMPM is a SmartGear compliant service.
 
An instance has already been deployed and configured at Development level.
 
  
http://node2-d-d4s.d4science.org:8080
+
@GET
 +
@Path("/log")
 +
@Produces("text/plain")
 +
public String getLogById(@QueryParam("logUrl") String logUrl) throws IOException {
 +
// TODO Auto-generated method stub
 +
LOGGER.debug("Returning Log =" + logUrl);
 +
return service.getScriptFromURL(service.getURLfromWorkerLog(logUrl));
 +
}
  
In order to allow Ansible to access the DataMiner host it is necessary that the SSH key of the host where the Service is deployed is correctly configured at DataMiner host level.
+
 
 +
 
 +
<source lang="text">
 +
http://node2-d-d4s.d4science.org:8080/dataminer-pool-manager-1.0.0-SNAPSHOT/rest/log?gcube-token=**********************************&logUrl=log_id
 +
</source>
 +
 
 +
parameters can be the same of the metadata or sovrascritti
 +
 
 +
 
 +
 
 +
-- "addAlgorithmToVRE" returning immediatly the ID of the log
 +
 
 +
skipjava
 +
...
 +
 
 +
 
 +
-- "getLogById" returning asynchronously the detail of the execution

Revision as of 15:23, 3 April 2017


Data Miner Pool Manager

DMPM is a REST service able to rationalize and automatize the current process for publishing SAI algorithms on DataMiner nodes and keep DataMiner cluster updated.

Overview

The service may accept an algorithm descriptor, including its dependencies (either os, r and custom packages), queries the IS for dataminers in the current scope, generates (via templating) ansible playbook, inventory and roles for relevant stuff (algorithm installer, algorithms, dependencies), executes ansible playbook on a DataMiner. In such sense, the service accepts as input, among the others, the url of an algorithm package (including jar, and metadata), install the script and return asynchronously execution outcome to the caller.


Usage

DMPM is a SmartGear compliant service. An instance has already been deployed and configured at Development level.

http://node2-d-d4s.d4science.org:8080

In order to allow Ansible to access the DataMiner host it is necessary that the SSH key of the host where the Service is deployed is correctly configured at DataMiner host level.


API

Currently the service exposes the following REST methods:

  @GET
  @Path("/hosts/add")
  @Produces("text/plain")
  public String addAlgorithmToHost(
      @QueryParam("algorithm") String algorithm, 
      @QueryParam("hostname") String hostname,
      @QueryParam("name") String name,
      @QueryParam("description") String description,
      @QueryParam("category") String category,
      @QueryParam("algorithmType") String algorithmType,
      @QueryParam("skipJava") String skipJava) throws IOException, InterruptedException {
    Algorithm algo= this.getAlgorithm(algorithm, null, hostname, name, description, category, algorithmType, skipJava);
    //service.addAlgToIs(algo);
    return service.addAlgorithmToHost(algo, hostname);
  }

Mandatory params Options params

Return the id


An example of the usage is the following: <sourc lang ="text"> http://node2-d-d4s.d4science.org:8080/dataminer-pool-manager-1.0.0-SNAPSHOT/rest/hosts/add?gcube-token=TOKEN_ID&algorithm=URL_TP_ALGORITHM&hostname=TARGET_DATAMINER </source>


getLogById: returning asynchronously the detail of the execution


@GET @Path("/log") @Produces("text/plain") public String getLogById(@QueryParam("logUrl") String logUrl) throws IOException { // TODO Auto-generated method stub LOGGER.debug("Returning Log =" + logUrl); return service.getScriptFromURL(service.getURLfromWorkerLog(logUrl)); }


http://node2-d-d4s.d4science.org:8080/dataminer-pool-manager-1.0.0-SNAPSHOT/rest/log?gcube-token=**********************************&logUrl=log_id

parameters can be the same of the metadata or sovrascritti


-- "addAlgorithmToVRE" returning immediatly the ID of the log

skipjava ...


-- "getLogById" returning asynchronously the detail of the execution