Statistical Algorithms Importer: Java Project

From Gcube Wiki
Revision as of 15:32, 1 June 2020 by Giancarlo.panichi (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

This page explains how to create a Java project using two alternative approaches: Black-box and White-box integration. The next sections explain how these work and which cases these two approaches seaddress.

Black Box Integration

Java Project, SAI

This is the preferred way for developers who want their processes executions distributed based on the load of the requests. Each process request will run on one dedicated machine and is allowed to use multi-core processing. Black box processed usually do not use the e-Infrastructure resources but "live on their own". The Statistical Algorithms Importer (SAI) portlet must be used for this integration.

Project Configuration

Define project's metadata
Java Info, SAI
Add input and output parameters and click on "Set Code" to indicate the main file to execute (therefore you should select the .jar file and then click the "Set Code" button).
Important: the full class path (including the package path) should be indicated as the FIRST parameter. It should be also indicated as System parameter so that it will appear neither in the GUI nor among the user's inputs.
For example, the default value of the ClassToRun parameter would be org.gcube.dataanalysis.SimpleProducer should the package of the SimpleProducer class be org.gcube.dataanalysis. If the package is the "default" one, there is no need for this specification (like it is in the example).
Java I/O, SAI
Add information about the running environment (e.g. Java version etc.)
Java Interpreter, SAI
After the software creation phase a Main.R file and a Taget folder are created
Java Create, SAI

Example Code

Java code in sample:
/**
 * 
 * @author Giancarlo Panichi
 * 
 *
 */
import java.io.File;
import java.io.FileWriter;
 
public class SimpleProducer
{
  public static void main(String[] args)
  {
    try
    {
      FileWriter fw = new FileWriter(new File("program.txt"));
      fw.write("Check: " + args[0]);
      fw.close();
    }
    catch (Exception e)
    {
      e.printStackTrace();
    }
  }
}

Example Download

File:JavaBlackBox.zip

Inheritance of Global and Infrastructure Variables

At each run of the process the globalvariables.csv file is created locally to the process (i.e. it can be read as ./globalvariables.csv), which contains the following global variables that are meant to allow the process to properly contact the e-Infrastructure services:

  • gcube_username (the user who run the computation, e.g. gianpaolo.coro)
  • gcube_context (the VRE the process was run in, e.g. d4science.research-infrastructures.eu/gCubeApps/RPrototypingLab)
  • gcube_token (the token of the user for the VRE, e.g. 1234-567-890)

The format of the CSV file is like the one of the following example:

"globalvariable","globalvalue"
"gcube_username","gianpaolo.coro"
"gcube_context","/d4science.research-infrastructures.eu/gCubeApps/RPrototypingLab"
"gcube_token","1234-567-890"

White Box Integration

This is the preferred way for developers who want their processes to fully exploit the e-Infrastructure resources, for example to implement Cloud computing using the e-Infrastructure computational resources. This integration modality also allows to fully reuse the Java data mining frameworks integrated by DataMiner, i.e. Knime, RapidMiner, Weka, gCube EcologicalEngine. The Eclipse IDE should be used for this integration.

Step-by-step guide to integrate Java processes as white boxes