Difference between revisions of "Common-accounting-model ABANDONED"

From Gcube Wiki
Jump to: navigation, search
(Data-access)
(Execution)
Line 27: Line 27:
  
 
* Execution layer is aware of:
 
* Execution layer is aware of:
Statuses of execution jobs (success/fail/pending)
+
Statuses of execution jobs (success/fail)
also GHN hosting node information of every execution node is available to Workflow, harvested through Registry, containing info such as location, cpu load (week, day, hour,...), memory, disk space etc.: the Workflow layer that is more abstract, constructing workflow plans, supporting various adaptors and is aware of jobs as a whole. There is also the Execution layer, also a Service, where the actual execution takes place and is aware of more detailed stuff.
+
also GHN hosting node information of every execution node is available to Workflow, harvested through Registry, containing info such as location, cpu load (week, day, hour,...), memory, disk space etc.
 +
 
  
 
==== Plan ====
 
==== Plan ====
Line 34: Line 35:
 
Specific Plan properties:
 
Specific Plan properties:
  
* cores : the number of a vm's cores that get occupied is based on either the process is multithreaded or not.
+
* adaptorInUse : adaptor in use for the job (e.g. search, data-transformation, etc.)
* inputFilesNumber : this info could be extracted at workflow layer.
+
* vmsUsed : number of the VMs used by the job.
* inputFilesSize : not know at workflow layer, before execution starts, as files are transferred from different sources. Available at execution layer.
+
* jobId : an unique identifier for the job
* jobId, jobName, jobStart, jobEnd, jobStatus: This info could be extracted out of progress report of a job, or directly from every execution engine at execution layer.
+
* jobName : name of the job
* outputFilesNumber, outputFilesSize : same as input.
+
* jobStart : the instant the job start running
* overallNetworkIn, overallNetworkOut : depends on process demands.
+
* jobEnd : the instant the job ends its execution
* processors : number of processors used per job.
+
* jobStatus: completed/failed
* wallDuration : duration between the instant the job started running and the instant the job ended its execution.
+
* wallDuration : duration between the instant the job start running and the instant the job ends its execution.
 +
* cores : number of available cores per job.
 +
* processors : number of available processors per job.
  
 
==== Execution Engine ====
 
==== Execution Engine ====
Line 51: Line 54:
 
* usageStart : the earlier usage time of the Execution Engine
 
* usageStart : the earlier usage time of the Execution Engine
 
* usageEnd: the latest usage time of the Execution Engine
 
* usageEnd: the latest usage time of the Execution Engine
* usagePhase: Completed/Ready/Paused/Running/Cancel
+
* usagePhase: completed/failed
 +
* inputFilesNumber : number of input files to the Execution Engine
 +
* inputFilesSize : dimension of input files to the Execution Engine
 +
* outputFilesNumber : number of output files from the Execution Engine
 +
* outputFilesSize : dimension of output files from the Execution Engine
 +
* overallNetworkIn : overhead of the input traffic over the network to the Execution Engine
 +
* overallNetworkOut : overhead of the output traffic over the network from the Execution Engine
  
 
=== Service ===
 
=== Service ===

Revision as of 16:00, 23 May 2013

Scope

This library contains the definition of the resource accounting record.

Data-model

The structure of a generic accounting record (Usage Record, UR) will be composed of a set of common fields for all resource types, in particular:

  • id : an unique identifier for the UR
  • consumerId : the user actually consuming the resource (optional, for future purposes)
  • createTime : when the UR was created
  • startTime, endTime : the time window the UR refers to
  • resourceType : the type of resource the UR tracks
  • scope : the scope of the resource
  • resourceOwner : who owns the resource and/or who creates the UR

Furthermore, for each UR there will be a section to be filled with the specific properties per resource type (key-value pairs).

Resource Types

The resource types we've identified are: Execution, Service, Data-access and Storage.

Execution

Regarding the Execution resource type, there are two sub-types, according to the PE2ng's structure which is composed by two main layers. There is the Workflow layer that is more abstract, constructing workflow plans, supporting various adaptors and is aware of jobs as a whole. There is also the Execution layer, also a Service, where the actual execution takes place and is aware of more detailed stuff.

Discriminating those layers:

  • Workflow layer is aware of:

Number of jobs submitted and adaptor that were used Execution nodes that will be used (scale out) per job

  • Execution layer is aware of:

Statuses of execution jobs (success/fail) also GHN hosting node information of every execution node is available to Workflow, harvested through Registry, containing info such as location, cpu load (week, day, hour,...), memory, disk space etc.


Plan

Specific Plan properties:

  • adaptorInUse : adaptor in use for the job (e.g. search, data-transformation, etc.)
  • vmsUsed : number of the VMs used by the job.
  • jobId : an unique identifier for the job
  • jobName : name of the job
  • jobStart : the instant the job start running
  • jobEnd : the instant the job ends its execution
  • jobStatus: completed/failed
  • wallDuration : duration between the instant the job start running and the instant the job ends its execution.
  • cores : number of available cores per job.
  • processors : number of available processors per job.

Execution Engine

Specific Execution Engine properties:

  • refHost : hostname of the vm
  • refVM : Execution Engine resource id or gHN id
  • usageStart : the earlier usage time of the Execution Engine
  • usageEnd: the latest usage time of the Execution Engine
  • usagePhase: completed/failed
  • inputFilesNumber : number of input files to the Execution Engine
  • inputFilesSize : dimension of input files to the Execution Engine
  • outputFilesNumber : number of output files from the Execution Engine
  • outputFilesSize : dimension of output files from the Execution Engine
  • overallNetworkIn : overhead of the input traffic over the network to the Execution Engine
  • overallNetworkOut : overhead of the output traffic over the network from the Execution Engine

Service

Specific service attributes

  • callerIP :
  • invocationCount :
  • averageInvocationTime :
  • serviceClass :
  • serviceName :

Data-access

Specific Data-access properties:

  • sourceId: the identifier of the Tree Manager source which is the target of a read/write operation
  • operation : the name of the read/write operation performed via the Tree Manager over a given source
  • treeId : the identfier of a tree within the data source which is the target of a given read/write operation performed via the Tree Manager
  • treeCount : the number of trees within the data source which are accessed/written as the result of a given read/write operation performed via the Tree Manager

Storage

Specific storage attributes

  • operationType : GET, PUT (update or new file), DELETE
  • targetFile : remote full path of the storage resource
  • fileDimension : storage resource dimension
  • serviceClass: service class used by the client of the storage library at the initialization time of the library
  • serviceName: service name used by the client of the storage library at the initialization time of the library
  • hostname: hostname of the host where the storage library is invoked