Process Optimisation

From Gcube Wiki
Jump to: navigation, search

Alert icon2.gif THIS SECTION OF GCUBE DOCUMENTATION IS OBSOLETE.

Introduction

The gCube Process Optimisation Services implements core functionality in the form of libraries and web services for Process scheduling and execution planning. gCube POS is exploited by the CSEngine in order to deliver optimized process execution functionality in the context of a VRE.

Implementation Overview

POS is comprised by a core optimisation library (POSLib) and two Web Services (RewriterService and PlannerService) that expose part of the library's functionality. POSLib implements three core components of process optimisation

Rewriter

Provides structure optimization of a process. It receives as input a BPEL process, analyzes its structure, identifies independent invocations and formulates them in parallel constructs (BPEL flow elements) in order to accelerate the overall process execution. It is the first step of optimization that takes place before the process arrives to the execution engine.

Planner

Performs pre-planning of the process execution. Receives an abstract BPEL process and generates various scheduling plans for execution. The generation of an executable plan implies that all references to abstract services are replaced by invocations to concrete, instantiated services in a gCube infrastructure. The Planner uses information provided by the IS that holds up-to-date metrics for resources employed in the grid (machines, services, etc). This information is input to various cost functions that calculate the individual execution cost of a candidate plan. The selection of best plans is performed by a custom implementation of the Simulated Annealing algorithm. The outcome of the planning is a set of executable BPEL processes that are passed to the gCube execution engine. Cost calculation can be guided by various weighted optimization policies passed by the author (human or application) of the BPEL process inside the BPEL description.

ActivePlanner

Provides run-time optimized scheduling of a gCube process. It is invoked by the execution engine before any invocation activity to ensure that the plan generated by the Planner (during pre-planning) is still valid (e.g. the selected service end-point is still reachable) and optimal (according to the user-defined optimization policies). If any of the former criteria has been violated the ActivePlanner re-evaluates a optimal service instance for the current process invocation. It can also work without pre-planning being available.


The Rewriter and the Planner are also available as Web Services. The ActivePlanner only as part of POSLib.

Optimisation Policies

The Planner and ActivePlanner components perform optimised scheduling of abstract BPEL processes based on user defined policies. Optimisation policies are declared within the BPEL document and can apply to individual partnerLinkTypes or to the whole process.

In a BPEL document, PartnerLinkTypes define the classes of Web Services that can participate in multiple roles in a process. A particular instantiation of a partnerLinkType inside the process is denoted by a partnerLink element definition. A process may include various different parterLinks from the same partnerLinkType participating in the process with different roles. In practice a partnerLink is the Running Instance whose operations can be invoked during the exection of a process.

The selection of a specific Running Instance to used in a particular process invocation is driven by the optimisation policy applied either in the process level or in a partnerLink level. Currently POS supports six different optimisation policies:

  • Host load: In this policy the gHNs with the lowest system load is used for scheduling the invocation.
  • Fastest CPU: In this policy the gHNs are ranked based on their CPU capabilities and the best one is used for scheduling the invocation
  • Memory Utilization: gHNs are ranked according to the percentage of available memory as reported by the Java VM. The gHN with the highest percentage is selected.
  • Storage Utilization: gHNs are ranked according to their total available space. The gHN with the biggest available space is prefered.
  • Reliability: The gHNs are ranked based on their total uptime. The gHN which has been up and running for the longest period is ranked highest. The idea behind this policy is that a gHN that hasn't gone off line for a long period of time has smaller probability to go down while a process invocation takes place.
  • Network Utilization: This is a so called "whole plan" optimisation policy. When the Planner evaluates multiple possible scheduling plans it will so preference to those plans where the Running Instances are located close to each other (based on the reported gHN locality information). Notice though that the Planner will try to avoid co-scheduling invocations to the same gHN to avoid overloading it. Actually, this will be the last resort when there are no other available gHNs to use.

One additional optimisation policy has been reserved but is not fully implemented yet, namely the Monetary Cost optimisation policy, that instructs the Planner to select the best Running Instances based on the money charging cost defined by the RI provider. Currently though, all services in the two user community infrastructures established with gCube (EM and FARM) are provided free of charge.

Abstract and Concrete Service References

In gCube we distinguish three categories of partnerLinkTypes:

  • concreteGCubeService: Is a static reference to a specific Running Instance. The Planner will not try to reschedule the assignment of this partnerLinkType. Whenever such partnerLink is declared in the process the static reference to the Running Instance is used in the relevant invocations. The user, provides a static URL inside the partnerLink element to be used in every process invocation that involves such partnerLink.
  • concreteExternalService: This in practice is similar to the above but points to a Web Service outside a gCube infrastructure. Thus this service is not a Running Instance of any gCube service deployed in the VRE.
  • abstractGCubeService: These are partnerLinkTypes whose partnerLinks can be rescheduled at any time either by the Planner or the ActivePlanner. The selection of the specific Running Instance to use depends on the optimisation policies declared in the BPEL document and of course on the current state of the VRE infrastructure as it is reflected by the information provided from the IS.

BPEL Optimisation Extensions

gCube POS functionality heavily depends on the Business Process Execution Language (BPEL) standard. The notation used to represent the Processes is based on BPEL v1.1. The standard has been extended to include optimisation information such as process policy information per partnerLinks, the definition of abstract or concrete services, allocation relationship between invocations etc. The XML schema of the extended BPEL 1.1 can be found here .

Possible values of optimisation policy attribute list

Below is the XML schema of the policy values that can be used in a BPEL document. The name of the policies are self-explanatory and refer to the policies described in the previous paragraphs.

<simpleType name="policyValues">
        <restriction base="string">
		<enumeration value="host_load"/>
		<enumeration value="fastest_cpu"/>
		<enumeration value="network_utilization"/>
		<enumeration value="memory_utilization"/>
		<enumeration value="storage_utilization"/>
		<enumeration value="reliability"/>
		<enumeration value="monetary_cost"/>
	</restriction>
</simpleType>

Note that the order of the policy definitions is important. For instance a policy "network_utilization reliability" will try to satisfy first the requirement for Network utilisation optimisation and then for Reliability. Formally speaking, the order of the policies defines the weight of the respective individual execution cost when the Planner is calculating the total plan cost.

Process-wide policy definition

To define process wide optimisation policies use the optimisationPolicy attribute of the BPEL process element. The attribute is a string list of optimisation policy names separated with a space. For example the process defined by the BPEL exerpt below will be scheduled for according to the fastest_cpu policy (with higher weight) and the storage_utilization policy (lower weight).

<process optimisationPolicy="fastest_cpu storage_utilization" xmlns="http://schemas.xmlsoap.org/ws/2003/03/business-process/" xmlns:plnk="http://schemas.xmlsoap.org/ws/2003/05/partner-link/" xmlns:tns="http://diligentproject.org/searchservice/diligentprocess" targetNamespace="http://diligentproject.org/searchservice/diligentprocess" name="BPELDiligentProcessJAXB1826584379" abstractProcess="no" xmlns:jxb="http://java.sun.com/xml/ns/jaxb" xmlns:xjc="http://java.sun.com/xml/ns/jaxb/xjc" xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://schemas.xmlsoap.org/ws/2003/03/business-process/
C:\Development\workspace\ProcessOptimisation\etc\schema\bpel+.xsd">
	<partnerLinkTypes>
		<partnerLinkType name="BPELDiligentProcess">
			<role name="BPELDiligentProcessProvider"> 
				<portType serviceType="concreteDiligentService" name="tns:BPELDiligentProcess"/>
			</role>
		</partnerLinkType>
		<partnerLinkType name="fulltextindexlookupserviceLT">
			<role name="fulltextindexlookupserviceRole">
				<portType xmlns:fulltextindexlookupservice="http://diligentproject.org/namespaces/index/FullTextIndexLookupService" serviceType="concreteDiligentService" name="fulltextindexlookupservice:FullTextIndexLookupPortType"/>
			</role>

...

If no policy is defined the default used is the host_load optimisation policy.

Service specific policy definition

The policies defined on process wide level pertain the planning of all partnerLinks included in the process unless a partnerLink specific policy is defined on the partnerLink element. To define such policy use the partnerLinkPolicyType attribute of the BPEL partnerLink element. The usage is similar with the policy definitions on the process level. Again, if no policy is defined the Planner will use as default the host_load policy. If the network_utilisation policy is used it will be ignored because it doesn't apply to the partnerLink level but only to the process level.

Below is and excerpt from a BPEL partnerLinks definition that demonstrates the above.

...
<partnerLinks>
	<partnerLink partnerLinkType="tns:BPELDiligentProcess" name="client" myRole="BPELDiligentProcessProvider"/>
	<partnerLink xmlns:fulltextindexlookupservice="http://diligentproject.org/namespaces/index/FullTextIndexLookupService" partnerRole="fulltextindexlookupserviceRole" partnerLinkType="fulltextindexlookupservice:fulltextindexlookupserviceLT" name="fulltextindexlookupservicePLfulltextindexlookupserviceLT0" partnerLinkPolicyType="fastest_cpu"/>
	<partnerLink xmlns:sortoperatorservice="http://diligentproject.org/namespaces/searchservice/SortOperatorService" partnerRole="sortoperatorserviceRole" partnerLinkType="sortoperatorservice:sortoperatorserviceLT" name="sortoperatorservicePLsortoperatorserviceLT" partnerLinkPolicyType="storage_utilization fastest_cpu"/>
	<partnerLink xmlns:keeptopoperatorservice="http://diligentproject.org/namespaces/searchservice/KeepTopOperatorService" partnerRole="keeptopoperatorserviceRole" partnerLinkType="keeptopoperatorservice:keeptopoperatorserviceLT" name="keeptopoperatorservicePLkeeptopoperatorserviceLT" partnerLinkPolicyType="fastest_cpu"/>
	<partnerLink xmlns:transformbyxsltoperatorservice="http://diligentproject.org/namespaces/searchservice/TransformByXSLTOperatorService" partnerRole="transformbyxsltoperatorserviceRole" partnerLinkType="transformbyxsltoperatorservice:transformbyxsltoperatorserviceLT" name="transformbyxsltoperatorservicePLtransformbyxsltoperatorserviceLT"/>
	<partnerLink xmlns:joininneroperatorservice="http://diligentproject.org/namespaces/searchservice/JoinInnerOperatorService" partnerRole="joininneroperatorserviceRole" partnerLinkType="joininneroperatorservice:joininneroperatorserviceLT" name="joininneroperatorservicePLjoininneroperatorserviceLT"/>
	<partnerLink xmlns:filterresultsetbyxpathoperatorservice="http://diligentproject.org/namespaces/searchservice/FilterResultSetByXPathOperatorService" partnerRole="filterresultsetbyxpathoperatorserviceRole" partnerLinkType="filterresultsetbyxpathoperatorservice:filterresultsetbyxpathoperatorserviceLT" name="filterresultsetbyxpathoperatorservicePLfilterresultsetbyxpathoperatorserviceLT"/>
</partnerLinks>
...

Dependencies

POSLib depends on the following components

  • ResourceManager - All queries to the gCore Based Information_System are performed through the ResourceManager taking advantage of the caching functionality that the component implements.
  • Java Architecture for XML Binding (JAXB) - Sun's reference implementation of JAXB is used by the Planner to parse BPEL documents and to extract optimisation related information. It is also used by the Rewriter for reading and reconstructing BPEL processes in order to optimise them.
  • gCore - As with most gCube components, POSLib depends at gCore not only because it provides the container were the PlannerService and RewriterService are deployed, but also because it provides, indirectly, access to a set of supporting libraries that are extensively used in various components of the library.

Usage Example

The following subparagraphs contain usage examples for the three main POS components (Planner, Rewriter and ActivePlanner). Although in the current gCube architecture POS is exploited only by the CSEngine, the implemented classes and web services can be used by any other component wishing to optimise BPEL processes. Apart from the three main implemented components, that are described below, also the rest of POS functionality like the Cost Functions or other utility classes can be utilized by various other sub-systems of a VRE environment. Interested developers should consult the POS API documentation, included in the POS binaries distribution, for further information.

BPEL Static optimisation

Below is an example of how the Rewriter component can be used, through the POSLib, to structurally optimise a BPEL document.

//new rewriter
BPELRewriter rewriter = new BPELRewriter();
 
//pass bpel content to String
fis = new FileInputStream("/samplepath/bpelsample.xml");
x= fis.available();
b= new byte[x];
fis.read(b);
bpelContent = new String(b);
 
/* Create stringBuffer to store the content of the bpel document */
bpelContents = new StringBuffer(bpelContent);
 
/* Create a new BPELDocument using the contents of the above StringBuffer */
bpelDocument = new BPELDocument(bpelContents);
 
//OPTIMISE
bpelDocument = (BPELDocument)rewriter.optimiseProcess(bpelDocument);
 
//write the output to output.xml
FileWriter fileWriter = new FileWriter("/samplepath/output.xml");
BufferedWriter buffWriter = new BufferedWriter(fileWriter);
buffWriter.write(bpelDocument.getBpelContents().toString());
buffWriter.close();

Process Pre-planning and Dynamic planning using the ActivePlanner

The following example demonstrates how the Planner and ActivePlanner can be used, within a gCube Web-Service, to produce a list of execution plans for a given BPEL Process, and to perform dynamic planning for specific invocations inside the BPEL document.

/* Pass BPEL content to a String */
FileInputStream fis = new FileInputStream("/samplepath/testbpel.xml");
int x= fis.available();
byte b[]= new byte[x];
fis.read(b);
String bpelContent = new String(b);
 
/* Pass the Service's Context to a ServiceContextContainer */	
ServiceContextContainer sc = new ServiceContextContainer(ServiceContext.getContext());
GCubePlanner planner = new GCubePlanner(sc);
 
/* Create stringBuffer to store the content of the bpel document */
StringBuffer bpelContents = new StringBuffer(bpelContent);
 
/* Create a new bpelDocument using the contents of the above StringBuffer */
BPELDocument bpelDocument = new BPELDocument(bpelContents);
 
/* Create a new bpelProcess */
BPELProcess bpelProcess = new BPELProcess();
 
/* Set the bpelDocument of the process */
bpelProcess.setBpelDoc(bpelDocument);
 
//PRE-PLANNING	
BPELPlanList planlist = planner.createPlan(bpelProcess);
 
// print the output 
if(planlist.size() == 0){
	System.out.println("PlanList has no plans...");
}
for(int i=0; i < planlist.size(); i++)
{	
	System.out.println("----------------------------------------------------------------");
	System.out.println("Cost: " + ((DoubleCost)planlist.get(i).getPlanCost()).getValue());
	for(int j=0; j < planlist.get(i).getPlanPairs().size(); j++)
	{
		System.out.println("PortType: " + planlist.get(i).getPlanPairs().get(j).partnerLink.getPartnerLinkType().getPortType());
		System.out.println("DHN ID: " + planlist.get(i).getPlanPairs().get(j).resource.getEndPoint());			
	}
}
 
// create an active planner using the Service's Context
GCubeActivePlanner activePlanner = new GCubeActivePlanner(sc);
 
// "someActivityId" is the name attribute value of the <invoke> element of the invocation for which we perform the active-planning(inside the BPEL document) 
String currentActivity = "someActivityId";
 
RunningInstance ri = null;
 
// refers to the 10th partnerLinkType inside the BPEL document
int target = 9;
ArrayList<RunningInstance> ris = new ArrayList<RunningInstance>();
 
//get the currently best Running Instance for Porttype of the 10th PartnerlinkType, with the given policy	
System.out.println("before evaluateNewRunningInstance: " + planlist.get(0).getPlanPairs().get(target).resource.getEndPoint() + " with cost: " + ((DoubleCost)planlist.get(0).getPlanPairs().get(target).resource.getCost()).getValue());
ri = activePlanner.evaluateNewRunningInstance(planlist.get(0).getPlanPairs().get(target).partnerLink.getPartnerLinkType().getPortType(), planlist.get(0).getPlanPairs().get(target).partnerLink.getPolicyType());
System.out.println("returned: " + ri.getEndPoint() + " with cost: " + ((DoubleCost)planlist.get(0).getPlanPairs().get(target).resource.getCost()).getValue());
 
//get the Running Instances for	Porttype of the 10th PartnerlinkType, from each plan returned from pre-planning	
PlanList planlist2 = TypeConversions.stubPlan2objectPlan(planlist, bpelContent);
ris = (ArrayList<RunningInstance>)activePlanner.extractPreplannedRunningInstances(planlist2, planlist.get(0).getPlanPairs().get(target).partnerLink.getPartnerLinkType().getPortType());
for(int i=0; i < ris.size(); i++)
	System.out.println("extractPreplannedRunningInstances: " + ris.get(i).getEndPoint()+ " with cost: " + ((DoubleCost)planlist.get(0).getPlanPairs().get(target).resource.getCost()).getValue());
 
//get the best Running Instance for the given invocation. If the planlist from pre-planning is still fresh, it is used to produce the result 
ri = activePlanner.suggestNext(planlist2, planlist.get(0).getPlanPairs().get(target).partnerLink.getPartnerLinkType().getPortType(), currentActivity);
System.out.println("returned: " + ri.getEndPoint()+ " with cost: " + ((DoubleCost)planlist.get(0).getPlanPairs().get(target).resource.getCost()).getValue());
 
//get the currently best Running Instance for Porttype of the 10th PartnerlinkType. The policy is retrieved through the provided bpelContent	
ri = activePlanner.suggestNext(planlist.get(0).getPlanPairs().get(target).partnerLink.getPartnerLinkType().getPortType(), bpelContent, currentActivity);
System.out.println("returned: " + ri.getEndPoint()+ " with cost: " + ((DoubleCost)planlist.get(0).getPlanPairs().get(target).resource.getCost()).getValue());
 
//given a previous suggestion of a Running Instance, find a new Running Instance if the previous suggestion is no longer preferable
URI uri = new URI(planlist.get(0).getPlanPairs().get(target).resource.getEndPoint().toString());
activePlanner.suggestNext(uri, planlist.get(0).getPlanPairs().get(target).resource.getCost(), planlist.get(0).getPlanPairs().get(target).partnerLink.getPartnerLinkType().getPortType(), planlist.get(0).getPlanPairs().get(target).resource.getGhn().getId(), (Policy)planlist.get(0).getPlanPairs().get(target).partnerLink.getPolicyType());
System.out.println("returned: " + ri.getEndPoint()+ " with cost: " + ((DoubleCost)planlist.get(0).getPlanPairs().get(target).resource.getCost()).getValue());
 
//find the best Running Instances for each of the PartnerLink in the given List
LinkedList<PartnerLink> li = new LinkedList<PartnerLink>();
li.add(planlist.get(0).getPlanPairs().get(target).partnerLink);
activePlanner.findBestRunningInstance(li, (Policy)planlist.get(0).getPlanPairs().get(target).partnerLink.getPolicyType());
System.out.println("returned: " + ri.getEndPoint()+ " with cost: " + ((DoubleCost)planlist.get(0).getPlanPairs().get(target).resource.getCost()).getValue());

Dynamic planning using the ActivePlanner

[coming soon]