Difference between revisions of "GCube ResultSet (gRS)"
(→Usage Example) |
|||
(32 intermediate revisions by 2 users not shown) | |||
Line 12: | Line 12: | ||
− | * The ResultSet | + | * The ResultSet creates a linked list of pages holding the records. Pages are also referred as parts whereas records make up the content of the referenced data. The ResultSet can iterate over the created parts and perform various operations on the structure of the list as well as the content entailed. Its operations are mostly part specific but they can also affect the entire record chain. Once a part of the content has been created and the next in line has been initialized the part cannot be altered. Once it has been declared that the authoring of a result set has finished, the entire chain of it is immutable. |
* The ResultSet Service is merely a WS front-end to the ResultSet library. It creates and manages WS-Resources holding references to instances of the ResultSet components. These instances are responsible to hold state regarding the iteration over the chain of pages and for accessing / modifying / querying the data held. | * The ResultSet Service is merely a WS front-end to the ResultSet library. It creates and manages WS-Resources holding references to instances of the ResultSet components. These instances are responsible to hold state regarding the iteration over the chain of pages and for accessing / modifying / querying the data held. | ||
* The ResultSet Client is a library giving access to both high and low level operations that can be performed on a specific ResultSet component instance and its underlying data. It enables handling of ResultSets that are both wrapped through a WS front-end and that are locally manipulated though direct java invocations (within JVM). | * The ResultSet Client is a library giving access to both high and low level operations that can be performed on a specific ResultSet component instance and its underlying data. It enables handling of ResultSets that are both wrapped through a WS front-end and that are locally manipulated though direct java invocations (within JVM). | ||
Line 42: | Line 42: | ||
==== Creating an RS ==== | ==== Creating an RS ==== | ||
− | /*A method that creates a new RSXMLWriter | + | The simplest way to create an RS is through a writer (here RSXMLWriter). |
+ | |||
+ | <source lang="java5"> | ||
+ | /*A method that creates a new RSXMLWriter*/ | ||
public static RSXMLWriter createRSWriter() throws Exception{ | public static RSXMLWriter createRSWriter() throws Exception{ | ||
− | |||
RSXMLWriter writer=RSXMLWriter.getRSXMLWriter(); | RSXMLWriter writer=RSXMLWriter.getRSXMLWriter(); | ||
+ | ..... | ||
+ | /*One should ALWAYS close the RS he is creating in order to make the full payload available to readers*/ | ||
+ | writer.close(); | ||
+ | return writer; | ||
+ | } | ||
+ | </source> | ||
+ | Alternatively one can choose the part size and the records placed in each part: | ||
+ | |||
+ | <source lang="java5"> | ||
+ | /*A method that creates a new RSXMLWriter*/ | ||
+ | public static RSXMLWriter createRSWriter() throws Exception{ | ||
/*Create a new writer which should have as a condition for paging inserted results either having | /*Create a new writer which should have as a condition for paging inserted results either having | ||
30 records per page or having a total size surpasing 1024 bytes */ | 30 records per page or having a total size surpasing 1024 bytes */ | ||
− | + | writer=RSXMLWriter.getRSXMLWriter(30,1024); | |
+ | ..... | ||
+ | /*One should ALWAYS close the RS he is creating in order to make the full payload available to readers*/ | ||
+ | writer.close(); | ||
+ | return writer; | ||
+ | } | ||
+ | </source> | ||
+ | Since the RS has a number of features that must be enabled during initialization we also offer an RS initialization class, used in the following way: | ||
+ | |||
+ | <source lang="java5"> | ||
+ | public static RSXMLWriter createRSWriter() throws Exception{ | ||
+ | RSWriterCreationParams initParams = new RSWriterCreationParams(); | ||
+ | /* You would normaly enable some features usng the initParams*/ | ||
+ | RSXMLWriter writer=RSXMLWriter.getRSXMLWriter(initParams); | ||
+ | ..... | ||
+ | /*One should ALWAYS close the RS he is creating in order to make the full payload available to readers*/ | ||
+ | writer.close(); | ||
+ | return writer; | ||
+ | } | ||
+ | </source> | ||
+ | ==== Populating an RS ==== | ||
+ | Having created an RS writer you are ready to place some content into it through a call to addResults. | ||
+ | |||
+ | <source lang="java5"> | ||
+ | /*A method that creates a new RSXMLWriter and populates it with some results*/ | ||
+ | public static RSXMLWriter createRSWriter() throws Exception{ | ||
/*Create a new writer and set a lifetime property of 1000 millisecs. Also set that the paging | /*Create a new writer and set a lifetime property of 1000 millisecs. Also set that the paging | ||
condition will be having 10 records per page or a total size surpasing teh default value*/ | condition will be having 10 records per page or a total size surpasing teh default value*/ | ||
− | + | writer=RSXMLWriter.getRSXMLWriter(new PropertyElementBase[]{new PropertyElementLifeSpanGC(1000)}); | |
− | + | writer.setRecsPerPart(10); | |
/*Add a new result with the given id, collection, rank and payload*/ | /*Add a new result with the given id, collection, rank and payload*/ | ||
writer.addResults(new ResultElementGeneric("id1","collection1","rank1","payload")); | writer.addResults(new ResultElementGeneric("id1","collection1","rank1","payload")); | ||
Line 65: | Line 103: | ||
return writer; | return writer; | ||
} | } | ||
− | + | </source> | |
==== Retrieving a locator ==== | ==== Retrieving a locator ==== | ||
− | + | In order to access the content of an RS you would create a reference realized in a RSLocator object. Creating a reference also includes the concept of the technology through which the RS will be came available. Therefore when creating an RSLocator you should also set the resource type by providing one of RSResourceLocalType or RSResourceWSRFType. | |
+ | |||
+ | After creating the writer you can get a reference to the RS by: | ||
+ | |||
+ | <source lang="java5"> | ||
/*This method retrieves an instance of an RSLocator capable of identifying a ResultSet authored by the provided RSXMLWriter. Depending on the type of | /*This method retrieves an instance of an RSLocator capable of identifying a ResultSet authored by the provided RSXMLWriter. Depending on the type of | ||
resource requested, this identifier can either be an identifier pointing to the local node (filesystem) or an idenitfier capable of pinpointing the RS | resource requested, this identifier can either be an identifier pointing to the local node (filesystem) or an idenitfier capable of pinpointing the RS | ||
Line 78: | Line 120: | ||
The service pointed to must be in the localhost. This method can be used when manipulating the RSXMLWriter | The service pointed to must be in the localhost. This method can be used when manipulating the RSXMLWriter | ||
outside container context. Otherwise the following method can be used*/ | outside container context. Otherwise the following method can be used*/ | ||
− | + | locator=writer.getRSLocator(new RSResourceWSRFType("http://localhost:8080/wsrf/services/gcube/common/searchservice/ResultSet")); | |
/*Retrieves a locator identifying the RS through WS-Resource pattern accessible from any node. The | /*Retrieves a locator identifying the RS through WS-Resource pattern accessible from any node. The | ||
default constructor utilizes the container context (which must be available) to compose the locally hosted | default constructor utilizes the container context (which must be available) to compose the locally hosted | ||
Line 85: | Line 127: | ||
return locator; | return locator; | ||
} | } | ||
− | + | </source> | |
− | + | You are also able to create a RSLocator on a RS you are reading through a call to reader.getRSLocator() | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
==== Reading an RS ==== | ==== Reading an RS ==== | ||
− | + | Through an RSLocator you are able to access the respective RS. You can either read the RS contents part by part: | |
+ | |||
+ | <source lang="java5"> | ||
/*This method instantiates a new RSXMLReader to point to the RS identified by the provided | /*This method instantiates a new RSXMLReader to point to the RS identified by the provided | ||
− | RSLocator and iterates over each page retrieving the contents*/ | + | RSLocator and iterates over each page retrieving the contents*/ |
public static void readRS(RSLocator locator) throws Exception { | public static void readRS(RSLocator locator) throws Exception { | ||
PropertyElementBase []props=reader.getProperties(PropertyElementEstimationCount.class,PropertyElementEstimationCount.propertyType); | PropertyElementBase []props=reader.getProperties(PropertyElementEstimationCount.class,PropertyElementEstimationCount.propertyType); | ||
Line 115: | Line 151: | ||
}while(reader.getNextPart()); | }while(reader.getNextPart()); | ||
} | } | ||
+ | </source> | ||
+ | or using an iterator over the records: | ||
− | = | + | <source lang="java5"> |
/*This method instantiates a new RSXMLReader to point to the RS identified by the provided RSLocator and iterates over each record retrieving the | /*This method instantiates a new RSXMLReader to point to the RS identified by the provided RSLocator and iterates over each record retrieving the | ||
contents*/ | contents*/ | ||
Line 127: | Line 165: | ||
} | } | ||
} | } | ||
+ | </source> | ||
+ | '''in case you use RS BLOB you should first call "makelocal" befoer reading the RS'''. | ||
==== Extending Base elements ==== | ==== Extending Base elements ==== | ||
Line 134: | Line 174: | ||
* ResultElementBase - Which is used to add / retrieve records from the RS | * ResultElementBase - Which is used to add / retrieve records from the RS | ||
− | These elements must be available to both the producer and the consumer as library elements in order to be used. | + | These elements must be available to both the producer and the consumer as library elements in order to be used. Of course this is not mandatory and a consumer can employ an element to retrieve the records that a consumer inserted using some other element. It is in the responsibility of the producer / consumer to synchronize the elements they use. These elements serve mainly as containers for the actual payload that should be inserted / retrieved possibly adding some common handling functionality on top of that payload. |
===== Extending PropertyElementBase ===== | ===== Extending PropertyElementBase ===== | ||
+ | Here we provide an example of extending a property element: | ||
+ | |||
+ | <source lang="java5"> | ||
/*This is ane example of extending the PropertyElementBase to derive Property Element that is used to add WSRF EndpointReferenceType serializations to | /*This is ane example of extending the PropertyElementBase to derive Property Element that is used to add WSRF EndpointReferenceType serializations to | ||
the RS head page*/ | the RS head page*/ | ||
Line 160: | Line 203: | ||
} | } | ||
} | } | ||
− | + | </source> | |
===== Extending ResultElementBase ===== | ===== Extending ResultElementBase ===== | ||
+ | Similarly, here we provide an example of extending a results element: | ||
+ | |||
+ | <source lang="java5"> | ||
/* A Result element that extends the ResultElementBase and can be used to insert and retrieve xml records from an RS*/ | /* A Result element that extends the ResultElementBase and can be used to insert and retrieve xml records from an RS*/ | ||
public class ResultElementFoo extends ResultElementBase{ | public class ResultElementFoo extends ResultElementBase{ | ||
Line 196: | Line 242: | ||
} | } | ||
} | } | ||
+ | </source> | ||
+ | Note that the RS mechanism was initially based on the notion that each record would have at least one attribute (its Record ID). There are might still be side effects when using ResultSet Elements with no attributes. | ||
==== Complex Operations ==== | ==== Complex Operations ==== | ||
+ | In this section we provide some information on operations that expose particular features that the developer might find appealing to use. | ||
+ | |||
===== RS creation within Workflows ===== | ===== RS creation within Workflows ===== | ||
In the context of a workflow it might be desirable to send to the asking component a locator capable of identifying the RS that the producer will be populating. This is best done in a non blocking manner in a background thread.<br> | In the context of a workflow it might be desirable to send to the asking component a locator capable of identifying the RS that the producer will be populating. This is best done in a non blocking manner in a background thread.<br> | ||
+ | <source lang="java5"> | ||
/*This method creates an RS and starts a background thread populating the RS. It then returns the locator to the RS that is beeing populated*/ | /*This method creates an RS and starts a background thread populating the RS. It then returns the locator to the RS that is beeing populated*/ | ||
public static RSLocator populateRS() throws Exception{ | public static RSLocator populateRS() throws Exception{ | ||
Line 229: | Line 280: | ||
} | } | ||
} | } | ||
− | + | </source> | |
===== Partial Localization ===== | ===== Partial Localization ===== | ||
+ | When reading a RS you are ofered the option to first store it localy so as to access it more efficiently; this operation is called localization. Further more you able to fetch localy only a portion of the RS. We provide an example of that here: | ||
+ | |||
+ | <source lang="java5"> | ||
/*This method localizes a ResultSet keeping the top count records of the identified throught the RSLocator RS. The localization is in terms | /*This method localizes a ResultSet keeping the top count records of the identified throught the RSLocator RS. The localization is in terms | ||
− | + | of the node that hosts the RS content and not the Resource type. This can be further defined by the RSResourceType*/ | |
public static RSLocator localizePart(RSLocator locator,int count) throws Exception { | public static RSLocator localizePart(RSLocator locator,int count) throws Exception { | ||
RSXMLReader reader=RSXMLReader.getRSXMLReader(locator); | RSXMLReader reader=RSXMLReader.getRSXMLReader(locator); | ||
Line 246: | Line 300: | ||
return readerLocalTop.getRSLocator(); | return readerLocalTop.getRSLocator(); | ||
} | } | ||
− | + | </source> | |
===== RS xPath filtering ===== | ===== RS xPath filtering ===== | ||
+ | Some basic filtering features are provided in the context of the RS: | ||
+ | |||
+ | <source lang="java5"> | ||
/*This method scans through every record of the RS and evaluates the provided xPath expression against each. If the evaluation returns | /*This method scans through every record of the RS and evaluates the provided xPath expression against each. If the evaluation returns | ||
− | + | some result, the record is inserted in a new RS. A new RSXMLReader is created holding the matching record. The xPath expression must | |
− | + | start with a reference to the current node (eg .//[..])*/ | |
public static RSLocator filterRS(RSLocator locator,String xPath) throws Exception { | public static RSLocator filterRS(RSLocator locator,String xPath) throws Exception { | ||
RSXMLReader reader=RSXMLReader.getRSXMLReader(locator); | RSXMLReader reader=RSXMLReader.getRSXMLReader(locator); | ||
return reader.filter(xPath).getRSLocator(); | return reader.filter(xPath).getRSLocator(); | ||
} | } | ||
− | + | </source> | |
==== RS as Flow Control Mechanism ==== | ==== RS as Flow Control Mechanism ==== | ||
Line 267: | Line 324: | ||
In the context of a service for example on can staticly define a factory of writers that include a configurable pool of precreated writers and use this factory to retrieve the writers to use. The code below shows this in sudo-code. | In the context of a service for example on can staticly define a factory of writers that include a configurable pool of precreated writers and use this factory to retrieve the writers to use. The code below shows this in sudo-code. | ||
− | import org.gcube.searchservice.searchlibrary.rsclient.elements.pool.PoolConfig; | + | <source lang="java5"> |
− | import org.gcube.searchservice.searchlibrary.rsclient.elements.pool.PoolObjectConfig; | + | import org.gcube.common.searchservice.searchlibrary.rsclient.elements.pool.PoolConfig; |
− | import org.gcube.searchservice.searchlibrary.rsclient.elements.pool.RSPoolObject; | + | import org.gcube.common.searchservice.searchlibrary.rsclient.elements.pool.PoolObjectConfig; |
− | import org.gcube.searchservice.searchlibrary.rswriter.RSWriterFactory; | + | import org.gcube.common.searchservice.searchlibrary.rsclient.elements.pool.RSPoolObject; |
− | import org.gcube.searchservice.searchlibrary.rswriter.RSXMLWriter; | + | import org.gcube.common.searchservice.searchlibrary.rswriter.RSWriterFactory; |
+ | import org.gcube.common.searchservice.searchlibrary.rswriter.RSXMLWriter; | ||
public class ExampleService { | public class ExampleService { | ||
Line 303: | Line 361: | ||
} | } | ||
} | } | ||
− | + | </source> | |
===ResultSet Extentions=== | ===ResultSet Extentions=== | ||
+ | The ResultSet has been enchanced with a number of features that one can enable through the initialization class RSWriterCreationParams. | ||
+ | These features include: | ||
+ | * Access Leasing | ||
+ | * Time Leasing | ||
+ | * Forward Access | ||
+ | * Encryption | ||
+ | In addition through RSWriterCreationParams you are also able to some of the aforementioned parameters such as: | ||
+ | * Part size | ||
+ | * Records per part | ||
+ | * Control flow and | ||
+ | * Property elements to be used | ||
+ | |||
+ | Note that when you perform a localization of an RS you essentially create a new RS and the feature properties of the original RS are not inherited to the new one. | ||
====ResultSet Leasing==== | ====ResultSet Leasing==== | ||
− | Freeing resources hold by a ResultSet(RS) in currently takes place in the | + | <!-- |
+ | Freeing resources hold by a ResultSet(RS) in currently takes place in the of the GarbageCollector. The policy implemented thus far is that the GarbageCollector wakes up periodically and if it finds an RS that has not been modified for too long it removes it. This removing results in freeing two types of resources:a) The disk storage resources and b) the WS resources. The former resources are freed by removing the RS files from the file system while the latter are freed by calling the destroy function on the RS WSRF resource. | ||
+ | --> | ||
Leasing aims to be a mechanism to enhance the resource feeing mechanism of the ResultSet. When creating a new RS the author is able to chose up until when the RS will be available. As a consequence the internal mechanisms of the RS will be able to free the resources hold at a convenient point in time. As of now we identify two leasing policies/types: | Leasing aims to be a mechanism to enhance the resource feeing mechanism of the ResultSet. When creating a new RS the author is able to chose up until when the RS will be available. As a consequence the internal mechanisms of the RS will be able to free the resources hold at a convenient point in time. As of now we identify two leasing policies/types: | ||
− | *Time leasing. When the RS author creates a resource he | + | *Time leasing. When the RS author creates a resource he is able to set a life time during which the ResultSet will be available. This essentially means that after this period the resultset will not be accessible. Extension is also possible in case it is requested. |
− | *Access leasing. When RS author creates a resource he | + | <source lang="java5"> |
+ | private void TimeLeasingTest() throws Exception{ | ||
+ | // Writer | ||
+ | String content = "Some content"; | ||
+ | RSWriterCreationParams initParams = new RSWriterCreationParams(); | ||
+ | initParams.setExpire_date(new Date(Calendar.getInstance().getTimeInMillis() + 60000)); | ||
+ | RSXMLWriter writer=RSXMLWriter.getRSXMLWriter(initParams); | ||
+ | for(int i=0;i<30;i+=1) writer.addResults(new ResultElementGeneric("id" + i,"foo",content)); | ||
+ | writer.close(); | ||
+ | String epr=writer.getRSLocator(new RSResourceWSRFType()).getLocator(); | ||
+ | //Reader 1 | ||
+ | RSLocator l = new RSLocator(epr); | ||
+ | RSXMLReader reader=RSXMLReader.getRSXMLReader(l); | ||
+ | int q=0; | ||
+ | while (true){ | ||
+ | ResultElementBase[] res=reader.getResults(ResultElementGeneric.class); | ||
+ | if(!reader.getNextPart()) break; | ||
+ | } | ||
+ | try{ Thread.sleep(60000); }catch(Exception ex ){} | ||
+ | try{ | ||
+ | l = new RSLocator(epr); | ||
+ | reader=RSXMLReader.getRSXMLReader(l); | ||
+ | while (true){ | ||
+ | ResultElementBase[] res=reader.getResults(ResultElementGeneric.class); | ||
+ | if(!reader.getNextPart()) break; | ||
+ | } | ||
+ | }catch(Exception x){ | ||
+ | Print("Read failed. This is expected!"); | ||
+ | } | ||
+ | } | ||
+ | </source> | ||
+ | |||
+ | |||
+ | *Access leasing. When RS author creates a resource he is able to set the number of reads allowed. In case the number of reads is exceeded the resource will not be available anymore. | ||
+ | |||
+ | <source lang="java5"> | ||
+ | private void AccessLeasingTest() throws Exception{ | ||
+ | // Writer | ||
+ | String content = "Some content"; | ||
+ | RSWriterCreationParams initParams = new RSWriterCreationParams(); | ||
+ | initParams.setAccessReads(2); //Two access reads allowed | ||
+ | RSXMLWriter writer=RSXMLWriter.getRSXMLWriter(initParams); | ||
+ | for(int i=0;i<30;i+=1) writer.addResults(new ResultElementGeneric("id" + i,"foo",content)); | ||
+ | writer.close(); | ||
+ | String epr=writer.getRSLocator(new RSResourceWSRFType()).getLocator(); | ||
+ | //Reader 1 | ||
+ | RSLocator l = new RSLocator(epr); | ||
+ | RSXMLReader reader=RSXMLReader.getRSXMLReader(l); | ||
+ | while (true){ | ||
+ | ResultElementBase[] res=reader.getResults(ResultElementGeneric.class); | ||
+ | if(!reader.getNextPart()) break; | ||
+ | } | ||
+ | //Reader 2 | ||
+ | try{ | ||
+ | l = new RSLocator(epr); | ||
+ | reader=RSXMLReader.getRSXMLReader(l); | ||
+ | while (true){ | ||
+ | ResultElementBase[] res=reader.getResults(ResultElementGeneric.class); | ||
+ | if(!reader.getNextPart()) break; | ||
+ | } | ||
+ | }catch(Exception x){ | ||
+ | System.out.println("Read failed. This is correct!"); | ||
+ | } | ||
+ | } | ||
+ | </source> | ||
+ | ====Forward Only ResultSet==== | ||
+ | The author of an RS may require to for the RS to be read without accessing parts of that preceed the current one in the parts list. Here is an example of that: | ||
+ | |||
+ | <source lang="java5"> | ||
+ | private void ForwardTest() throws Exception{ | ||
+ | // Writer | ||
+ | String content = "some content"; | ||
+ | RSWriterCreationParams initParams = new RSWriterCreationParams(); | ||
+ | initParams.setAccessReads(2); | ||
+ | initParams.setForward(true); | ||
+ | RSXMLWriter writer=RSXMLWriter.getRSXMLWriter(initParams); | ||
+ | for(int i=0;i<300;i+=1) writer.addResults(new ResultElementGeneric("id" + i,"foo",content)); | ||
+ | writer.close(); | ||
+ | String epr=writer.getRSLocator(new RSResourceWSRFType()).getLocator(); | ||
+ | //Reader | ||
+ | RSLocator l = new RSLocator(epr); | ||
+ | RSXMLReader reader=RSXMLReader.getRSXMLReader(l); | ||
+ | while (true){ | ||
+ | ResultElementBase[] res=reader.getResults(ResultElementGeneric.class); | ||
+ | if(!reader.getNextPart()) break; | ||
+ | } | ||
+ | try{ | ||
+ | l = new RSLocator(epr); | ||
+ | reader=RSXMLReader.getRSXMLReader(l); | ||
+ | while (true){ | ||
+ | ResultElementBase[] res=reader.getResults(ResultElementGeneric.class); | ||
+ | if(!reader.getPreviousPart()) break; | ||
+ | } | ||
+ | }catch(Exception x){ | ||
+ | Print("Read failed. This is correct!"); | ||
+ | } | ||
+ | try{ | ||
+ | l = new RSLocator(epr); | ||
+ | reader=RSXMLReader.getRSXMLReader(l); | ||
+ | while (true){ | ||
+ | ResultElementBase[] res=reader.getResults(ResultElementGeneric.class); | ||
+ | if(!reader.getNextPart()) break; | ||
+ | } | ||
+ | }catch(Exception x){ | ||
+ | Print("Read failed. This is correct!"); | ||
+ | } | ||
+ | System.out.println("Now check under /tmp/resultset that the newelly" + | ||
+ | " created RS does not have 300 records"); | ||
+ | } | ||
+ | </source> | ||
+ | Note that when Forward and Access leasing is used at the same time then the during the last access read the RS parts are also being deleted. | ||
+ | <!-- | ||
=====Implementation Details===== | =====Implementation Details===== | ||
There are two types of resources to be managed. Disk and WSRF resources. Freeing them is essential in the success of the resultset. | There are two types of resources to be managed. Disk and WSRF resources. Freeing them is essential in the success of the resultset. | ||
Line 341: | Line 525: | ||
Altered: Upon getFirstPart call of the RS reader, indicating that a read starts. | Altered: Upon getFirstPart call of the RS reader, indicating that a read starts. | ||
Read: Upon RS read initiation. | Read: Upon RS read initiation. | ||
+ | --> | ||
− | ==== | + | ====ResultSet Security==== |
− | + | The ResultSet framework offers encryption of the records stored. | |
− | + | Upon RS creation a private public key can be created through respective calls in the RS or be provided by en aexternal mechanism. Using these two keys the production of a content encryption key takes place. The a) content encryption key and b) the public key are placed in the header of the RS and the header is encrypted using the private RS key. The rest of the content of the RS is encrypted with the content encryption key. It is the responsibility of the RS author to pass the private key to the RS reader clients. | |
− | + | Upon reading the RS the reader also indirectly has access to the public key so as to alter the RS header (for forward RS etc) but it is never actually retrieved. Here we present an example of how t use the encryption mechanism of the RS: | |
− | + | <source lang="java5"> | |
+ | private void EncryptionTest() throws Exception{ | ||
+ | // Writer | ||
+ | String content = "Some content"; | ||
+ | RSWriterCreationParams initParams = new RSWriterCreationParams(); | ||
+ | KeyGenerator kg = new KeyGenerator(); | ||
+ | KeyPair pair = kg.GenKeyPair(); | ||
+ | initParams.setPrivKey(pair.getPrivate()); | ||
+ | initParams.setPubKey(pair.getPublic()); | ||
+ | RSXMLWriter writer=RSXMLWriter.getRSXMLWriter(initParams); | ||
+ | for(int i=0;i<30;i+=1) writer.addResults(new ResultElementGeneric("id" + i,"foo",content)); | ||
+ | writer.close(); | ||
+ | String epr=writer.getRSLocator(new RSResourceWSRFType()).getLocator(); | ||
+ | //Reader | ||
+ | RSLocator l = null; | ||
+ | try{ | ||
+ | l = new RSLocator(epr); | ||
+ | reader = RSXMLReader.getRSXMLReader(l); | ||
+ | while (true){ | ||
+ | ResultElementBase[] res=reader.getResults(ResultElementGeneric.class); | ||
+ | Print("ResultLength: " + res.length); | ||
+ | if(!reader.getNextPart()) break; | ||
+ | } | ||
+ | }catch(Exception x){ | ||
+ | Print("Read failed. This is correct!"); | ||
+ | } | ||
+ | |||
+ | Print("Starting rs re-reader (2)"); | ||
+ | l = new RSLocator(epr); | ||
+ | l.setPrivKey(pair.getPrivate()); | ||
+ | reader=RSXMLReader.getRSXMLReader(l); | ||
+ | while (true){ | ||
+ | ResultElementBase[] res=reader.getResults(ResultElementGeneric.class); | ||
+ | if(!reader.getNextPart()) break; | ||
+ | } | ||
+ | } | ||
+ | </source> | ||
+ | Note that the encryption has to do with the RS reference (RSLocator) because it has to be known on both sides of the RS reader and writer. | ||
− | + | Note that during data transfer data RS will use plain TCP/IP connections for performance reasons. These connections are encrypted through SSL. If for any reason you do not desire encrypted RS communication you can disable this feature by editing the services like this: | |
− | + | <environment name="SSLsupport" value="disabled" type="java.lang.String" /> | |
+ | |||
+ | <!-- | ||
====Interface==== | ====Interface==== | ||
Line 368: | Line 592: | ||
We will have default values and function overload to achieve backward compatibility and usage ease. | We will have default values and function overload to achieve backward compatibility and usage ease. | ||
+ | --> | ||
+ | |||
+ | ====Notification for the garbage collected resources==== | ||
+ | |||
+ | Using the RSEPRCache component a service/component can store the EPR of a RS, in this cache, and use it again without having to wait for the execution of the query that produced the given RS. A RSEPRCache instance needs to be notified when the Garbage Collector collects an RS WS-resource, in order to update its content. | ||
+ | |||
+ | For this purpose, each deployed ResultSetService will register as a notifier, in the service's initialisation phase, to a topic associated with all the scopes, in which the Running Instance is deployed. In each period of the garbage collection procedure, the Collector aggregates all the RS WS-resources that are being garbage collected, and sends a batch notification message with the corresponding RS EPRs using the aforementioned topic. When a RSEPRCache instance is created, it subscribes as a notification receiver to the topic, and it is able to receive notification messages with the EPRs of garbage-collected RSs, during its lifetime. | ||
===ResultSet Cache=== | ===ResultSet Cache=== | ||
Line 389: | Line 620: | ||
</source> | </source> | ||
− | A ResultSet Cache object receives notifications, from every ResultSet service that is deployed in the | + | A ResultSet Cache object receives notifications, from every ResultSet service that is deployed in the same scope, for ResultSet EPRs that were reclaimed and can't be accessed any more. Using gCube [[IS-Notification]] mechanism each ResultSet Cache object subscribes to a topic for notifying about reclaimed ResultSet EPRs, receives batch messages that contain the EPRs of the reclaimed ResultSets and deletes the contents that are related to them(see also [[GCube_ResultSet_%28gRS%29#Notification_for_the_garbage_collected_resources|Notification_for_the_garbage_collected_resources]]). This operation is performed in the background, and the contents of a Cache object remain valid without any action needed from the component that uses the Cache object. |
− | ====Usage | + | ====Usage Examples==== |
<source lang="java5"> | <source lang="java5"> | ||
− | //construct a new RSEPRCache using the default settings | + | //construct a new RSEPRCache using the default settings. myEpr will be used to register to |
+ | //the notifications' topic used for notifying when a ResultSet is removed from RS garbage collector | ||
String myEpr = ServiceHost.getBaseURL().toString() + ServiceContext.getContext().getJNDIName(); | String myEpr = ServiceHost.getBaseURL().toString() + ServiceContext.getContext().getJNDIName(); | ||
RSEPRCache rseprCache = new RSEPRCache(new EndpointReferenceType(new AttributedURI(myEpr)), ServiceContext.getContext()); | RSEPRCache rseprCache = new RSEPRCache(new EndpointReferenceType(new AttributedURI(myEpr)), ServiceContext.getContext()); | ||
Line 445: | Line 677: | ||
reader = RSXMLReader.getRSXMLReader(rsLocator); | reader = RSXMLReader.getRSXMLReader(rsLocator); | ||
</source> | </source> | ||
+ | |||
+ | So creating a RSXMLReader from the EPR retrieved should be performed inside a try-catch statement. | ||
+ | |||
+ | A second constructor is provided, for creating caches with different settings than default: | ||
+ | <source lang="java5"> | ||
+ | /** | ||
+ | * Constructor of RSEPRCache that takes configuration parameters and creates a new cache with | ||
+ | * this configuration. | ||
+ | * | ||
+ | * @param notificationConsumerEPR the EPR of the service that will be used to register to | ||
+ | * the notifications topic used for notifying when a ResultSet is removed from RS garbage collector | ||
+ | * @param sctx the context of the service that will be used to register to the notifications topic | ||
+ | * @param ttl the default amount of time to live for an element from its creation date | ||
+ | * @param tti the default amount of time to live for an element from its last accessed date | ||
+ | * @param maxElementsInMemory the maximum number of elements in memory, before they are evicted | ||
+ | * @param overflowToDisk whether to use the disk store when there is overflow of the memory used | ||
+ | * @throws Exception | ||
+ | */ | ||
+ | public RSEPRCache(EndpointReferenceType notificationConsumerEPR, | ||
+ | GCUBEServiceContext sctx, long ttl, long tti, int maxElementsInMemory, boolean overflowToDisk) | ||
+ | </source> | ||
+ | |||
+ | A third constructor is provided, for the creation of caches by components that are not gCube services(i.e. portlets): | ||
+ | <source lang="java5"> | ||
+ | /** | ||
+ | * Default constructor of RSEPRCache that should be used by gCube components that are not exposed as services. | ||
+ | * It returns a new RSEPRCache object. | ||
+ | * | ||
+ | * @param notificationConsumerEPR the caller's URI expressed as an EPR, that will be used to register | ||
+ | * to the notifications topic used for notifying when a ResultSet is removed from RS garbage collector | ||
+ | * @param sman a security manager that will be used to register to the notifications topic | ||
+ | * @param sc the infrastructure scope of this scope will be used for subscribing to the topic | ||
+ | * @throws Exception | ||
+ | */ | ||
+ | public RSEPRCache(EndpointReferenceType notificationConsumerEPR, GCUBESecurityManager sman, GCUBEScope sc) | ||
+ | </source> | ||
+ | |||
+ | The first argument for the last constructor can be the address of the machine where the component runs, i.e: | ||
+ | <source lang="java5"> | ||
+ | EndpointReferenceType consEpr = new EndpointReferenceType(); | ||
+ | consEpr.setAddress(new Address("http://ariadni.di.uoa.gr")); | ||
+ | RSEPRCache cache = new RSEPRCache(consEpr, sman, sc); | ||
+ | </source> | ||
+ | |||
+ | ====Dependencies==== | ||
+ | |||
+ | * Result Set Service Stubs | ||
+ | |||
+ | The Notification message types are defined in the wsdl of the Result Set Service. | ||
+ | |||
+ | * EHCACHE | ||
+ | |||
+ | Ehcache is used internally to implement the caching mechanisms. | ||
+ | |||
+ | *gCore |
Latest revision as of 11:59, 18 October 2010
Contents
- 1 ResultSet Framework
- 1.1 Introduction
- 1.2 Implementation Overview
- 1.3 Dependencies
- 1.4 Usage Example
- 1.5 ResultSet Extentions
- 1.6 ResultSet Cache
ResultSet Framework
Introduction
The ResultSet Framework provides the enabling mechanism to pass by reference data between components and / or services that can be hosted in the same or remote nodes. It utilizes the gCore Framework for keeping state and offers value adding operations for encapsulating remote and local calls, paging of content, data movement, remote processing and more described in available code documentation.
Implementation Overview
The framework consists of 4 sub components :
- the ResultSet library
- the ResultSet Service
- the ResultSet Client
- the ResultSet Garbage Collector
- The ResultSet creates a linked list of pages holding the records. Pages are also referred as parts whereas records make up the content of the referenced data. The ResultSet can iterate over the created parts and perform various operations on the structure of the list as well as the content entailed. Its operations are mostly part specific but they can also affect the entire record chain. Once a part of the content has been created and the next in line has been initialized the part cannot be altered. Once it has been declared that the authoring of a result set has finished, the entire chain of it is immutable.
- The ResultSet Service is merely a WS front-end to the ResultSet library. It creates and manages WS-Resources holding references to instances of the ResultSet components. These instances are responsible to hold state regarding the iteration over the chain of pages and for accessing / modifying / querying the data held.
- The ResultSet Client is a library giving access to both high and low level operations that can be performed on a specific ResultSet component instance and its underlying data. It enables handling of ResultSets that are both wrapped through a WS front-end and that are locally manipulated though direct java invocations (within JVM).
- The ResultSet Garbage Collector is a thread that clears up resources used by the ResultSet Framework (removing file-system resources and destroying created gCore-Resources)
Dependencies
- ResultSetLibrary
- jdk 1.5
- gCore
- ResultSetService
- jdk 1.5
- gCore
- ResultSetLibrary
- ResultSetClient
- jdk 1.5
- gCore
- ResultSetLibrary
- ResultSetService
- ResultSetGarbageCollector
- jdk 1.5
- gCore
- ResultSetLibrary
- ResultSetService stubs
Usage Example
The ResultSets that can be created are content independent but depending on the content inserted different operations can be performed. So different readers and writers have been implemented to offer different levels of versatility depending on the content. Here he will mainly focus on the XML authoring and retrieving which is most commonly used and thoroughly tested up to this point. As more types of readers / writers become popular additional documentation will be made available
The code examples are not exhaustive in error checking and should only be considered as usage templates
Creating an RS
The simplest way to create an RS is through a writer (here RSXMLWriter).
/*A method that creates a new RSXMLWriter*/ public static RSXMLWriter createRSWriter() throws Exception{ RSXMLWriter writer=RSXMLWriter.getRSXMLWriter(); ..... /*One should ALWAYS close the RS he is creating in order to make the full payload available to readers*/ writer.close(); return writer; }
Alternatively one can choose the part size and the records placed in each part:
/*A method that creates a new RSXMLWriter*/ public static RSXMLWriter createRSWriter() throws Exception{ /*Create a new writer which should have as a condition for paging inserted results either having 30 records per page or having a total size surpasing 1024 bytes */ writer=RSXMLWriter.getRSXMLWriter(30,1024); ..... /*One should ALWAYS close the RS he is creating in order to make the full payload available to readers*/ writer.close(); return writer; }
Since the RS has a number of features that must be enabled during initialization we also offer an RS initialization class, used in the following way:
public static RSXMLWriter createRSWriter() throws Exception{ RSWriterCreationParams initParams = new RSWriterCreationParams(); /* You would normaly enable some features usng the initParams*/ RSXMLWriter writer=RSXMLWriter.getRSXMLWriter(initParams); ..... /*One should ALWAYS close the RS he is creating in order to make the full payload available to readers*/ writer.close(); return writer; }
Populating an RS
Having created an RS writer you are ready to place some content into it through a call to addResults.
/*A method that creates a new RSXMLWriter and populates it with some results*/ public static RSXMLWriter createRSWriter() throws Exception{ /*Create a new writer and set a lifetime property of 1000 millisecs. Also set that the paging condition will be having 10 records per page or a total size surpasing teh default value*/ writer=RSXMLWriter.getRSXMLWriter(new PropertyElementBase[]{new PropertyElementLifeSpanGC(1000)}); writer.setRecsPerPart(10); /*Add a new result with the given id, collection, rank and payload*/ writer.addResults(new ResultElementGeneric("id1","collection1","rank1","payload")); /*Add a new result with the given id, collection, rank and payload*/ writer.addResults(new ResultElementGeneric("id2","collection1","rank2","payload")); /*Add an array of results with the given id, collection, rank and payload*/ writer.addResults(new ResultElementBase[]{new ResultElementGeneric("id3","collection1","rank3","payload"), new ResultElementGeneric("id4","collection1","rank4","payload")}); /*Give back the writer*/ /*One should ALWAYS close the RS he is creating in order to make the full payload available to readers*/ writer.close(); return writer; }
Retrieving a locator
In order to access the content of an RS you would create a reference realized in a RSLocator object. Creating a reference also includes the concept of the technology through which the RS will be came available. Therefore when creating an RSLocator you should also set the resource type by providing one of RSResourceLocalType or RSResourceWSRFType.
After creating the writer you can get a reference to the RS by:
/*This method retrieves an instance of an RSLocator capable of identifying a ResultSet authored by the provided RSXMLWriter. Depending on the type of resource requested, this identifier can either be an identifier pointing to the local node (filesystem) or an idenitfier capable of pinpointing the RS through a web service using the WS-Resource pattern. One can create many locators (of the same or different) type for the same authored RS*/ public static RSLocator getLocator(RSXMLWriter writer) throws Exception{ RSLocator locator=null; /*Retrieves a locator identifying the RS in the local node*/ //locator=writer.getRSLocator(new RSResourceLocalType()); /*Retrieves a locator identifying the RS through WS-Resource pattern accesible from any node. The The service pointed to must be in the localhost. This method can be used when manipulating the RSXMLWriter outside container context. Otherwise the following method can be used*/ locator=writer.getRSLocator(new RSResourceWSRFType("http://localhost:8080/wsrf/services/gcube/common/searchservice/ResultSet")); /*Retrieves a locator identifying the RS through WS-Resource pattern accessible from any node. The default constructor utilizes the container context (which must be available) to compose the locally hosted ResultSetService that must be deployed*/ locator=writer.getRSLocator(new RSResourceWSRFType()); return locator; }
You are also able to create a RSLocator on a RS you are reading through a call to reader.getRSLocator()
Reading an RS
Through an RSLocator you are able to access the respective RS. You can either read the RS contents part by part:
/*This method instantiates a new RSXMLReader to point to the RS identified by the provided RSLocator and iterates over each page retrieving the contents*/ public static void readRS(RSLocator locator) throws Exception { PropertyElementBase []props=reader.getProperties(PropertyElementEstimationCount.class,PropertyElementEstimationCount.propertyType); System.out.println(((PropertyElementEstimationCount)props[0]).toXML()); RSXMLReader reader=RSXMLReader.getRSXMLReader(locator); while(!reader.isLast()){ reader.getNumberOfResults(); /*Retrieve the records in the current page as instances of the provided class that extends the ResultElementBase*/ ResultElementBase []res=reader.getResults(ResultElementGeneric.class); reader.getNextPart(); } reader.getFirstPart(); do{ reader.getFullPayload(); }while(reader.getNextPart()); }
or using an iterator over the records:
/*This method instantiates a new RSXMLReader to point to the RS identified by the provided RSLocator and iterates over each record retrieving the contents*/ public static void iterateRS(RSLocator locator) throws Exception { RSXMLIterator iter=RSXMLReader.getRSXMLReader(locator).getRSIterator(); while(iter.hasNext()){ ResultElementGeneric elem=(ResultElementGeneric)iter.next(ResultElementGeneric.class); /*Retrieve the DocID of the retrieved record*/ if(elem!=null) elem.getRecordAttributes(ResultElementGeneric.RECORD_ID_NAME); } }
in case you use RS BLOB you should first call "makelocal" befoer reading the RS.
Extending Base elements
Two types of elements are currently extendable in the context of the Framework.
- PropertyElementBase - Which is used to add / retrieve property elements describing the authored RSs
- ResultElementBase - Which is used to add / retrieve records from the RS
These elements must be available to both the producer and the consumer as library elements in order to be used. Of course this is not mandatory and a consumer can employ an element to retrieve the records that a consumer inserted using some other element. It is in the responsibility of the producer / consumer to synchronize the elements they use. These elements serve mainly as containers for the actual payload that should be inserted / retrieved possibly adding some common handling functionality on top of that payload.
Extending PropertyElementBase
Here we provide an example of extending a property element:
/*This is ane example of extending the PropertyElementBase to derive Property Element that is used to add WSRF EndpointReferenceType serializations to the RS head page*/ public class PropertyElementWSEPR extends PropertyElementBase{ /* The Type of the Property this Property element produces*/ public static String propertyType="WS-EPR"; private String epr=null; /* Default contructor required by the PropertyElementBase in order to instantiate the class using reflection*/ public PropertyElementWSEPR(){} /* Initializes a new PropertyElementWSEPR with the given EndpointReferenceType serialization */ public PropertyElementWSEPR(String epr) throws Exception{ this.epr=epr; setType(PropertyElementWSEPR.propertyType); } /* The usefull property payload from the PropertyElementWSEPR point of view as valid xml */ public String toXML() throws Exception{ return this.epr; } /* The usefull property payload from the PropertyElementWSEPR point of view as returned by PropertyElementWSEPR.fromXML */ public void fromXML(String xml) throws Exception{ this.epr=xml; } }
Extending ResultElementBase
Similarly, here we provide an example of extending a results element:
/* A Result element that extends the ResultElementBase and can be used to insert and retrieve xml records from an RS*/ public class ResultElementFoo extends ResultElementBase{ /* The name of the attribute holding a desired record attribute */ public static final String RECORD_FOO_NAME="Foo"; private String payload; /* Default contructor nessecary for the framework to instantiate the ResultElementFoo though reflection */ public ResultElementFoo(){} /* Creates a new link ResultElementFoo with the provided Foo attribute value and payload */ public ResultElementFoo(String foo,String payload) throws Exception{ Vector<RecordAttribute> at=new Vector<RecordAttribute>(); at.add(new RecordAttribute(ResultElementFoo.RECORD_FOO_NAME,foo)); setRecordAttributes(at.toArray(new RecordAttribute[0])); this.payload=payload; } private void setPayload(String payload){ this.payload=payload; } public String getPayload(){ return this.payload; } /* The Attributes that can be defined for a record are handled internally by the ResultElementBase and can be set / accessed using the Base elements respective methods*/ public String getFooValue(){ return getRecordAttributes(ResultElementFoo.RECORD_FOO_NAME)[0].getAttrValue(); } /* The xml representation of the record payload */ public String toXML() throws Exception{ return payload; } /* The payload of the result element as returned by the ResultElementFoo.toXML */ public void fromXML(String xml) throws Exception{ setPayload(xml); } }
Note that the RS mechanism was initially based on the notion that each record would have at least one attribute (its Record ID). There are might still be side effects when using ResultSet Elements with no attributes.
Complex Operations
In this section we provide some information on operations that expose particular features that the developer might find appealing to use.
RS creation within Workflows
In the context of a workflow it might be desirable to send to the asking component a locator capable of identifying the RS that the producer will be populating. This is best done in a non blocking manner in a background thread.
/*This method creates an RS and starts a background thread populating the RS. It then returns the locator to the RS that is beeing populated*/ public static RSLocator populateRS() throws Exception{ RSXMLWriter writer=RSXMLWriter.getRSXMLWriter(); BackgroundPopulating rst=new BackgroundPopulating(writer); rst.start(); return writer.getRSLocator(new RSResourceLocalType()); } /*This extends Thread in order to enable populating of the RS the RSXMLWriters it is initialized with points to*/ public class BackgroundPopulating extends Thread{ RSXMLWriter writer=null; /*The RSXMLWriter to use for populating the RS*/ public BackgroundPopulating(RSXMLWriter writer){ this.writer=writer; } /*Populate the RS*/ public void run(){ try{ writer.addResults(new ResultElementGeneric("id1","collection1","rank1","payload")); writer.addResults(new ResultElementGeneric("id2","collection1","rank2","payload")); writer.addResults(new ResultElementBase[]{new ResultElementGeneric("id3","collection1","rank3","payload"), new ResultElementGeneric("id4","collection1","rank4","payload")}); writer.close(); }catch(Exception e){ e.printStackTrace(); } } }
Partial Localization
When reading a RS you are ofered the option to first store it localy so as to access it more efficiently; this operation is called localization. Further more you able to fetch localy only a portion of the RS. We provide an example of that here:
/*This method localizes a ResultSet keeping the top count records of the identified throught the RSLocator RS. The localization is in terms of the node that hosts the RS content and not the Resource type. This can be further defined by the RSResourceType*/ public static RSLocator localizePart(RSLocator locator,int count) throws Exception { RSXMLReader reader=RSXMLReader.getRSXMLReader(locator); reader.isLocal(); //false //RSXMLReader readerTop=reader.keepTop(count); //readerTop.isLocal(); //false //RSXMLReader readerLocalJVM=readerTop.makeLocal(new RSResourceLocalType()); //readerLocalJVM.isLocal(); //true //RSXMLReader readerLocalWS=readerTop.makeLocal(new RSResourceWSRFType()); //readerLocalWS.isLocal(); //true RSXMLReader readerLocalTop=reader.makeLocal(new RSResourceLocalType(),count); readerLocalTop.isLocal(); //true return readerLocalTop.getRSLocator(); }
RS xPath filtering
Some basic filtering features are provided in the context of the RS:
/*This method scans through every record of the RS and evaluates the provided xPath expression against each. If the evaluation returns some result, the record is inserted in a new RS. A new RSXMLReader is created holding the matching record. The xPath expression must start with a reference to the current node (eg .//[..])*/ public static RSLocator filterRS(RSLocator locator,String xPath) throws Exception { RSXMLReader reader=RSXMLReader.getRSXMLReader(locator); return reader.filter(xPath).getRSLocator(); }
RS as Flow Control Mechanism
The ResultSet Framework offers the additional functionality of synchronizing result production with result consumption. This functionality is simulated in the following attached code
ResultSet as Flow Contol Mechanism Example code
Pool Of RS Writers
Another feature that is available for usage is a pool of pre-created writers with already created locators. In cases of container heavy load and of WS based locators, this pool can save a significant amount of time in the creation of a writer. In the context of a service for example on can staticly define a factory of writers that include a configurable pool of precreated writers and use this factory to retrieve the writers to use. The code below shows this in sudo-code.
import org.gcube.common.searchservice.searchlibrary.rsclient.elements.pool.PoolConfig; import org.gcube.common.searchservice.searchlibrary.rsclient.elements.pool.PoolObjectConfig; import org.gcube.common.searchservice.searchlibrary.rsclient.elements.pool.RSPoolObject; import org.gcube.common.searchservice.searchlibrary.rswriter.RSWriterFactory; import org.gcube.common.searchservice.searchlibrary.rswriter.RSXMLWriter; public class ExampleService { /** * Factory of rs writers */ private static RSWriterFactory factory=null; //the factory that will be used and include the pool static{ PoolConfig config=new PoolConfig(); //the pool configuration PoolObjectConfig oConf=new PoolObjectConfig(); //configuration item try{ oConf.FlowControl=false; //the specific writers should not allow flow control oConf.MaxSize=20; //the maximum size of the specific writer pool oConf.MinSize=8; //the minimum size (threshlod to repopulate) oConf.ObjectType=RSPoolObject.PoolObjectType.WriterXML; //the type of writer oConf.ResourceType=RSPoolObject.PoolObjectResourceType.WSRFType; //the type of locator to pre-contsruct oConf.ServiceEndPoint=null; //not staticly created but in service context config.add(oConf); //add this object in the pool }catch(Exception e){ log.error("Could not initialize factory pool. Continuing",e); } factory=new RSWriterFactory(config); } public String doWork(DoWork params) throws SomeBaseFault{ Vector<String> results=MyLibrary.search(params.getQuery); //background add records PropertyElementBase [] props=new PropertyElementBase []{ new PropertyElementEstimationCount(results.size(),results.size(),results.size())}; // without the factory this call would have been RSXMLWriter.getRSXMLWriter(props) return writer=factory.getRSXMLWriter(props).getRSLocator(new RSResourceWSRFType(); } }
ResultSet Extentions
The ResultSet has been enchanced with a number of features that one can enable through the initialization class RSWriterCreationParams. These features include:
- Access Leasing
- Time Leasing
- Forward Access
- Encryption
In addition through RSWriterCreationParams you are also able to some of the aforementioned parameters such as:
- Part size
- Records per part
- Control flow and
- Property elements to be used
Note that when you perform a localization of an RS you essentially create a new RS and the feature properties of the original RS are not inherited to the new one.
ResultSet Leasing
Leasing aims to be a mechanism to enhance the resource feeing mechanism of the ResultSet. When creating a new RS the author is able to chose up until when the RS will be available. As a consequence the internal mechanisms of the RS will be able to free the resources hold at a convenient point in time. As of now we identify two leasing policies/types:
- Time leasing. When the RS author creates a resource he is able to set a life time during which the ResultSet will be available. This essentially means that after this period the resultset will not be accessible. Extension is also possible in case it is requested.
private void TimeLeasingTest() throws Exception{ // Writer String content = "Some content"; RSWriterCreationParams initParams = new RSWriterCreationParams(); initParams.setExpire_date(new Date(Calendar.getInstance().getTimeInMillis() + 60000)); RSXMLWriter writer=RSXMLWriter.getRSXMLWriter(initParams); for(int i=0;i<30;i+=1) writer.addResults(new ResultElementGeneric("id" + i,"foo",content)); writer.close(); String epr=writer.getRSLocator(new RSResourceWSRFType()).getLocator(); //Reader 1 RSLocator l = new RSLocator(epr); RSXMLReader reader=RSXMLReader.getRSXMLReader(l); int q=0; while (true){ ResultElementBase[] res=reader.getResults(ResultElementGeneric.class); if(!reader.getNextPart()) break; } try{ Thread.sleep(60000); }catch(Exception ex ){} try{ l = new RSLocator(epr); reader=RSXMLReader.getRSXMLReader(l); while (true){ ResultElementBase[] res=reader.getResults(ResultElementGeneric.class); if(!reader.getNextPart()) break; } }catch(Exception x){ Print("Read failed. This is expected!"); } }
- Access leasing. When RS author creates a resource he is able to set the number of reads allowed. In case the number of reads is exceeded the resource will not be available anymore.
private void AccessLeasingTest() throws Exception{ // Writer String content = "Some content"; RSWriterCreationParams initParams = new RSWriterCreationParams(); initParams.setAccessReads(2); //Two access reads allowed RSXMLWriter writer=RSXMLWriter.getRSXMLWriter(initParams); for(int i=0;i<30;i+=1) writer.addResults(new ResultElementGeneric("id" + i,"foo",content)); writer.close(); String epr=writer.getRSLocator(new RSResourceWSRFType()).getLocator(); //Reader 1 RSLocator l = new RSLocator(epr); RSXMLReader reader=RSXMLReader.getRSXMLReader(l); while (true){ ResultElementBase[] res=reader.getResults(ResultElementGeneric.class); if(!reader.getNextPart()) break; } //Reader 2 try{ l = new RSLocator(epr); reader=RSXMLReader.getRSXMLReader(l); while (true){ ResultElementBase[] res=reader.getResults(ResultElementGeneric.class); if(!reader.getNextPart()) break; } }catch(Exception x){ System.out.println("Read failed. This is correct!"); } }
Forward Only ResultSet
The author of an RS may require to for the RS to be read without accessing parts of that preceed the current one in the parts list. Here is an example of that:
private void ForwardTest() throws Exception{ // Writer String content = "some content"; RSWriterCreationParams initParams = new RSWriterCreationParams(); initParams.setAccessReads(2); initParams.setForward(true); RSXMLWriter writer=RSXMLWriter.getRSXMLWriter(initParams); for(int i=0;i<300;i+=1) writer.addResults(new ResultElementGeneric("id" + i,"foo",content)); writer.close(); String epr=writer.getRSLocator(new RSResourceWSRFType()).getLocator(); //Reader RSLocator l = new RSLocator(epr); RSXMLReader reader=RSXMLReader.getRSXMLReader(l); while (true){ ResultElementBase[] res=reader.getResults(ResultElementGeneric.class); if(!reader.getNextPart()) break; } try{ l = new RSLocator(epr); reader=RSXMLReader.getRSXMLReader(l); while (true){ ResultElementBase[] res=reader.getResults(ResultElementGeneric.class); if(!reader.getPreviousPart()) break; } }catch(Exception x){ Print("Read failed. This is correct!"); } try{ l = new RSLocator(epr); reader=RSXMLReader.getRSXMLReader(l); while (true){ ResultElementBase[] res=reader.getResults(ResultElementGeneric.class); if(!reader.getNextPart()) break; } }catch(Exception x){ Print("Read failed. This is correct!"); } System.out.println("Now check under /tmp/resultset that the newelly" + " created RS does not have 300 records"); }
Note that when Forward and Access leasing is used at the same time then the during the last access read the RS parts are also being deleted.
ResultSet Security
The ResultSet framework offers encryption of the records stored.
Upon RS creation a private public key can be created through respective calls in the RS or be provided by en aexternal mechanism. Using these two keys the production of a content encryption key takes place. The a) content encryption key and b) the public key are placed in the header of the RS and the header is encrypted using the private RS key. The rest of the content of the RS is encrypted with the content encryption key. It is the responsibility of the RS author to pass the private key to the RS reader clients.
Upon reading the RS the reader also indirectly has access to the public key so as to alter the RS header (for forward RS etc) but it is never actually retrieved. Here we present an example of how t use the encryption mechanism of the RS:
private void EncryptionTest() throws Exception{ // Writer String content = "Some content"; RSWriterCreationParams initParams = new RSWriterCreationParams(); KeyGenerator kg = new KeyGenerator(); KeyPair pair = kg.GenKeyPair(); initParams.setPrivKey(pair.getPrivate()); initParams.setPubKey(pair.getPublic()); RSXMLWriter writer=RSXMLWriter.getRSXMLWriter(initParams); for(int i=0;i<30;i+=1) writer.addResults(new ResultElementGeneric("id" + i,"foo",content)); writer.close(); String epr=writer.getRSLocator(new RSResourceWSRFType()).getLocator(); //Reader RSLocator l = null; try{ l = new RSLocator(epr); reader = RSXMLReader.getRSXMLReader(l); while (true){ ResultElementBase[] res=reader.getResults(ResultElementGeneric.class); Print("ResultLength: " + res.length); if(!reader.getNextPart()) break; } }catch(Exception x){ Print("Read failed. This is correct!"); } Print("Starting rs re-reader (2)"); l = new RSLocator(epr); l.setPrivKey(pair.getPrivate()); reader=RSXMLReader.getRSXMLReader(l); while (true){ ResultElementBase[] res=reader.getResults(ResultElementGeneric.class); if(!reader.getNextPart()) break; } }
Note that the encryption has to do with the RS reference (RSLocator) because it has to be known on both sides of the RS reader and writer.
Note that during data transfer data RS will use plain TCP/IP connections for performance reasons. These connections are encrypted through SSL. If for any reason you do not desire encrypted RS communication you can disable this feature by editing the services like this:
<environment name="SSLsupport" value="disabled" type="java.lang.String" />
Notification for the garbage collected resources
Using the RSEPRCache component a service/component can store the EPR of a RS, in this cache, and use it again without having to wait for the execution of the query that produced the given RS. A RSEPRCache instance needs to be notified when the Garbage Collector collects an RS WS-resource, in order to update its content.
For this purpose, each deployed ResultSetService will register as a notifier, in the service's initialisation phase, to a topic associated with all the scopes, in which the Running Instance is deployed. In each period of the garbage collection procedure, the Collector aggregates all the RS WS-resources that are being garbage collected, and sends a batch notification message with the corresponding RS EPRs using the aforementioned topic. When a RSEPRCache instance is created, it subscribes as a notification receiver to the topic, and it is able to receive notification messages with the EPRs of garbage-collected RSs, during its lifetime.
ResultSet Cache
The Result Set Cache is a component that caches Result Set EndPointReferences(EPRs) and communicates with the Result Set service instances that are deployed in the infrastructure, to be informed about the Result Set EPRs that are being reclaimed.
Internal Structure
A ResultSet Cache can be used by every component(even if this component is not a web-service deployed in gCube infrastructure), that needs to store EPRs of Result Sets that were once obtained, in order to use them again in the future. For example a component that forms a query Q, which returns results that are contained in a Result Set with EPR A, can add A in a RSEPRCache object with the Q as a key. When this component needs the results for the same query Q in the future, it can get the corresponding ResultSet EPR A through the Cache object, without having to wait for the execution of the query again. In order to add a new ResultSet EPR in a RSEPRCache object, a add method is provided that takes two arguments, an java.lang.String object which is the EPR to be added and a java.lang.Object object which is the key that can be used to retrieve this EPR. In order to get a previously added ResultSet EPR from a Cache object, a get method is provided that takes a java.lang.Object object as an argument, and returns the java.lang.String EPR with a corresponding key equal to the argument, using the equals() method that is defined in the actual class of the argument. The EPR to be added using the add method must be expressed as a java.lang.String, in a serialized form. The EPR returned by the get method is also expressed as a java.lang.String, in a serialized form i.e:
<ns1:ResultSetResourceReference xmlns:ns1="http://gcube.org/namespaces/searchservice/ResultSetService"> <ns2:Address xmlns:ns2="http://schemas.xmlsoap.org/ws/2004/03/addressing"> http://ariadni.di.uoa.gr:8080/wsrf/services/gcube/searchservice/ResultSet </ns2:Address> <ns3:ReferenceProperties xmlns:ns3="http://schemas.xmlsoap.org/ws/2004/03/addressing"> <ns1:ResourceKey>287f4870-e7b5-11dd-bf93-8bbaa734be44</ns1:ResourceKey> </ns3:ReferenceProperties> <ns4:ReferenceParameters xmlns:ns4="http://schemas.xmlsoap.org/ws/2004/03/addressing"/> </ns1:ResultSetResourceReference>
A ResultSet Cache object receives notifications, from every ResultSet service that is deployed in the same scope, for ResultSet EPRs that were reclaimed and can't be accessed any more. Using gCube IS-Notification mechanism each ResultSet Cache object subscribes to a topic for notifying about reclaimed ResultSet EPRs, receives batch messages that contain the EPRs of the reclaimed ResultSets and deletes the contents that are related to them(see also Notification_for_the_garbage_collected_resources). This operation is performed in the background, and the contents of a Cache object remain valid without any action needed from the component that uses the Cache object.
Usage Examples
//construct a new RSEPRCache using the default settings. myEpr will be used to register to //the notifications' topic used for notifying when a ResultSet is removed from RS garbage collector String myEpr = ServiceHost.getBaseURL().toString() + ServiceContext.getContext().getJNDIName(); RSEPRCache rseprCache = new RSEPRCache(new EndpointReferenceType(new AttributedURI(myEpr)), ServiceContext.getContext()); ... //get the epr of the Result Set that answers to queryQ String epr = searchMaster.search(queryQ); //add this epr to the cache with the queryQ as a key rseprCache.add(queryQ, epr); //perform some actions with the results obtained ... //perform some other actions ... //if you want results for the same queryQ, get the epr for the corresponding Result Set //get it from the cache. epr = rseprCache.get(queryQ); RSLocator rsLocator = null; RSXMLReader reader = null; //there is always a possibility that the retrieved EPR refers to a Result Set that was //reclaimed. A try-catch statement should be used when using the EPR. try{ rsLocator = new RSLocator(epr); reader = RSXMLReader.getRSXMLReader(rsLocator); }catch(Exception e){ //execute the query again in case of failure epr = searchMaster.search(queryQ); rsLocator = new RSLocator(epr); reader = RSXMLReader.getRSXMLReader(rsLocator); } //perform actions to the retrieved results ...
When retrieving an EPR from a RSEPRCache object, there is always a small possibility that the Result Set that this EPR refers to, has been reclaimed but the corresponding notification has never arrived to the RSEPRCache object, due to a network failure. Another smaller possibility is that the notification arrives to the RSEPRCache object between invocations:
epr = rseprCache.get(queryQ);
and
rsLocator = new RSLocator(epr); reader = RSXMLReader.getRSXMLReader(rsLocator);
So creating a RSXMLReader from the EPR retrieved should be performed inside a try-catch statement.
A second constructor is provided, for creating caches with different settings than default:
/** * Constructor of RSEPRCache that takes configuration parameters and creates a new cache with * this configuration. * * @param notificationConsumerEPR the EPR of the service that will be used to register to * the notifications topic used for notifying when a ResultSet is removed from RS garbage collector * @param sctx the context of the service that will be used to register to the notifications topic * @param ttl the default amount of time to live for an element from its creation date * @param tti the default amount of time to live for an element from its last accessed date * @param maxElementsInMemory the maximum number of elements in memory, before they are evicted * @param overflowToDisk whether to use the disk store when there is overflow of the memory used * @throws Exception */ public RSEPRCache(EndpointReferenceType notificationConsumerEPR, GCUBEServiceContext sctx, long ttl, long tti, int maxElementsInMemory, boolean overflowToDisk)
A third constructor is provided, for the creation of caches by components that are not gCube services(i.e. portlets):
/** * Default constructor of RSEPRCache that should be used by gCube components that are not exposed as services. * It returns a new RSEPRCache object. * * @param notificationConsumerEPR the caller's URI expressed as an EPR, that will be used to register * to the notifications topic used for notifying when a ResultSet is removed from RS garbage collector * @param sman a security manager that will be used to register to the notifications topic * @param sc the infrastructure scope of this scope will be used for subscribing to the topic * @throws Exception */ public RSEPRCache(EndpointReferenceType notificationConsumerEPR, GCUBESecurityManager sman, GCUBEScope sc)
The first argument for the last constructor can be the address of the machine where the component runs, i.e:
EndpointReferenceType consEpr = new EndpointReferenceType(); consEpr.setAddress(new Address("http://ariadni.di.uoa.gr")); RSEPRCache cache = new RSEPRCache(consEpr, sman, sc);
Dependencies
- Result Set Service Stubs
The Notification message types are defined in the wsdl of the Result Set Service.
- EHCACHE
Ehcache is used internally to implement the caching mechanisms.
- gCore