XML Indexer

From Gcube Wiki
Revision as of 17:57, 11 December 2007 by Andrea (Talk | contribs) (Usage Examples)

Jump to: navigation, search

Introduction

The XMLIndexer Service is a generic indexer of XML data homogeneous collections. The service allows creating, populating and resolving queries against such collections. We distinguish between two types of XMLIndexer, each of them manages a collection of XML documents:

  • GenericXMLIndexer – a GenericXMLIndexer is completely unaware about the collection and the data handled. This means that it does not impose any constraint about them and, therefore, it assumes that the clients know the schema of the documents to query. It can be used each time it is useful to index and query a (temporary) set of XML data, like a result set.
  • MetadataXMLIndexer – a MetadataXMLIndexer is bound to a specific Metadata Collection and it is used to index the elements of such a collection. When a new indexable Metadata Collection is created, the Metadata Catalog Service creates also a new related MetadataXMLIndexer and, each time a new Metadata Object is added/updated in such collection, the Metadata Catalog Service also adds/updates the MetadataXMLIndexer by feeding it with the new element.


The XMLIndexer follows the Factory pattern and it is composed by:

  • the XMLIndexerFactory creates new XMLIndexers
  • the GenericXMLIndexer service
  • the MetadataXMLIndexer service

Managed Resources

The XMLIndexerFactory creates a WS-Resource per each XMLIndexer. Since there are two kinds of XMLIndexer, there are also two kinds of WS-Resource that the Factory service can create: the GenericXMLIndexer resource and the MetadataXMLIndexer resource. The state of each Indexer is published in the DIS by means of its WS-ResourceProperties. These resource properties includes the creation parameters and, if the Indexer is a GenericXMLIndexer, the SetTerminationTime and CurrentTime WS-ResourceProperties.

MetadataXMLIndexer

A MetadataXMLIndexer operates over a collection of homogeneous XML documents bound to a specific Metadata Collection. Since the managed XMLDocuments are wrapped in the Metadata envelope, each document is identified by a unique ID (the Metadata Object ID) and this allows a more advanced management of this type of Indexer with respect to the GenericXMLIndexer one. In fact, this type of Indexers can be populated, updated, recreated and queried. The XMLIndexer extends the standard WSRF ImmediateResourceTermination portType implemented by the DestroyProvider operation provider and this means that it has to be explicitly destroyed.

  • AddElements(Documents[]) --> void
    This operation take a list of Documents and adds them to the current collection in exist1.1 DB. A Document is a pair of id and a String representation of the XMLDocument. This operation can be used to update elements already stored given the same id of an existing document.
  • AddElementsRS(String) --> void
    This operation take a string representing a reference to a RSLocator (see ResultSetService). The AddElementRS create a RSReader and stores the element readed in exist1.1 DB.
  • ExecuteXPath(string) --> string[]
    This operation take a string representing the XPath. It executes the given XPath on the current collection.
  • ExecuteXPath(string) --> string
    This operation take a string representing the XPath. It executes the given XPath on the current collection and return a string representing a reference to RSLocator (see ResultSetService).
  • ExecuteXQuery(string) --> string[]
    This operation take a string representing the XQuery. It executes the given XQuery on the current collection and return an array of string.
  • ExecuteXPathRS(string) --> string
    This operation take a string representing the XQuery. It executes the given XQuery on the current collection and return a string representing a reference to RSLocator (see ResultSetService).

Implementation Detail

The XMLIndexer Service is built as a wrapper around an XML database (eXist 1.1). At the first start up time, the service creates an embedded database instance. Each XMLIndexer manages a collection of data in the database instance. These collections are always created together with the Indexer in the case of a GenericXMLIndexer creation. On the other hand, whenever the CreateMetadataIndexer() operation is invoked on the XMLIndexerFactory, it checks if an XML collection for that Metadata Collection is already available in the database instance. If not, a new XML collection for that Metadata Collection is created, otherwise the already available collection is used.

Dependencies

These are the dependencies of the Service :

  • eXist1.1
  • ResultSetService

Usage Examples

This example shows the creation of a XMLIndexer WS-Resource and adds elements to collection using ResultSetService :


...

XMLIndexerFactoryServiceAddressingLocator factoryIndexerLocator= new XMLIndexerFactoryServiceAddressingLocator();
MetadataXMLIndexerServiceAddressingLocator serviceIndexerLocator= new MetadataXMLIndexerServiceAddressingLocator();

try{

     String[] eprs=DISHLSClient.getRunningInstanceManager(credential, epr).getEPRsRIFromClassAndName("MetadataManagement", "XMLIndexer", "diligentproject/metadatamanagement/xmlindexer/XMLIndexerFactoryService", credential, epr);
     if (eprs.length==0) throw new Exception("non XMLIndexer Factory Service instance retreived");
     EndpointReferenceType factoryEPR= new EndpointReferenceType();
     factoryEPR.setAddress(new AttributedURI(eprs[0]));
     XMLIndexerFactoryPortType factoryPortType=factoryIndexerLocator.getXMLIndexerFactoryPortTypePort(factoryEPR);
		
     CreateMetadataIndexerMessage request=new metadataIndexerMessage();
     request.setRecreate(true);
     request.setId(collectionID);
     CreateMetadataIndexerResponse response=factoryPortType.createMetadataIndexer(request);
     MetadataXMLIndexerPortType indexerServicePortTypePort=serviceIndexerLocator.getMetadataXMLIndexerPortTypePort(response.getEndpointReference());
     indexerServicePortTypePort.addElementRS(RSlocator);

}catch(Exception e){...}

....