Difference between revisions of "Forward Index"
m (→RowSet) |
|||
Line 41: | Line 41: | ||
<rowset> | <rowset> | ||
<insert> | <insert> | ||
− | <key></key><value></value> | + | <tuple><key></key><value></value></tuple> |
− | <key></key><value></value> | + | <tuple><key></key><value></value></tuple> |
</insert> | </insert> | ||
<delete> | <delete> | ||
Line 50: | Line 50: | ||
</rowset> | </rowset> | ||
− | The rowset may contain a insert section, or a delete section or both. The key and value pairs in the insert section may be repeated 1 or infinite number of times. The key in the delete section may be repeated 1 or infinite number of times. | + | The rowset may contain a insert section, or a delete section or both. The key and value pairs (tuples) in the insert section may be repeated 1 or infinite number of times. The key in the delete section may be repeated 1 or infinite number of times. |
+ | |||
===Test Client ForwardIndexClient=== | ===Test Client ForwardIndexClient=== | ||
Revision as of 23:11, 24 October 2007
Contents
Introduction
The Forward Index is responsible for the capability of storing and retrieving key and value pairs. The values can be retrieved by indicating an interval for the keys. The Forward Index Service design pattern is similar to/the same as the Full Text Index Service design and the Geo Index Service design. The forward index supports the following schema for the key value pair. key; integer, value; string key; float, value; string key; string, value; string key; date, value;string
The schema is for an index is given as a parameter when the index is created. The schema must be known in order to be able to instantiate a class that is capable of comparing the two keys (implements java.util.comarator). The Objects stored in the database can be anything.
Implementation Overview
Services
The forward index is implemented through three services. They are all implemented according to the factory-instance pattern:
- An instance of ForwardIndexManagement Service represents an index and manages this index. The life-cycle of the index is the same as the life-cycle of the management instance; the index is created when the ForwardIndexManagement instance is created, and the index is terminated (deleted) when the ForwardIndexManagement instance resource is removed. The ForwardIndexManagement Service manage the life-cycle and properties of the forward index. It co-operates with instances of the ForwardIndexUpdater Service when feeding content into the index, and with instances of the ForwardIndexLookup Service for getting content from the index. The Content Management service is used for safe storage of an index. A logical file is established in Content Management when the index is created. The index is retrieved from Content Management and established on the local node when an existing forward index is dynamically deployed on a node. The logical file in Content Management is deleted when the ForwardIndexManagement instance is deleted.
- The ForwardIndexUpdater Service is responsible for feeding content into the forward index. The content of the forward index consists of key value pairs. A ForwardIndexUpdater Service resource updates a single Index. One index may be updated by several ForwardIndexUpdater Service instances simultaneously. When feeding the index, a ForwardIndexUpdater Service is created, with the EPR of the FullTextIndexManagement resource connected to the Index to update. The ForwardIndexUpdater instance is connected to a ResultSet that contains the content to be fed to the Index.
- The ForwardIndexLookup Service is responsible receiving queries for the index, and returning responses that matches the queries. The ForwardIndexLookup gets a reference to the ForwardIndexManagement instance that is managing the index, when it is created. It can only query this index. Several ForwardIndexLookup instances may query the same index. The ForwardIndexLookup instances gets the index from Content Management, and establishes a local copy of the index on the file system that is queried. The local copy is kept up to date by subscribing for index change notifications that are emitted my the ForwardIndexManagement instance.
It is important to note that none of the three services have to reside on the same node; they are only connected through web service calls and the DILIGENT Content Management System. The following illustration shows the information flow and responsibilities for the different services used to implement the Forward Index:
(illustration will be improved shortly... )
________________________________ | | |•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘| |•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘| |•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘| | So Pretty Index Design... | |•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘| |•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘| |•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘•∘| |________________________________|
RowSet
The content to be fed into an Index, must be served as a ResultSet containing XML documents conforming to the ROWSET schema. This is a simple schema, containing key and value pairs. The following is an example of a ROWSET for that can be fed into the Forward Index Updater:
The row set "schema"
<rowset> <insert> <tuple><key></key><value></value></tuple> <tuple><key></key><value></value></tuple> </insert> <delete> <key></key> <key></key> </delete> </rowset>
The rowset may contain a insert section, or a delete section or both. The key and value pairs (tuples) in the insert section may be repeated 1 or infinite number of times. The key in the delete section may be repeated 1 or infinite number of times.
Test Client ForwardIndexClient
The org.diligentproject.indexservice.clients.ForwardIndexClient test client is used to test the ForwardIndex.
The ForwardIndexClient uses a property file ForwardIndex.properties contains the properties for the ForwardIndexTest client.
The property file contains the following properties: ForwardIndexManagementFactoryResource= /wsrf/services/diligentproject/index/ForwardIndexManagementFactoryService Host=dili02.osl.fast.no ForwardIndexUpdaterFactoryResource= /wsrf/services/diligentproject/index/ForwardIndexUpdaterFactoryService ForwardIndexLookupFactoryResource= /wsrf/services/diligentproject/index/ForwardIndexLookupFactoryService geoManagementFactoryResource= /wsrf/services/diligentproject/index/GeoIndexManagementFactoryService Port=8080 Create-ForwardIndexManagementFactory=true Create-ForwardIndexLookupFactory=true Create-ForwardIndexUpdaterFactory=true
The property Host and Port must be edited to point to VO of interest.
The test client creates the Factory services (gets the EPRs of) and uses the factory services to create the statefull web services:
ForwardIndexManagementService - responsible for holding the list of delta files that
in sum is the index. The service also relays Notifications from the ForwardIndexUpdaterService to the ForwardIndexLookupService when new delta files must be merged into the index.
ForwardIndexUpdaterService - responsible for creating new delta files with tuples that shall
be deleted from the index or inserted into the index.
ForwardIndexLookupService - responsible for looking up queries and returning the answer.
The test clients creates one WS - resource of each type, inserts some data into the update, and queries the data by using the lookup WS resource.
Inserting data and deleting tuples Tuples can be inserted and deleted by: insertingPair(key,value) / deletingPair(key) -simple methods to insert / delete tuples. process(rowSet) - method to insert / delete a series of tuples. procesResultSet - method to insert / delete a series of tuples in a rowset inserted into a resultSet.
Lookup: Tuples can be queried by : getEQ_int(key), getEQ_float(key), getEQ_string(key), getEQ_date(key) getLT_int(key), getLT_float(key), getLT_string(key), getLT_date(key) getLE_int(key), getLE_float(key), getLE_string(key), getLE_date(key) getGT_int(key), getGT_float(key), getGT_string(key), getGT_date(key) getGE_int(key), getGE_float(key), getGE_string(key), getGE_date(key) getGTandLT_int(keyGT,keyLT), getGTandLT_float(keyGT,keyLT),getGTandLT_string(keyGT,keyLT), getGTandLT_date(keyGT,keyLT) getGEandLT_int(keyGE,keyLT), getGEandLT_float(keyGE,keyLT),getGEandLT_string(keyGE,keyLT), getGEandLT_date(keyGE,keyLT) getGTandLE_int(keyGT,keyLE), getGTandLE_float(keyGT,keyLE),getGTandLE_string(keyGT,keyLE), getGTandLE_date(keyGT,keyLE) getGEandLE_int(keyGE,keyLE), getGEandLE_float(keyGE,keyLE),getGEandLE_string(keyGE,keyLE), getGEandLE_date(keyGE,keyLE) getAll
The result is provided to the client by using the Result Set service.