Difference between revisions of "Ic-client"
(→Discovery) |
(→Templated Queries) |
||
Line 272: | Line 272: | ||
Then, whenever a consumer invokes <code>expression()</code> on the <code>QueryTemplate</code>, the parameters are interpolated and the complete query expression is returned. | Then, whenever a consumer invokes <code>expression()</code> on the <code>QueryTemplate</code>, the parameters are interpolated and the complete query expression is returned. | ||
+ | |||
+ | '''Note''': Besides <code>addParameter(String,String)</code>, <code>QueryTemplate</code> lets us manipulate templates with <code>appendParameter(String,String</code>, <code>hasParameter(String)</code>, and <code>parameter(String)</code>, all of which have the expected semantics. | ||
The approach makes sense whenever there is a natural separation between query creators and query builders. Creators define the template and 'see' the details of the remote database, builders work instead with a simpler view of the query, typically one that hides the database. The main use case here is for library code built on top of the <code>ic-client</code>. The <code>ic-client</code> itself predefines all its queries for resource types as <code>QueryTemplate</code>s. | The approach makes sense whenever there is a natural separation between query creators and query builders. Creators define the template and 'see' the details of the remote database, builders work instead with a simpler view of the query, typically one that hides the database. The main use case here is for library code built on top of the <code>ic-client</code>. The <code>ic-client</code> itself predefines all its queries for resource types as <code>QueryTemplate</code>s. |
Revision as of 21:36, 4 December 2012
The ic-client is a client library for the Information Collector
service. It helps clients formulating queries for gCube resource descriptions, submitting them to the service, and processing their results.
Similar facilities for resource discovery are traditionally provided by the gCube Application Framework
(gCF) and the IS Client
library. The ic-client
improves over the latter in a number of ways, most noticeably:
- it is completely independent from the gCore stack.
- the
ic-client
can be easily embedded in a variety of client runtimes without out-of-band installation or configuration requirements. Clients may be external to gCube, or they may be 2nd-generation gCube services developed and running on stacks other than gCore stack. In this sense, theic-client
is a key part of the Featherweight Stack for gCube clients.
- it helps formulating a wider range of queries based only on knowledge of resource schemas.
- simple queries can be configured with custom namespaces and custom result expressions. As a result, clients can retrieve only parts of resource descriptions or arbitrary combinations of parts. Fine-grained results are more easily processed and improve the performance of both clients and service.
- it offers increased flexibility in how query results are processed.
- clients can configure how results ought to be parsed, or else take direct responsibility for parsing them. For example, clients may configure their own JAXB object bindings, while preprepared bindings for whole resource descriptions or specific properties thereof are readily available.
The ic-client
is available in our Maven repositories with the following coordinates:
<groupId>org.gcube.resources.discovery</groupId> <artifactId>ic-client</artifactId> <version>...</version>
The library depends on a small set of components of the Featherweight Stack. Among these, the following are visible to library clients:
-
common-gcore-resources
: the object-based implementation of the gCube resource model.
- clients may and normally will use the classes in
common-gcore-resources
to parse and process query results.
-
discovery-client
: a layer of interfaces and abstract implementations for queries and query submission API.
ic- client
customises this layer for queries to theInformation Collector
.
Note: in what follows, we blur the distinction between the ic- client
and the discovery-client
. The distinction reflects modular choices for the design of the library but is otherwise of little consequence for its clients. The visibility of the discovery-client
is limited only to the package of certain components that we discuss below, which starts with org.gcube.resource.discovery.client
. The components of the ic-client
are instead in packages that start with org.gcube.discovery.icclient
.
Contents
Quick Tour
We introduce the API of the ic-client
through a set of examples.
Note that, in all the examples, we submit queries to the Information Collector
. We then need to make sure that we do so in a given scope, e.g. binding it to the current thread with standard idiom:
ScopeProvider.instance.set("...somescope...");
SubmittingPredefined Queries
Let us submit a query for, say, service endpoints:
import static org.gcube.resources.discovery.icclient.ICFactory.*; ... Query query = queryFor(ServiceEndpoint.class); DiscoveryClient<ServiceEndpoint> client = clientFor(ServiceEndpoint.class); List<ServiceEndpoint> resources = client.submit(query);
queryFor()
and clientFor()
are static factory methods of the ICFactory
class, which we import for improved legibility of the code (cf import static ...*
).
queryFor()
gives us a predefined Query
for descriptions of service endpoints, which we lookup with the corresponding class in the object model of common-gcore-resources
. Had we wanted to query the Information Collector
for, say, hosting nodes then we would have used the HostingNode
class of the object model.
clientFor()
gives us a DiscoveryClient
which can submit the query and parse its results as instances of the ServiceEndpoint
class.
Finally, we ask the client to submit the query to the remote service and collect the parsed results. Once the results are back we can navigate through the ServiceEndpoint
instances to get to the information we need. Again, we need to acquire familiarity with the resource model, nothing that some documentation and a good IDE cannot help with. In what follows, we assume such familiarity.
Thus interaction with the ic-client
is two-phased:
- in the first phase we define the query we want, here simply pick a predefined one for a given resource type.
- in the second phase we submit the query with a
DiscoveryClient
that returns the results in the form we want.
Note that we use use the ServiceEndpoint
class twice: a first time to lookup predefined queries and a second time to indicate how we want the results parsed. This appears redundant until we realise that the ic-client
allows us to separate what we want to query from how we want the results to be returned. In this case, ServiceEndpoint
identifies both query and parsing strategy, but this is not always the case, as we show next.
Conditions and Result Expressions
We now customise a predefined query for service endpoints so that it returns only the addresses of endpoints of database services. Since the results are now plain strings our choice of parser is to have none at all.
import static org.gcube.resources.discovery.icclient.ICFactory.*; ... SimpleQuery query = queryFor(ServiceEndpoint.class); query.addCondition("$resource/Profile/Category/text() eq 'Database'") .setResult("$resource/Profile/AccessPoint/Interface/Endpoint/text()"); DiscoveryClient<String> client = client(); List<String> addresses = client.submit(query);
We lookup the predefined query as we did earlier, but this time we type it under a more specific interface, SimpleQuery
, which enables customisations. We first add an XQuery condition, which is the query language of the Information Collector
. The condition filters out endpoints that do not give access to databases.
We then customise the result expression to get back only their addresses, chaining setResult()
to addCondition
. For both tasks, we rely on the schema of service endpoint descriptions to formulate expressions, and adhere to the (documented) convention to use $resource
as a variable ranging over target resources.
Finally, we replace the method ICFactory#clientFor(Class)
with ICFactory#client()
, which returns a client that does not attempt to parse the results using classes from the resource model.
Note that if we needed to chain multiple conditions, we would invoke addCondition()
multiple times. The chain would then be based on the AND operator. To chain conditions with other operators we would instead need to do the chaining ourself and add the result as a single condition.
Retrieving Resource Parts
In the next example, we move to middle ground: rather than looking for whole resource descriptions or individual strings we focus on selected parts of descriptions, e.g. we retrieve all the available access information. We thus return to needing result parsing, but this time pass the class of the resource model that describes the resource properties that we want retrieved, rather than the top-level class:
import static org.gcube.resources.discovery.icclient.ICFactory.*; ... SimpleQuery query = queryFor(ServiceEndpoint.class); query.addCondition("$resource/Profile/Category/text() eq 'Database'") .setResult("$resource/Profile/AccessPoint"); DiscoveryClient<AccessPoint> client = client(AccessPoint.class); List<AccessPoint> accesspoints = client.submit(query); for (AccessPoint point : accesspoints) { ...point.name()....point.address().... }
It's now clear that the resource model classes that we pass to the DiscoveryClient
drive a generic parser embedded in the client. The classes of the model are in fact decorated with JAXB annotations, and these annotations define the binding of classes from XML.
Custom Results
In our next example, we show how different parts of service endpoint descriptions, say access points and identifiers, can be composed together to form the desired results. We then show how these ad-hoc combination can be conveniently parsed:
import static org.gcube.resources.discovery.icclient.ICFactory.*; ... SimpleQuery query = queryFor(ServiceEndpoint.class); query.addCondition("$resource/Profile/Category/text() eq 'Database'") .setResult("<perfect>" + "<id>{$resource/ID/text()}</id>" + "{$resource/Profile/AccessPoint}" + "</perfect>"); DiscoveryClient<PerfectResult> client = clientFor(PerfectResult.class); List<PerfectResult> results = client.submit(query); for (PerfectResult result : results) { ...result.id...result.ap); }
where PerfectResult
is the simple bean defined as:
@XmlRootElement(name="perfect") class PerfectResult { @XmlElement(name="id") String id; @XmlElementRef AccessPoint ap; }
Here we give results a custom shape, hence we cannot cherry-pick from the resource model the class to be used for parsing. Rather, we define it as our PerfectResult
, a very simple bean which we decorate with annotations to drive the underlying JAXB parser. Note that the bean reuses AccessPoint
from the resource model, we do not need to reinvent the wheel. We then pass the bean to the discovery client and collect results as its instances.
Namespaces
Oftentimes, we need to specify conditions and result expressions that refer to resource properties with qualified names. We can then add namespace prefix declarations to the query and use those prefixes in conditions and result expressions. The following example illustrates the case for a resource property of an instance of the tree-manager
service:
import static org.gcube.resources.discovery.icclient.ICFactory.*; ... SimpleQuery query = queryFor(ServiceInstance.class); query.addNamespace("tm",URI.create("http://gcube-system.org/namespaces/data/tm")) .addCondition("$resource/Data/tm:Plugin/name/text() eq 'species-tree-plugin'"); DiscoveryClient<ServiceInstance> client = clientFor(ServiceInstance.class); List<ServiceInstance> resources = client.submit(query);
Note that predefined queries for service instances already include a namespace declaration for the instance properties common to all such instances (e.g. ServiceClass
or ServiceName
). In particular, their namespace is automatically bound to the prefix gcube. Accordingly, a query for all service instances of the tree-manager
could be defined as follows:
import static org.gcube.resources.discovery.icclient.ICFactory.*; ... SimpleQuery query = queryFor(ServiceInstance.class); query.addCondition("$resource/Data/gcube:ServiceClass/text() eq 'DataAccess'") .addCondition("$resource/Data/gcube:ServiceName/text() eq 'tree-manager-service'"); DiscoveryClient<ServiceInstance> client = clientFor(ServiceInstance.class); List<ServiceInstance> resources = client.submit(query);
Auxiliary Variables
In our final example, we show how to introduce variables in the query in addition to $resource
. Typically this allows us to characterise complex resource properties that can repeat within resources. For example, we need an auxiliary variable to characterise the service endpoints that have a property with a given name and value:
import static org.gcube.resources.discovery.icclient.ICFactory.*; ... SimpleQuery query = queryFor(ServiceEndpoint.class); query.addVariable("$prop", "$resource/Profile/AccessPoint/Properties/Property") .addCondition("$prop/Name/text() eq 'dbname'") .addCondition("$prop/Value/text() eq 'timeseries'"); DiscoveryClient<ServiceEndpoint> client = clientFor(ServiceEndpoint.class); List<ServiceEndpoint> resources = client.submit(query);
Here we declare the auxiliary variable $prop
to range over the resources Property
s, and then use it to characterise on such property. Had we not declared the variables, we could have retrieved undesired results, i.e. resource descriptions that have two properties, one with the required name and another with the required value.
Queries
In the examples above, we have introduced two interfaces for queries, Query
and SimpleQuery
. We discuss the interfaces and their implementations below, starting from the most generic of the two.
Freeform Queries
Query
defines a read-only API, exposing only the textual expression of a query.
public interface Query { String expression(); }
The interface is thus only suitable to clients that consume queries. The DiscoveryClient
which submits queries to the Information Collector
is one such client. More generally, this is a good type to move queries from the place of building to the place of consumption.
The simplest Query
implementation is no more than a box for the textual expression of the query:
Query q = new QueryBox("….a query…");
We resort to boxing a query whenever we cannot build it with higher-level facilities, such as those discussed below. Normally, this happens when the query is too complex to be conveniently composed in object-oriented terms. Then we can always write it as a string, accepting full exposure to the details of the target database. For the ic-client
this means to know the details of the database in which the Information Collector
stores resource descriptions.
Templated Queries
QueryTemplate
is the first implementation of Query
which introduces the notion of building a query. The idea is to derive a query from a template, by filling named holes with equally named parameters. Who instantiates a QueryTemplate
provides the template:
QueryTemplate template = new QueryTemplate("….template…");
and who uses the instance fills the holes, ignoring other details of the query:
template.addParameter("name","value");
Then, whenever a consumer invokes expression()
on the QueryTemplate
, the parameters are interpolated and the complete query expression is returned.
Note: Besides addParameter(String,String)
, QueryTemplate
lets us manipulate templates with appendParameter(String,String
, hasParameter(String)
, and parameter(String)
, all of which have the expected semantics.
The approach makes sense whenever there is a natural separation between query creators and query builders. Creators define the template and 'see' the details of the remote database, builders work instead with a simpler view of the query, typically one that hides the database. The main use case here is for library code built on top of the ic-client
. The ic-client
itself predefines all its queries for resource types as QueryTemplate
s.
If we are not writing library code, however, QueryTemplate
s are of little use and we can move on to the facilities discussed next. Otherwise, let us look at templates in more details.
Templates are strings with empty XML elements, optionally with a def
attribute. Here's a template for an unlikely query language:
all results that satisfy <cond1/> or <cond2 def='that'/> </extra>
Whenever expression()
is invoked, the empty elements in the template are replaced according to the first rule that applies among the following:
- by the value of an equally named parameter, if one exists
- by the value of the def attribute, if one exists
- by the empty string
For example, after adding the single parameter cond1="this"
to the template above, expression()
returns the string:
all results that satisfy this or that
QueryTemplate
s are intended for subclassing rather than directly for clients. Subclasses are expected to present a more typed interface to their clients, where the parameters are added behind dedicated setters, e.g.:
public class MyQuery extends QueryTemplate { public MyQuery() { super("..mytemplate"); } …. setCond(String cond1) { …this.addParameter("cond1",cond1)…..} setCond2(String cond2) { …this.addParameter("cond2",cond2)…..} setExtra(String value) {…this.addParameter("extra",extra)…..} …... }
We discuss below one such subclass and the API that it presents to clients.
Note that, for added flexibility, QueryTemplate
defines a constructor that allows a subclass to provide an initial set of parameters:
QueryTemplate(String template, Map<String,String> parameters);
Thus a family of subclasses can share the same generic template, partially interpolate it at construction time, and offer many different specialisations to their clients. We can see this approach in action in the next section.
Simple Queries
As shown in our firsts examples, a SimpleQuery
allows clients to customise a query in terms of namespaces, variables, conditions, and result expressions, concepts that tend to recur across query languages:
public interface SimpleQuery extends Query { SimpleQuery addVariable(String name, String range); SimpleQuery addCondition(String condition); SimpleQuery addNamespace(String prefix, URI uri); SimpleQuery setResult(String expression); }
Naturally, the ic-client
includes an implementation of SimpleQuery
for the XQuery language, the language of the Information Collector
. XQuery
implements the SimpleQuery
over a simple XQuery template, extending the QueryTemplate
discussed above for the purpose:
public class XQuery extends QueryTemplate implements SimpleQuery {…}
The template declares an initial namesapaces for gCube resources, as well as a single variable, $result
. The ic-client
then predefines a number of XQuery
instances where the template is specialised in such a way that $resource
ranges over different resource types. When we invoke the method queryFor()
of the ICFactory
we get back the predefined XQuery
for the resource type that we indicate in input. We have shown above how the $resource
variable is used in specifying conditions and result expressions, and how additional namespaces and auxiliary variables can be configured on the query.
Recap
In conclusion, we have two main options for building queries:
- we can use predefined queries and customise them.
- we can box more complex queries inside QueryBoxes.
Furthermore, if we are writing libraries:
- we can define our own
QueryTemplate
s and present them to our clients, typically under ad-hoc APIs likeSimpleQuery
's;
Discovery
Once we have prepared a Query
we can submit to the Information Collector
through some implementation of the DiscoveryClient
interface.
Discovery Client
DiscoveryClient
is defined as follows:
public interface DiscoveryClient<R> { List<R> submit(Query query) throws DiscoveryException, InvalidResultException; Stream<R> submitForStream(Query query) throws DiscoveryException; }
Both submit()
and submitForStream()
take a Query
and return its results under a given type R
. submit()
collects all the results in a List
, while submitForStream()
streams them, using the API of the streams library.
Both methods can fail with a DiscoveryException
, e.g. if the Information Collector
endpoint is not available, or if the query is malformed. submit()
may also fail if the query is successfully submitted, but some of its results cannot be presented with the type R
. Implementations decide how tolerant they want to be with respect to these failures, whether to fail at the first occurrence of an invalid result, after a number of invalid results, or never.
In contrast, submitForStream()
does not concern itself with invalid results, as these would become encountered only as the stream is consumed. Fault handling policies become then a responsibility (and privilege) of consuming clients, which will normally specify them using dedicated facilities in streams
library.
Note: At the time of writing, the Information Collector
does no support data streaming. Accordingly, submitForStream()
will block its clients until all results have arrived to the ic-client
. Streaming is thus limited to the presentation of the results to its clients, and no advantage in terms of client responsiveness and capacity can be expected from its use. However, the method does allow clients to future-proof how their code deals with 'large' queries, i.e. queries that typically yield high numbers of results. When the Information Collector
will support streaming, such clients will not need to change. In addition, clients may choose submitForStream()
for the control it gives them in dealing with invalid results, as discussed above.
Note: Since a DiscoveryClient
is bound to a result type, we should not reuse it across queries that have different result expressions. This is a key difference with respect to the IS Client
library, where queries were bound to result types and service proxies could work with any query. In the ic-client
queries can be parsed in different ways, but different clients are required for different result types.
IC Client
The ICClient
is a first implementation of the DiscoveryClient
interface. Since it implements DiscoveryClient<String>
, it returns results exactly as the Information Collector
produces them and sends them on the wire.
As such, ICClient
does not perform any parsing. Rather, it concentrates on the task of query submission against the remote interface of the Information Collector
.
Note: when we invoke ICFactory#client()
we obtain an instance of ICClient
. If we prefer, we can instantiate it directly without passing through a factory method, which in fact exists only to guarantee consistency with ICFactory#clientFor(Class)
.
Delegate Client
Result parsing comes into play through a second implementation of the DiscoveryClient
interface:
public class DelegateClient<R> implements DiscoveryClient<R> { .... public DelegateClient(ResultParser<R> parser, DiscoveryClient<String> inner) {...}
A DelegateClient
offloads the task of query submission to another implementation of the DiscoveryInterface
. It then takes the results and passes them to a ResultParser
, which specialises into parsing them into the required type. Hence, the DelegateClient
is the general-purpose bridge between query submission and result parsing.
If we want to parse the results produced by the Information Collector
into a target type R
, we can arrange for it as follows:
ResultParser<R> parser = .... DiscoveryClient<R> client = new DelegateClient<R>(parser,new ICClient());
Note: Besides brokering between parsers and the ICClient
, DelegateClient
also deals with parsing failures, tolerating them as long as their number does not exceed the number of successfully parsed results. In this case, the client infers that the problem is the parser, rather than some faulty resource descriptions.
Result Parsers
ResultParser
is defined as follows:
public interface ResultParser<R> { R parse(String result) throws Exception; }
and we are free to implement it as we please. We do not need to worry about exception handling, i.e. can let any Exception
exit the method. It is the DelegateClient
that handles parsing failures as InvalidResultException
s.
In most cases, the recommended strategy for implementing ResultParser
is not to implement it at all.
The ic-client
includes a JAXBParser
that we can more easily use to parse our results. All we need to do is to define a bean class, annotate it with JAXB annotations that drive the JAXBParser
, and pass it to the parser's constructor. Thus we can arrange for, say, having results as ServiceEndpoint
instances as follows:
DiscoveryClient<ServiceEndpoint> client = new DelegateClient(new JAXBParser(ServiceEndpoint.class);
where ServiceEndpoint
is the JAXB-annotated class of common-gcore-resources
that model service endpoint descriptions, as we have seen in our examples. We can of course, define our own annotated bean classes too.
Note: Of course, the idiom above is precisely what we avoid to write when we invoke the ICFactory#clientFor(Class)
. Note also that ICFactory
includes the method clientWith(ResultParser)
, which adapts the idiom above to the case in which we do want to provide an alternative to JAXBParser
.