Difference between revisions of "Search Operators"

From Gcube Wiki
Jump to: navigation, search
(BooleanOperator)
(Operators)
Line 13: Line 13:
 
===== Description =====
 
===== Description =====
 
The Boolean Operator is used in conditional execution and more specifically, in evaluating the condition. So, it actually offers the ability of selecting alternative execution plans. For example, one can follow a plan (let’s say a projection on a given field of a set of data), if a given precondition is valid; otherwise, she may follow the alternative plan (e.g. a projection on another field of the same set of data and then sort on the field). The precondition validation is the responsibility of this Service.
 
The Boolean Operator is used in conditional execution and more specifically, in evaluating the condition. So, it actually offers the ability of selecting alternative execution plans. For example, one can follow a plan (let’s say a projection on a given field of a set of data), if a given precondition is valid; otherwise, she may follow the alternative plan (e.g. a projection on another field of the same set of data and then sort on the field). The precondition validation is the responsibility of this Service.
 +
 
The condition is a Boolean expression. Basically, it involves comparisons using the operations: equal, not_equal, greater_than, lower_than, greater_equal, lower_equal. The comparing parts are either literals (date, string, integer, double literals are supported) or aggregate functions on the results of a search service execution. These aggregate functions include max, min, average, size, sum and they can be applied to a given field of the result set of a search service execution, by referring to that field employing xPath expressions.
 
The condition is a Boolean expression. Basically, it involves comparisons using the operations: equal, not_equal, greater_than, lower_than, greater_equal, lower_equal. The comparing parts are either literals (date, string, integer, double literals are supported) or aggregate functions on the results of a search service execution. These aggregate functions include max, min, average, size, sum and they can be applied to a given field of the result set of a search service execution, by referring to that field employing xPath expressions.
 
===== Dependencies =====
 
===== Dependencies =====
 
+
*jdk 1.5
 +
*WS-Core 4.0.4
 +
*ResultSetClientLibrary
 +
*SearchLibrary
 
==== FilterResultSetByXPathOperator ====
 
==== FilterResultSetByXPathOperator ====
 
===== Description =====
 
===== Description =====
Line 44: Line 48:
 
==== ScannerOperator ====
 
==== ScannerOperator ====
 
===== Description =====
 
===== Description =====
 +
The Scanner Operator defines and provides a generic methodology of scanning through a result set, which is produced by another search operation service. It provides the ability to filter records, retrieve and update element/attributes values and remove selected elements/attributes. For this purpose it employs a formal function-like mathematical language, which is used for defining the operation on a given result set. The evaluation is done by an external package called JEP, which is a parser for mathematical expressions with the additional ability of defining new custom functions. Taking this ability into consideration, we have introduced some functions (do, like, filter, in, select), in order to provide a full-fledged filtering language.
 +
More analytically, the do function gets an arbitrary number of arguments and evaluates them. The filter function receives a boolean expression. If it is true then the respective result set record is removed from the derived result set. The like function performs a pattern matching and returns the boolean result of the matching. The in function determines whether a variable is in a given range of numeric values. Finally the select function selects specific elements|attributes to be included in the new result set.
 +
Language Semantics: In order for the evaluator to produce a valid result, the filtering expression should contain at least one select or filter function. Besides that, the expression can contain any possible mathematical expression. More precisely, the evaluator supports the most frequently used operators (!, +, -, *, /, ^, <, >, =, !=) and functions (sin, cos, tan, ln, log, exp, sqrt, abs, rand, mod, ...). Also, users are free to define their own temporary variables. However, the variable names of the leaf element names (leaf elements are the XML elements which do not have any child elements, but plain text values) cannot be redefined, cause they are automatically defined by the evaluator and initialized to their values, which can be either strings or numerics (doubles). For further information about the available functions and operators, see org.nfunk.jep The syntax of our custom functions is the following:
 +
BNF Syntax
 +
<custom_functions> ::= <filter_fun> <do_fun> <like_fun> <in_fun> <select_fun>
 +
<filter_fun> ::= <filter_fun_name> <left_parenthesis> <boolean_expression> <right_parenthesis>
 +
<filter_fun_name> ::= 'filter'
 +
<do_fun> ::= <do_fun_name> <left_parenthesis> <do_arguments> <right_parenthesis>
 +
<do_fun_name> ::= 'do'
 +
<do_arguments> ::= <do_argument> <do_args>
 +
<do_args> ::= <comma> <do_arguments> | EMPTY
 +
<do_argument> ::= <expression>
 +
<like_fun> ::= <like_fun_name> <left_parenthesis> <like_object> <comma> <regular_expression> <right_parenthesis>
 +
<like_fun_name> ::= 'like'
 +
<like_object> ::= <element> | <attribute>
 +
<regular_expression> ::= (see java.util.regexp.Pattern)
 +
<element> ::= String
 +
<attribute> ::= <element> '_' <attribute_name>
 +
<attribute_name> ::= String
 +
<in_fun> ::= <in_fun_name> <left_parenthesis> <object> <comma> <lower_bound> <comma> <upper_bound> <right_parenthesis>
 +
<in_fun_name> ::= 'in'
 +
<lower_bound> ::= Numeric
 +
<upper_bound> ::= Numeric
 +
< object> ::= <user_defined_variable> | <bound_variable>
 +
<bound_variable> ::= <attribute> | <element>
 +
<user_defined_variable> ::= (any instantiated variable, e.g. a=2)
 +
<select_fun> ::= <select_fun_name> <left_parenthesis> <select_object_list> <right_parenthesis>
 +
<select_fun_name> ::= 'select'
 +
<select_object_list> ::= <bound_variable> <select_args>
 +
<select_args> ::= <comma> <select_object_list>
 
===== Dependencies =====
 
===== Dependencies =====
 
==== SortOperator ====
 
==== SortOperator ====

Revision as of 11:14, 23 August 2007

Search Operators

Introduction

The Search Operator family of services are the building blocks of any search operation. These along with external to the Search services handle the production, filtering and refinement of available data according to the user queries. The various intermediate steps towards producing the final search output are handled by Search Operator services. In this section we will only describe the Search Service internal Services listed below, although the Search Operator Framework reaches out to "integrate" on a high level other services too that can be utilized within a Search operation context.

The following operators are implemented as stateless services. They receive their input and produce their output in the context of a single invocation without holding any intermediate state. In case any data transferring is necessary either as input to a service or as output from the processing, the ResultSet Framework is employed.

The search operators cover the basic functionality that could be encountered in a typical search operation. A search can be decomposed in undividable units consisting of the above operators and their interaction can construct a workflow producing the net result delivered to the requester. The external source search and the service invocation services provide some extendibility for future operators by offering a method for invoking an “unknown” to the Search framework service, importing its results to the search operator workflow. The distinguished search operators at present time are listed below.

Example Code

Search Operators Usage Examples

Operators

BooleanOperator

Description

The Boolean Operator is used in conditional execution and more specifically, in evaluating the condition. So, it actually offers the ability of selecting alternative execution plans. For example, one can follow a plan (let’s say a projection on a given field of a set of data), if a given precondition is valid; otherwise, she may follow the alternative plan (e.g. a projection on another field of the same set of data and then sort on the field). The precondition validation is the responsibility of this Service.

The condition is a Boolean expression. Basically, it involves comparisons using the operations: equal, not_equal, greater_than, lower_than, greater_equal, lower_equal. The comparing parts are either literals (date, string, integer, double literals are supported) or aggregate functions on the results of a search service execution. These aggregate functions include max, min, average, size, sum and they can be applied to a given field of the result set of a search service execution, by referring to that field employing xPath expressions.

Dependencies
  • jdk 1.5
  • WS-Core 4.0.4
  • ResultSetClientLibrary
  • SearchLibrary

FilterResultSetByXPathOperator

Description
Dependencies

JoinInnerOperator

Description
Dependencies

KeepTopOperator

Description

The role of the KeepTop Operator is to perform a simple filtering operation on its input ResultSet and to produce as output a new ResultSet that holds a defined number of leading records.

Dependencies
  • jdk 1.5
  • WS-Core 4.0.4
  • ResultSetClientLibrary
  • SearchLibrary

MergeOperator

Description
Dependencies

QueryExtSourceOperatorGoogle

Description
Dependencies

QueryExtSourceOperatorJDBC

Description
Dependencies

QueryExtSourceOperatorOSIRIS

Description
Dependencies

ScannerOperator

Description

The Scanner Operator defines and provides a generic methodology of scanning through a result set, which is produced by another search operation service. It provides the ability to filter records, retrieve and update element/attributes values and remove selected elements/attributes. For this purpose it employs a formal function-like mathematical language, which is used for defining the operation on a given result set. The evaluation is done by an external package called JEP, which is a parser for mathematical expressions with the additional ability of defining new custom functions. Taking this ability into consideration, we have introduced some functions (do, like, filter, in, select), in order to provide a full-fledged filtering language. More analytically, the do function gets an arbitrary number of arguments and evaluates them. The filter function receives a boolean expression. If it is true then the respective result set record is removed from the derived result set. The like function performs a pattern matching and returns the boolean result of the matching. The in function determines whether a variable is in a given range of numeric values. Finally the select function selects specific elements|attributes to be included in the new result set. Language Semantics: In order for the evaluator to produce a valid result, the filtering expression should contain at least one select or filter function. Besides that, the expression can contain any possible mathematical expression. More precisely, the evaluator supports the most frequently used operators (!, +, -, *, /, ^, <, >, =, !=) and functions (sin, cos, tan, ln, log, exp, sqrt, abs, rand, mod, ...). Also, users are free to define their own temporary variables. However, the variable names of the leaf element names (leaf elements are the XML elements which do not have any child elements, but plain text values) cannot be redefined, cause they are automatically defined by the evaluator and initialized to their values, which can be either strings or numerics (doubles). For further information about the available functions and operators, see org.nfunk.jep The syntax of our custom functions is the following: BNF Syntax <custom_functions> ::= <filter_fun> <do_fun> <like_fun> <in_fun> <select_fun> <filter_fun> ::= <filter_fun_name> <left_parenthesis> <boolean_expression> <right_parenthesis> <filter_fun_name> ::= 'filter' <do_fun> ::= <do_fun_name> <left_parenthesis> <do_arguments> <right_parenthesis> <do_fun_name> ::= 'do' <do_arguments> ::= <do_argument> <do_args> <do_args> ::= <comma> <do_arguments> | EMPTY <do_argument> ::= <expression> <like_fun> ::= <like_fun_name> <left_parenthesis> <like_object> <comma> <regular_expression> <right_parenthesis> <like_fun_name> ::= 'like' <like_object> ::= <element> | <attribute> <regular_expression> ::= (see java.util.regexp.Pattern) <element> ::= String <attribute> ::= <element> '_' <attribute_name> <attribute_name> ::= String <in_fun> ::= <in_fun_name> <left_parenthesis> <object> <comma> <lower_bound> <comma> <upper_bound> <right_parenthesis> <in_fun_name> ::= 'in' <lower_bound> ::= Numeric <upper_bound> ::= Numeric < object> ::= <user_defined_variable> | <bound_variable> <bound_variable> ::= <attribute> | <element> <user_defined_variable> ::= (any instantiated variable, e.g. a=2) <select_fun> ::= <select_fun_name> <left_parenthesis> <select_object_list> <right_parenthesis> <select_fun_name> ::= 'select' <select_object_list> ::= <bound_variable> <select_args> <select_args> ::= <comma> <select_object_list>

Dependencies

SortOperator

Description
Dependencies

TransformByXSLTOperator

Description
Dependencies