Integration and Interoperability Facilities Framework: HTTP API Framework Specification

From Gcube Wiki
Revision as of 14:29, 21 November 2012 by Rena.tsantouli (Talk | contribs) (Authentication Modes)

Jump to: navigation, search

Objective

The Application Services Layer offers HTTP APIs to expose a subset of its JAVA API facilities for supporting high level standards. It consists of a set of servlets that support HTTP methods and operates on top of the JAVA libraries. The retrospection of the current architecture and principles of those interfaces will guide the steps towards a formalization into a framework for this layer of the system.

HTTP Front End

Work within HTTP Front End framework focuses on system software that handles HTTP requests and replies with HTTP responses. The objective of this task involves access points that cover series of services, can pass through ASL for session handling and/or expose HTTP standards compliant interfaces.

Clients

The framework targets clients submitting HTTP requests and handling HTTP responses. Such clients will be components external or internal to the system that have come in an agreement with the HTTP services implementers about the data/format they are expecting and the response they will be giving out, or will be implementing standard HTTP specifications offered by the HTTP front end of the system.

Goals

The task aims at reviewing the existing architecture and extending the principles of the HTTP Front End to promote consistency across its implementing components and extensibility of the framework. At a later stage it focuses on monitoring its adoption across HTTP Front End components.

Roadmap

The roadmap defines the steps through which the review of the existing implementation will lead to the definition of the options and needs for extensions and formalization of the procedures. The final objective is the aggregation of the findings into a framework that will guide the evolution, usage, deployment and evaluation of the components that underlie this layer of the infrastructure. The methodology includes the following steps:

  • Architecture review
    • Clean the current architecture into more distinct components that will promote framework extensibility
    • Identify common functionalities and needs for development of common utility libraries
  • Principles extension
    • Extend principles with rules clarifying areas with needs for functionalities compliant with standards
    • Usage of standard data interchange formats
  • Definition of rules defining scope of application
  • Definition of management rules covering topics for changes, deployment, distribution, etc.
  • Definition of rules for controlling use and development
  • Definition of methods for checking compliance with fundamental rules

Models

The framework distinguishes concerns that relate to the HTTP applications fromt hose that relate to their management and evolves two separate models for their structuring: the Design Model and the Management Model.

  • Design Model: The Design Model for CL defines the architecture, the framework principles and the common functionality needs. It addresses cross-cutting design concerns within the HTTP components that include at least the following issues: authenticated calls, error handling, data interchange formats, Web Standards and their interplay with Infrastructure requirements, configurations. Consistency is more readily and conveniently achieved through shared implementations of common solutions. In this sense, the work within the framework evolution will also be concerned with the delivery of new system components that support the development of HTTP components.
  • Management Model: The model for HTTP Front End components will address at least the following (inter-related) issues:
    • Build outputs: what secondary artifacts are associated with HTTP Front End modules
    • Release cycle: how are HTTP components released with respect to target services
    • Change management: how changes in ASL or other target service API should be handled
    • Profiling and deployment: how should HTTP artifacts be profiled for dynamic deployment
    • Distribution: how should HTTP artifacts be packaged for distribution

Design Model HTTP Front End

Architecture

The architecture of this layer is driven by the needs for provision of access to resources that fall under specific functional categories. More specifically, it consists of a set of servlets for which each access point covers a series of services. In the existing implementation, those components cover mostly the fundamental functionality needed by the end-users. Extensions in the existing framework can either provide access to resources of other functional categories or offer a standard HTTP based API adhering to a widely used specification, to support interoperability needs. All the components within the existing framework use common functionalities implemented for caller authentication.

External Architecture

HTTP Front End aims at providing access to higher level functionality, frequently compliant to web standards. Most probably each access point covers a series of services and requires interaction with lower level components in the system architecture to achieve resources aggregation. The system layer that offers this kind of access to multiple services as well as a session mechanism that can be exploited within different applications’ interactions is the ASL (Application Services Layer). Therefore, the HTTP layer is logically placed above it and offers a subset of its functionality. In the image below, the framework of ‘layers’ of gCube system is depicted, clarifying the place of this set of interfaces in the system architecture.

WP11 FrameworkLayers.jpg

Internal Architecture

HTTP Front End is composed by web applications based on Java Servlet technology. By revisiting the architecture of the existing implementation of ASL HTTP Front End, the need for extensibility drives the decision for division of the only application into a set of smaller applications that can proliferate within the framework. An HTTP application in the new version of this layer can logically group related functionalities. The grouped functionalities address demanding tasks that need access to multiple services and a sequence of logical steps to complete. In the context of this decision, the architecture of the existing implementation consists of the following new system components:

  • ASL_HTTP_InformationRetrieval: Aggregates mandated functionality to perform a gCube search
    • listing of searchable collections
    • retrieval of information about searchable collections
    • listing of search types
    • listing of languages
    • listing of searchable fields
    • submission of search query
    • support for OpenSearch Specification
    • retrieval of search results
  • ASL_HTTP_ContentAccess: Aggregates functionality for accessing gCube content
    • retrieval of information about content
    • retrieval of content
    • retrieval of metadata
    • retrieval of thumbnails
  • ASL_HTTP_InfrastructureLogin: Aggregates functionality for logging in an infrastructure scope by using the ASL interna session mechanism
    • user authentication – logging in Infrastructure
    • listing of infrastructure scopes
    • logging in an Infrastructure scope
  • Specifications Implementations: Implementations of HTTP Standards placed in separate application components
  • Implementations offering common functionality to all HTTP applications

Based on the functionality, each servlet overrides the doGet and doPost methods of HttpServlet

Principles

The general principles conducting the implementation within the framework are listed as follows:

  • Session management with the use of the built in session tracking mechanisms of the servlets and the internal ASL Session for intercommunication within applications deployed in the same container.
  • Use of higher level functionalities hosted in ASL.
  • Provision of sufficient error handling based on appropriate error codes and prescriptive fault messages
  • Ability to run in any servlet container (not binding to portal installation) and with minor container configurations
  • Authentication application for all invocations
    • Provision of support for anonymous access also for every invocation

More specifically, within the HTTP layer, servlets are implemented as Java programming language classes that extend HttpServlet. Depending on the functionality offered, a servlet overrides the doHead, doGet and doPost methods and handles the requests based on the following guidelines:

API

Data Interchange Formats

For the benefits of interoperability and openness, HTTP responses must be in an interchangeable data serialization format. Extensible Markup Language (XML) or JavaScript Object Notation (JSON), which is a text notation that is better suited for data-interchange are recommended to be used in the applications HTTP responses. Therefore, the content type in the doHead method of every servlet should be either “application/json” or “text/xml”. If both mime types are supported, then the servlet will require the parameter “type” and render the results based on the user selection.

Error Handling

HTTP Front End components must return status codes compliant with Hypertext Transfer Protocol and appropriate HTTP error codes, with prescriptive messages of what went wrong in the system, or in the user input.

Coding Guidelines

- Naming Conventions

Context Management

As stated above, HTTP Front End uses ASL constructs to access lower level functionality. The remote operations are called by ASL in a context which encompasses more information than the target service endpoint and the input parameters of the calls. In particular, calls occur always in a given scope and to the provision of credentials about the caller. An attempt to call an operation in no particular scope or an operation called anonymously within a scope for which anonymous access is not configured, will be rejected. Below, we describe how this contextual information is made available within the calls to an application:

Authenticated Calls

All calls to HTTP Front End components must be authenticated, in terms of being performed within a context that contains information about the caller identity and the infrastructure scope. In the case of anonymous access, the caller does not need to identify herself but has to provide information about the scope within which the anonymous call will be performed (see authentication modes). Therefore, the first step for an interaction with any HTTP application is logging in to the system, through Login servlet by providing the right credentials. The Login servlet makes use of status codes and HTTP headers to manage the security policy. It implements both BASIC and Form – based authentication methods, receiving the user’s credential (username-password) and then communicating with ASLCore component to perform authentication. ASLCore provides a user authentication mechanism independent from specific applications and authentication providers. Using a pluggable mechanism to specify authentication modules it restricts users’ access to applications interacting with it. In case of denied credentials, the servlet returns an SC_UNAUTHORIZED status code to the client. Once the user is logged in to the system, an internal session within ASL is created for him and can be accessed with the use of the http session id and the username of the caller. This means that in subsequent calls the user will have to add a “username” parameter and keep the same jsessionid. In the cases of ‘anonymous access’, the user does not need to identify herself by visiting the Login portlet.

Scope Management

All calls to the underlying infrastructure are scoped. Consequently, a user interacting with HTTP Front End has to login to a scope before trying to access resources. This is done by visiting the LoginInfrastructureScope servlet and passing as parameter the scope to enter. LoginInfrastructureScope servlet interacts with ASLCore to log the user and keep this information in the internal ASLSession to prevent the need for passing this information in all subsequent calls of the application. When anonymous access is used, there is no need for logging in to an infrastructure scope, but all the requests need to be scoped by adding one more ‘scope’ parameter to the request.

Session Management

HTTP Front End makes use of the built in session tracking mechanisms of the servlets in combination with the internal ASLSession, to allow intercommunication between different applications living in the same container. Consequently, the user’s http session can be tracked either with the use of persistent cookies, where the session ID is saved on the client in a cookie called JSESSIONID, or with URL rewriting. URL rewriting can be used by clients that don’t support cookies by sending the sessionID as part of a rewritten URL, encoded using a jsessionid parameter. The http session is returned once the user logs in the system, inside an XML response, and all following URL – encoded requests within one application must contain it. The http session ID, along with the ‘username’ parameter are being used to track the internal ASL session that allows intercommunication between different ASL HTTP applications living in the same container. By offering a session tracking mechanism, the framework allows the user to make subsequent requests without having to log in every time she asks for a resource. In cases of HTTP standards implementations, where the requests can be strictly defined and the parameters non negotiable, the contextual information required from the underlying infrastructure is provided within application specific configurations, which are described here.

Authentication Modes

Since many Web Standards require open access, HTTP Front End supports two modes for accessing the system resources: Authenticated and Open Access

  • Authenticated

In authenticated mode, the user needs to login to the system and to an Infrastructure scope and continue interacting with the application over HTTP without having to pass the contextual information in every request submitted. Moreover, she can use personalized benefits in the cases of functionalities interacting with gCube personalization services.

  • Open Access

A servlet can also support Open Access to the underlying resources . For the user interacting with the application, this means that she has to pass the ‘scope’ parameter in every call it submits to the application.

The servlets implementers do not have to write code for any of the user authentication mode. This common functionality can be directly accessed through the aslHttpAccessManagement component, on which all servlets of the HTTP Front End should depend. Before processing the user’s call, the servlet implementer needs to authenticate the user by passing the HTTPServletRequest to the asl authentication component for HTTP and the identifier of the operation the user asked to perform, i.e:

HttpSession session = request.getSession();

//-- Check if the user is authenticated AuthenticationResponse authenticationResp = CallAuthenticationManager.authenticateCall(request, operationID);

The operationID is a string that identifies the operation and is used in a generic configuration for the management of Open Access. This configuration controls the open access permission per scope and per operation through HTTP. Therefore, if the user has not previously logged in the System, the Open Access configuration for the requested scope and for the particular operation is checked. The operationId should be a static final variable declared for every operation (or group of operations) in every servlet. As a response, the servlet implementer gets back an AuthenticationResponse object, from which he can retrieve whether the user was authenticated. If the user was authenticated, the userId can be retrieved from the returned object and if not, one can retrieve the message indicating what went wrong. For example:

HttpSession session = request.getSession(); //-- Check if the user is authenticated AuthenticationResponse authenticationResp = CallAuthenticationManager.authenticateCall(request, operationID); if (!authenticationResp.isAuthenticated()) { response.sendError(401, authenticationResp.getUnauthorizedErrorMessage()); return; }

String username = authenticationResp.getUserId(); …

Node configured with Scope

There is an option to configure the node hosting the web applications with a specific scope. In that case, all the web applications deployed in that node are running in the configured scope and there is no need for clients to select a running scope. This means that for anonymous access, the request can be performed using the fundamental parameters for making the calls, without having to deal with the scope parameter. This is mainly used for web applications with interfaces that adhere to specifications and that have strictly defined APIs, not allowing for extensions with supplementary ones.

Web Standards and Infrastructure Requirements

Many HTTP standards strictly define the request types and their parameters that can’t be extended and are non-negotiable. In the cases of implementation of such standards for providing interoperable HTTP interfaces, the client cannot specify the details of the environment within which her actions will take place (for instance information about the gCube scope). Moreover there are other cases where an administrator of an application could wish to expose part of the infrastructure resources through an Open Access protocol or cases where a protocol requires the exposal of resources in a particular form (not directly matching to the system’s resources structure).

Configurations

In all cases, where there is a gap between the requirements of the protocol and those of the infrastructure, the servlets offering the rendering functionality should base on bridging configurations. Those configurations can bind requests to scopes and virtually organize the infrastructure resources, based on the protocol’s requirements. The configurations of the servlets must be stored as generic resources in the system, in order for the Administrators to be able to easily configure them. If smaller configurations are needed, (represented by name – value pairs) that are rarely changed, those can be placed in the ‘init’ params of the web.xml of the servlet.