SmartGears4

From Gcube Wiki
Jump to: navigation, search

SmartGears is a set of Java libraries that turn Servlet-based containers and applications into gCube resources, transparently.

In this document, we motivate SmartGears and explain how it improves over existing gCube solutions. The discussion is relevant to node and infrastructure managers, who perform and maintain Smartgears installations, and to developers, who package or write software for a gCube infrastructure.

Rationale

What does it mean to turn software applications and containers into gCube resources?

We start by revisiting the goals of “Software-As-Resource” (SaR) and “Container-as-Resource” (CaR), i.e. the core mission of gCube’s Enabling Layer. We then take stock of how far we have come towards achieving these goals, motivating Smartgears as our latest attempt to go further still.

Software-as-Resource

A piece of software is a gCube resource if we can manage it in a gCube infrastructure. This means that we can do a number of things with the software, including:

  • discover where it is deployed, so as to use it without hard coded knowledge of its location.
For this, we need to describe each and every software deployment, and publish these descriptions, or profiles, in the infrastructure;
  • monitor and change the status of its deployments, so as to take actions when they are not in an operational status (e.g. redeploy the software, or at least prevent discovery and usage of the deployments).
For this, we need to track their current status, report it in the profiles we publish, and republish the profiles when the status changes;
  • dedicate its deployments to certain groups of users, in the sense that only users in those groups can use them.
We can change the sharing policies of individual deployments at any time, i.e. share them across more or less groups. We can also grant different privileges to different types of users within given groups.

Publication, discovery, lifecycle management, controlled sharing are the pillars of resource management in gCube. Yet relying on humans to compile deployment profiles, publish them in the infrastructure, keep track and change the status of deployments, or enforce sharing policies is all but practical. In some cases, it is downright impossible. We need instead automated solutions that live alongside each and every deployment and help us turn it into a resource we can manage. Smartgears is one such solution.

Container-as-Resource

When we pursue SaR, we are not after any possible software. We focus on software that can be used over the network, such as distributed applications and network services. Software deployments then correspond to software endpoints.

Typically, software endpoints run within containers and, in gCube, containers can be resources in their own right, the so-called gCube Hosting Nodes (gHNs).

Managing gHNs is a way to manage multiple endpoints simultaneously (e.g. deactivate a gHN means to deactivate a set of endpoints at once). Equally, it is a way to manage underlying hardware resources (e.g. dedicate a gHN to selected groups of users).

This is a notion of "Container-as-Resource" (CaR), and it raises the same requirements as SaR, including publication and discovery, lifecycle management, and controlled sharing. Smartgears helps us meet these requirements too, i.e. turns containers as well as the endpoints therein into gCube resources.

Big Frameworks

Traditionally, enabling SaR and CaR has been the main value proposition of the gCube Core Framework (gCF). The way gCF delivers that value has limitations, however:

  • gCF is a framework to develop SaR: its class hierarchies, interfaces, callbacks, and helper objects guide and simplify the task of writing software that directly meets our management requirements. Simply put, gCF lets us develop SaR specifically and exclusively for gCube. This "closed-world" assumption is not a bad thing insofar as all the SaR we need is part of gCube itself. Indeed, gCF has helped us a great deal to grow gCube consistently. However, it does becomes a problem when we want to:
  • reuse gCube software externally to the infrastructure;
  • bring existing software into the infrastructure;
  • encourage third parties to develop software for the infrastructure.
It also becomes a problem if we wish to deviate from the type of SaR that we can develop with gCF, which brings us to the next point:
  • with gCF, the only CaR we get is the Globus container, and the only type of SaR we can enable are JAX-RPC services that run in that container. We cannot:
  • develop SaR as a Rest service, or as a plain Web Application;
  • develop SaR using modern standards for Soap services (e.g. JAX-WS);
  • run SaR in popular, modern, and performant application containers (e.g. Tomcat, Jetty, full-blown JEE servers).
Overall, we are severely limited in our choice of development stack. This creates an evolution problem for gCube, as well as an obstacle to its adoption and further growth.
  • since gCF sits right at the top of the stack of the SaR we produce, its APIs are public: if we change them we break at once a large number of gCube services. Again, this inhibits the evolution of the system.

Overall, there are at least two themes in the issues above:

  • age: gCF, Globus, and their technological context are well dated by now;
  • visibility: gCF sits right at the top of the software’s stack and right in the middle of its design.

Less is More

If we look at gCF as an ageing solution, the temptation is to revamp and expand: align with new standards for Soap services, open up to Rest services, and perhaps move towards modern containers. The is substantial work, however, and it would have a dramatic impact on gCube. It would also be short-sighted work, as a cycle of five years may well bring us back where we are now.

We believe instead that visibility is the key problem to address: if gCF weren’t visible to begin with, its age would be of little of no concern. With Smartgears we propose the same net value as gCF, but deliver it in a completely different fashion: we move away from frameworks and make Smartgears invisible to the software, not part of its stack at all. As a result, gCube is invisible too and any software can run in the infrastructure: SaR becomes a nature that software acquires at runtime;

Indeed, Smartgears has little requirements to raise against the software. As we shall see, all we ask of software is to be based on the Servlet specifications, which define the hooks that we need to track its lifecycle and its use. The software is thus a Web Application and may more specifically be a Soap service, a Rest service, or a generic Web Application. It may adopt different standards and technologies (e.g. , JAKARTA-WS, but also Dependency Injection technologies, persistence technologies, etc.). And of course it may run in any container that is Servlet-compliant (Web Containers, Application Servers).

Finally, the evolution of Smartgears is inconsequential for the software: most of the APIs of Smartgears remain private to Smartgears.

Featherweight Relations

Smartgears and the FeatherWeight Stack (FWS) are both solutions based on microlibs, and in fact share a number of them.

Smartgears is the logical counterpart of the FWS: if the first turns Java software into gCube resources, the second enables other Java software to call such resources. Together, Smartgears and the FWS provide a logical replacement of gCF and gCore.

The FWS is a stack, however, i.e. a direct or indirect dependency of clients. In contrast, Smartgears live in the runtime of software but it does not need to be a compile or runtime dependency for it (but see gCube-aware applications).

Requirements

Containers and applications need a minimal set of requirements before SmartGears can turn them into gCube resources:

  • containers must comply with version 6 of the Servlet specifications with jakarta;
  • applications must include one application.yaml configuration file alongside their deployment descriptor (i.e.. under /WEB-INF);

In addition:

  • node managers must define a GHN_HOME environment variable that resolves to a location where SmartGears can find a container.ini configuration file;.

The Servlet specifications allow SmartGears to intercept relevant events in the lifecycle of individual applications whilst being shared across all applications, in line with the deployment scheme of SmartGears. In particular, the specifications introduce a ServletContextInitializer interface that SmartGears implements to be notified of application startup. The specifications also allow programmatic registration of filters and servlets, which SmartGears uses to transparently manage applications without the need of additional configuration in their web.xml descriptor.

Configuration is thus limited to WEB-INF/application.yaml and $GHN_HOME/container.ini, which provide the configuration of, respectively, the application and the container as gCube resources. We discuss their contents in the Appendices.

Distribution

Smartgears is distributed as a tarball that contains the libraries, scripts, and configuration files required to install Smartgears in a given container, and to maintain the installation over time. Instructions on how to download, install and maintain Smartgears are available in the SmartGears_Web_Hosting_Node_(wHN)_Installation and we do not duplicate them here.

It is also distributed as docker container from [D4S docker hub repository]

Components

The libraries that comprise Smartgears and are bundled in its distribution are also available in our Maven repository, like any other gCube component. They can thus be resolved as Maven dependencies whenever applications need to introspect on their configuration and status as gCube resources. We discuss below this class of gCube-aware applications.

The vast majority of Smartgears libraries are “microlibs”, i.e. Java libraries with:

  • a narrow functional focus, i.e. provide one and only one type of functionality;
  • contained dependencies, i.e. depend only on other microlibs and the JDK.

Using microlibs reflects a commitment to provide support which is:

  • modular: client assemble microlibs to obtain no more and no less than the functionality they need. This avoids the spurious dependencies that are often introduced by multi-function libraries, hence limits the impact of evolving individual libraries;
  • unobtrusive: clients do not inherit from microlibs runtime dependencies on common, general-purpose third-party libraries. This limits the risk that such dependencies may clash with versions already available in their runtimes;
  • lightweight: as an implication of modular and non-intrusive support, microlibs do not inflate the size of client runtimes.

SmartGears includes the following microlibs:

  • org.gcube.core:common-smartgears: the main library in SmartGears, contains all the components that provide the management logic required to turn applications and containers into gCube resources. All the other libraries are direct or indirect dependencies of common-smartgears;
  • org.gcube.core:common-events: a general-purpose, annotation-based eventing library used by the components in common-smartgears to sync with each other actions in a loosely-coupled manner;
  • org.gcube.core:common-validator: a general-purpose, annotation-based library for object state validation used in common-smartgears to validate configuration objects;
  • org.gcube.core:common-scope: a library with facilities related to gCube scope management;
  • org.gcube.resources:common-gcore-resources: a library with the object bindings for the gCube resource model used in SmartGears to model application and container resource profiles ;
  • org.gcube.resources:registry-publisher: a client library for the gCube Registry service used in common-smartgears to publish application and container profiles;
  • org.gcube.core:common-smartgears-app: a library that provides mechanisms to expose selected APIs in common-smartgears to gCube-aware applications;

The distribution of SmartGears includes of course the transitive closure of the libraries above. It also includes the only 3rd party dependencies of SmartGears, namely Slf4j and Logback, which we discuss below in regard to logging.

Deployment Scheme

The installation scripts included in the SmartGears distribution copy the SmartGears libraries in container-specific locations, where they are available to all the applications that run in the container.

This shared deployment scheme for SmartGears is largely dictated by the need to share state about the container across applications, which remain otherwise isolated in the classloader hierarchies defined by the container. A shared installation achieves that in a container-independent manner.

The shared scheme has additional advantages:

  • SmartGears does not need to be embedded in applications, which lowers their packaging requirements and size of distribution. As we've discussed above, requirements on applications are limited to resource descriptors;
  • the management functions provided by SmartGears are uniformly and consistently applied to all applications that run in a given container;
  • SmartGears can be updated simply by updating its shared installation, i.e. without requiring a repackaging of all the applications that run in the container;

Of course, shared installations emphasise the importance of limiting 3rd-party dependencies to a minimum, avoiding altogether very common, general purpose dependencies that may clash with versions used in applications stacks. Of course, this requirement would exist even if SmartGears were packaged with applications, but it stands even stronger against shared installations, which expose SmartGears libraries also to all applications that run in the container, including those that do not need to be managed by SmartGears. We minimise this risk by basing SmartGears on microlibs, as we have discussed above.

Logging

SmartGears uses Slf4j for its own logs, and its distribution includes Logback as the Slf4j binding . The installation scripts then copy a default logback.xml [#Default Logging Configuration| configuration file]] in the container-specific locations where SmartGears libraries are installed. The default configuration defines a file-based appender for the logs, where the current log file is $GHN_HOME/ghn.log and files are rolled daily and kept for a maximum of 30 days.

Slf4j<code> and <code>Logback are the only concession that SmartGears makes to 3rd-party dependencies. The concession is necessary, as it could be only be avoided via a custom logging framework or a custom repackaging of any framework of choice. Unfortunately, we cannot pursue either option in SmartGears because many of its libraries are used also in other contexts (e.g. in the FWS).

As a result, SmartGears may interfere with applications that use Slf4j. While the Slf4j API does not create problems (regardless of the version used by applications), Slf4j bindings might. In particular, we distinguish the following cases:

  1. applications that bundle the Slf4j API and Logback (i.e. make the same choices as SmartGears, regardless of versions) will experience no interference, in that their Logback configuration will be handled separately from SmartGears’;
  1. applications that bundle the Slf4j API and an Slf4j binding other than Logback will be warned that the binding of the Slf4j API is ambiguous, as two different options are available in the classpath. While Slf4j does not guarantee it, our tests show that the application’s binding will prevail and no interference will be observed by the application. In this case, the net effect is the same as in case 1.;
  1. applications that bundle the Slf4j API but no Slf4j binding will encounter classloading errors at startup. This configuration is however highly unlikely in practice, because if Slf4j bindings are provided by the container then also the Slf4j API should be, as in the next case;
  1. applications that bundle neither the Slf4j API nor an Slf4j binding (i.e. expect the API to be provided by the container) will find that their logs in SmartGears’ logfile and their configurations, if any, ignored;
  1. applications that bundles an Slf4j binding but do not bundle the Slf4j API (i.e. expect it to be provided by the container), will find that their logs where they configure them to be;

Overall:

  • cases 1. and 2. are most likely for applications that are fully unaware of gCube and their use as gCube resources;
  • case 3. should not occur often in practice, as it is likely to be a configuration error of the application;
  • case 4. and case 5. are likely to apply when containers, like SmartGears, adopt Slf4j and applications are packaged with an awareness of this arrangement. A key scenario for this is that of gCube-aware applications, which are packaged and developed with the expectation of a SmartGears-enabled container. Case 4. does not require configuration while case 5 separates application logs from SmartGears logs and, depending on the configuration, from the logs of other applications.

API Extensions

SmartGears extends all managed applications with servlets that can be used for remote management of the applications. These API extensions are registered at URLs under the gcube root, which is in turn immetely under the application root.

Currently, the available extensions can answer the following HTTP requests:

  • GET /gcube/resource
returns an HTML splash page for the application, which reports its key properties as a resource and includes links to its resource profile and resource configuration;
  • GET /gcube/resource/profile
returns the resource profile for the application, in its standard XML serialisation;
  • GET /gcube/resource/health
returns the current status of the service;
  • GET /gcube/resource/metrics
returns the metrics of the node in prometheus format;
  • GET /gcube/resource/lifecycle
returns the current state of the application as the simple content of a state XML element (e.g. <state>READY</state>);
  • POST /gcube/resource/lifecycle
changes the current state of the application to the simple content of a state XML element provided as the body of the request (e.g. <state>FAILED</state>);
  • GET /gcube/resource/scopes
returns the current scopes of the application as the simple content of one or more scope XML elements nested inside a scopes element (e.g. <scopes> <scope>/gcube/devsec</scope><scope>/gcube/devNext</scope></scopes>);
  • POST /gcube/resource/scopes
adds or remove a scope to or from the application. Scopes are added as the simple content of a scope XML element provided as the body of the request (e.g. <scope>/gcube/devsec</scope>. Scopes are removed analogously, except that the scope element provided in the body has a delete attribute set to true (e.g. <scopes> <scope delete=”true”>/gcube/devsec</scope>);

gCube-Aware Applications

While SmartGears can make resources out of arbitrary applications, it does not exclude that some applications may still be designed explicitly as gCube resources. These applications may need to inspect their properties as resources, or the properties of their container (e.g. current status, active sharing policies, configuration, etc), or they may need to react to events in their own resource lifecycle, or again in the lifecycle of their container. Of course, gCube itself is and will continue to be the main source of such gCube-aware applications.

When it comes to developing gCube-aware applications, SmartGears must become visible. For this, SmartGears expose a selection of its APIs through the ServletContext of the application, the common denominator of all Servlet-based applications.

The APIs are rooted in an ApplicationContext that represents the application as a resource. Through the ApplicationContext the application can get to the resource profile, the resource configuration in gcube-app.xml, the resource lifecycle, and the subscription mechanisms that relate to the resource lifecycle. The ApplicationContext exposes also the ContainerContext, which gives access to similar information and mechanisms about the container in which the application is running.

For further convenience, SmartGears includes a common-smartgears-app library, which simplifies the task of accessing the ApplicationContext from within any implementation stack. The library leverages the annotations in the Servlet 3 specifications to transparently add a ServletContextListener to all the applications that include the library in their distribution.

At application startup, the listener extracts the ApplicationContext from the ServletContext and registers it on a ContextProvider. The provider is statically accessible to the application and can make available the ApplicationContext as follows:

ApplicationContext ctx = ContextProvider.get()

The application can then access all the information and notification mechanisms that are directly or indirectly provided by SmartGears.

gCube-aware applications that use Maven can specify their dependencies to Smartgears as shown in the Appendices.

Authorization

Authorization is transparent for all the Smartgears apps. The TOKEN is automatically read and resolved on every call. In case the developer needs to know the current context of the call or the client that is trying to access the webapp, a ThreadLocal variable is set in the current Thread:

  • SecretmanagerProvider;

The SecretManagerProvider.get() method returns a Secret object containing the current context, identifier of the caller and roles.

Accounting

Calls received by a Smartgears enable service are automatically accounted. The method name used is extracted from the PathInfo (in case of REST service) or from the client. Another way to set internally a custom method name is to set the InnerMethodName thread local.

InnerMethodName.instance.set("methodnameToAccount")


Intialization

The smartgears framwork allows services to make some work at initialization-time (e.g. search for resources, register to a topic etc.). To enable this functionality is sufficent to create a class that implements the ApplicationManager class.

..
import org.gcube.smartgears.ApplicationManager;
..
 
public class MyAppManager implements ApplicationManager {
 
	private static Logger logger = LoggerFactory.getLogger(MyAppManager.class);
 
	ApplicationContext ctx = ContextProvider.get();
 
	@Override
	public void onInit() {
	   //do something on init
	}
 
	@Override
	public void onShutdown() {
	   //do something on shutdown
	}
}

The last thing to do is to declare the servlet implementation class as ManagedBy the MyAppManager class.

..
import org.gcube.smartgears.annotations.ManagedBy;
..
@WebService(portName = "TestPort",
serviceName = "testinterface",
targetNamespace = "http://gcube-system.org/test",
endpointInterface = "org.gcube.test.ServiceInterface")
@ManagedBy(MyAppManager.class)
public class FactoryTest implements ServiceInterface {
 
..
 
}

The implemented ApplicationManager keep the state per context. The ApllicationManager related to current context can be recovered on the managed class as follow:

..
import org.gcube.smartgears.ApplicationManagerProvider;
 
..
MyAppManager appManager = (MyAppManager)ApplicationManagerProvider.get();
..

Appendices

Container Descriptor

SmartGears manages containers according to the instructions found in a $GHN_HOME/container.ini descriptor. We can illustrate the structure of this descriptor by example, commenting out elements that are optional:

[node]
; mandatory
; optional fields: mode (=online), publication-frequency-seconds (=60), authorizeChildrenContext (=false)
mode = online
hostname = hostname.it
protocol= http
port = 8080
infrastructure = gcube
authorizeChildrenContext = true
publicationFrequencyInSeconds = 60
 
[properties]
; not mandatory
SmartGearsDistribution = 1.0.0
SmartGearsDistributionBundle = UnBundled
 
[site]
; mandatory
country = it
location = rome

;[proxy]
; not mandatory
protocol = https
hostname = proxy
port = 80
 
 
[authorization]
; mandatory
; optional fields: provider factory (=org.gcube.smartgears.security.defaults.DefaultAuthorizationProviderFactory)
factory = org.gcube.smartgears.security.defaults.DefaultAuthorizationProviderFactory
factory.endpoint = https://accounts.cloud-dev.d4science.org/auth/realms/d4science/protocol/openid-connect/token
credentials.class = org.gcube.smartgears.security.SimpleCredentials
credentials.clientID = testClient
credentials.secret = testSecret

;[persistence]
; not mandatory (default is LocalPersistence writing in the ghn home)
class = utils.PersistenceWriterTest
location = /tmp

Application Descriptor

SmartGears manages applications that include a WEB-INF/application.yaml descriptor in their distributions. We can illustrate the structure of this descriptor by example, commenting out elements that are optional:

name: test
group: group
version: 1.0.0
#not mandatory
description: pippo
#not mandatory
proxable: true
#not mandatory
excludes:
  - path: /pippo/*
  - handlers: [H1, H2]
    path: /trip
#not mandatory
allowed-secrets:
  - org.gcube.smartgears.security.secrets.GCubeKeyCloakSecretFactory
  - org.gcube.smartgears.security.secrets.LegacyGCubeTokenSecretFactory
#not mandatory
persistence:
  implementationClass: org.gcube.smartgears.persistence.LocalWriter
  writerConfiguration:
    className: org.gcube.smartgears.persistence.LocalWriterConfiguration
    location: /tmp

Default Logging Configuration

<configuration>
    <appender name="FILE" class="ch.qos.logback.core.FileAppender">
    <file><GHN_HOME>/ghn.log</file>
    <append>true</append>
    <encoder>
      <pattern>%d{HH:mm:ss.SSS} [%thread] %-5level %logger{0}: %msg%n</pattern>
    </encoder>
          <rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
              <fileNamePattern>logFile.%d{yyyy-MM-dd}.log</fileNamePattern>
	 <maxHistory>30</maxHistory>
        </rollingPolicy>
  </appender>
 
  <logger name="org.gcube" level="INFO" />
 
  <root level="WARN">
    <appender-ref ref="FILE" />
  </root>
</configuration>

Maven Dependencies

gCube-aware applications that use Maven can resolve their dependencies to SmartGears by adding the following to their POMs:

           <dependencyManagement>
                         <dependencies>
			...
			<dependency>
				<groupId>org.gcube.distribution</groupId>
				<artifactId>maven-smartgears-bom</artifactId>
				<version>LATEST</version>
				<type>pom</type>
				<scope>import</scope>
			</dependency>
			...
		</dependencies>
	</dependencyManagement>
          ...
	<dependencies>
          ...	
	<dependency>
			<groupId>org.gcube.core</groupId>
			<artifactId>common-smartgears-app</artifactId>
		</dependency>
		<dependency>
			<groupId>org.gcube.core</groupId>
			<artifactId>common-smartgears</artifactId>
		</dependency>
                  ….
	</dependencies>

The first dependency is to SmartGears's “Bills of Materia”l (BOM), a POM-only Maven artefact which specifies dependencies to all the SmartGears libraries. By importing it in the dependencyManagement section of their POMs, applications make sure that dependencies to SmartGears libraries:

  • have provided scope, i.e. will not be bundled with the applications;
  • are the latest available versions, those with which the applications will be built and then managed in production.

The version LATEST matches the latest release of the BOM, which is always kept in sync with the latest release of SmartGears. During gCube builds, LATEST is resolved to and replaced with a specific version number, so that the reproducibility of release builds is always guaranteed.

The other two concrete dependencies are to common-smartgears-app and common-smartgears, and rely on versions and scopes defined in the BOM.