Difference between revisions of "Messaging Infrastructure"
Andrea.manzi (Talk | contribs) (New page: ==gCube Messaging Architecture== Tools to monitor the infrastructure and gather accounting data are important tasks within the infrastructure operation work. As a consequence, D4Science de...) |
Andrea.manzi (Talk | contribs) (→Local Producer) |
||
Line 22: | Line 22: | ||
The MSG Broker URL is disseminated in the gHN container deployed in all gCube nodes of the D4Science production infrastructure. The URL is distributed in the gHN Service Map configuration file. | The MSG Broker URL is disseminated in the gHN container deployed in all gCube nodes of the D4Science production infrastructure. The URL is distributed in the gHN Service Map configuration file. | ||
− | + | ==Local Producer== | |
The Local Producer is the entity deployed on each node of the infrastructure responsible for the messages exchange. It defines the methods to communicate with the Message Broker and is activated at node start-up (if configured to do so). | The Local Producer is the entity deployed on each node of the infrastructure responsible for the messages exchange. It defines the methods to communicate with the Message Broker and is activated at node start-up (if configured to do so). | ||
Line 35: | Line 35: | ||
# Queue connections: exploited by accounting probes that produce messages consumed by only one consumer; | # Queue connections: exploited by accounting probes that produce messages consumed by only one consumer; | ||
# Topic connections: exploited by monitoring probes that produce messages consumed by multiple consumers. | # Topic connections: exploited by monitoring probes that produce messages consumed by multiple consumers. | ||
+ | |||
+ | ===Configuration=== | ||
+ | |||
+ | In order to configure the GHN to run the gCube Local Monitor, at least one MessageBroker ( an Active MQ endpoint) must be configured in one of the ServiceMap related to the GHN scope as follows: | ||
+ | |||
+ | <pre> | ||
+ | <ServiceMap> | ||
+ | <Service name ="ISICAllQueryPT" endpoint ="http://dlib01.isti.cnr.it:8080/wsrf/services/diligentproject/informationservice/disic/DISICService"/> | ||
+ | <Service name ="ISICAllRegistrationPT" endpoint ="http://dlib01.isti.cnr.it:8080/wsrf/services/diligentproject/informationservice/disic/DISICRegistrationService"/> | ||
+ | ...................... | ||
+ | <Service name ="MessageBroker" endpoint ="tcp://ui.grid.research-infrastructures.eu:6166"/> | ||
+ | </ServiceMap> | ||
+ | </pre> | ||
+ | |||
+ | |||
+ | One parameter can been added also to the [https://wiki.gcore.research-infrastructures.eu/gCube/index.php/Administrator_Guide#Configuring_the_gHN]GHN configuration : | ||
+ | |||
+ | * testInterval: The interval in seconds between test executions ( default = 1800) | ||
+ | |||
+ | |||
+ | In case none of MessageBroker parameters present on GHN ServiceMaps, the gCube Local Monitor is not enabled on the GHN. | ||
===Monitoring Probes=== | ===Monitoring Probes=== |
Revision as of 15:12, 2 March 2010
Contents
gCube Messaging Architecture
Tools to monitor the infrastructure and gather accounting data are important tasks within the infrastructure operation work. As a consequence, D4Science decided to implement:
- A monitoring tool based on a messaging system to compliment the monitoring tools already available based on the gCube IS;
- An accounting tool also based on a messaging system to satisfy the need to provide accounting information.
These monitoring and accounting tools have been implemented under a common gCube subsystem called gCube Messaging. This section presents the architecture and core components of such subsystem.
The gCube Messaging subsystem is composed by seven components:
- Message Broker – receives and dispatches messages;
- Local Producer – provides facilities to send messages from each node;
- Node Monitoring Probes – produces monitoring info for each node;
- Node Accounting Probes – produces accounting info for each node;
- Portal Accounting Probes – produces accounting info for the portal;
- Messages – defines the messages to exchange;
- Messaging Consumer – subscribes for messages from the message broker, checks metrics, stores messages, and notifies administrators.
- Messaging Consumer Library – hides the Consumer DB details helping clients to query for accounting and monitoring information
Message Broker
Following the work done by the EGEE/LCG projects at CERN using messaging systems, and to make interoperable the EGEE and D4science monitoring and accounting solutions, D4Science adopted the MSG Broker has its standard message broker service. The MSG Broker URL is disseminated in the gHN container deployed in all gCube nodes of the D4Science production infrastructure. The URL is distributed in the gHN Service Map configuration file.
Local Producer
The Local Producer is the entity deployed on each node of the infrastructure responsible for the messages exchange. It defines the methods to communicate with the Message Broker and is activated at node start-up (if configured to do so).
Figure 2 - gCube local Producer
The Local Producer is structured in two main components:
- An abstract Local Producer interface. This interface is part of the gCore Framework (gCF) and models a local producer, a local probe, and the base message.
- An implementation of the abstract Local Producer. This implementation class has been named GCUBELocalProducer.
The GCUBELocalProducer, at node start-up, sets up two types of connections towards the Message Broker:
- Queue connections: exploited by accounting probes that produce messages consumed by only one consumer;
- Topic connections: exploited by monitoring probes that produce messages consumed by multiple consumers.
Configuration
In order to configure the GHN to run the gCube Local Monitor, at least one MessageBroker ( an Active MQ endpoint) must be configured in one of the ServiceMap related to the GHN scope as follows:
<ServiceMap> <Service name ="ISICAllQueryPT" endpoint ="http://dlib01.isti.cnr.it:8080/wsrf/services/diligentproject/informationservice/disic/DISICService"/> <Service name ="ISICAllRegistrationPT" endpoint ="http://dlib01.isti.cnr.it:8080/wsrf/services/diligentproject/informationservice/disic/DISICRegistrationService"/> ...................... <Service name ="MessageBroker" endpoint ="tcp://ui.grid.research-infrastructures.eu:6166"/> </ServiceMap>
One parameter can been added also to the [1]GHN configuration :
- testInterval: The interval in seconds between test executions ( default = 1800)
In case none of MessageBroker parameters present on GHN ServiceMaps, the gCube Local Monitor is not enabled on the GHN.
Monitoring Probes
The monitoring probes can be of two types: gHN and Running Instance (RI). A gHN probe produces messages related to the gHN node itself while the RI probe produces messages concerning the gCube services running on the gHN. The following probes are currently available:
- GHNDiskProbe – monitors the local available disk space;
- GHNLoadProbe – monitors the CPU load of the gHN;
- GHNMemoryProbe – monitors the memory available on the gHN;
- GHNInformationProbe – gathers information related to the gHN HW;
- GHNNotificationProbe – subscribes for local gHN events (scope changed, scope added, node start, etc);
- RINotificationProbe – subscribes for local RI events (scope changed, scope added, deployment, etc).
All the above probes exploit the GCUBELocalProducer to contact the Message Broker and send messages.
Probes can also be grouped according to types of message that they produce and according to their behaviour: Test Probes – Perform local tests on the gHN and send messages containing the test results. Probes 1 to 4 above. Notification Probes – Exploits the gCF local event mechanism to consume events related to GHN/RI actions (GHN Ready, RI/GHN scope changed, etc). Probes 5 and 6 above.
Node Accounting Probe
The Node Accounting Probe is in charge of collecting information about local usage of gCube services. The probe is a library deployed on each gHN that exploits the mechanisms offered by gCF to understand the usage of the services on the infrastructure. For each incoming method call, gCF produces a record log as follows: 2009-10-30 04:17:52,287 TRACE handlers.GCUBEHandler [ServiceThread-59,trace:82] GCUBEHandler: END CALL (VREMANAGEMENT:SOFTWAREREPOSITORY:get),/testing,Thread[ServiceThread-59,5,main],[0.0080] Each “END CALL” line contains information about:
- Time
- Running Instance invoked
- Method invoked
- Caller scope
- Caller IP
- Invocation Time
The probe parses this type of log files at the end of each day and aggregates information per running instance. In particular the information is aggregated following this schema: RI -> CallerScope -> CallerIP -> Hourly Interval Time -> Number of Invocations and Average Invocation Time. The information about the invoked method is not parsed since it has been decided not to expose this granularity of information. At the end of the aggregation process, the probe creates node accounting messages that are sequentially send to the Message Broker using the Local Producer. For this particular type of messages a queue receiver is exploited on Message Broker side.
Portal Accounting Probe
The Portal Accounting Probe is in charge of aggregating information about portal usage. As for the node accounting, the portal (and in particular the ASL library) produces a log record, describing the following operations:
- Login
- Browse Collection
- Simple Search
- Advanced Search
- Content Retrieval
This is an example of one log record produced by the D4science portal: 2009-09-03 12:22:37, VRE -> EM/GCM, USER -> andrea.manzi, ENTRY_TYPE -> Simple_Search, MESSAGE -> collectionName = Earth images AND collectionID = 12345 | collectionName = Landsat 7 AND collectionID = 54321 | term = satellite
The common information between each type of log is:
- Time
- User
- VRE
- OperationType
- Message
The message part differs between each type of log record, for example for the simple search record contains info about the collections included in the operation and term searched. The portal accounting probe aggregates portal accounting information by creating a number of Portal Accounting messages aggregated by: User -> VRE -> OperationType. Each message contains a certain number of records of one OperationType. As in node accounting, at the end of the aggregation process the probe sequentially sends to the Message Broker using the Local Producer a number of portal accounting messages. For this particular type of messages a queue receiver is exploited on Message Broker side. Messages Different message types are defined for monitoring and for accounting.
Monitoring Messages
The monitoring probes (GHN and RI probes) exchange with the Message Broker a particular type of messages (extensions to the base GCUBEMessage) named respectively GHNMessage and RIMessage. Both of them contain a particular object named “Test” that represents the test performed on the GHN (together with the result) or a Notification:
- TestType – either TEST or NOTIFICATION;
- Description – the test/notification description;
- TestNumber – a unique Identifier;
- TestResult – object that stores the TEST results; (in case of NOTIFICATION no results are expected_
- Priority – either HIGH or LOW.
The RIMessage also contains information about the ServiceClass and the ServiceName of the Running Instance where the probe is running. At message creation time, depending on the type of messages and type of probes, different combinations of topic names and message selectors are possible:
- GHN Message, TEST probe:
scope.MONITORING.GHN.sourceGHN/MessageType='TEST'
- GHN Message, NOTIFICATION probe:
scope.MONITORING.GHN.sourceGHN/MessageType='NOTIFICATION'
- RI Message, TEST probe:
scope.MONITORING.RI.sourceGHN/MessageType='TEST'
- RI Message, NOTIFICATION probe:
scope.MONITORING.RI.sourceGHN/MessageType='NOTIFICATION' The monitoring probes, following the above topic structure, send messages for each scope of the GHN/RI. For example on a gHN running on node pcd4science.cern.ch and port 8080, that belongs to both /gcube and /gcube/devsec scopes, the GHNDiskProbe probe will send two messages with the following topic names:
- gcube.MONITORING.GHN.pcd4science_cern_ch:8080
- gcube.devsec.MONITORING.GHN.pcd4science_cern_ch:8080
Accounting Messages
Node and portal accounting probes use particular types of messages, named respectively NodeAccountingMessage and PortalAccountingMessage.
The NodeAccountingMessage is a specialization of the generic GCUBEMessage. It’s used to transfer the details about the invocations received by a RI on a particular scope. It includes:
- RI service name and class
- Caller scope
- Caller IP
- Invocation date
- Hourly records composed by:
- Time frame
- Service invocation number
- Average invocation time
For accounting messages, the JMS destination is a queue. Instead of a topic naming structure, the message follows a queue naming structure:
scope.ACCOUNTING.GHN.SourceGHN
For example:
- gcube.ACCOUNTING.GHN.pcd4science_cern_ch:8080
- gcube.devsec.ACCOUNTING.GHN.pcd4science_cern_ch:8080
The PortalAccountingMessage is a specialization of the generic GCUBEMessage. The type of information to transport is rich and can vary considerably. The basic fields are: User and VRE. Then the message is structured to contain a list of Basic Record specialized in:
- LoginRecord
- AdvancedSearchRecord
- SimpleSearchRecord
- QuickSearch
- GoogleSearch
- BrowseRecord
- ContentRecord
- GenericRecord (for generic operation logs)
All of the above records have in common only timestamp information. Also for these messages, there is queue naming structure as follows: scope.ACCOUNTING.PORTAL.SourceGHN
For example:
- gcube.ACCOUNTING.PORTAL.pcd4science_cern_ch:8080
Messaging Consumer
The Consumer Monitor is a gCube WSRF service that is deployed on the infrastructure to consume messages coming from Message Brokers. The main features of the service are:
- Subscribe to monitoring/accounting messages for different scopes;
- Check monitoring message test result against metrics;
- Store monitoring/accounting messages on local database;
- Send email notifications to admins in case of abnormal tests results;
- Provides a GUI with summary information and query facilities.
This WSRF service exposes public operations to allow queries to the underneath database and export information outside the infrastructure.
Figure 3 – gCube Messaging consumer
Following the messages topic structure the Messaging Consumer, at start-up time, creates (1) durable subscriptions towards topics, and (2) queue receiver towards queues. The Message Broker server will hold messages for a client subscriber after it has formally subscribed. Durable topic subscriptions receive messages published while the subscriber is not active. Subsequent subscriber objects specifying the identity of the durable subscription can resume the subscription in the state it was left by the previous subscriber. This means that using the same subscription ID the Messaging Consumer can resume the receipt of messages from the Message Broker server. This is very powerful, and it's useful in case of a node-crash or service re-deployment.
The Messaging Consumer also embeds a Message Broker for testing purposes. However in the production environment a dedicate Message Broker is deployed.
The Messaging Consumer can dynamically run in one or more scopes. According to the topic/queue structure defined, when a scope is added to its RI the service automatically subscribes for the following topics/queues:
- <scope>.MONITORING.GHN.*
- <scope>.MONITORING.RI.*
- <scope>.ACCOUNTING.GHN.*
- <scope>.ACCOUNTING.PORTAL.*
The Consumer Service can be configured using the “subscriptions” configuration variable, to subscribe only to a subset of the available information. In addiction the Messaging Consumer can be configured to use JMS message selectors. This means that for each scope 2*nOfSelectors durable subscribers are created using the wildcard (.*) syntax for TopicNames (all topic names of the same scope and type are subscribed for).
An important functionality of the Messaging Consumer is the capability to send notifications and daily reports to administrators by elaborating on the stored incoming messages. The administrators are selected trough a local configuration file, directly retrieved from VOMS, or by a configuration file stored on IS. The Messaging Consumer is configured to send email notification in two situations:
- When a message of type NOTIFICATION with HIGH priority (e.g. gHN start, shutdown) is received;
- When a message of type TEST and the test result exceed some threshold parameters (e.g. CPU usage, disk quota) is received.
The Messaging Consumer embeds a Jetty web server in order to give access to the database content (for debug purposes) and to publish daily report. A number of servlets show the DB content grouped by gHN name. The first version of the report GUI, allows the admin to navigate trough reports grouped by day, scope, gHN name, and shows the related messages consumed by the service. In order to include daily/monthly graphs, a first integration with Google char has been developed.