Docker environment at D4Science
From Gcube Wiki
Revision as of 15:48, 5 January 2021 by Andrea.dellamico (Talk | contribs)
Contents
D4Science docker infrastructure
A production cluster is available, based on Docker Swarm [1]. The cluster consists of:
- three manager nodes
- currently, five worker nodes.
The running services are exposed using a double set of HAPROXY load balancers:
- A L4 layer, used to reach the http/https services exposed by the L7 layer
- A L7 layer, running in the swarm, configured to dinamically resolve the backend names using the Docker internal DNS service
Provisioning of the Docker Swarm
The Swarm, with portainer [2] and the L7 HAPROXY [3] installation is managed by ansible, starting from the role [4]
The load balancers architecture
L4 load balancers
All the requests to public containerized services pass through two HAPROXY servers in a High Availability setup, acting as IP level 4 proxies. The basic configuration is
frontend LB bind *:80 mode http redirect scheme https if !{ ssl_fc }
frontend lb4_swarm bind *:443 mode tcp description L4 swarm default_backend lb4_swarm_bk
backend lb4_swarm_bk mode tcp balance source hash-type consistent option tcp-check tcp-check connect port 443 ssl send-proxy server docker-swarm1.int.d4science.net docker-swarm1.int.d4science.net:443 check fall 1 rise 1 inter 2s send-proxy sni req.ssl_sni server docker-swarm2.int.d4science.net docker-swarm2.int.d4science.net:443 check fall 1 rise 1 inter 2s send-proxy sni req.ssl_sni server docker-swarm3.int.d4science.net docker-swarm3.int.d4science.net:443 check fall 1 rise 1 inter 2s send-proxy sni req.ssl_sni
A client is guaranteed to be routed to the same haproxy backend instance, it that instance is alive.
L7 HAPROXY cluster
- The https requests are handled by three HAPROXY instances.
- Backend hostnames are resolved using the Docker internal DNS.
- Backends can be single or multiple instances (if they support such a configuration)
- Stick sessions can be managed through cookies, src address, etc
- Rate limiting policies can be applied
A configuration snippet:
frontend http bind *:443 ssl crt /etc/pki/haproxy alpn h2,http/1.1 accept-proxy bind *:80 accept-proxy mode http option http-keep-alive option forwardfor http-request add-header X-Forwarded-Proto https # HSTS (63072000 seconds) http-response set-header Strict-Transport-Security max-age=63072000 acl lb_srv hdr(host) -i load-balanced-service.d4science.org redirect scheme https code 301 if !{ ssl_fc } use_backend lb_stack_name_bk if lb_srv
backend lb_stack_name_bk mode http option httpchk balance leastconn http-check send meth HEAD uri / ver HTTP/1.1 hdr Host load-balanced-service.d4science.org http-check expect rstatus (2|3)[0-9][0-9] dynamic-cookie-key load-balanced-service cookie JSESSIONID prefix nocache dynamic server-template lb-service-name- 2 lb_stack_name-lb_service_name:80080 check inter 10s rise 2 fall 3 resolvers docker init-addr libc,none
Docker Stack
- The services can be deployed into the Docker cluster as stacks [5] using a specially crafted compose file. The Open ASFA case can be used as a working example [6]
- The web services must be connected to the
haproxy-public
network and setup to use thednsrr
deploy mode, to be discoverable by HAPROXY - HAPROXY must be configured to expose the service. Example of a two instances shinyproxy service:
backend shinyproxy_bck mode http option httpchk balance leastconn http-check send meth HEAD uri / ver HTTP/1.1 hdr Host localhost http-check expect rstatus (2|3)[0-9][0-9] stick on src stick-table type ip size 2m expire 180m server-template shinyproxy- 2 shinyproxy_shinyproxy:8080 check resolvers docker init-addr libc,none
Docker compose example
- The following is an example of a stack made by two services, one of them is talking to the other using its Docker service name:
version: '3.6'
services: conductor-server: environment: - CONFIG_PROP=conductor-swarm-config.properties image: nubisware/conductor-server networks: - conductor-network - haproxy-public deploy: mode: replicated replicas: 2 endpoint_mode: dnsrr placement: constraints: [node.role == worker] restart_policy: condition: on-failure delay: 5s max_attempts: 3 window: 120s configs: - source: swarm-config target: /app/config/conductor-swarm-config.properties logging: driver: "journald"
conductor-ui: environment: - WF_SERVER=http://conductor-server:8080/api/ image: nubisware/conductor-ui networks: - conductor-network - haproxy-public deploy: mode: replicated replicas: 2 endpoint_mode: dnsrr placement: constraints: [node.role == worker] restart_policy: condition: on-failure delay: 5s max_attempts: 3 window: 120s
networks: conductor-network: haproxy-public: external: True
configs: swarm-config: file: ./conductor-swarm-config.properties
Setup Steps for a service
A ticket that asks for a new service should report the following data:
- A DNS name for each public available service. The name must be a CNAME of swarm-lb.d4science.org. If the name is not in the d4science.org domain, a _acme-challenge.externaldomain record must be set, CNAME of _acme-challenge.d4science.net. Otherwise we will not able to generate a valid TLS certificate
- The number of replicas for the service
- If stick sessions are required, if they are cookie or client src based
- A URI that will be used to check the status of the service (mandatory if the service is being replicated)
- The http port where the service will be listening, if stated in the compose file