OpenVPN, Traefik and Docker logos. OpenVPN, Traefik and Docker logos.

Building on OpenVPN Server on Docker Swarm and Deploy traefik, prometheus, grafana, portainer and oauth2_proxy with docker-compose, this How-To will show you how to:

  • Launch Traefik as an edge router in docker swarm
  • Launch OpenVPN Server
  • Connect to your docker web container via VPN by using name resolution

This documentation is oriented to advanced users. See Prerequisites.

Table of Contents

Rationale

Since writing OpenVPN Server on Docker Swarm, Traefik v2.2 was released, with UDP support. This allows now to ditch the use of nginx from the configuration.

I’ll be using in this example the following docker images:

Prerequisites

  • Have a fully functional docker swarm
  • Have a public IP address that points to one of the docker swarm nodes
  • Have an attachable docker swarm network (see Traefik Network below)
  • (optional) Have a DNS entry that points at the IP address (something like vpn.example.com - which will be used in this example)
  • (optional) Have another DNS entry that points at the IP address (something like spielwiese.example.com - which will be used in this example)

Any commands listed below need to be run on a swarm manager. You can see your manager nodes by running: sudo docker node ls -f role=manager.

Certain workloads are designed to run only on docker manager nodes (such as OpenVPN, Traefik). While this can be worked around, it’s out of scope for this How-To.

Files Needed

The stacks created with this How-To:

  • traefik
  • vpn
  • websites

Folders and persistent storage

You will want to store the certificates created for OpenVPN, so that they don’t get lost the first time you restart the container. I’m using for this a folder shared among all docker manager nodes, mounted under /var/docker.

For Traefik and OpenVPN (as root):

mkdir -p /var/docker/traefik
mkdir -p /var/docker/openvpn

Networks

Make sure you have two attachable swarm network for the communication between:

  • Traefik <–> OpenVPN (traefik-vpn)
  • OpenVPN <–> Other Containers (vpn)

Also, one non-attachable swarm network for the web services:

  • Traefik <–> Websites (traefik-web)

Create the networks

docker network create --driver overlay --attachable --scope swarm --opt encrypted traefik-vpn
docker network create --driver overlay --attachable --scope swarm --opt encrypted vpn
docker network create --driver overlay --scope swarm --opt encrypted traefik-web

Since we are using registry.gitlab.com/ix.ai/swarm-launcher to start OpenVPN, which is a glorified docker-compose, the networks traefik-vpn and vpn must be attachable from outside the Docker Swarm!

OpenVPN

I have already covered the detailed configuration of OpenVPN in the post OpenVPN Server on Docker Swarm, so I will do only a summary here.

Variables

export OVPN_DATA="/var/docker/openvpn"
export ENDPOINT="vpn.example.com"

If you don’t have a FQDN for vpn.example.com, replace it with the public IP of the Traefik container.

DNSMASQ

Assuming you haven’t changed the IP pool of the OpenVPN server, by running the following command, you can configure it to push itself (192.168.255.1) as a DNS server to the clients:

docker run \
  -v "${OVPN_DATA}":/etc/openvpn \
  --log-driver=none \
  --rm \
  registry.gitlab.com/ix.ai/openvpn ovpn_genconfig \
    -u udp://"${ENDPOINT}" \
    -b \
    -Q \
    -n 192.168.255.1
  • -Q enables the DNSMASQ server baked into the image
  • -n 192.168.255.1 tells the clients to use the internal OpenVPN IP for DNS

Initialise the PKI

docker run \
  -v "${OVPN_DATA}":/etc/openvpn \
  --log-driver=none \
  --rm \
  -it \
  registry.gitlab.com/ix.ai/openvpn \
    ovpn_initpki

Stack File

See stack-vpn.yml:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
version: "3.8"

services:
  openvpn-launcher:
    deploy:
      placement:
        constraints:
          - 'node.role == manager'
    image: registry.gitlab.com/ix.ai/swarm-launcher:latest
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:rw
    environment:
      LAUNCH_IMAGE: registry.gitlab.com/ix.ai/openvpn:latest
      LAUNCH_PULL: 'true'
      LAUNCH_EXT_NETWORKS: 'traefik-vpn vpn'
      LAUNCH_PROJECT_NAME: 'vpn'
      LAUNCH_SERVICE_NAME: 'vpn'
      LAUNCH_CAP_ADD: 'NET_ADMIN'
      LAUNCH_PRIVILEGED: 'true'
      LAUNCH_ENVIRONMENTS: 'OVPN_NATDEVICE=eth0@_@eth1@_@eth2 OVPN_DNSMASQ=1'
      LAUNCH_VOLUMES: '/var/docker/openvpn:/etc/openvpn:rw'

registry.gitlab.com/ix.ai/swarm-launcher converts @_@ to a single space. The OpenVPN container will have the variable OVPN_NATDEVICE="eth0 eth1 eth2". Simply put, in order to communicate with anything outside the OpenVPN container, you need at least one ethX NAT device in the list plus one for every additional network (that’s listed under LAUNCH_EXT_NETWORKS).

Deploy the Stack

docker stack deploy vpn -c stack-vpn.yml

Traefik

Since Traefik can’t run in dual mode, both with and without swarm, we need the services.yml file to create an UDP router and its service for port 1194:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
udp:
  routers:
    openvpn-udp:
      entryPoints:
        - openvpn-udp
      service: openvpn-udp
  services:
    openvpn-udp:
      loadBalancer:
        servers:
          - address: "vpn_vpn_1:1194"

Note the name vpn_vpn_1 on line 11. This is the name of the docker container running OpenVPN, as seen when issuing docker ps on the last column. If you use a different name for your container, it must be reflected here as well, or else connecting to your OpenVPN server will not work.

Considering that we need to get the services.yml file in docker swarm, we will create a docker config:

docker config create traefik_services.yml.v1 services.yml

Creating a docker config can be integrated in the stack-traefik.yml. I prefer to create it outside of the stack, since I want to use versioning for the configs.

Letsencrypt

export CF_DNS_API_TOKEN="FOO"
export CF_ZONE_API_TOKEN="BAR"

The CF_DNS_API_TOKEN (API token with DNS:Edit permission) can be set only for a specific zone (in our case, example.com), but the CF_ZONE_API_TOKEN (API token with Zone:Read permission) has to have permissions for all zones.

Stack File

See stack-traefik.yml:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
version: "3.8"

services:
  traefik:
    deploy:
      placement:
        constraints:
          - 'node.role == manager'
      labels:
        traefik.enable: 'true'
        traefik.http.middlewares.default-compress.compress: 'true'
        traefik.http.middlewares.default-headers-https.headers.browserXssFilter: 'true'
        traefik.http.middlewares.default-headers-https.headers.contentTypeNosniff: 'true'
        traefik.http.middlewares.default-headers-https.headers.customRequestHeaders.X-Forwarded-Proto: https
        traefik.http.middlewares.default-headers-https.headers.customResponseHeaders.server: ""
        traefik.http.middlewares.default-headers-https.headers.forceSTSHeader: 'true'
        traefik.http.middlewares.default-headers-https.headers.frameDeny: 'true'
        traefik.http.middlewares.default-headers-https.headers.sslRedirect: 'true'
        traefik.http.middlewares.default-headers-https.headers.stsSeconds: 31536000
        traefik.http.middlewares.default-headers-https.headers.stsPreload: 'true'
        traefik.http.middlewares.default-compress.compress: 'true'
        traefik.http.middlewares.default-http.redirectScheme.scheme: https
        traefik.http.middlewares.default-http.redirectScheme.permanent: 'true'
        traefik.http.middlewares.default-https.chain.middlewares: default-headers-https,default-compress
        traefik.http.routers.default-redirect.entrypoints: http
        traefik.http.routers.default-redirect.middlewares: default-http
        traefik.http.routers.default-redirect.rule: "HostRegexp(`{any:.*}`)"
        traefik.http.routers.default-redirect.service: noop@internal
        traefik.http.services.traefik.loadbalancer.server.port: '80'
    image: traefik:latest
    networks:
      - traefik-web
      - traefik-vpn
    ports:
      - target: 80
        published: 80
        protocol: tcp
        mode: host
      - target: 443
        published: 443
        protocol: tcp
        mode: host
      - target: 1194
        published: 1194
        protocol: udp
        mode: host
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - /var/docker/traefik:/etc/traefik/acme
    configs:
      - source: traefik_services.yml.v1
        target: /services.yml
    environment:
      CF_DNS_API_TOKEN: ${CF_DNS_API_TOKEN?err}
      CF_ZONE_API_TOKEN: ${CF_ZONE_API_TOKEN?err}
    command:
      - --accesslog=true
      - --accesslog.fields.defaultmode=keep
      - --accesslog.fields.headers.defaultmode=keep
      # For the example only. Remove it for production!
      - --certificatesResolvers.default.acme.caserver=https://acme-staging-v02.api.letsencrypt.org/directory
      - --certificatesResolvers.default.acme.dnsChallenge.provider=cloudflare
      - --certificatesResolvers.default.acme.dnsChallenge.resolvers=1.1.1.1:53,1.0.0.1:53
      - --certificatesResolvers.default.acme.storage=/etc/traefik/acme/acme.json
      - --entrypoints.http.address=:80
      - --entrypoints.https.address=:443
      - --entrypoints.openvpn-udp.address=:1194/udp
      - --log.level=INFO
      - --metrics.prometheus=true
      - --providers.docker.endpoint=unix:///var/run/docker.sock
      - --providers.docker.defaultRule=Host(`{{ index .Labels "ai.ix.fqdn"}}`)
      - --providers.docker.exposedByDefault=false
      - --providers.docker.swarmModeRefreshSeconds=3
      - --providers.file.filename=/services.yml
networks:
  traefik-vpn:
    external: true
  traefik-web:
    external: true
configs:
  traefik_services.yml.v1:
    external: true

Using mode: host to be able to capture the real IP of the connecting clients.

Deploy the Stack

docker stack deploy traefik -c stack-traefik.yml

Websites

I’ll keep this part simple: only stack-websites.yml. Note the ai.ix.fqdn label - it was configured in the Traefik Stack File with --providers.docker.defaultRule.

export SPIELWIESE_DOMAIN="spielwiese.example.com"
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
version: "3.8"

services:
  spielwiese:
    image: registry.gitlab.com/ix.ai/spielwiese:latest
    networks:
      - traefik-web
      - vpn
    environment:
      TZ: Europe/Berlin
    deploy:
      restart_policy:
        delay: 5s
      labels:
        ai.ix.fqdn: "${SPIELWIESE_DOMAIN?err}"
        traefik.enable: 'true'
        traefik.http.routers.spielwiese-example-com.entrypoints: https
        traefik.http.routers.spielwiese-example-com.middlewares: default-https
        traefik.http.routers.spielwiese-example-com.tls.certResolver: 'default'
        traefik.http.services.spielwiese-example-com.loadbalancer.server.port: '8000'
networks:
  traefik-web:
    external: true
  vpn:
    external: true

Deploy the Stack

docker stack deploy websites -c stack-websites.yml

That’s it! If you now connect to your OpenVPN service, you should be able to:

curl -v http://websites_spielwiese:8000
*   Trying 10.0.43.6:8000...
* Connected to websites_spielwiese (10.0.43.6) port 8000 (#0)
> GET / HTTP/1.1
> Host: websites_spielwiese:8000
> User-Agent: curl/7.72.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Content-Length: 271
< Content-Type: text/html; charset=utf-8
< Date: Fri, 18 Sep 2020 21:16:39 GMT
< Server: spielwiese 1.1.2-122639551
<
request: GET / HTTP/1.1
hostname: 18ddd83872a4
uname: #51-Ubuntu SMP Fri Sep 4 19:50:52 UTC 2020
ram: 63 GB
remote_addr: [10.0.43.5]:63760
lo: 127.0.0.1
eth0: 10.0.14.95
eth2: 172.18.0.20
eth1: 10.0.43.7
Host: websites_spielwiese:8000
User-Agent: curl/7.72.0
Accept: */*
* Connection #0 to host websites_spielwiese left intact

When connecting from a Linux client, using NetworkManager and systemd-resolved, you will probably not be able to resolve the host:

curl -v websites_spielwiese:8000
* Could not resolve host: websites_spielwiese
* Closing connection 0
curl: (6) Could not resolve host: websites_spielwiese

If this doesn’t work, add the name of the network as a domain name:

curl -v websites_spielwiese.vpn:8000
*   Trying 10.0.43.11:8000...
* TCP_NODELAY set
* Connected to websites_spielwiese.vpn (10.0.43.11) port 8000 (#0)
> GET / HTTP/1.1
> Host: websites_spielwiese.vpn:8000
> User-Agent: curl/7.68.0
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Content-Length: 277
< Content-Type: text/html; charset=utf-8
< Date: Tue, 20 Oct 2020 11:51:01 GMT
< Server: spielwiese 1.1.3-192008834
< 
request: GET / HTTP/1.1
hostname: c80c5e2825c7
uname: #57-Ubuntu SMP Thu Oct 15 10:57:00 UTC 2020
ram: 63 GB
remote_addr: [10.0.43.9]:43318
lo: 127.0.0.1
eth0: 10.0.14.50
eth2: 172.18.0.31
eth1: 10.0.43.32
Host: websites_spielwiese.vpn:8000
User-Agent: curl/7.68.0
Accept: */*
* Connection #0 to host websites_spielwiese.vpn left intact


Updates

This article has been updated:

  • 2020-10-20: Add the warning about not being able to resolve the service
  • 2020-11-01: Change the docker images from docker hub (ixdotai/[...]) to registry.gitlab.com/ix.ai/[...]