Installing & Configuring Traefik with SSL certs

Installing & Configuring Traefik with SSL certs

The problem

After running my Kubernetes services without a ‘proper’ SSL, using the traefik self-signed certificate and having browsers complain all of the time that there was a security problem, I decided to bite the bullet and set everything up to download and serve a certificate from LetsEncrypt.

For this, I decided to use two different domains. As my company domain is ‘intrasoftware.co.uk’, I left that one for real production use. For my development/test/home Kubernetes services, I used my ‘intrasoftware.uk’ domain.

Setting up a DNS wildcard

All of the domains (I have around 15 in total) are hosted on Ionos.co.uk. These guys provide relatively inexpensive hosting, domain administration and email. I have used them for about 20 years and have never had any problems.

The first thing I wanted to do was to define a wildcard in the DNS for *.intrasoftware.uk and point it to my internal Kubernetes cluster IP address. This was nice and easy; I accessed my domain admin website and created the entry. I then did a ping to ‘anything.intrasoftware.uk’, and the correct IP address popped up – great first stage.

Installing and Configuring Traefik

I have used Traefik for my ingress controller since I set up my Kubernetes cluster. It is great and easy to set up and configure. It allows me to define middleware to add extra headers to existing websites and add authentication to sites that don’t have a login page/facility.

To make use of ‘properly’ signed certificates, there are quite a few different steps required both within the Traefik setup and within the IngressRoutes you create for your services.

To install Traefik, I ran the following commands: –

helm repo add traefik https://helm.traefik.io/traefik
helm repo update
helm search repo traefik

I then created a custom ‘values.yaml’ file from the standard values in the helm chart by running the following command: –

helm show values traefik/traefik > traefik-values.yaml

This creates a huge file of values and comments – most of which I never change. So, rather than have a huge file that is very hard to understand, I stripped it down to this: –

globalArguments:
  - "--global.checknewversion=false"
  - "--global.sendanonymoususage=false"

additionalArguments: 
  - "--providers.kubernetesingress.ingressclass=traefik-internal"
  - "--log.level=DEBUG"

deployment:
  enabled: true
  kind: Deployment
  replicas: 1
  annotations: {}
  labels: {}

ingressRoute:
  dashboard:
    enabled: false
  
providers:
  kubernetesCRD:
    enabled: true
    allowCrossNamespace: true
    allowExternalNameServices: true

  kubernetesIngress:
    enabled: true
    allowExternalNameServices: true
    publishedService:
      enabled: false

rbac:
  enabled: true

ports:
  web:
    redirectTo: websecure
  websecure:
    tls:
      enabled: true

service:
  enabled: true
  type: LoadBalancer
  annotations: {}
  labels: {}
  spec:
    loadBalancerIP: 192.168.1.71
  loadBalancerSourceRanges: []
  externalIPs: []

In the config, I left the number of replicas as 1 – this should be set to 3 to give me an HA environment, but I left it as 1 for testing.

The load balancer IP is my Kubernetes cluster IP address – I don’t think this is absolutely necessary, but I like to set it declaratively so I know it uses my one and only address.

The ‘ports’ section is where the magic happens. This tells Traefik to redirect all HTTP requests to HTTPS and enables TLS. The only other settings of note are: –

allowCrossNamespace: true 
allowExternalNameServices: true

These allow IngressRoutes to be created in different namespaces as needed, and everything works as expected.

To complete the installation of Traefik, I issued the following command: –

helm install traefik traefik/traefik --values traefik-values.yaml -n traefik --create-namespace

While playing around with the settings in the values file, I had to change the settings a couple of times. To re-apply the settings to the deployment, I kept issuing the following command: –

helm upgrade traefik traefik/traefik --values traefik-values.yaml -n traefik

Once Traefik was installed, the last thing to do was to create a simple IngressRoute to one of the existing services like this: –

apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:  
  name: homarr
spec:  
  entryPoints:
    - websecure 
  routes:
    - match: Host(`home.intrasoftware.uk`)
      kind: Rule
      services:
        - name: homarr-service
          port: 7575

This allowed me to access the website via a very nice-looking URL. This still displayed the usual ‘this site is not really secure’ type of message in the browser, and clicking on the certificate button showed it was a Traefik self-signed certificate. This was another great sign, and things looked good.

Install and configure Cert-Manager

The next part of the puzzle is cert-manager from https://cert-manager.io

cert-manager adds certificates and certificate issuers as resource types in Kubernetes clusters, and simplifies the process of obtaining, renewing and using those certificates.

It can issue certificates from a variety of supported sources, including Let’s Encrypt, HashiCorp Vault, and Venafi as well as private PKI.

It will ensure certificates are valid and up to date, and attempt to renew certificates at a configured time before expiry.

As standard, Kubernetes does not have any real knowledge of certificates or how to generate authoritative certificates, but cert-manager does. It creates several Custom Resource Definitions that pull everything together. To install cert-manager, I created a new cert-manager namespace and then created the CRDs with the following commands: –

kubectl create namespace cert-manager

kubectl apply -f kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.1/cert-manager.crds.yaml

Note the version number. At the time I did this, 1.13.1 was the latest stable version.

This took a while as there are quite a few things it creates. Once it was complete, I issued the following command to see if there were any certificates in the store. This would have generated an error before the CRDs were created.

kubectl get certificate

Also, note that there are a couple of different pages on the cert-manager.io site that have commands that create the CRDs – initially, I ran the standard manifest command, but this only led to the helm install failing with error messages stating it was creating duplicate CRDs – so it is important to run the command above and ensure that the helm CRD is installed. You can have the helm install to create the CRDs while it installs everything else, but I created them first.

The next thing to do was to create a ‘certmanager-values.yaml’ file. This defines how cert-manager and its pods behave.

installCRDs: false
replicaCount: 1
extraArgs:
  - --dns01-recursive-nameservers=1.1.1.1:53,8.8.8.8:53
  - --dns01-recursive-nameservers-only
podDnsPolicy: None
podDnsConfig:
  nameservers:
    - "1.1.1.1"
    - "8.8.8.8"

It took a while to determine these values and much trial and error.

The first line tells Helm not to install in the CRDs – this was because I had already installed them. I left the replicaCount as 1, but it should be set at something like 3, as was the case above for the Traefik install.

Cert-manager will check the correct DNS records exist before attempting a DNS01 challenge. The extraArgs section tells cert-manager where to check for *.intrasoftware.uk.

The podDnsPolicy / Config sections were needed to ensure the cert-manager pods only check externally and don’t use any of my existing internal DNS servers. I have my Proxmox servers and my clusters configured to use my internal DNS server – this allows me to override various *.intrasoftware.uk items internally, as I don’t host everything within my K8s clusters.

To install cert-manager, I ran the following commands: –

helm repo add jetstack https://charts.jetstack.io
helm repo update

helm install cert-manager --namespace cert-manager --version v1.13.1 jetstack/cert-manager --values=certmanager-values.yaml

Again, note the version number. This command took quite a long time to run, and I thought it would take so long it would fail. However, after a few minutes, it completed without any errors.

I kept issuing the following command and checking the pods that were created. As I had set the replicaCount to 1, there was 1 cert-manager server pod and 2 support pods for a total of 3.

kubectl get pods -n cert-manager

Ionos Issuer Resolver

LetsEncrypt will only generate certificates for domains it confirms belong to you. This is to stop anyone from asking for a certificate for a domain that isn’t theirs – it makes sense. So, it is necessary to configure ‘Issuer’ cert-manager resources to tell it where it should go to confirm a domain belongs to you. My problem was that most examples I found were for things like Cloudflare, and my domains are held on Ionos.

A little Googling found the answer – a custom resolver specifically written for Ionos – result!

This is at: –

To install this, I ran the following commands: –

helm repo add cert-manager-webhook-ionos https://fabmade.github.io/cert-manager-webhook-ionos
helm repo update
helm install cert-manager-webhook-ionos cert-manager-webhook-ionos/cert-manager-webhook-ionos -n cert-manager

This created the necessary things within the cert-manager namespace.

The next step was to create a couple of certificate resolvers – one for staging/testing and another for production. LetsEncrypt has rate limiting in place on its production API – this means you must not repeatedly request certificates on this one. If you do, you will be locked out for up to a week – so it is best to use the staging API whilst you are testing and only switch over to the production API once you know everything works. The staging API is specifically for you to test against and doesn’t limit you. However, the staging certificates still result in your browser complaining that the certificate is not fully trusted. This does allow you to confirm that a staging and, therefore, an externally provided certificate is in place.

The two resolvers require a user API key to work. I had to head back to the Ionos Admin Centre and set up an API key. To do this, you have to activate the API on your account. At first, I thought I would be charged extra for this as it uses a sort of shopping cart application to do this, but the cost was £0, so this was great, and a few clicks later, my account had API access.

I clicked the relevant link buttons and created an API key. This gave me both a public and private API key. As these are effectively secret values specific to my domains, I create a secret file with the details in: –

apiVersion: v1
stringData:
  IONOS_PUBLIC_PREFIX: 9ba3b95ad5314bxxxxxxxxxxxx
  IONOS_SECRET: 1LrARt7q9VwPIyaa3BwvLThKxt9gkLhTUV8QX8ItKBuPUtV0zpIc3zaaExoPxxxxxxxxxxxxxxxx
kind: Secret
metadata:
  name: ionos-secret
type: Opaque

Then I created two more files, one for the staging issuer and one for production: –

apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
  name: letsencrypt-ionos-staging
spec:
  acme:
    server: https://acme-staging-v02.api.letsencrypt.org/directory
    # Email address used for ACME registration
    email: stf@finchett.com
    privateKeySecretRef:
      name: letsencrypt-ionos-staging-key
    solvers:
      - dns01:
          webhook:
            groupName: acme.fabmade.de
            solverName: ionos
            config:
              apiUrl: https://api.hosting.ionos.com/dns/v1
              publicKeySecretRef:
                key: IONOS_PUBLIC_PREFIX
                name: ionos-secret
              secretKeySecretRef:
                key: IONOS_SECRET
                name: ionos-secret

apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
  name: letsencrypt-ionos-prod
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: stf@finchett.com
    privateKeySecretRef:
      name: letsencrypt-ionos-prod
    solvers:
      - dns01:
          webhook:
            groupName: acme.fabmade.de
            solverName: ionos
            config:
              apiUrl: https://api.hosting.ionos.com/dns/v1
              publicKeySecretRef:
                key: IONOS_PUBLIC_PREFIX
                name: ionos-secret
              secretKeySecretRef:
                key: IONOS_SECRET
                name: ionos-secret

Other than the words ‘staging’ and ‘prod’ being different in these manifests, not the server name is different. One is staging, and one is production.

Once these files were created, I ran the following commands to install things in the cluster:-

kubectl create -f ionos-secret.yaml
kubectl create -f ionos-issuer-staging.yaml
kubectl create -f ionos-issuer-production.yaml

Once those were created, the next things needed were staging and production certificates themselves. These manifests link with the issuers to have certificates generated on LetsEncrypt, download them into the cluster and make them available for use within IngressRoutes, etc.

To do this, two more files were created for the certificates: –

apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: intrasoftware-uk-staging
spec:
  dnsNames:
    - '*.intrasoftware.uk'
  issuerRef:
    name: letsencrypt-ionos-staging
  secretName: intrasoftware-uk-staging-tls

apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: intrasoftware-uk
spec:
  dnsNames:
    - '*.intrasoftware.uk'
  issuerRef:
    name: letsencrypt-ionos-prod
  secretName: intrasoftware-uk-tls

The only differences are the names and the issuer references – one for staging and one for production. Again, these we created in the cluster will the following commands: –

kubectl create -f ionos-certificate-staging.yaml
kubectl create -f ionos-certificate-production.yaml

Once these certificate requests are made, they trigger cert-manager into action, which then triggers the issuers to get to work. It can take quite a few minutes for this work and for the certificates to be created and downloaded. You can issue the following command to see if the certificates have appeared: –

kubectl get certificate

As this does take some time, I looked at the logs in the cert-manager server pod. Initially, I noticed that the issuer could not pick up the secret I had created. This was because I had created the secret in the cert-manager namespace, but the certificates were in the default namespace. So, I moved the secret and bingo, the issuer communicated with Ionos and LetsEncrypt, the staging certificate was downloaded, and it moved on to the production certificate. A couple of minutes later, I had both certificates, as shown below: –

kubectl get certificate
NAME                                READY   SECRET                             AGE
intrasoftware-uk                    True    intrasoftware-uk-tls               30h
intrasoftware-uk-staging            True    intrasoftware-uk-staging-tls       30h

The last step was to update my IngressRoutes for each of my deployments. Fortunately, that is nice and easy: –

apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
  name: traefikdashboard
spec:
  entryPoints:
    - websecure
  routes:   
    - match: Host(`traefik.intrasoftware.uk`)
      kind: Rule
      services:
        - name: api@internal
          kind: TraefikService
  tls:
    secretName: intrasoftware-uk-staging-tls

The only thing that needed to be added was the TLS section at the bottom. This time, the https://traefik.intrasoftware.uk website opened, and again, it said ‘it had a self-signed certificate’. However, this time the issuer was LetsEncrypt, not Traefik. This was good – everything was wired up correctly – so I then updated the IngressRoute again to remove ‘-staging’ from the secret name and redeployed it again with: –

kubectl delete ingressroute traefixdashboard
kubectl create -f i dashboard.yaml 

This time, when I accessed the URL, I had a nice padlock in the browser with no warning messages. The browser was happy, and I was happy. When I looked at the certificate, it was a fully-fledged, fully validated, publically valid certificate for the right external domain.

Overall, this was quite a journey, but I now have valid SSL certificates on all my Kubernetes services.

Stephen

Hi, my name is Stephen Finchett. I have been a software engineer for over 30 years and worked on complex, business critical, multi-user systems for all of my career. For the last 15 years, I have been concentrating on web based solutions using the Microsoft Stack including ASP.Net, C#, TypeScript, SQL Server and running everything at scale within Kubernetes.