Certificate Management
Objective
Deploy the foundational components for managing X.509 certificate infrastructure.
Background
In this and the following sections we will be deploying various systems for managing your network's cryptography. For a detailed discussion of the fundamental concepts and motivations, we strongly recommend reviewing our concept docs on network cryptography for the relevant background information.
Deploy cert-manger
cert-manager is the de-facto standard tool for managing X.509 certificates in Kubernetes. It handles the entire certificate lifecycle: provisioning, deployment, and renewal. Additionally, it works with both public certificate authorities (e.g., Let's Encrypt) and private CAs (e.g., Vault).
We provide a module for deploying cert-manager: kube_cert_manager.
Let's deploy it now:
-
Create a new directory adjacent to your
kube_vault
module calledkube_cert_manager
. -
Add a
terragrunt.hcl
to that directory that looks like this. -
Set
vpa_enabled
tofalse
. We will enable it when we deploy the autoscalers. -
Set
self_generated_certs_enabled
totrue
. We will disable insecure self-signed certificates after we set up certificate issuers in the next section. -
Add a
module.yaml
that enables theaws
,kubernetes
,helm
, andrandom
providers. -
Run
terragrunt apply
.
You should now see the cert-manager pods running:
As you might notice, cert-manager comes with three deployments:
-
controller (unlabeled): The primary system for managing the certificate lifecycle
-
cainjector: Injects CA data into webhooks to facilitate mTLS
-
webhook: Validates and mutates cert-manager CRDs submitted to the Kubernetes API server
Additionally, notice that several CustomResourceDefinitions (CRDs) were installed in the cluster:
These CRDs extend the Kubernetes API to allow us to create new types of resources such Certificates which cert-manager will handle.
Deploy Issuers
Creating certificates using cert-manager is a two-step process:
-
Create an "issuer" which instructs cert-manager on how to create a certificate.
-
Create the certificate by indicating which issuer to use.
You can read more about this process in the cert-manager documentation here.
We provide a module to set up the issuers required by the Panfactum stack: kube_cert_issuers.
This module creates three different types of issuers:
-
An issuer for public certificates signed by Let's Encrypt used by your public services
-
An issuer for internal certificates signed by your Vault cluster 1 for use in securing internal network traffic such as which webhooks
-
An issuer for internal intermediate CA certificates signed by your Vault cluster by tools like the service mesh (next section)
Let's deploy it now:
-
Create a new directory adjacent to your
kube_cert_manager
module calledkube_cert_issuers
. -
Add a
terragrunt.hcl
to that directory that looks like this. -
Note the
alert_email
will receive notifications if your certificates fail to renew. Set it to an email that is actively monitored to prevent unexpected service disruptions. -
Let's walk through the
route53_zones
input:-
When creating certificates for use in creating TLS connections to particular domains, cert-manager must have the ability to create DNS records as a part of Let's Encrypt's DNS challenge flow. That means cert-manager needs access to our AWS Route53 zones that host our DNS records for the cluster's environment.
-
In the prior DNS guide, we configured DNS in the aws_registered_domains and aws_delegated_zones modules under
production/global
. If you are setting up an environment other than production, we cannot reference them in terragruntdependency
blocks. Instead, we must copy the necessary values manually from the module outputs. 2 -
To retrieve the outputs, change your directory to the desired modules and run
terragrunt output
. You will see an output that looks like this: -
The
record_manager_role_arn
defines the IAM role that can create records in the every zone created by the module. Thezone_id
is the unique internal AWS identifier for the Route53 zone hosting records for the domain.
-
-
Add a
module.yaml
that enables theaws
,vault
, andkubernetes
providers. -
Run
terragrunt apply
.
Deploy the First Certificate
Let's ensure that the certificate infrastructure is working as expected.
As context, many projects come with a mechanism to generate their own certificates for ease-of-use. However, in a production setting these self-generated certificates are not ideal:
- They may contain insecure ciphers
- You cannot control their rotation frequency
- They are signed by private keys that can be easily extracted from the cluster
- They are not visible in our centralized observability tooling
In the Panfactum stack, we ensure that all internal utilities components use the centrally managed Vault certificates.
Let's replace the self-generated cert-manager webhook certificates 3 with ones generated by Vault:
-
Return to the
kube_cert_manager
module. -
Update the
self_generated_certs_enabled
tofalse
. This will instruct the module to instead use the internal certificate issuer created bykube_cert_issuers
. -
Run
terragrunt apply
. -
After a few moments, the module should successfully update. Using k9s, you should now be able to find your first Certificate resource (
:certificate
): -
A certificate is an abstract resource that ultimately results in certificate data stored in a corresponding secret:
Notice that there are three pieces of data in the secret:
tls.key
: The private key for the certificatetls.crt
: The actual public X.509 certificateca.crt
: The public certificate of the certificate authority that signed thetls.crt
-
Note that you can decode secrets in k9s by pressing
x
:You can copy the secret data by pressing
c
. -
Copy the
tls.crt
certificate to a decoder to view the metadata: 4 -
cert-manager has a companion CLI called
cmctl
to aid in managing certificates. The Panfactum devenv already bundles it. Let's manually rotate the certificate a few times to ensure the certificate rotation infrastructure is working as intended:cmctl -n cert-manager renew cert-manager-webhook-certs
. 5
Congratulations! We have just verified that the internal certificate provisioning process works as intended. We will test public certificates when we deploy our ingress system in a future guide section.
Deploy trust-manager
There is one final component to the certificate infrastructure: trust-manager.
trust-manager is an ancillary cert-manger utility that copies CA trust bundles into every cluster namespace so that they can be consumed by cluster workloads. As our private CA's (Vault) certs are not installed in every operating system by default, this provides a convenient way to make them available.
We provide a module for deploying trust-manager: kube_trust_manager.
Let's deploy it now:
-
Create a new directory adjacent to your
kube_cert_manager
module calledkube_trust_manager
. -
Add a
terragrunt.hcl
to that directory that looks like this. -
Set
vpa_enabled
tofalse
. We will enable it when we deploy the autoscalers. -
Add a
module.yaml
that enables theaws
,kubernetes
, andhelm
providers. -
Run
terragrunt apply
.
Note that unlike most modules which receive their own namespace, trust-manager will be deployed to the cert-manager namespace. 6
Next Steps
Now that internal certificate management is working, we will build upon that foundation to deploy a service mesh which will ensure that all network traffic in the cluster is secured with mTLS.
Footnotes
-
This module also configure Vault to act as a private CA ↩
-
This is because cross-environment dependencies will break the CI permission model where each CI runner only gets access to a single environment at a time. ↩
-
Many projects that extend Kubernetes include webhook servers that register themselves with the Kubernetes API server to validate or mutate incoming Kubernetes manifests. Read more about how that works here. These webhooks require mTLS to work. ↩
-
Recall that the certificate is public information, so it is safe to share. ↩
-
Certificates will automatically rotate as they near expiration so this is just for testing purposes. ↩
-
Requirement of the trust-manager ↩