Panfactum LogoPanfactum
Bootstrapping StackPolicy Controller

Policy Controller

Objective

Install the Kyverno policy engine which allows installing cluster-wide rules for automatically generating, mutating, and validating Kubernetes resources.

Background

Kyverno is a CNCF project which allows us to add many features to vanilla Kubernetes:

  • Native integration with the pull through cache
  • Support for arm64 and spot instances
  • Support for our bin-packing pod scheduler
  • Descriptive labels and automatically injected reflexive environment variables for pods
  • Production-ready security and uptime hardening for your workloads
  • and much more.

Additionally, Kyverno allows you to deploy your own policies that control what and how resources are deployed to your clusters.

Kyverno works by installing admission webhooks that allow Kyverno to receive, update, and even reject Kubernetes manifests before they are applied to the cluster. Kyverno's behavior is configured through Kyverno Policies.

For a full architectural breakdown, see their excellent documentation.

Deploy Kyverno

We provide a module for deploying Kyverno, kube_kyverno.

Let's deploy it now:

  1. Create a new directory adjacent to your aws_eks module called kube_kyverno.

  2. Add a terragrunt.hcl to that directory that looks like this.

  3. Run pf-tf-init to enable the required providers.

  4. Run terragrunt apply.

  5. If the deployment succeeds, you should see the various Kyverno pods running:

    Kyverno pods running

Deploy Panfactum Policies

While kube_kyverno installs Kyverno itself, Kyverno does not apply any policy rules by default. To load in the default Kyverno Policies, we provide a module called kube_policies.

Let's deploy the policies now:

  1. Create a new directory adjacent to your kube_kyverno module called kube_policies.

  2. Add a terragrunt.hcl to that directory that looks like this.

  3. Run pf-tf-init to enable the required providers.

  4. Run terragrunt apply.

  5. You can verify that the policies are working as follows:

    1. Examine the containers for any pod in the cluster (press enter when selecting the pod in k9s):

      Containers before pull-through cache policy enabled

      Notice that the image is being pulled directly from the GitHub container registry (ghcr.io) rather than from the pull-through cache.

    2. Delete the pod you just inspected (ctrl+d when selecting the pod in k9s).

    3. Examine the containers for the new pod that gets created to take its place:

      Containers after pull-through cache policy enabled

      Notice that the image is now being pulled from your ECR pull-through cache. This occurred because our Kyverno policies dynamically replaced the images for the pod when it was created.

Run Network Tests

Now that both networking and the policy controller are installed, let's verify that everything is working as intended. The easiest approach is to perform a battery of network tests against the cluster to ensure that pods can both launch successfully and communicate with one another.

Cilium comes with a companion CLI tool that is bundled with the Panfactum devShell. We will use that to test that cilium is working as intended:

  1. Run cilium connectivity test --test '!pod-to-pod-encryption'. 1 2

  2. Wait about 20-30 minutes for the test to complete.

  3. If everything completes successfully, you should receive a message like this:

✅ All 46 tests (472 actions) successful, 18 tests skipped, 0 scenarios skipped.
  1. Unfortunately, the test does not clean up after itself. You should run kubectl delete ns cilium-test to remove the test resources.

Next Steps

Now that the policy engine and basic policies are deployed, let's deploy storage controllers to allow your pods to utilize storage.

PreviousNext
Panfactum Bootstrapping Guide:
Step 10 /21

Footnotes

  1. Skipping the pod-to-pod-encryption test is required due to this issue.

  2. If you receive an error like Unable to detect Cilium version, assuming vX.X.X for connectivity tests: unable to parse cilium version on pod. that means that you tried to run the test while not all cilium-xxxx pods were ready. Wait for all the cilium-xxxxx pods to reach a ready state and then try again.