Panfactum LogoPanfactum
Stack AddonsBuildKitInstalling

Installing BuildKit

Objective

Install BuildKit and build a container image.

Background

BuildKit is a server that efficiently executes container image build instructions (often via a format such as Dockerfiles) and outputs build artifacts into image registries.

The organization that maintains BuildKit, Moby, grew out of Docker in 2017. Today, BuildKit is ubiquitous. It is embedded in the Docker Engine and as a result has become the most popular way to build container images. Moreover, it is used as a core component in many popular projects such as Dagger, Depot and Gitpod.

In the Panfactum stack, we deploy BuildKit into a Kubernetes cluster as a standalone set of servers that can be used by CI / CD pipelines and local developers alike. Centralizing the build processes provides many benefits:

  • The build cache is shared across all users allowing for 10-100x faster build times.
  • Multi-platform container images become trivial to create, allowing developers to easily generate images that run on both amd64 and arm64 architectures.
  • Every developer does not need a beefy machine to build images as teams can rely on the elastic capacity of the cloud.

Prepare for Installation

Before you install BuildKit, you must first decide where it should run.

In general, you should aim to achieve the following architecture: 1

BuildKit deployment architecture
  1. BuildKit should run in the lowest environment for the following reasons:

    • All developers should have direct access to the build servers as they will use them in their daily development workflow.
    • Builds are bursty workloads that may cause disruption to workloads running on the same node. For example, they might saturate the network bandwidth by downloading many dependencies all at once.
    • BuildKit runs arbitrary code 2 with elevated privileges as build processes are given more permissions than runtime processes. If code does escape the build sandbox, it should not be able to access your sensitive systems.
  2. BuildKit runs inside a Kubernetes cluster. If you have multiple clusters in your environment, you only need to deploy BuildKit to one.

  3. BuildKit should push generated images to an Elastic Container Registry (ECR) in the same environment where it itself is hosted so that it does not need to be granted cross-environment permissions.

  4. All environments will have permission to pull images from this registry as every build should only be done once (not per-environment). This ensures that container behavior will not change as code is promoted across environments. Additionally, images should be configured as immutable so that they can not be tampered with after creation.

Once you have chosen the cluster you will deploy BuildKit to, you are ready for the installation.

Install the Module

We provide a module, kube_buildkit, for deploying the BuildKit servers.

Let's deploy it now:

  1. In the region with the cluster you wish to deploy BuildKit to, add a new directory called kube_buildkit.

  2. Add a terragrunt.hcl that looks like this.

  3. Run pf-tf-init to enable the required providers.

  4. Run terragrunt apply.

Note that the BuildKit system deployed by this module is horizontally autoscaled to dynamically adjust to your capacity needs. Additionally, it will scale to zero when not in use.

By default, the builders are sized at 2 vCPU / 8 GB RAM, but you can adjust these settings via the module inputs.

Deploy ECR Repositories

We recommend pushing new container images to AWS Elastic Container Registry (ECR) as this is the easiest way to consume images from within your Kubernetes clusters.

We provide a module for configuring ECR with your image repositories: aws_ecr_repos. 3

Note that a registry is the domain name (e.g., 5555555555.dkr.ecr.us-east-2.amazonaws.com), a repository is a specific namespace for images in the registry (e.g., my_image), and a tag is an arbitrary identifier for single image in the repository. Together these form a unique image identifier (e.g., 5555555555.dkr.ecr.us-east-2.amazonaws.com/my_image:some_tag) which conveniently also specifies where to push and pull the image from.

Let's set up a test image repository, so we can demonstrate how to build images with BuildKit:

  1. Adjacent to your kube_buildkit directory, add a new directory called aws_ecr_repos.

  2. Add a terragrunt.hcl that looks like this.

  3. For now, set ecr_repositories to { test = {} }. This creates a test repository that you can push images to. You can add additional repositories later.

  4. Run pf-tf-init to enable the required providers.

  5. Run terragrunt apply.

Configuration Files

To connect your local tooling to both the BuildKit servers and your ECR repositories, you must add some additional configuration to your repository. Specifically, you must create a <buildkit_dir>/config.yaml (reference docs).

Let's do this now:

  1. You should already have a buildkit_dir directory in your repository (defaults to .buildkit). In it, you should see a config.example.yaml. Copy that file to config.yaml.

  2. Update module to the path of the kube_buildkit module you deployed earlier.

  3. Update bastion to the name of the bastion deployed in the Kubernetes cluster running BuildKit. See your <ssh_dir>/config.yaml.

  4. Run pf-update-buildkit --build to build the generated configuration files that will be used to connect to BuildKit and ECR.

Build an Image

Now anyone on your team can use the remote BuildKit servers to build and push multi-platform container images right from their local machine.

Let's test this out.

  1. Create a simple Dockerfile (Dockerfile) in the root of your repository:

    FROM ubuntu/nginx
    
  2. Create an adjacent .dockerignore (.dockerignore). The .dockeringore file follows .gitignore syntax and ignored files will not be sent to BuildKit in the build context.

    Setting this up correctly is critically important. Removing unnecessary files from the build context prevents local secrets from accidentally being included in your builds and also improves overall build performance, especially since the build context must be transferred over the network to BuildKit.

    For these reasons, we recommend an ignore-by-default approach:

    # Ignore-by-default
    *
    # Example of including a specific file or directory
    !some/path
    
  3. We provide a utility pf-buildkit-build that will automatically submit your builds to BuildKit. Run it now from the root of your repository: pf-buildkit-build --repo=test --tag=test --context=. --file=./Dockerfile.

    This command automatically takes care of turning on BuildKit (if necessary), selecting the instance with the most free capacity, establishing a secure remote network tunnel, submitting builds for both arm64 and amd64 platform architectures, setting up the cache, pushing to ECR, and generating the multi-platform image manifest.

    If all goes well, you should see output that looks like this:

    statefulset.apps/buildkit-arm64 scaled
    statefulset.apps/buildkit-amd64 scaled
    arm64: Waiting 599 seconds for at least one BuildKit replica to become available...
    amd64: Waiting 599 seconds for at least one BuildKit replica to become available...
    arm64: Waiting 589 seconds for at least one BuildKit replica to become available...
    amd64: #1 [internal] load build definition from ./Dockerfile
    amd64: #1 transferring dockerfile: 29B 0.1s
    arm64: #1 [internal] load build definition from ./Dockerfile
    arm64: #1 transferring dockerfile: 54B 0.2s done
    amd64: #1 transferring dockerfile: 54B 0.2s done
    arm64: #1 DONE 0.2s
    arm64:
    arm64: #2 [internal] load metadata for docker.io/ubuntu/nginx:latest
    amd64: #1 DONE 0.2s
    amd64:
    amd64: #2 [internal] load metadata for docker.io/ubuntu/nginx:latest
    arm64: #2 DONE 0.4s
    amd64: #2 DONE 0.4s
    ...
    amd64: #6 exporting to image
    amd64: #6 pushing layers 1.1s done
    amd64: #6 pushing manifest for 891377197483.dkr.ecr.us-east-2.amazonaws.com/test:test-amd64@sha256:b6e2ca2f347b515a158fdb18bd1cb2f4c7b69ac5f9be9fa037bff62df2b7ffd1
    arm64: #6 pushing manifest for 891377197483.dkr.ecr.us-east-2.amazonaws.com/test:test-arm64@sha256:b501fac8d78b9db276549bab5cc38784e0bfd7758062741f0716abbbbd690740 0.1s done
    arm64: #6 DONE 1.1s
    arm64:
    arm64: #8 exporting cache to Amazon S3
    arm64: #8 preparing build cache for export 0.1s done
    arm64: #8 DONE 0.1s
    amd64: #6 pushing manifest for 891377197483.dkr.ecr.us-east-2.amazonaws.com/test:test-amd64@sha256:b6e2ca2f347b515a158fdb18bd1cb2f4c7b69ac5f9be9fa037bff62df2b7ffd1 0.1s done
    amd64: #6 DONE 1.1s
    amd64:
    amd64: #8 exporting cache to Amazon S3
    amd64: #8 preparing build cache for export 0.1s done
    amd64: #8 DONE 0.1s
    Digest: sha256:3e2cb9822cabdb580c825d5bf3ec53e52df9d18871f1a0a1086f17d57b01f020 710
    Closing build processes...
    
  4. Log into the AWS web console and navigate to the ECR. Select the "test" repository under the Private Registry section. Notice that your image should now be available.

    Image is available in ECR

    Notice there are three entries: one for each system architecture (amd64 and arm64) and the multi-platform manifest that is labeled as an Image Index.

    The multi-platform manifest allows you to use a single tag (in this case test), and the right image (either test-amd64 or test-arm64) for your system architecture will automatically be pulled from the registry. This unlocks the ability to create a single deployment that can run either amd64 or arm64 nodes.

  5. You can now pull and run this image locally as well: podman run --rm -it 891377197483.dkr.ecr.us-east-2.amazonaws.com/test:test. Replace 891377197483.dkr.ecr.us-east-2.amazonaws.com with your registry name (click Copy URI).

    If you run this command from inside the repository, our devShell will automatically take care of registry authentication for you.

Next Steps

BuildKit is now ready for your organization to use. We recommend taking a look at the following guides for putting it to use:

  • Core techniques for building images

  • Building images in CI pipelines (TODO)

  • Using BuildKit in local development (TODO)

Footnotes

  1. A more detailed architectural overview can be found here.

  2. Whatever you supply in your Dockerfiles.

  3. We also supply aws_ecr_public_repos for deploying public repositories.