Writing good Dockerfiles in 2023
Introduction
Recently I bought an arm vm at Hetzner for my Kubernetes Cluster. While these are cheap, I quickly discovered how neglected arm images actually are.
To give you a quick rundown:
- Many projects don't offer arm images
- People hardcode their registries in the build scripts, making out-of-tree building harder than needed
- People write way too complex Dockerfiles (please trust your Docker. It sets a lot of useful things by default!)
- People don't sign things
- Docker and Podman both kinda are odd with their multiarch images
- Docker generates 2 dead references in the manifest which don't exist
- Podman needs you to manage the manifest somewhat manually
You may now rightfully ask how to actually do this properly.
Writing a good Dockerfile
A good Dockerfile is actually simple. There are more specific examples if you search for it, but here are some good guidelines to start from:
- Make use of multistage Dockerfiles.
- Dockerimages will be smaller
- Caching can be more efficient in some cases
- Your CI may be faster since there is less to upload
- Don't require the
.git
folder to exist. People may run your script in all kind of scenarios like for example having it as a submodule - Don't make the build process hidden behind a HUGE Makefile
- This both decreases readability and maintainability
- It also reduces the flexibility
- Only copy files you need into the Dockerimage
- Use a
.dockerignore
file
- Use a
- Set a UID and GID for a rootless Dockerimage
- Try to be minimal
Writing a good Docker CI on GitHub Actions
For the most people this is the more easy part and yet often people explicitly seem to disable arm for no reason.
An example good GitHub action would be:
name: Publish Synapse Docker image
on:
push:
jobs:
push_to_registries:
permissions:
contents: read
packages: write
id-token: write
name: Push synapse image to repo
runs-on: ubuntu-latest
steps:
- name: Check out the repo
uses: actions/checkout@v3
with:
submodules: "recursive"
- name: Log in to Docker Hub
uses: docker/login-action@v2
with:
username: ${\{ secrets.DOCKER_USERNAME }}
password: ${\{ secrets.DOCKER_PASSWORD }}
- name: Install Cosign
uses: sigstore/cosign-installer@v3
- name: Set up QEMU
uses: docker/setup-qemu-action@v2
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
- name: Docker meta
id: docker_meta
uses: docker/metadata-action@v4
with:
tags: type=sha,format=long
- name: Build and push
id: build_and_push
uses: docker/build-push-action@v4
with:
push: true
context: .
sbom: true
provenance: true
platforms: linux/amd64,linux/arm64
tags: ${\{ steps.docker_meta.outputs.tags }}
- name: Sign the images with GitHub OIDC Token
env:
DIGEST: ${\{ steps.build_and_push.outputs.digest }}
TAGS: ${\{ steps.docker_meta.outputs.tags }}
run: cosign sign --yes "${TAGS}@${DIGEST}"
Due to a bug with the publishing software there are escape characters visible in the above source. Make sure to remove the backslashes for the secret and step references!
But what does this actually do, you may ask: It's actually quite simple. First it clones the repo with all the submodules, then it logs in to the Docker Hub. After doing these too basic steps, we install all the required dependencies.
First we install Cosign[1] which we later need to be able to sign our image.
Then we install 2 components required to be able to do a multiarch build. Since we are usually using amd64 runners, QEMU is required to emulate arm64. The BuildX setup is then needed to make use of QEMU in the build.
After we installed our dependencies, we now can generate the metadata. You can look its GitHub repo[2] for more details on how to actually use. Important here is just the tags section since this format is needed for cosign.
The metadata then can be used to build the Image itself and push it to the server. The most notable things about this step are that we define both arm64 and amd64 as a platform. This makes the resulting image a multiarch image.
As a last step, we then sign the images using the GitHub Account. That way, people can ensure they are actually using the image from the right person.
Bonus: Building multiarch on modern Podman
I also did find Multi-arch build with Podman · Rust stuff while digging the internet on multiarch. It is fairly complete, but there is a minor improvement you can do:
First, you should add a podman manifest rm ${MANIFEST_NAME}
before the create
to make sure that it actually can be created.
Secondly, you need to use podman manifest push
instead of podman push
otherwise only one of the images is being pushed instead of the manifest.
Sources
- Multi-arch build with Podman · Rust stuff
- Getting up and running with multi-arch Kubernetes clusters - CableSpaghetti
- Signing Containers - Sigstore Documentation