yarn from a container
Image from https://yarnpkg.com/en/
Within a continuous integration / continuous delivery system one of the hardest
problems to deal with as the system and teams grow is build time dependency management.
At first, one version of a few dependencies is totally manageable in traditional ways.
yum install or
apk add is the perfect solution for a single team or project.
Over time as the CI system begins to grow and support more and more teams this solution
starts to show weaknesses. Anyone who has had to install
nvm, or any of the
other environment based tool version managers within a CI system knows the pain of
getting things working right in a stable way. In addition, distributing the configuration
for these tools can be just as challenging, especially when dealing with multiple
versions of a tool.
One way to help address these complications (in addition to a few others) is to contain your tooling in Docker containers. This gives a few advantages over traditional package managers:
- The software is entirely contained within Docker and can be installed or removed without fuss by any job that needs it
- Containers can (and should) be built in house as to be reproducibly built at a minimum and fully audited and tracked within the change management tool if required
- Multiple versions and configurations can exist side by side without any interference
- Containers can be shared between teams and shell wrappers distributed via Homebrew
How can we run yarn in a container for a local project?
TL;DR: Docker volumes
The way the process works is similar to other Docker workflows:
- We build a Docker image that has whatever tool we need
- Then any configuration (no secrets!) are layered in to the image
- When run the container, we mount our current working directory into the container
- The command is executed in the running container, but acting on the local directory that is volume mounted
- The command finishes and all output is captured in our local directory
- The container exits and since we’ve used
--rm, is completely gone
This gives us the effects of the tool without modifying other parts of the filesystem as one does in normal software installation. One issue that this process has as stated is that the command to run the container can be a bit unwieldy. In order to simplify things in this regard we must sacrifice the ‘no changes to the filesystem’ benefit by shipping a shell wrapper. Nonetheless, shipping a shell script or allowing others to create their own is a bit easier than full package / config management.
Anyways, now that we have an idea of how it will work, let’s take a look at how it does work.
Step 1: Build a Docker image for
The first item we need is a Docker image that contains our tooling. In this case
we’re going to install
yarn, a common dependency manager for Node, similar to
Since Yarn depends upon Node, we can create a container that has the specific
version of each that is needed by the team using it. In this case we will install
the latest Node and Yarn packages, but pinning them to other versions would be a
fairly simple task.
Let’s take a look at our Dockerfile here:
This super simple Dockerfile will give us a container that has
as an entrypoint. We can now build the image using a command like so:
This will give us the container named
technolog/run-yarn that we can test
with a command like this:
Excellent, yarn works! However, in this configuration we have no way
to have it operate on a local
is still in the container. We will fix that by using a Docker volume mount.
Step 2: Volume mount the current directory into the Docker container
If we were to just run the container with the example above, nothing is going to happen outside of the container. What we need to do is make the current directory accessible inside the container with a Docker volume. This is a simple task and looks something like:
Here we can see that the file was created locally (
./) and contains all of
the info we provided to yarn running in the Docker container. Pretty neat! One
thing you may notice is that the Docker command is growing a bit. This exact
command (or the one you create) doesn’t roll off of the fingers and so can be
hard to have everyone typing the same thing. There are a few solutions to this
minor issue and one of them is using bash aliases like so:
If we are using this command in a lot of places and especially within the build system, a slightly more robust wrapper may be required. Sometimes dealing with the shell and it’s intricacies is best left to ZSH developers and a script is a more unambiguous approach. What I mean by that is a script that encapsulates the Docker command and is then installed on the user or machine’s path. Let’s take a look at one for yarn:
Now if we make this file executable and run it, we should have a fully working yarn installation within a container:
Step 3: Distribute the software
The final step is getting these commands to be available. The beauty of this solution is that there are many ways to distribute this command. The script can live in a variety of places, depending on your needs:
- In the code repo itself for a smaller team
- Included and added to the path of the build runner
- Distributed locally with Homebrew
- Kept in a separate repo that is added to the path of builds
It depends on your environment, but I prefer to make the scripts available by keeping them all in one repo, cloning that repo, and adding it to the path on a build. This allows the scripts to change with the build and versions to be pinned via git tags if needed. Every team can include the scripts they need and use the version that works for them if they have to pin.
Step 4: ….
Run the build with whatever tools are required
Step 5: Cleanup
Now that we’re done with the tools, let’s wipe them out completely. We will
do that using Docker’s
This will kill any running containers and then prune (delete):
- Any stopped containers
- Any unused networks
- Any unused images
- Any build cache
- Any dangling images
Pretty much anything we would worry about interfering with the next build. If there are containers (such as the drone itself) that must be kept alive, the command is a bit different, but more or less the same.
Building the Docker image repeatably and consistently is key to this whole approach. Changing how the container works depending on who builds it will lead to the same pitfalls of bad dependency management: mainly broken builds.
Here is an example
build.sh that I would use for the above container:
Once teams begin using this framework, you’ll find each develops a set
of version requirements that may not match all the rest of the teams. When
you find yourself in this scenario, it is time to begin versioning the images
as well. While
:latest should probably always point at the newest version,
it’s also reasonable to create
:vX.X tags as well so teams can pin to
specific versions if desired.
In order to do this, you can add a Docker build argument or environment variable to install a specific version of a piece of software and use that version to tag the image as well. I am going to leave this as an exercise for the user, but the steps would be:
- Read the version in
- Pass that version as a build arg
- In the Dockerfile, read that ARG and install a specific version of the software
This becomes a bit more complex when sharing between teams and requiring different versions of both node and yarn, but it can be managed with a smart versioning scheme.
This methodology does not encourage just pulling random images from Docker hub and running
them! You must always use your good judgement when deciding what software to run
in your environment. As you see here, we have used the trusted Alpine Docker image and
then installed yarn from trusted Alpine packages ourselves. We did not rely on a random
Docker image found on the hub, nor did we install extra software that was not required or
executed untrusted commands (
curl | sudo bash). This means as long as we trust Alpine,
we should be able to trust this image, within reason. As my Mum would say:
Downloading unknown or unsigned binaries from the Internet will kill you!
This is a powerful and flexible technique for managing build time dependencies within your continuous integration / continuous delivery system. It is a bit overkill if you have a single dependency and can change it without affecting anything unintended. However, if you, like me, run many versions of software to support many teams’ builds, I think you’ll find this to be a pretty simple and potentially elegant solution.