Robert's Data Science Blog

Building R Docker Images in Azure Container Registry

I have recently written a few posts about Docker images for R. I maintain four recurring images – r-deps, r-minimal, r-base and r-test – that depend on each other in that order.

docker image deps

But how do we build and share these images?

We have the images for all the versions of R we use and have used. A new version of R is typically released twice a year, so I have updated them manually so far. But now I prioritized looking into a smarter way of doing this.

I have previously referred to this post about tagging practices – it is really worth a read. I follow the advice here and have two tags for each image: A “moving” tag which is just the version of R (e.g. 3.5.1) and a unique tag that also includes a “unique” part. The unique tag ensures that we can always backtrace if necessary.

Builds in a registry

The container registry has a number of features for building images “in” the registry, that is, transfering a Dockerfile to a service that builds the image and puts it in the registry. I am going to use a feature that is currently in preview, so don’t rely on everything here being accurate too far out in the future.

Getting ready

Check the official docs for how to create an Azure Container Registry. Sign into Azure CLI

az login

Set the subscription to that of the container registry.

az account set --subscription "<subscription>"

Log in to the container registry, using <acr name> of <acr name>.azurecr.io

az acr login --name "<acr name>"

Repository

I keep the Dockerfiles in a repository in Azure DevOps with the following files

r-dockerfiles
├── r-base
│   └── Dockerfile
├── r-deps
│   └── Dockerfile
├── r-minimal
│   └── Dockerfile
├── r-test
│   ├── Dockerfile
│   └── run_tests.R
├── task.yaml
└── values.yaml

In the values.yaml we have a number of build variables, including the version of R needed in the tag:

ubuntu: 18.04
rversion: 3.5.0
mrandate: 2018-07-02

As I wrote about in a previous post I hardcode the package repository to match the version of R. When a new version of R is released the rversion and mrandate variables must be updated to match each other.

In order to use the values in a build we have to create a build task using a YAML file. If the task is created as in the official tutorials with a specific Dockerfile the values.yaml is ignored. This is what task.yaml does here. It is quite a mouthful, but I will go through the parts.

version: 1.0-preview-1
steps:
    - id: r-deps-build
      build:  --build-arg UBUNTU_VERSION={{.Values.ubuntu}} -t {{.Run.Registry}}/r-deps:{{.Values.rversion}}-{{.Run.Date}} -t {{.Run.Registry}}/r-deps:{{.Values.rversion}} -f r-deps/Dockerfile .
      when: ["-"]

    - id: r-deps-push
      push:
      - {{.Run.Registry}}/r-deps:{{.Values.rversion}}-{{.Run.Date}}
      - {{.Run.Registry}}/r-deps:{{.Values.rversion}}
      when: ["r-deps-build"]

    - id: r-minimal-build
      build: --build-arg REGISTRY={{.Run.Registry}} --build-arg R_VERSION={{.Values.rversion}} --build-arg MRANDATE={{.Values.mrandate}} -t {{.Run.Registry}}/r-minimal:{{.Values.rversion}}-{{.Run.Date}} -t {{.Run.Registry}}/r-minimal:{{.Values.rversion}} -f r-minimal/Dockerfile .
      timeout: 1200
      when: ["r-deps-build"]

    - id: r-minimal-push
      push:
      - {{.Run.Registry}}/r-minimal:{{.Values.rversion}}-{{.Run.Date}}
      - {{.Run.Registry}}/r-minimal:{{.Values.rversion}}
      when: ["r-minimal-build"]

    - id: r-base-build
      build: --build-arg REGISTRY={{.Run.Registry}} --build-arg R_VERSION={{.Values.rversion}} -t {{.Run.Registry}}/r-base:{{.Values.rversion}}-{{.Run.Date}} -t {{.Run.Registry}}/r-base:{{.Values.rversion}} -f r-base/Dockerfile .

      when: ["r-minimal-build"]

    - id: r-base-push
      push:
      - {{.Run.Registry}}/r-base:{{.Values.rversion}}-{{.Run.Date}}
      - {{.Run.Registry}}/r-base:{{.Values.rversion}}
      when: ["r-base-build"]

    - id: r-test-build
      build: --build-arg REGISTRY={{.Run.Registry}} --build-arg R_VERSION={{.Values.rversion}} -t {{.Run.Registry}}/r-test:{{.Values.rversion}}-{{.Run.Date}} -t {{.Run.Registry}}/r-test:{{.Values.rversion}} -f r-test/Dockerfile .
      timeout: 1200
      when: ["r-base-build"]

    - id: r-test-push
      push:
      - {{.Run.Registry}}/r-test:{{.Values.rversion}}-{{.Run.Date}}
      - {{.Run.Registry}}/r-test:{{.Values.rversion}}
      when: ["r-test-build"]

There is a build step and a push step for each of the four images. This is necessary as each step can only do one thing. In the build step we specify the build arguments and the tags using the registry we are going to push to ({{.Run.Registry}}), the version of R ({{.Values.rversion}}) and other variables.

The unique tag is the version of R and the date/time when the build was started.

What makes this multi-step task a winner for me is that we can specify the dependencies between the steps with when:

  1. r-deps-build has no dependencies and starts when the task is triggered.
  2. r-deps-push and r-minimal-build has to wait for r-deps-build to finish.
  3. r-minimal-push and r-base-build has to wait for r-minimal-build to finish.
  4. And so on…

The final trick is that we set a timeout for the two time consuming steps r-minimal-build and r-test-build to be 20 minutes instead of accepting the default of 10 minutes.

Create a build task

With the container registry created and the repository in DevOps a build task can be created. To enable the container registry to accees the repository we need a personal access token. Click the “Clone” link in the top right corner and then “Generate Git credentials”.

docker image deps

Armed with a <token> the task is created with this command:

az acr task create \
	--registry "<acr name>" \
	--name "r-builds" \
	--context "<Azure DevOps URL>" \
	--file "task.yaml" \
	--values "values.yaml" \
	--branch "master" \
	--git-access-token "<token>"

Now every push to the master branch will trigger the task. It can also be triggered manually with

az acr task run --registry "<acr name>" --name "r-builds"

Inspection

The Azure CLI has tools to inspect the tasks and images. Some of them trimmed a bit to better fit the page.

The tasks on the registry.

$ az acr task list --registry "<acr name>" --output table
NAME      STATUS    SOURCE REPOSITORY 
--------  --------  -------------------
r-builds  Enabled   "<Azure DevOps URL> 

The runs related to the registry.

$ az acr task list-runs --registry "<acr name>" -o table
TASK       STATUS     TRIGGER       STARTED               DURATION
---------  ---------  ------------  --------------------  ----------
r-builds   Succeeded  Commit        2019-03-27T12:21:30Z  00:20:51
r-builds   Succeeded  Manual        2019-03-27T10:22:46Z  00:20:38

See all images in registry.

$ az acr repository list --name "<acr name>" --output table
Result
---------
r-base
r-deps
r-minimal
r-test

See all tags for a particular image. Here we also see the multiple tags.

$ az acr repository show-tags --name "<acr name>" --repository r-test --output table
Result
----------------------
3.5.0
3.5.0-20190327-102250z
3.5.1
3.5.1-20190322-130220z
3.5.2
3.5.2-20190327-122135z

Finally, in Azure Portal we can see this overview if the resource group with the container registry is on the dashboard.

docker image deps

Clicking on r-builds will allow inspection of the build task in JSON format.