Vue normale

Il y a de nouveaux articles disponibles, cliquez pour rafraîchir la page.
À partir d’avant-hierFlux principal

How to Dockerize a Django App: Step-by-Step Guide for Beginners

Par : Lance Haig
8 janvier 2025 à 18:11

One of the best ways to make sure your web apps work well in different environments is to containerize them. Containers let you work in a more controlled way, which makes development and deployment easier. This guide will show you how to containerize a Django web app with Docker and explain why it’s a good idea.

We will walk through creating a Docker container for your Django application. Docker gives you a standardized environment, which makes it easier to get up and running and more productive. This tutorial is aimed at those new to Docker who already have some experience with Django. Let’s get started!

2400x1260 docker evergreen logo blog A

Why containerize your Django application?

Django apps can be put into containers to help you work more productively and consistently. Here are the main reasons why you should use Docker for your Django project:

  • Creates a stable environment: Containers provide a stable environment with all dependencies installed, so you don’t have to worry about “it works on my machine” problems. This ensures that you can reproduce the app and use it on any system or server. Docker makes it simple to set up local environments for development, testing, and production.
  • Ensures reproducibility and portability: A Dockerized app bundles all the environment variables, dependencies, and configurations, so it always runs the same way. This makes it easier to deploy, especially when you’re moving apps between environments.
  • Facilitates collaboration between developers: Docker lets your team work in the same environment, so there’s less chance of conflicts from different setups. Shared Docker images make it simple for your team to get started with fewer setup requirements.
  • Speeds up deployment processes: Docker makes it easier for developers to get started with a new project quickly. It removes the hassle of setting up development environments and ensures everyone is working in the same place, which makes it easier to merge changes from different developers.

Getting started with Django and Docker

Setting up a Django app in Docker is straightforward. You don’t need to do much more than add in the basic Django project files.

Tools you’ll need

To follow this guide, make sure you first:

If you need help with the installation, you can find detailed instructions on the Docker and Django websites.

How to Dockerize your Django project

The following six steps include code snippets to guide you through the process.

Step 1: Set up your Django project

1. Initialize a Django project. 

If you don’t have a Django project set up yet, you can create one with the following commands:

django-admin startproject my_docker_django_app
cd my_docker_django_app

2. Create a requirements.txt file. 

In your project, create a requirements.txt file to store dependencies:

pip freeze > requirements.txt

3. Update key environment settings.

You need to change some sections in the settings.py file to enable them to be set using environment variables when the container is started. This allows you to change these settings depending on the environment you are working in.

  # The secret key
  SECRET_KEY = os.environ.get("SECRET_KEY")

  DEBUG = bool(os.environ.get("DEBUG", default=0))

  ALLOWED_HOSTS = os.environ.get("DJANGO_ALLOWED_HOSTS","127.0.0.1").split(",")

Step 2: Create a Dockerfile

A Dockerfile is a script that tells Docker how to build your Docker image. Put it in the root directory of your Django project. Here’s a basic Dockerfile setup for Django:

# Use the official Python runtime image
FROM python:3.13  

# Create the app directory
RUN mkdir /app

# Set the working directory inside the container
WORKDIR /app

# Set environment variables 
# Prevents Python from writing pyc files to disk
ENV PYTHONDONTWRITEBYTECODE=1
#Prevents Python from buffering stdout and stderr
ENV PYTHONUNBUFFERED=1 

# Upgrade pip
RUN pip install --upgrade pip 

# Copy the Django project  and install dependencies
COPY requirements.txt  /app/

# run this command to install all dependencies 
RUN pip install --no-cache-dir -r requirements.txt

# Copy the Django project to the container
COPY . /app/

# Expose the Django port
EXPOSE 8000

# Run Django’s development server
CMD ["python", "manage.py", "runserver", "0.0.0.0:8000"]

Each line in the Dockerfile serves a specific purpose:

  • FROM: Selects the image with the Python version you need.
  • WORKDIR: Sets the working directory of the application within the container.
  • ENV: Sets the environment variables needed to build the application
  • RUN and COPY commands: Install dependencies and copy project files.
  • EXPOSE and CMD: Expose the Django server port and define the startup command.

You can build the Django Docker container with the following command:

docker build -t django-docker .

To see your image, you can run:

docker image list

The result will look something like this:

REPOSITORY      TAG       IMAGE ID       CREATED          SIZE
django-docker   latest    ace73d650ac6   20 seconds ago   1.55GB

Although this is a great start in containerizing the application, you’ll need to make a number of improvements to get it ready for production.

  • The CMD manage.py is only meant for development purposes and should be changed for a WSGI server.
  • Reduce the size of the image by using a smaller image.
  • Optimize the image by using a multistage build process.

Let’s get started with these improvements.

Update requirements.txt

Make sure to add gunicorn to your requirements.txt. It should look like this:

asgiref==3.8.1
Django==5.1.3
sqlparse==0.5.2
gunicorn==23.0.0
psycopg2-binary==2.9.10

Make improvements to the Dockerfile

The Dockerfile below has changes that solve the three items on the list. The changes to the file are as follows:

  • Updated the FROM python:3.13 image to FROM python:3.13-slim. This change reduces the size of the image considerably, as the image now only contains what is needed to run the application.
  • Added a multi-stage build process to the Dockerfile. When you build applications, there are usually many files left on the file system that are only needed during build time and are not needed once the application is built and running. By adding a build stage, you use one image to build the application and then move the built files to the second image, leaving only the built code. Read more about multi-stage builds in the documentation.
  • Add the Gunicorn WSGI server to the server to enable a production-ready deployment of the application.
# Stage 1: Base build stage
FROM python:3.13-slim AS builder

# Create the app directory
RUN mkdir /app

# Set the working directory
WORKDIR /app 

# Set environment variables to optimize Python
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1 

# Upgrade pip and install dependencies
RUN pip install --upgrade pip 

# Copy the requirements file first (better caching)
COPY requirements.txt /app/

# Install Python dependencies
RUN pip install --no-cache-dir -r requirements.txt

# Stage 2: Production stage
FROM python:3.13-slim

RUN useradd -m -r appuser && \
   mkdir /app && \
   chown -R appuser /app

# Copy the Python dependencies from the builder stage
COPY --from=builder /usr/local/lib/python3.13/site-packages/ /usr/local/lib/python3.13/site-packages/
COPY --from=builder /usr/local/bin/ /usr/local/bin/

# Set the working directory
WORKDIR /app

# Copy application code
COPY --chown=appuser:appuser . .

# Set environment variables to optimize Python
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1 

# Switch to non-root user
USER appuser

# Expose the application port
EXPOSE 8000 

# Start the application using Gunicorn
CMD ["gunicorn", "--bind", "0.0.0.0:8000", "--workers", "3", "my_docker_django_app.wsgi:application"]

Build the Docker container image again.

docker build -t django-docker .

After making these changes, we can run a docker image list again:

REPOSITORY      TAG       IMAGE ID       CREATED         SIZE
django-docker   latest    3c62f2376c2c   6 seconds ago   299MB

You can see a significant improvement in the size of the container.

The size was reduced from 1.6 GB to 299MB, which leads to faster a deployment process when images are downloaded and cheaper storage costs when storing images.

You could use docker init as a command to generate the Dockerfile and compose.yml file for your application to get you started.

Step 3: Configure the Docker Compose file

A compose.yml file allows you to manage multi-container applications. Here, we’ll define both a Django container and a PostgreSQL database container.

The compose file makes use of an environment file called .env, which will make it easy to keep the settings separate from the application code. The environment variables listed here are standard for most applications:

services:
 db:
   image: postgres:17
   environment:
     POSTGRES_DB: ${DATABASE_NAME}
     POSTGRES_USER: ${DATABASE_USERNAME}
     POSTGRES_PASSWORD: ${DATABASE_PASSWORD}
   ports:
     - "5432:5432"
   volumes:
     - postgres_data:/var/lib/postgresql/data
   env_file:
     - .env

 django-web:
   build: .
   container_name: django-docker
   ports:
     - "8000:8000"
   depends_on:
     - db
   environment:
     DJANGO_SECRET_KEY: ${DJANGO_SECRET_KEY}
     DEBUG: ${DEBUG}
     DJANGO_LOGLEVEL: ${DJANGO_LOGLEVEL}
     DJANGO_ALLOWED_HOSTS: ${DJANGO_ALLOWED_HOSTS}
     DATABASE_ENGINE: ${DATABASE_ENGINE}
     DATABASE_NAME: ${DATABASE_NAME}
     DATABASE_USERNAME: ${DATABASE_USERNAME}

     DATABASE_PASSWORD: ${DATABASE_PASSWORD}
     DATABASE_HOST: ${DATABASE_HOST}
     DATABASE_PORT: ${DATABASE_PORT}
   env_file:
     - .env
volumes:
   postgres_data:

And the example .env file:

DJANGO_SECRET_KEY=your_secret_key
DEBUG=True
DJANGO_LOGLEVEL=info
DJANGO_ALLOWED_HOSTS=localhost
DATABASE_ENGINE=postgresql_psycopg2
DATABASE_NAME=dockerdjango
DATABASE_USERNAME=dbuser
DATABASE_PASSWORD=dbpassword
DATABASE_HOST=db
DATABASE_PORT=5432

Step 4: Update Django settings and configuration files

1. Configure database settings. 

Update settings.py to use PostgreSQL:

  DATABASES = {
       'default': {
           'ENGINE': 'django.db.backends.{}'.format(
               os.getenv('DATABASE_ENGINE', 'sqlite3')
           ),
           'NAME': os.getenv('DATABASE_NAME', 'polls'),
           'USER': os.getenv('DATABASE_USERNAME', 'myprojectuser'),
           'PASSWORD': os.getenv('DATABASE_PASSWORD', 'password'),
           'HOST': os.getenv('DATABASE_HOST', '127.0.0.1'),
           'PORT': os.getenv('DATABASE_PORT', 5432),
       }
   }

2. Set ALLOWED_HOSTS to read from environment files. 

In settings.py, set ALLOWED_HOSTS to:

   # 'DJANGO_ALLOWED_HOSTS' should be a single string of hosts with a , between each.
   # For example: 'DJANGO_ALLOWED_HOSTS=localhost 127.0.0.1,[::1]'
   ALLOWED_HOSTS = os.environ.get("DJANGO_ALLOWED_HOSTS","127.0.0.1").split(",")

3. Set the SECRET_KEY to read from environment files.

In settings.py, set SECRET_KEY to:

   # SECURITY WARNING: keep the secret key used in production secret!
   SECRET_KEY = os.environ.get("DJANGO_SECRET_KEY")

4. Set DEBUG to read from environment files.

In settings.py, set DEBUG to:

 # SECURITY WARNING: don't run with debug turned on in production!
 DEBUG = bool(os.environ.get("DEBUG", default=0))

Step 5: Build and run your new Django project

To build and start your containers, run:

docker compose up --build

This command will download any necessary Docker images, build the project, and start the containers. Once complete, your Django application should be accessible at http://localhost:8000.

Step 6: Test and access your application

Once the app is running, you can test it by navigating to http://localhost:8000. You should see Django’s welcome page, indicating that your app is up and running. To verify the database connection, try running a migration:

docker compose run django-web python manage.py migrate

Troubleshooting common issues with Docker and Django

Here are some common issues you might encounter and how to solve them:

  • Database connection errors: If Django can’t connect to PostgreSQL, verify that your database service name matches in compose.yml and settings.py.
  • File synchronization issues: Use the volumes directive in compose.yml to sync changes from your local files to the container.
  • Container restart loops or crashes: Use docker compose logs to inspect container errors and determine the cause of the crash.

Optimizing your Django web application

To improve your Django Docker setup, consider these optimization tips:

  • Automate and secure builds: Use Docker’s multi-stage builds to create leaner images, removing unnecessary files and packages for a more secure and efficient build.
  • Optimize database access: Configure database pooling and caching to reduce connection time and boost performance.
  • Efficient dependency management: Regularly update and audit dependencies listed in requirements.txt to ensure efficiency and security.

Take the next step with Docker and Django

Containerizing your Django application with Docker is an effective way to simplify development, ensure consistency across environments, and streamline deployments. By following the steps outlined in this guide, you’ve learned how to set up a Dockerized Django app, optimize your Dockerfile for production, and configure Docker Compose for multi-container setups.

Docker not only helps reduce “it works on my machine” issues but also fosters better collaboration within development teams by standardizing environments. Whether you’re deploying a small project or scaling up for enterprise use, Docker equips you with the tools to build, test, and deploy reliably.

Ready to take the next step? Explore Docker’s powerful tools, like Docker Hub and Docker Scout, to enhance your containerized applications with scalable storage, governance, and continuous security insights.

Learn more 

Docker Best Practices: Using ARG and ENV in Your Dockerfiles

Par : Jay Schmidt
16 octobre 2024 à 14:35

If you’ve worked with Docker for any length of time, you’re likely accustomed to writing or at least modifying a Dockerfile. This file can be thought of as a recipe for a Docker image; it contains both the ingredients (base images, packages, files) and the instructions (various RUN, COPY, and other commands that help build the image).

In most cases, Dockerfiles are written once, modified seldom, and used as-is unless something about the project changes. Because these files are created or modified on such an infrequent basis, developers tend to rely on only a handful of frequently used instructions — RUN, COPY, and EXPOSE being the most common. Other instructions can enhance your image, making it more configurable, manageable, and easier to maintain. 

In this post, we will discuss the ARG and ENV instructions and explore why, how, and when to use them.

2400x1260 best practices

ARG: Defining build-time variables

The ARG instruction allows you to define variables that will be accessible during the build stage but not available after the image is built. For example, we will use this Dockerfile to build an image where we make the variable specified by the ARG instruction available during the build process.

FROM ubuntu:latest
ARG THEARG="foo"
RUN echo $THEARG
CMD ["env"]

If we run the build, we will see the echo foo line in the output:

$ docker build --no-cache -t argtest .
[+] Building 0.4s (6/6) FINISHED                                                                     docker:desktop-linux
<-- SNIP -->
 => CACHED [1/2] FROM docker.io/library/ubuntu:latest@sha256:8a37d68f4f73ebf3d4efafbcf66379bf3728902a8038616808f04e  0.0s
 => => resolve docker.io/library/ubuntu:latest@sha256:8a37d68f4f73ebf3d4efafbcf66379bf3728902a8038616808f04e34a9ab6  0.0s
 => [2/2] RUN echo foo                                                                                               0.1s
 => exporting to image                                                                                               0.0s
<-- SNIP -->

However, if we run the image and inspect the output of the env command, we do not see THEARG:

$ docker run --rm argtest
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
HOSTNAME=d19f59677dcd
HOME=/root

ENV: Defining build and runtime variables

Unlike ARG, the ENV command allows you to define a variable that can be accessed both at build time and run time:

FROM ubuntu:latest
ENV THEENV="bar"
RUN echo $THEENV
CMD ["env"]

If we run the build, we will see the echo bar line in the output:

$ docker build -t envtest .
[+] Building 0.8s (7/7) FINISHED                                                                     docker:desktop-linux
<-- SNIP -->
 => CACHED [1/2] FROM docker.io/library/ubuntu:latest@sha256:8a37d68f4f73ebf3d4efafbcf66379bf3728902a8038616808f04e  0.0s
 => => resolve docker.io/library/ubuntu:latest@sha256:8a37d68f4f73ebf3d4efafbcf66379bf3728902a8038616808f04e34a9ab6  0.0s
 => [2/2] RUN echo bar                                                                                               0.1s
 => exporting to image                                                                                               0.0s
<-- SNIP -->

If we run the image and inspect the output of the env command, we do see THEENV set, as expected:

$ docker run --rm envtest
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
HOSTNAME=f53f1d9712a9
THEENV=bar
HOME=/root

Overriding ARG

A more advanced use of the ARG instruction is to serve as a placeholder that is then updated at build time:

FROM ubuntu:latest
ARG THEARG
RUN echo $THEARG
CMD ["env"]

If we build the image, we see that we are missing a value for $THEARG:

$ docker build -t argtest .
<-- SNIP -->
 => CACHED [1/2] FROM docker.io/library/ubuntu:latest@sha256:8a37d68f4f73ebf3d4efafbcf66379bf3728902a8038616808f04e  0.0s
 => => resolve docker.io/library/ubuntu:latest@sha256:8a37d68f4f73ebf3d4efafbcf66379bf3728902a8038616808f04e34a9ab6  0.0s
 => [2/2] RUN echo $THEARG                                                                                           0.1s
 => exporting to image                                                                                               0.0s
 => => exporting layers                                                                                              0.0s
<-- SNIP -->

However, we can pass a value for THEARG on the build command line using the --build-arg argument. Notice that we now see THEARG has been replaced with foo in the output:

 => CACHED [1/2] FROM docker.io/library/ubuntu:latest@sha256:8a37d68f4f73ebf3d4efafbcf66379bf3728902a8038616808f04e  0.0s
 => => resolve docker.io/library/ubuntu:latest@sha256:8a37d68f4f73ebf3d4efafbcf66379bf3728902a8038616808f04e34a9ab6  0.0s
 => [2/2] RUN echo foo                                                                                               0.1s
 => exporting to image                                                                                               0.0s
 => => exporting layers                                                                                              0.0s
<-- SNIP -->

The same can be done in a Docker Compose file by using the args key under the build key. Note that these can be set as a mapping (THEARG: foo) or a list (- THEARG=foo):

services:
  argtest:
    build:
      context: .
      args:
        THEARG: foo

If we run docker compose up --build, we can see the THEARG has been replaced with foo in the output:

$ docker compose up --build
<-- SNIP -->
 => [argtest 1/2] FROM docker.io/library/ubuntu:latest@sha256:8a37d68f4f73ebf3d4efafbcf66379bf3728902a8038616808f04  0.0s
 => => resolve docker.io/library/ubuntu:latest@sha256:8a37d68f4f73ebf3d4efafbcf66379bf3728902a8038616808f04e34a9ab6  0.0s
 => CACHED [argtest 2/2] RUN echo foo                                                                                0.0s
 => [argtest] exporting to image                                                                                     0.0s
 => => exporting layers                                                                                              0.0s
<-- SNIP -->
Attaching to argtest-1
argtest-1  | PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
argtest-1  | HOSTNAME=d9a3789ac47a
argtest-1  | HOME=/root
argtest-1 exited with code 0

Overriding ENV

You can also override ENV at build time; this is slightly different from how ARG is overridden. For example, you cannot supply a key without a value with the ENV instruction, as shown in the following example Dockerfile:

FROM ubuntu:latest
ENV THEENV
RUN echo $THEENV
CMD ["env"]

When we try to build the image, we receive an error:

$ docker build -t envtest .
[+] Building 0.0s (1/1) FINISHED                                                                     docker:desktop-linux
 => [internal] load build definition from Dockerfile                                                                 0.0s
 => => transferring dockerfile: 98B                                                                                  0.0s
Dockerfile:3
--------------------
   1 |     FROM ubuntu:latest
   2 |
   3 | >>> ENV THEENV
   4 |     RUN echo $THEENV
   5 |
--------------------
ERROR: failed to solve: ENV must have two arguments

However, we can remove the ENV instruction from the Dockerfile:

FROM ubuntu:latest
RUN echo $THEENV
CMD ["env"]

This allows us to build the image:

$ docker build -t envtest .
<-- SNIP -->
 => [1/2] FROM docker.io/library/ubuntu:latest@sha256:8a37d68f4f73ebf3d4efafbcf66379bf3728902a8038616808f04e34a9ab6  0.0s
 => => resolve docker.io/library/ubuntu:latest@sha256:8a37d68f4f73ebf3d4efafbcf66379bf3728902a8038616808f04e34a9ab6  0.0s
 => CACHED [2/2] RUN echo $THEENV                                                                                    0.0s
 => exporting to image                                                                                               0.0s
 => => exporting layers                                                                                              0.0s
<-- SNIP -->

Then we can pass an environment variable via the docker run command using the -e flag:

$ docker run --rm -e THEENV=bar envtest
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
HOSTNAME=638cf682d61f
THEENV=bar
HOME=/root

Although the .env file is usually associated with Docker Compose, it can also be used with docker run.

$ cat .env
THEENV=bar

$ docker run --rm --env-file ./.env envtest
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
HOSTNAME=59efe1003811
THEENV=bar
HOME=/root

This can also be done using Docker Compose by using the environment key. Note that we use the variable format for the value:

services:
  envtest:
    build:
      context: .
    environment:
      THEENV: ${THEENV}

If we do not supply a value for THEENV, a warning is thrown:

$ docker compose up --build
WARN[0000] The "THEENV" variable is not set. Defaulting to a blank string.
<-- SNIP -->
 => [envtest 1/2] FROM docker.io/library/ubuntu:latest@sha256:8a37d68f4f73ebf3d4efafbcf66379bf3728902a8038616808f04  0.0s
 => => resolve docker.io/library/ubuntu:latest@sha256:8a37d68f4f73ebf3d4efafbcf66379bf3728902a8038616808f04e34a9ab6  0.0s
 => CACHED [envtest 2/2] RUN echo ${THEENV}                                                                          0.0s
 => [envtest] exporting to image                                                                                     0.0s
<-- SNIP -->
 ✔ Container dd-envtest-1    Recreated                                                                               0.1s
Attaching to envtest-1
envtest-1  | PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
envtest-1  | HOSTNAME=816d164dc067
envtest-1  | THEENV=
envtest-1  | HOME=/root
envtest-1 exited with code 0

The value for our variable can be supplied in several different ways, as follows:

  • On the compose command line:
$ THEENV=bar docker compose up

[+] Running 2/0
 ✔ Synchronized File Shares                                                                                          0.0s
 ✔ Container dd-envtest-1    Recreated                                                                               0.1s
Attaching to envtest-1
envtest-1  | PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
envtest-1  | HOSTNAME=20f67bb40c6a
envtest-1  | THEENV=bar
envtest-1  | HOME=/root
envtest-1 exited with code 0
  • In the shell environment on the host system:
$ export THEENV=bar
$ docker compose up

[+] Running 2/0
 ✔ Synchronized File Shares                                                                                          0.0s
 ✔ Container dd-envtest-1    Created                                                                                 0.0s
Attaching to envtest-1
envtest-1  | PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
envtest-1  | HOSTNAME=20f67bb40c6a
envtest-1  | THEENV=bar
envtest-1  | HOME=/root
envtest-1 exited with code 0
  • In the special .env file:
$ cat .env
THEENV=bar

$ docker compose up

[+] Running 2/0
 ✔ Synchronized File Shares                                                                                          0.0s
 ✔ Container dd-envtest-1    Created                                                                                 0.0s
Attaching to envtest-1
envtest-1  | PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
envtest-1  | HOSTNAME=20f67bb40c6a
envtest-1  | THEENV=bar
envtest-1  | HOME=/root
envtest-1 exited with code 0

Finally, when running services directly using docker compose run, you can use the -e flag to override the .env file.

$ docker compose run -e THEENV=bar envtest

[+] Creating 1/0
 ✔ Synchronized File Shares                                                                                          0.0s
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
HOSTNAME=219e96494ddd
TERM=xterm
THEENV=bar
HOME=/root

The tl;dr

If you need to access a variable during the build process but not at runtime, use ARG. If you need to access the variable both during the build and at runtime, or only at runtime, use ENV.

To decide between them, consider the following flow (Figure 1):

build process

Both ARG and ENV can be overridden from the command line in docker run and docker compose, giving you a powerful way to dynamically update variables and build flexible workflows.

Learn more

Docker for Web Developers: Getting Started with the Basics

17 septembre 2024 à 13:38

Docker is known worldwide as a popular application containerization platform. But it also has a lesser-known and intriguing alter ego. It’s a popular go-to platform among web developers for its speed, flexibility, broad user base, and collaborative capabilities. 

Docker has been growing as a modern solution that brings innovation to web development using containerization. With containers, developers and web development projects can become more efficient, save time, and drive fresh creativity. Web developers use Docker for development because it ensures consistency across different environments, eliminating the “it works on my machine” problem. Docker also simplifies dependency management, enhances resource efficiency, supports scalable microservices architectures, and allows for rapid deployment and rollback, making it an indispensable tool for modern web development projects.

In this post, we dive into the benefits of using Docker in businesses from small to large, and review Docker’s broad capabilities, strengths, and features for bolstering web development and developer productivity. 

2400x1260 docker for web developers

What is Docker?

Docker is secure, out-of-the-box containerization software offering developers and teams a robust, hybrid toolkit to develop, test, monitor, ship, deploy, and run enterprise and web applications. Containerization lets developers separate their applications from infrastructure so they can run them without worrying about what is installed on the host, giving development teams flexibility and collaborative advantages over virtual machines, while delivering better source code faster. 

The Docker suite enables developers to package and run their application code in lightweight, local, standardized containers that have everything needed to run the application — including an operating system and required services. Docker allows developers to run many containers simultaneously on a host, while also allowing the containers to be shared with others. By working within this collaborative workspace, productive and direct communications can thrive and development processes become easier, more accurate, and more secure. Many of the components in Docker are open source, including Docker Compose, BuildKit, the Docker command-line interface (Docker CLI), containerd, and more. 

As the #1 containerization software for developers and teams, Docker is well-suited for all flavors of development. Highlights include: 

  • Docker Hub: The world’s largest repository of container images, which helps developers and open source contributors find, use, and share their Docker-inspired container images.
  • Docker Compose: A tool for defining and running multi-container applications.
  • Docker Engine: An open source containerization technology for building and containerizing applications.
  • Docker Desktop: Includes the Docker Engine and other open source components; proprietary components; and features such as an intuitive GUI, synchronized file shares, access to cloud resources, debugging features, native host integration, governance, and security features that support Enhanced Container Isolation (ECI), air-gapped containers, and administrative settings management.
  • Docker Build Cloud: A Docker service that lets developers build their container images on a cloud infrastructure that ensures fast builds anywhere for all team members. 

What is a container?

Containers are lightweight, standalone, executable packages of software that include everything needed to run an application: code, runtime, libraries, environment variables, and configuration files. Containers are isolated from each other and can be connected to networks or storage and can be used to create new images based on their current states. 

Docker containers are faster and more efficient for software creation than virtualization, which uses a resource-heavy software abstraction layer on top of computer hardware. Additionally, Docker containers require fewer physical hardware resources than virtual machines and communicate with their host systems through well-defined channels.

Why use Docker for web applications?

Docker is a popular choice for developers building enterprise applications for various reasons, including consistent environments, efficient resource usage, speed, container isolation, scalability, flexibility, and portability. And, Docker is popular for web development for these same reasons. 

Consistent environments

Using Docker containers, web developers can build web applications that provide consistent environments from their development all the way through to production. By including all the components needed to run an application within an isolated container, Docker addresses those issues by allowing developers to produce and package their containers and then run them through various development, testing, and production environments to ensure their quality, security, and performance. This approach helps developers prevent the common and frustrating “but it works on my machine” conundrum, assuring that the code will run and perform well anywhere, from development through deployment.

Efficiency in using resources

With its lightweight architecture, Docker uses system resources more efficiently than virtual machines, allowing developers to run more applications on the same hardware. Docker containers allow multiple containers to run on a single host and gain resource efficiency due to the isolation and allocation features that containers incorporate. Additionally, containers require less memory and disk space to perform their tasks, saving on hardware costs and making resource management easier. Docker also saves development time by allowing container images to be reused as needed. 

Speed

Docker’s design and components also give developers significant speed advantages in setting up and tearing down container environments, allowing needed processes to be completed in seconds due to its lightweight and flexible application architecture. This allows developers to rapidly iterate their containerized applications, increasing their productivity for writing, building, testing, monitoring, and deploying their creations.  

Isolation

Docker’s application isolation capabilities provide huge benefits for developers, allowing them to write code and build their containers and applications simultaneously, with changes made in one not affecting the others. For developers, these capabilities allow them to find and isolate any bad code before using it elsewhere, improving security and manageability.

Scalability, flexibility, and portability

Docker’s flexible platform design also gives developers broad capabilities to easily scale applications up or down based on demand, while also allowing them to be deployed across different servers. These features give developers the ability to manage different workloads and system resources as needed. And, its portability features mean that developers can create their applications once and then use them in any environment, further ensuring their reliability and proper operation through the development cycle to production.

How web developers use Docker

There is a wide range of Docker use cases for today’s web developers, including its flexibility as a local development environment that can be quickly set up to match desired production environments; as an important partner for microservices architectures, where each service can be developed, tested, and deployed independently; or as an integral component in continuous integration and continuous deployment (CI/CD) pipelines for automated testing and deployment.

Other important Docker use cases include the availability of a strong and knowledgeable user community to help drive developer experiences and skills around containerization; its importance and suitability for vital cross-platform production and testing; and deep resources and availability for container images that are usable for a wide range of application needs. 

Get started with Docker for web development (in 6 steps)

So, you want to get a Docker container up and running quickly? Let’s dive in using the Docker Desktop GUI. In this example, we will use the Docker version for Microsoft Windows, but there are also Docker versions for use on Mac and many flavors of Linux

Step 1: Install Docker Desktop

Start by downloading the installer from the docs or from the release notes.

Double-click Docker Desktop for Windows Installer.exe to run the installer. By default, Docker Desktop is installed at C:\Program Files\Docker\Docker.

When prompted, be sure to choose the WSL 2 option instead of the Hyper-V option on the configuration page, depending on your choice of backend. If your system only supports one of the two options, you will not be able to select which backend to use.

Follow the instructions on the installation wizard to authorize the installer and proceed with the installation. When the installation is successful, select Close to complete the installation process.

Step 2: Create a Dockerfile

A Dockerfile is a text-based file that contains a running script of instructions giving full details on how a developer wants to build their Docker container image. A Dockerfile, which uses no file extension, is built by creating a file named Dockerfile in the getting-started-app directory, which is also where the package.json file is found. 

A Dockerfile contains details about the container’s operating system, file locations, environment, dependencies, configuration, and more. Check out the useful Docker best practices documentation for creating quality Dockerfiles. 

Here is a basic Dockerfile example for setting up an Apache web server

Create a Dockerfile in your project:

FROM httpd:2.4
COPY ./public-html/ /usr/local/apache2/htdocs/

Next, run the commands to build and run the Docker image:

$ docker build -t my-apache2
$ docker run -dit --name my-running-app -p 8080:80 my-apache2

Visit http://localhost:8080 to see it working.

Step 3: Build your Docker image

The Dockerfile that was just created allows us to start building our first Docker container image. The docker build command initiated in the previous step started the new Docker image using the Dockerfile and related “context,” which is the set of files located in the specified PATH or URL. The build process can refer to any of the files in the context. Docker images begin with a base image that must be downloaded from a repository to start a new image project.

Step 4: Run your Docker container

To run a new container, start with the docker run command, which runs a command in a new container. The command pulls an image if needed and then starts the container. By default, when you create or run a container using docker create or docker run, the container does not expose any of its ports to the outside world. To make a port available to services outside of Docker you must use the --publish or -p flag commands. This creates a firewall rule in the host, mapping a container port to a port on the Docker host to the outside world. 

Step 5: Access your web application

How to access a web application that is running inside a Docker container.

To access a web application running inside a Docker container, you need to publish the container’s port to the host. This can be done using the docker run command with the --publish or -p flag. The format of the --publish command is [host_port]:[container_port].

Here is an example of how to run a container and publish its port using the Docker CLI:

$ docker run -d -p 8080:80 docker/welcome-to-docker

In this command, the first 8080 refers to the host port. This is the port on your local machine that will be used to access the application running inside the container. The second 80 refers to the container port. This is the port that the application inside the container listens on for incoming connections. Hence, the command binds to port 8080 of the host to port 80 on the container system.

After running the container with the published port, you can access the web application by opening a web browser and visiting http://localhost:8080.

You can also use Docker Compose to run the container and publish its port. Here is an example of a compose.yaml file that does this:

services:
 app:
   image: docker/welcome-to-docker
   ports:
     - 8080:80

After creating this file, you can start the application with the docker compose up command. Then, you can access the web application at http://localhost:8080.

Step 6: Make changes and update

Updating a Docker application in a container requires several steps. With the command-line interface use the docker stop command to stop the container, then the existing container can be removed by using the docker rm (remove) command. Next, a new updated container can be started by using a new docker run command with the updated container. The old container must be stopped before replacing it because the old container is already using the host’s port 3000. Only one process on the machine — including containers — can listen to a specific port at a time. Only after the old container is stopped can it be removed and replaced with a new one. 

Conclusion

In this blog post, we learned about how Docker brings valuable benefits to web developers to speed up and improve their operations and creativity, and we touched on how web developers can get started with the platform on Day One, including basic instructions on setting up Docker quickly to start using it for web development.

Docker delivers streamlined workflows for web development due to its lightweight architecture and broad collaboration, application design, scalability, and other benefits. Docker expands the capabilities of web application developers, giving them flexible tools for everything from building better code to testing, monitoring, and deploying reliable code more quickly. 

Subscribe to our newsletter to stay up-to-date about Docker and its latest uses and innovations. 

Learn more

Deploying a Next.js App on HTTPS with Docker Using NGINX as a Reverse Proxy

Par : Adesoji Alu
28 août 2024 à 12:49
In this tutorial, we’ll walk through setting up a Docker Compose environment where NGINX serves as a reverse proxy for a Next.js application. This setup will handle SSL connections, providing a secure HTTPS endpoint for your application. Additionally, we’ll make the code sections on this page copyable for your convenience. Step 1: Dockerizing Your Next.js […]

Docker Best Practices: Understanding the Differences Between ADD and COPY Instructions in Dockerfiles

Par : Jay Schmidt
8 août 2024 à 13:30

COPY vs. ADD tl;dr:

When you search for “Dockerfile best practices,” one of the suggestions you will find is that you always use the COPY instruction instead of the ADD instruction when adding files into your Docker image.

This blog post will explore why this suggestion exists by providing additional detail on the functionality of these two instructions. Once you understand these concepts, you may find scenarios where you can benefit from ignoring the suggestion and using the ADD command instead of COPY.

2400x1260 understanding the differences between add and copy instructions in dockerfiles

Understanding file system build context

Before diving into the differences between ADD and COPY, it’s important to understand the concept of build context. The build context is the set of files and directories that are accessible to the Docker engine when building an image. When you run a docker build command, Docker sends the content of the specified context directory (and its subdirectories) to the Docker daemon. This context forms the scope within which the COPY and ADD instructions operate.

COPY instruction

The COPY instruction is straightforward and does exactly what its name implies: It copies files and directories from a source within the build context to a destination layer in the Docker image. This instruction can be used to copy both files and directories, and all paths on the host are relative to the root of the build context.

Syntax:

COPY <src>... <dest>
  • <src>: The source files or directories on the host.
  • <dest>: The destination path inside the Docker image.

Key points

  • Basic functionality: COPY only supports copying files and directories from the host file system. It does not support URLs or automatic unpacking of compressed files.
  • Security: Because COPY only handles local files, it tends to be more predictable and secure than ADD, reducing the risk of unintentionally introducing files from external sources.
  • Use case: Best used when you need to include files from your local build context into the Docker image without any additional processing.

Example:

COPY ./app /usr/src/app
COPY requirements.txt /usr/src/app/

In this example, the contexts of the local app directory are copied into the /usr/src/app directory inside the Docker image being built. The second command copies the requirements.txt file into the /usr/src/app directory as well.

ADD instruction

The ADD instruction provides the same functionality that the COPY instruction does, but it also has additional functionality that, if misunderstood, can introduce complexity and potential security risks.

Syntax:

ADD <src>... <dest>
  • <src>: The source files (directories or URLs).
  • <dest>: The destination path inside the Docker image.

Key points

  • Extended functionality: In addition to copying local files and directories from the build context, ADD provides the following advanced functionality:
    • Handle URLs: When supplied as a source, the file referenced by a URL will be downloaded to the current Docker image layer at the supplied destination path.
    • Extract archives: When supplied as a source, ADD will automatically unpack and expand archives to the current Docker image layer at the supplied destination path.
  • Flexibility vs. security: Although ADD is more flexible, it does introduce risk. Downloading external URLs into the build process may allow malicious code or contents to be brought into the process. Using ADD with archives may result in unintended consequences if you do not understand how it handles archives.
  • Use case: ADD should only be used when you need specific functionality that it provides and are willing to manage the potential security issues arising from this usage.

Example:

ADD https://example.com/file.tar.gz /usr/src/app/
ADD my-archive.tar.gz /usr/src/app/

In this example, the build process first downloads https://example.com/file.tar.gz and extracts the contents into /usr/src/app in the Docker image layer. In the next step, it takes the local file my-archive.tar.gz and extracts it into the Docker image layer under /usr/src/app.

When to use COPY vs. ADD

  • For most use cases, COPY is the better choice due to its simplicity and security. This instruction allows you to transfer files and directories from your local context into the Docker image you are building.
  • Use ADD only when you need the additional capabilities it offers, but be mindful of potential security implications.

Remote contexts

In addition to traditional file system contexts, Docker also supports remote contexts, which can be particularly useful in cloud environments or for building images from code repositories directly. These include:

  • Git repositories: You can specify a Git repository URL as the build context, allowing Docker to clone the repository and use its content as the context.
docker build https://github.com/username/repository.git#branch
  • Remote URLs: Docker can use remote URLs for the build context. This is useful for building images directly from archives available online.
docker build http://example.com/context.tar.gz
  • OCI images: You can use an OCI image as the build context, which is useful for using pre-built images as the starting point for new builds.
docker build oci://registry.example.com/image:tag

How ADD and COPY behave in remote contexts

Note that both ADD and COPY behave slightly differently when used in a remote context.

Using COPY with remote contexts

COPY still operates within the scope of the build context, and can copy files and directories from the cloned repository into the Docker image. For example, when using a Git repository as the build context, COPY can copy files and directories from the cloned repository into the Docker image. It does not support copying files from URLs or other remote sources directly.

Example with Git repository as build context:

# Using a Git repository as build context
COPY ./src /app/src

In this case, COPY will copy the src directory from the Git repository (the build context) to /app/src in the Docker image.

Example with URL build context:

# Using an archive from a URL
COPY ./src /app/src

In this case, COPY will copy the src directory from the extracted archive (the build context) to /app/src in the Docker image.

Example with OCI image as build context:

# Using an OCI image as build context
COPY /path/in/oci/image /app/path

In this case, COPY will copy the contents from the specified path within the OCI image to the specified destination path in the Docker image.

Using ADD with remote contexts

The ADD instruction can still be used to download files and extract archives as well as copy files from the build context. Note that all the caveats provided about the ADD instruction above apply here as well.

Example with Git repository as build context:

# Using a Git repository as build context
ADD https://example.com/data.tar.gz /data
ADD ./src /app/src

In this example, ADD will download and extract data.tar.gz from the URL into the /data directory in the Docker image. It will also copy the src directory from the Git repository (the build context) to /app/src in the Docker image.

Example with URL build context:

# Using an archive from a URL
ADD https://example.com/data.tar.gz /data
ADD ./src /app/src

In this example, ADD will download and extract data.tar.gz from the URL into the /data directory in the Docker image. It will also copy the src directory from the downloaded/unpacked URL (the build context) to /app/src in the Docker image.

Example with OCI image as build context:

# Using an OCI image as build context
ADD https://example.com/data.tar.gz /data
ADD /path/in/oci/image /app/path

In this scenario, ADD will download and extract data.tar.gz from the URL into the /data directory in the Docker image. It will also copy the contents from the specified path within the OCI image to the specified destination path in the Docker image.

COPY vs. ADD tl;dr:

  • Prefer COPY: For most use cases, COPY is the better choice due to its simplicity and security. Use it to transfer files and directories from your local context or a remote context like a Git repository to the Docker image.
  • Use ADD with caution: Opt for ADD only when you need its additional functionalities, like downloading files from URLs or automatically extracting archives (Figure 1). Always be mindful of the potential security implications when using ADD.
Diagram showing concepts that are also explained in the blog post.

Conclusion

Understanding the differences between ADD and COPY instructions in Dockerfiles and how they can be affected by build context can help you build more efficient and secure Docker images. Although COPY offers a straightforward way to include local files, ADD provides additional flexibility with the cost of increased complexity and potential security risks.

Learn more

Introducing Docker Build Checks: Optimize Dockerfiles with Best Practices

29 juillet 2024 à 14:32

Today, we’re excited to announce the release of Docker Build checks with Docker Desktop 4.33. Docker Build checks help your team learn and follow best practices for building container images. When you run a Docker Build, you will get a list of warnings for any check violations detected in your build. Taking a proactive approach and resolving Build warnings and issues early will save you time and headaches downstream. 

banner how to set up the weaviate vector database on docker

Why did we create Docker Build checks?

During conversations with developers, we found that many struggle to learn and follow the best practices for building container images. According to our 2024 State of Application Development Report, 35% of Docker users reported creating and editing Dockerfiles as one of the top three tasks performed. However, 55% of respondents reported that creating Dockerfiles is the most selected task they refer to support. 

Developers often don’t have the luxury of reading through the Docker Build docs, making the necessary changes to get things working, and then moving on. A Docker Build might “work” when you run docker build, but a poorly written Dockerfiles may introduce quality issues, such as they are:

  • Hard to maintain or update
  • Contain hidden and unexpected bugs 
  • Have sub-optimal performance

In our conversations with Docker users, we heard that they want to optimize their Dockerfiles to improve build performance, aren’t aware of current best practices, and would like to be guided as they build. 

Investigating and fixing build issues wastes time. We created Docker Build checks to empower developers to write well-structured Dockerfiles from the get-go and learn from existing best practices. With Build checks, your team spends less time on build issues and more on innovation and coding.   

Why should you use Docker Build checks? 

You want to write better Dockerfiles and save time! 

We have collected a set of best practices from the community of build experts and codified them into Docker Build tooling. You can use Docker Build checks to evaluate all stages of your local and CI workflows, including multi-stage builds and Bake, and deep dive in the Docker Desktop Builds view. You can also choose which rules to skip. 

You can access Docker Build checks in the CLI and in the Docker Desktop Builds view. 

More than just linting: Docker Build checks are powerful and fast 

Linting tools typically just evaluate the text files against a set of rules. As a native part of Docker Build, the rules in Docker Build checks are more powerful and accurate than just linting. Docker Build checks evaluate the entire build, including the arguments passed in and the base images used. These checks are quick enough to be run in real-time as you edit your Dockerfile. You can quickly evaluate a build without waiting for a full build execution. 

Check your local builds

A good practice is to evaluate a new or updated Dockerfile before committing or sharing your changes. Running docker build will now give you an overview of issues and warnings in your Dockerfile.

build checks 433 f1
Figure 1: A Docker Build with four check warnings displayed.

To get more information about these specific issues, you can specify the debug flag to the Docker CLI with docker --debug build. This information includes the type of warning, where it occurs, and a link to more information on how to resolve it. 

build checks 433 f2
Figure 2: Build debug output for the check warnings.

Quickly check your build

Running these checks during a build is great, but it can be time-consuming to wait for the complete build to run each time when you’re making changes or fixing issues. For this reason, we added the --check flag as part of the build command. 

# The check flag can be added anywhere as part of your build command
docker build . --check
docker build --check .
docker build --build-arg VERSION=latest --platfrom linux/arm64 . --check

As illustrated in the following figure, appending the flag to your existing build command will do the same full evaluation of the build configuration without executing the entire build. This faster feedback typically completes in less than a second, making for a smoother development process. 

build checks 433 f3
Figure 3: Running check of build.

Check your CI builds

By default, running a Docker build with warnings will not cause the build to fail (return a non-zero exit code). However, to catch any regressions in your CI builds, add the following declarations to instruct the checks to generate errors. 

# syntax=docker/dockerfile:1
# check=error=true

FROM alpine
CMD echo "Hello, world!"

Checking multi-stage builds in CI

During a build, only the specified stage/target, including its dependent, is executed. We recommend adding a stage check step in your workflow to do a complete evaluation of your Dockerfile. This is similar to how you would run automated tests before executing the full build.

If any warnings are detected, it will return a non-zero exit code, which will cause the workflow to fail, therefore catching any issues.

docker build --check .

Checking builds in Docker Build Cloud

Of course, this also works seamlessly with Docker Build Cloud, both locally and through CI. Use your existing cloud builders to evaluate your builds. Your team now has the combined benefit of Docker Build Cloud performance with the reassurance that the build will align with best practices. In fact, as we expand our checks, you should see even better performance from your Docker Build Cloud builds.

build checks 433 f4
Figure 4: Running checks in Docker Build Cloud.

Configure rules

You have the flexibility to configure rules in Build checks with a skip argument. You can also specify skip=all or skip=none to toggle the rules on and off. Here’s an example of skipping the JSONArgsRecommended and StageNameCasing rules:

# syntax=docker/dockerfile:1
# check=skip=JSONArgsRecommended,StageNameCasing

FROM alpine AS BASE_STAGE
CMD echo "Hello, world!"

Dive deep into Docker Desktop Builds view

In Docker Desktop Builds view, you can see the output of the build warnings. Locating the cause of warnings in Dockerfiles and understanding how to resolve them quickly is now easy.

As with build errors, warnings are shown inline with your Dockerfile when inspecting a build in Docker Desktop:

build checks 433 f5
Figure 5: Build checks warnings in Docker Desktop Builds view.

What’s next? 

More checks

We are excited about the new Builds checks to help you apply best practices to your Dockfiles, but this is just the start. In addition to the current set of checks, we plan on adding even more to provide a more comprehensive evaluation of your builds. Further, we look forward to including custom checks and policies for your Docker builds.

IDE integration

The earlier you identify issues in your builds, the easier and less costly it is to resolve them. We plan to integrate Build checks with your favorite IDEs so you can get real-time feedback as you type.

build checks 433 f6
Figure 6: Check violations displaying in VS Code.

GitHub Actions and Docker Desktop

You can already see Build checks warnings in Docker Desktop, but more detailed insights are coming soon to Docker Desktop. As you may have heard, we recently announced Inspecting Docker Builds in GitHub Actions’s beta release, and we plan to build on this new functionality to include support for investigating check warnings.

Get started now

To get started with Docker Build checks, upgrade to Docker Desktop 4.33 today and try them out with your existing Dockerfiles. Head over to our documentation for a more detailed breakdown of Build checks. 

Learn more

How to Create Dockerfiles with GenAI 

Par : Docker Labs
29 juillet 2024 à 14:00

This ongoing Docker Labs GenAI series will explore the exciting space of AI developer tools. At Docker, we believe there is a vast scope to explore, openly and without the hype. We will share our explorations and collaborate with the developer community in real time. Although developers have adopted autocomplete tooling like GitHub Copilot and use chat, there is significant potential for AI tools to assist with more specific tasks and interfaces throughout the entire software lifecycle. Therefore, our exploration will be broad. We will be releasing things as open source so you can play, explore, and hack with us, too.

As we learn how to work more effectively with generative AI, one of the tasks we return to is Dockerfile generation. Because we can equip our AI assistants with tools, does this allow us to reframe the problem of content generation as an interaction between LLMs and tools?

2400x1260 docker labs genai

If you ask ChatGPT to write a Dockerfile for your project, it will initiate a conversation with you and ask you to extract some details from your project (Figure 1).

Screenshot of conversation with ChatGPT starting with the question: Can you write a Dockerfile for this project?
Figure 1: Asking ChatGPT for help writing a Dockerfile.

However, we can also equip our assistant with a tool to extract this information from the project directly. Instead of asking the user, an assistant can ask for a tool instead.

Using tools

Here’s an alternative framing. Let’s prompt the assistant with two high-level instructions whenever it is asked to create a Dockerfile.

* First, analyze the project to determine how it should be built. 
* Once the analysis is complete, create a Dockerfile to build that project.

In addition to these new prompts, we will also supply the LLM with two function definitions.

[{"name": "analyze_project",
 "description": "Analyze a project to determine how it should be built"},
 {"name": "write_files",
  "description": "write a set of files to my project",
  "parameters": {...}}]

With these two functions, the assistant will have enough agency that it can write a Dockerfile directly to a user’s project. Unfortunately, without best practices, the actual Dockerfile content is never very good. 

Here’s an example of the kind of Dockerfile that gets generated when the assistant knows the kind of project but not much else.

# Use the official Node.js image from the Docker Hub
FROM node:14

# Create and change to the app directory
WORKDIR /usr/src/app

# Copy the package.json and package-lock.json files
COPY package*.json ./

# Install dependencies
RUN npm install

# Copy the rest of the application code
COPY . .

# Expose the port your app runs on
EXPOSE 3000

# Define the command to run your app
CMD ["npm", "start"]

This example shows several standard Dockerfile authoring problems.

  • Uses out of date base images.
  • Does not employ multi-stage builds.
  • Skips newer Docker features such as cache mounts.

So, what can we do about this? Let’s extract some best practices for authoring NPM project Dockerfiles.

Write Dockerfiles for NPM projects using three stages.  Do these three steps sequentially.
* the first node dependencies stage should be called "deps" 
  and it should fetch the runtime dependencies using npm ci
  with the --omit=dev flag.
* The second Dockerfile stage should be called build  and it should be based on the deps stage. 
  It should run npm ci and then npm build
* The third Dockerfile stage should select a base image 
  that is recommended by Scout and use that in the FROM line.
  This stage should do then do three things.
	1. it copies the node_modules directory from the deps stage.
    2. it copies the dist directory from the build stage.
    3. it then runs npm start

If you need to use a RUN statement containing npm ci always 
add the argument --mount=type=cache,target=/root/.npm to the RUN instruction.  
The --mount argument should be placed between the word RUN and the npm command.
This will cache the npm packages in the docker build cache and speed up the build process.
After selecting a base image, get a recommended tag. This recommendation will change over time but start with what is recommended now.

We’ll include this content in any conversation occurring in the context of an NPM project. We’ll also add a third function.

[{"name": "docker_scout_tag_recommendation",
  "description": "get a recommended tag",
  "parameters": {...}}]

At the time of writing, here’s the output from our assistant. In this example, the assistant is using the GPT-4 model, but we are seeing the same results with other local models.

# Stage 1 - Downloading dependencies
FROM node:22-slim AS deps
WORKDIR /usr/src/app
COPY package*.json ./
RUN --mount=type=cache,target=/root/.npm npm ci --omit=dev

# Stage 2 - Building application
FROM deps AS build
RUN --mount=type=cache,target=/root/.npm npm ci && npm build

# Stage 3 - Using a recommended base image from Scout
FROM node:22-slim
WORKDIR /usr/src/app
COPY --from=deps /usr/src/app/node_modules ./node_modules
COPY --from=build /usr/src/app/dist ./dist
CMD [ "npm", "start" ]

This is a big improvement. The three inputs that underlie this improvement are listed below.

  • The two top-level instructions that anchor the process our assistant should use:
* First, analyze the project to determine how it should be built. 
* Once the analysis is complete, create a Dockerfile to build that project.
  • The Markdown content that summarizes our Docker specific best practices.
  • The functions (analyze_project, write_files, and docker_scout_tag_recommendation).

Distilling best practices is certainly not easy, but a little can go a long way. Still, it is still hard to know whether this can be considered an improvement over just using code templates. Our continuing investigation centers around whether generative AI will help us to more easily adapt these models to new knowledge, new features, and new project types.

Conversation loop

It’s instructive to go through the actual messages passed between agents here. It’s particularly interesting to look at the interplay between the Assistant and the Tool agents. Near the end of the exchange, there is an unexpected twist where the write_files tool complains that it was passed an invalid JSON file. In this case, the assistant actually did generate invalid JSON and the tool was right to complain. However, the assistant also responded to this error by fixing the JSON and calling the tool a second time! That was unscripted.

Another interesting example of an unscripted moment is the last message from the assistant. The assistant correctly determines that the task is now complete and generates a summary of the work that was done and why.

Transcript

System:

You are an assistant who specializes in authoring Dockerfiles for projects. Since you are an expert and know about their project, be definitive about recommendations.

User:

  • First, analyze the project to determine how it should be built.
  • Once the analysis is complete, create a Dockerfile to build that project.

Assistant:

[{:function {:arguments "{}" :name "analyze_project"}}]

Tool (analyze_project):

This is an NPM project.

System:

Write Dockerfiles for NPM projects using three stages. Do these three steps sequentially …

… (system prompts excluded for brevity) ..

Assistant:

[{:function {:arguments "{
  \"repository\": \"node\"
}"
             :name "docker_scout_tag_recommendation"}}]

Tool (docker_scout_tag_recommendation):

22-slim

Assistant:

[{:function {:arguments "{
  \"files\": [
    {
      \"path\": \"Dockerfile\",
      \"content\": `
# Stage 1 - the build process
FROM node:22-slim AS deps
WORKDIR /usr/src/app
COPY package*.json ./
RUN --mount=type=cache,target=/root/.npm npm ci --omit=dev
# Stage 2 - the production environment
FROM deps AS build
RUN npm ci && npm run build
# Stage 3 - the production environment
FROM node:22-slim
WORKDIR /usr/src/app
COPY --from=deps /usr/src/app/node_modules ./node_modules
COPY --from=build /usr/src/app/dist ./dist
CMD [ \"npm\", \"start\" ]`,
      \"executable\": false
    }
  ]
}"
             :name "write_files"}
  :id "call_6VVCLiXcIBNyEGPQwftR7xEV"}]

Tool:

function call write_files failed call exited with non-zero code (1):  call exited with non-zero code (1): #error {\r\n :cause Unexpected character (‘`’)

Assistant:

I apologize for the error. Let me try that again.

[{:function {:arguments "{
  \"files\": [
    {
      \"path\": \"Dockerfile\",
      \"content\": \"FROM node:22-slim AS deps\\nWORKDIR /usr/src/app\\nCOPY package*.json ./\\nRUN --mount=type=cache,target=/root/.npm npm ci --omit=dev\\n\\nFROM deps AS build\\nRUN npm ci && npm run build\\n\\nFROM node:22-slim\\nWORKDIR /usr/src/app\\nCOPY --from=deps /usr/src/app/node_modules ./node_modules\\nCOPY --from=build /usr/src/app/dist ./dist\\nCMD [ \\\"npm\\\", \\\"start\\\" ]\",
      \"executable\": false
    }
  ]
}"
             :name "write_files"}}]

Tool (write_files):

wrote 1 files: Dockerfile

Assistant:

I have written the Dockerfile for your NPM project. You can find it in your project folder. This Dockerfile uses a multi-stage build process to efficiently build and run your application within a Docker container. The first stage fetches the runtime dependencies, the second stage builds the application, and the final stage runs the application. Please let me know if you need further assistance.

Summary

For a generative AI assistant to improve its ability to generate Dockerfile content, we have identified both tools and stack-specific best practice content as key enablers. Incorporating these elements into a generative AI based assistant can serve as an alternative to code templates. We continue to evaluate whether this approach allows us to more easily adapt to new project types.

As always, feel free to follow along in our public repo and reach out. Everything we’ve discussed in this blog post is available for you to try on your own projects.

For more on what we’re doing at Docker, subscribe to our newsletter.

Learn more

Docker Best Practices: Choosing Between RUN, CMD, and ENTRYPOINT

Par : Jay Schmidt
15 juillet 2024 à 14:00

Docker’s flexibility and robustness as a containerization tool come with a complexity that can be daunting. Multiple methods are available to accomplish similar tasks, and users must understand the pros and cons of the available options to choose the best approach for their projects.

One confusing area concerns the RUN, CMD, and ENTRYPOINT Dockerfile instructions. In this article, we will discuss the differences between these instructions and describe use cases for each.

2400x1260 choosing between run cmd and entrypoint

RUN

The RUN instruction is used in Dockerfiles to execute commands that build and configure the Docker image. These commands are executed during the image build process, and each RUN instruction creates a new layer in the Docker image. For example, if you create an image that requires specific software or libraries installed, you would use RUN to execute the necessary installation commands.

The following example shows how to instruct the Docker build process to update the apt cache and install Apache during an image build:

RUN apt update && apt -y install apache2

RUN instructions should be used judiciously to keep the image layers to a minimum, combining related commands into a single RUN instruction where possible to reduce image size.

CMD

The CMD instruction specifies the default command to run when a container is started from the Docker image. If no command is specified during the container startup (i.e., in the docker run command), this default is used. CMD can be overridden by supplying command-line arguments to docker run.

CMD is useful for setting default commands and easily overridden parameters. It is often used in images as a way of defining default run parameters and can be overridden from the command line when the container is run. 

For example, by default, you might want a web server to start, but users could override this to run a shell instead:

CMD ["apache2ctl", "-DFOREGROUND"]

Users can start the container with docker run -it <image> /bin/bash to get a Bash shell instead of starting Apache.  

ENTRYPOINT

The ENTRYPOINT instruction sets the default executable for the container. It is similar to CMD but is overridden by the command-line arguments passed to docker run. Instead, any command-line arguments are appended to the ENTRYPOINT command.

Note: Use ENTRYPOINT when you need your container to always run the same base command, and you want to allow users to append additional commands at the end. 

ENTRYPOINT is particularly useful for turning a container into a standalone executable. For example, suppose you are packaging a custom script that requires arguments (e.g., “my_script extra_args”). In that case, you can use ENTRYPOINT to always run the script process (“my_script”) and then allow the image users to specify the “extra_args” on the docker run command line. You can do the following:

ENTRYPOINT ["my_script"]

Combining CMD and ENTRYPOINT

The CMD instruction can be used to provide default arguments to an ENTRYPOINT if it is specified in the exec form. This setup allows the entry point to be the main executable and CMD to specify additional arguments that can be overridden by the user.

For example, you might have a container that runs a Python application where you always want to use the same application file but allow users to specify different command-line arguments:

ENTRYPOINT ["python", "/app/my_script.py"]
CMD ["--default-arg"]

Running docker run myimage --user-arg executes python /app/my_script.py --user-arg.

The following table provides an overview of these commands and use cases.

Command description and use cases

CommandDescriptionUse Case
CMDDefines the default executable of a Docker image. It can be overridden by docker run arguments.Utility images allow users to pass different executables and arguments on the command line.
ENTRYPOINTDefines the default executable. It can be overridden by the “--entrypoint”  docker run arguments.Images built for a specific purpose where overriding the default executable is not desired.
RUNExecutes commands to build layers.Building an image

What is PID 1 and why does it matter?

In the context of Unix and Unix-like systems, including Docker containers, PID 1 refers to the first process started during system boot. All other processes are then started by PID 1, which in the process tree model is the parent of every process in the system. 

In Docker containers, the process that runs as PID 1 is crucial, because it is responsible for managing all other processes inside the container. Additionally, PID 1 is the process that reviews and handles signals from the Docker host. For example, a SIGTERM into the container will be caught and processed by PID 1, and the container should gracefully shut down.

When commands are executed in Docker using the shell form, a shell process (/bin/sh -c) typically becomes PID 1. Still, it does not properly handle these signals, potentially leading to unclean shutdowns of the container. In contrast, when using the exec form, the command runs directly as PID 1 without involving a shell, which allows it to receive and handle signals directly. 

This behavior ensures that the container can gracefully stop, restart, or handle interruptions, making the exec form preferable for applications that require robust and responsive signal handling.

Shell and exec forms

In the previous examples, we used two ways to pass arguments to the RUN, CMD, and ENTRYPOINT instructions. These are referred to as shell form and exec form. 

Note: The key visual difference is that the exec form is passed as a comma-delimited array of commands and arguments with one argument/command per element. Conversely, shell form is expressed as a string combining commands and arguments. 

Each form has implications for executing commands within containers, influencing everything from signal handling to environment variable expansion. The following table provides a quick reference guide for the different forms.

Shell and exec form reference

FormDescriptionExample
Shell FormTakes the form of <INSTRUCTION> <COMMAND>.CMD echo TEST or ENTRYPOINT echo TEST
Exec FormTakes the form of <INSTRUCTION> ["EXECUTABLE", "PARAMETER"].CMD ["echo", "TEST"] or ENTRYPOINT ["echo", "TEST"]

In the shell form, the command is run in a subshell, typically /bin/sh -c on Linux systems. This form is useful because it allows shell processing (like variable expansion, wildcards, etc.), making it more flexible for certain types of commands (see this shell scripting article for examples of shell processing). However, it also means that the process running your command isn’t the container’s PID 1, which can lead to issues with signal handling because signals sent by Docker (like SIGTERM for graceful shutdowns) are received by the shell rather than the intended process.

The exec form does not invoke a command shell. This means the command you specify is executed directly as the container’s PID 1, which is important for correctly handling signals sent to the container. Additionally, this form does not perform shell expansions, so it’s more secure and predictable, especially for specifying arguments or commands from external sources.

Putting it all together

To illustrate the practical application and nuances of Docker’s RUN, CMD, and ENTRYPOINT instructions, along with the choice between shell and exec forms, let’s review some examples. These examples demonstrate how each instruction can be utilized effectively in real-world Dockerfile scenarios, highlighting the differences between shell and exec forms. 

Through these examples, you’ll better understand when and how to use each directive to tailor container behavior precisely to your needs, ensuring proper configuration, security, and performance of your Docker containers. This hands-on approach will help consolidate the theoretical knowledge we’ve discussed into actionable insights that can be directly applied to your Docker projects.

RUN instruction

For RUN, used during the Docker build process to install packages or modify files, choosing between shell and exec form can depend on the need for shell processing. The shell form is necessary for commands that require shell functionality, such as pipelines or file globbing. However, the exec form is preferable for straightforward commands without shell features, as it reduces complexity and potential errors.

# Shell form, useful for complex scripting
RUN apt-get update && apt-get install -y nginx

# Exec form, for direct command execution
RUN ["apt-get", "update"]
RUN ["apt-get", "install", "-y", "nginx"]

CMD and ENTRYPOINT

These instructions control container runtime behavior. Using exec form with ENTRYPOINT ensures that the container’s main application handles signals directly, which is crucial for proper startup and shutdown behavior.  CMD can provide default parameters to an ENTRYPOINT defined in exec form, offering flexibility and robust signal handling.

# ENTRYPOINT with exec form for direct process control
ENTRYPOINT ["httpd"]

# CMD provides default parameters, can be overridden at runtime
CMD ["-D", "FOREGROUND"]

Signal handling and flexibility

Using ENTRYPOINT in exec form and CMD to specify parameters ensures that Docker containers can handle operating system signals gracefully, respond to user inputs dynamically, and maintain secure and predictable operations. 

This setup is particularly beneficial for containers that run critical applications needing reliable shutdown and configuration behaviors. The following table shows key differences between the forms.

Key differences between shell and exec

Shell FormExec Form
FormCommands without [] brackets. Run by the container’s shell, e.g., /bin/sh -c.Commands with [] brackets. Run directly, not through a shell.
Variable SubstitutionInherits environment variables from the shell, such as $HOME and $PATH.Does not inherit shell environment variables but behaves the same for ENV instruction variables.
Shell FeaturesSupports sub-commands, piping output, chaining commands, I/O redirection, etc.Does not support shell features.
Signal Trapping & ForwardingMost shells do not forward process signals to child processes.Directly traps and forwards signals like SIGINT.
Usage with ENTRYPOINTCan cause issues with signal forwarding.Recommended due to better signal handling.
CMD as ENTRYPOINT ParametersNot possible with the shell form.If the first item in the array is not a command, all items are used as parameters for the ENTRYPOINT.

Figure 1 provides a decision tree for using RUN, CMD, and ENTRYPOINT in building a Dockerfile.

2400x1260 run cmd entrypoint
Figure 1: Decision tree — RUN, CMD, ENTRYPOINT.

Figure 2 shows a decision tree to help determine when to use exec form or shell form.

2400x1260 decision tree exec vs shell
Figure 2: Decision tree — exec vs. shell form.

Examples

The following section will walk through the high-level differences between CMD and ENTRYPOINT. In these examples, the RUN command is not included, given that the only decision to make there is easily handled by reviewing the two different formats.

Test Dockerfile

# Use syntax version 1.3-labs for Dockerfile
# syntax=docker/dockerfile:1.3-labs

# Use the Ubuntu 20.04 image as the base image
FROM ubuntu:20.04

# Run the following commands inside the container:
# 1. Update the package lists for upgrades and new package installations
# 2. Install the apache2-utils package (which includes the 'ab' tool)
# 3. Remove the package lists to reduce the image size
#
# This is all run in a HEREDOC; see
# https://www.docker.com/blog/introduction-to-heredocs-in-dockerfiles/
# for more details.
#
RUN <<EOF
apt-get update;
apt-get install -y apache2-utils;
rm -rf /var/lib/apt/lists/*;
EOF

# Set the default command
CMD ab

First build

We will build this image and tag it as ab.

$ docker build -t ab .

[+] Building 7.0s (6/6) FINISHED                                                               docker:desktop-linux
 => [internal] load .dockerignore                                                                              0.0s
 => => transferring context: 2B                                                                                0.0s
 => [internal] load build definition from Dockerfile                                                           0.0s
 => => transferring dockerfile: 730B                                                                           0.0s
 => [internal] load metadata for docker.io/library/ubuntu:20.04                                                0.4s
 => CACHED [1/2] FROM docker.io/library/ubuntu:20.04@sha256:33a5cc25d22c45900796a1aca487ad7a7cb09f09ea00b779e  0.0s
 => [2/2] RUN <<EOF (apt-get update;...)                                                                       6.5s
 => exporting to image                                                                                         0.0s
 => => exporting layers                                                                                        0.0s
 => => writing image sha256:99ca34fac6a38b79aefd859540f88e309ca759aad0d7ad066c4931356881e518                   0.0s
 => => naming to docker.io/library/ab 

Run with CMD ab

Without any arguments, we get a usage block as expected.

$ docker run ab
ab: wrong number of arguments
Usage: ab [options] [http[s]://]hostname[:port]/path
Options are:
    -n requests     Number of requests to perform
    -c concurrency  Number of multiple requests to make at a time
    -t timelimit    Seconds to max. to spend on benchmarking
                    This implies -n 50000
    -s timeout      Seconds to max. wait for each response
                    Default is 30 seconds
<-- SNIP -->

However, if I run ab and include a URL to test, I initially get an error:

$ docker run --rm ab https://jayschmidt.us
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec: "https://jayschmidt.us": stat https://jayschmidt.us: no such file or directory: unknown.

The issue here is that the string supplied on the command line — https://jayschmidt.us — is overriding the CMD instruction, and that is not a valid command, resulting in an error being thrown. So, we need to specify the command to run:

$ docker run --rm  ab ab https://jayschmidt.us/
This is ApacheBench, Version 2.3 <$Revision: 1843412 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking jayschmidt.us (be patient).....done


Server Software:        nginx
Server Hostname:        jayschmidt.us
Server Port:            443
SSL/TLS Protocol:       TLSv1.2,ECDHE-ECDSA-AES256-GCM-SHA384,256,256
Server Temp Key:        X25519 253 bits
TLS Server Name:        jayschmidt.us

Document Path:          /
Document Length:        12992 bytes

Concurrency Level:      1
Time taken for tests:   0.132 seconds
Complete requests:      1
Failed requests:        0
Total transferred:      13236 bytes
HTML transferred:       12992 bytes
Requests per second:    7.56 [#/sec] (mean)
Time per request:       132.270 [ms] (mean)
Time per request:       132.270 [ms] (mean, across all concurrent requests)
Transfer rate:          97.72 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:       90   90   0.0     90      90
Processing:    43   43   0.0     43      43
Waiting:       43   43   0.0     43      43
Total:        132  132   0.0    132     132

Run with ENTRYPOINT

In this run, we remove the CMD ab instruction from the Dockerfile, replace it with ENTRYPOINT ["ab"], and then rebuild the image.

This is similar to but different from the CMD command — when you use ENTRYPOINT, you cannot override the command unless you use the –entrypoint flag on the docker run command. Instead, any arguments passed to docker run are treated as arguments to the ENTRYPOINT.

$ docker run --rm  ab "https://jayschmidt.us/"
This is ApacheBench, Version 2.3 <$Revision: 1843412 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking jayschmidt.us (be patient).....done


Server Software:        nginx
Server Hostname:        jayschmidt.us
Server Port:            443
SSL/TLS Protocol:       TLSv1.2,ECDHE-ECDSA-AES256-GCM-SHA384,256,256
Server Temp Key:        X25519 253 bits
TLS Server Name:        jayschmidt.us

Document Path:          /
Document Length:        12992 bytes

Concurrency Level:      1
Time taken for tests:   0.122 seconds
Complete requests:      1
Failed requests:        0
Total transferred:      13236 bytes
HTML transferred:       12992 bytes
Requests per second:    8.22 [#/sec] (mean)
Time per request:       121.709 [ms] (mean)
Time per request:       121.709 [ms] (mean, across all concurrent requests)
Transfer rate:          106.20 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:       91   91   0.0     91      91
Processing:    31   31   0.0     31      31
Waiting:       31   31   0.0     31      31
Total:        122  122   0.0    122     122

What about syntax?

In the example above, we use ENTRYPOINT ["ab"] syntax to wrap the command we want to run in square brackets and quotes. However, it is possible to specify ENTRYPOINT ab (without quotes or brackets). 

Let’s see what happens when we try that.

$ docker run --rm  ab "https://jayschmidt.us/"
ab: wrong number of arguments
Usage: ab [options] [http[s]://]hostname[:port]/path
Options are:
    -n requests     Number of requests to perform
    -c concurrency  Number of multiple requests to make at a time
    -t timelimit    Seconds to max. to spend on benchmarking
                    This implies -n 50000
    -s timeout      Seconds to max. wait for each response
                    Default is 30 seconds
<-- SNIP -->

Your first thought will likely be to re-run the docker run command as we did for CMD ab above, which is giving both the executable and the argument:

$ docker run --rm ab ab "https://jayschmidt.us/"
ab: wrong number of arguments
Usage: ab [options] [http[s]://]hostname[:port]/path
Options are:
    -n requests     Number of requests to perform
    -c concurrency  Number of multiple requests to make at a time
    -t timelimit    Seconds to max. to spend on benchmarking
                    This implies -n 50000
    -s timeout      Seconds to max. wait for each response
                    Default is 30 seconds
<-- SNIP -->

This is because ENTRYPOINT can only be overridden if you explicitly add the –entrypoint argument to the docker run command. The takeaway is to always use ENTRYPOINT when you want to force the use of a given executable in the container when it is run.

Wrapping up: Key takeaways and best practices

The decision-making process involving the use of RUN, CMD, and ENTRYPOINT, along with the choice between shell and exec forms, showcases Docker’s intricate nature. Each command serves a distinct purpose in the Docker ecosystem, impacting how containers are built, operate, and interact with their environments. 

By selecting the right command and form for each specific scenario, developers can construct Docker images that are more reliable, secure, and optimized for efficiency. This level of understanding and application of Docker’s commands and their formats is crucial for fully harnessing Docker’s capabilities. Implementing these best practices ensures that applications deployed in Docker containers achieve maximum performance across various settings, enhancing development workflows and production deployments.

Learn more

Get Started with the Latest Updates for Dockerfile Syntax (v1.7.0)

Par : Tonis Tiigi
9 avril 2024 à 15:16


Dockerfiles are fundamental tools for developers working with Docker, serving as a blueprint for creating Docker images. These text documents contain all the commands a user could call on the command line to assemble an image. Understanding and effectively utilizing Dockerfiles can significantly streamline the development process, allowing for the automation of image creation and ensuring consistent environments across different stages of development. Dockerfiles are pivotal in defining project environments, dependencies, and the configuration of applications within Docker containers.

With new versions of the BuildKit builder toolkit, Docker Buildx CLI, and Dockerfile frontend for BuildKit (v1.7.0), developers now have access to enhanced Dockerfile capabilities. This blog post delves into these new Dockerfile capabilities and explains how you can can leverage them in your projects to further optimize your Docker workflows.

2400x1260 dockerfile images

Versioning

Before we get started, here’s a quick reminder of how Dockerfile is versioned and what you should do to update it. 

Although most projects use Dockerfiles to build images, BuildKit is not limited only to that format. BuildKit supports multiple different frontends for defining the build steps for BuildKit to process. Anyone can create these frontends, package them as regular container images, and load them from a registry when you invoke the build.

With the new release, we have published two such images to Docker Hub: docker/dockerfile:1.7.0 and docker/dockerfile:1.7.0-labs.

To use these frontends, you need to specify a #syntax directive at the beginning of the file to tell BuildKit which frontend image to use for the build. Here we have set it to use the latest of the 1.x.x major version. For example:

#syntax=docker/dockerfile:1

FROM alpine
...

This means that BuildKit is decoupled from the Dockerfile frontend syntax. You can start using new Dockerfile features right away without worrying about which BuildKit version you’re using. All the examples described in this article will work with any version of Docker that supports BuildKit (the default builder as of Docker 23), as long as you define the correct #syntax directive on the top of your Dockerfile.

You can learn more about Dockerfile frontend versions in the documentation. 

Variable expansions

When you write Dockerfiles, build steps can contain variables that are defined using the build arguments (ARG) and environment variables (ENV) instructions. The difference between build arguments and environment variables is that environment variables are kept in the resulting image and persist when a container is created from it.

When you use such variables, you most likely use ${NAME} or, more simply, $NAME in COPY, RUN, and other commands.

You might not know that Dockerfile supports two forms of Bash-like variable expansion:

  • ${variable:-word}: Sets a value to word if the variable is unset
  • ${variable:+word}: Sets a value to word if the variable is set

Up to this point, these special forms were not that useful in Dockerfiles because the default value of ARG instructions can be set directly:

FROM alpine
ARG foo="default value"

If you are an expert in various shell applications, you know that Bash and other tools usually have many additional forms of variable expansion to ease the development of your scripts.

In Dockerfile v1.7, we have added:

  • ${variable#pattern} and ${variable##pattern} to remove the shortest or longest prefix from the variable’s value.
  • ${variable%pattern} and ${variable%%pattern} to remove the shortest or longest suffix from the variable’s value.
  • ${variable/pattern/replacement} to first replace occurrence of a pattern
  • ${variable//pattern/replacement} to replace all occurrences of a pattern

How these rules are used might not be completely obvious at first. So, let’s look at a few examples seen in actual Dockerfiles.

For example, projects often can’t agree on whether versions for downloading your dependencies should have a “v” prefix or not. The following allows you to get the format you need:

# example VERSION=v1.2.3
ARG VERSION=${VERSION#v}
# VERSION is now '1.2.3'

In the next example, multiple variants are used by the same project:

ARG VERSION=v1.7.13
ADD https://github.com/containerd/containerd/releases/download/${VERSION}/containerd-${VERSION#v}-linux-amd64.tar.gz / 

To configure different command behaviors for multi-platform builds, BuildKit provides useful built-in variables like TARGETOS and TARGETARCH. Unfortunately, not all projects use the same values. For example, in containers and the Go ecosystem, we refer to 64-bit ARM architecture as arm64, but sometimes you need aarch64 instead.

ADD https://github.com/oven-sh/bun/releases/download/bun-v1.0.30/bun-linux-${TARGETARCH/arm64/aarch64}.zip /

In this case, the URL also uses a custom name for AMD64 architecture. To pass a variable through multiple expansions, use another ARG definition with an expansion from the previous value. You could also write all the definitions on a single line, as ARG allows multiple parameters, which may hurt readability.

ARG ARCH=${TARGETARCH/arm64/aarch64}
ARG ARCH=${ARCH/amd64/x64}
ADD https://github.com/oven-sh/bun/releases/download/bun-v1.0.30/bun-linux-${ARCH}.zip /

Note that the example above is written in a way that if a user passes their own --build-arg ARCH=value, then that value is used as-is.

Now, let’s look at how new expansions can be useful in multi-stage builds.

One of the techniques described in “Advanced multi-stage build patterns” shows how build arguments can be used so that different Dockerfile commands run depending on the build-arg value. For example, you can use that pattern if you build a multi-platform image and want to run additional COPY or RUN commands only for specific platforms. If this method is new to you, you can learn more about it from that post.

In summarized form, the idea is to define a global build argument and then define build stages that use the build argument value in the stage name while pointing to the base of your target stage via the build-arg name.

Old example:

ARG BUILD_VERSION=1

FROM alpine AS base
RUN …

FROM base AS branch-version-1
RUN touch version1

FROM base AS branch-version-2
RUN touch version2

FROM branch-version-${BUILD_VERSION} AS after-condition

FROM after-condition
RUN …

When using this pattern for multi-platform builds, one of the limitations is that all the possible values for the build-arg need to be defined by your Dockerfile. This is problematic as we want Dockerfile to be built in a way that it can build on any platform and not limit it to a specific set. 

You can see other examples here and here of Dockerfiles where dummy stage aliases must be defined for all architectures, and no other architecture can be built. Instead, the pattern we would like to use is that there is one architecture that has a special behavior, and everything else shares another common behavior.

With new expansions, we can write this to demonstrate running special commands only on RISC-V, which is still somewhat new and may need custom behavior:

#syntax=docker/dockerfile:1.7

ARG ARCH=${TARGETARCH#riscv64}
ARG ARCH=${ARCH:+"common"}
ARG ARCH=${ARCH:-$TARGETARCH}

FROM --platform=$BUILDPLATFORM alpine AS base-common
ARG TARGETARCH
RUN echo "Common build, I am $TARGETARCH" > /out

FROM --platform=$BUILDPLATFORM alpine AS base-riscv64
ARG TARGETARCH
RUN echo "Riscv only special build, I am $TARGETARCH" > /out

FROM base-${ARCH} AS base

Let’s look at these ARCH definitions more closely.

  • The first sets ARCH to TARGETARCH but removes riscv64 from the value.
  • Next, as we described previously, we don’t actually want the other architectures to use their own values but instead want them all to share a common value. So, we set ARCH to common except if it was cleared from the previous riscv64 rule. 
  • Now, if we still have an empty value, we default it back to $TARGETARCH.
  • The last definition is optional, as we would already have a unique value for both cases, but it makes the final stage name base-riscv64 nicer to read.

Additional examples of including multiple conditions with shared conditions, or conditions based on architecture variants can be found in this GitHub Gist page.

Comparing this example to the initial example of conditions between stages, the new pattern isn’t limited to just controlling the platform differences of your builds but can be used with any build-arg. If you have used this pattern before, then you can effectively now define an “else” clause, whereas previously, you were limited to only “if” clauses.

Copy with keeping parent directories

The following feature has been released in the “labs” channel. Define the following at the top of your Dockerfile to use this feature.

#syntax=docker/dockerfile:1.7-labs

When you are copying files in your Dockerfile, for example, do this:

COPY app/file /to/dest/dir/

This example means the source file is copied directly to the destination directory. If your source path was a directory, all the files inside that directory would be copied directly to the destination path.

What if you have a file structure like the following:

.
├── app1
│   ├── docs
│   │   └── manual.md
│   └── src
│       └── server.go
└── app2
    └── src
        └── client.go

You want to copy only files in app1/src, but so that the final files at the destination would be /to/dest/dir/app1/src/server.go and not just /to/dest/dir/server.go.

With the new COPY --parents flag, you can write:

COPY --parents /app1/src/ /to/dest/dir/  

This will copy the files inside the src directory and recreate the app1/src directory structure for these files.

Things get more powerful when you start to use wildcard paths. To copy the src directories for both apps into their respective locations, you can write:

COPY --parents */src/ /to/dest/dir/ 

This will create both /to/dest/dir/app1 and /to/dest/dir/app2, but it will not copy the docs directory. Previously, this kind of copy was not possible with a single command. You would have needed multiple copies for individual files (as shown in this example) or used some workaround with the RUN --mount instruction instead.

You can also use double-star wildcard (**) to match files under any directory structure. For example, to copy only the Go source code files anywhere in your build context, you can write:

COPY --parents **/*.go /to/dest/dir/

If you are thinking about why you would need to copy specific files instead of just using COPY ./ to copy all files, remember that your build cache gets invalidated when you include new files in your build. If you copy all files, the cache gets invalidated when any file is added or changed, whereas if you copy only Go files, only changes in these files influence the cache.

The new --parents flag is not only for COPY instructions from your build context, but obviously, you can also use them in multi-stage builds when copying files between stages using COPY --from

Note that with COPY --from syntax, all source paths are expected to be absolute, meaning that if the --parents flag is used with such paths, they will be fully replicated as they were in the source stage. That may not always be desirable, and instead, you may want to keep some parents but discard and replace others. In that case, you can use a special /./ relative pivot point in your source path to mark which parents you wish to copy and which should be ignored. This special path component resembles how rsync works with the --relative flag.

#syntax=docker/dockerfile:1.7-labs
FROM ... AS base
RUN ./generate-lot-of-files -o /out/
# /out/usr/bin/foo
# /out/usr/lib/bar.so
# /out/usr/local/bin/baz

FROM scratch
COPY --from=base --parents /out/./**/bin/ /
# /usr/bin/foo
# /usr/local/bin/baz

This example above shows how only bin directories are copied from the collection of files that the intermediate stage generated, but all the directories will keep their paths relative to the out directory. 

Exclusion filters

The following feature has been released in the “labs” channel. Define the following at the top of your Dockerfile to use this feature:

#syntax=docker/dockerfile:1.7-labs

Another related case when moving files in your Dockerfile with COPY and ADD instructions is when you want to move a group of files but exclude a specific subset. Previously, your only options were to use RUN --mount or try to define your excluded files inside a .dockerignore file. 

.dockerignore files, however, are not a good solution for this problem, because they only list the files excluded from the client-side build context and not from builds from remote Git/HTTP URLs and are limited to one per Dockerfile. You should use them similarly to .gitignore to mark files that are never part of your project but not as a way to define your application-specific build logic.

With the new --exclude=[pattern] flag, you can now define such exclusion filters for your COPY and ADD commands directly in the Dockerfile. The pattern uses the same format as .dockerignore.

The following example copies all the files in a directory except Markdown files:

COPY --exclude=*.md app /dest/

You can use the flag multiple times to add multiple filters. The next example excludes Markdown files and also a file called README:

COPY --exclude=*.md --exclude=README app /dest/

Double-star wildcards exclude not only Markdown files in the copied directory but also in any subdirectory:

COPY --exclude=**/*.md app /dest/

As in .dockerignore files, you can also define exceptions to the exclusions with ! prefix. The following example excludes all Markdown files in any copied directory, except if the file is called important.md — in that case, it is still copied.

COPY --exclude=**/*.md --exclude=!**/important.md app /dest/

This double negative may be confusing initially, but note that this is a reversal of the previous exclude rule, and “include patterns” are defined by the source parameter of the COPY instruction.

When using --exclude together with previously described --parents copy mode, note that the exclude patterns are relative to the copied parent directories or to the pivot point /./ if one is defined. See the following directory structure for example:

assets
├── app1
│   ├── icons32x32
│   ├── icons64x64
│   ├── notes
│   └── backup
├── app2
│   └── icons32x32
└── testapp
    └── icons32x32
COPY --parents --exclude=testapp assets/./**/icons* /dest/

This command would create the directory structure below. Note that only directories with the icons prefix were copied, the root parent directory assets was skipped as it was before the relative pivot point, and additionally, testapp was not copied as it was defined with an exclusion filter.

dest
├── app1
│   ├── icons32x32
│   └── icons64x64
└── app2
    └── icons32x32

Conclusion

We hope this post gave you ideas for improving your Dockerfiles and that the patterns shown here will help you describe your build more efficiently. Remember that your Dockerfile can start using all these features today by defining the #syntax line on top, even if you haven’t updated to the latest Docker yet.

For a full list of other features in the new BuildKit, Buildx, and Dockerfile releases, check out the changelogs:

Thanks to community members @tstenner, @DYefimov, and @leandrosansilva for helping to implement these features!

If you have issues or suggestions you want to share, let us know in the issue tracker.

Learn more

5 Benefits of a Container-First Approach to Software Development

Par : Ben Cotton
14 août 2023 à 14:00

Cargo containers completely transformed the shipping industry and enabled the global commerce experience of today. Similarly, software containers simplify application development and deployment, which helps enable the cloud-native software architecture that powers the modern technology we all rely on. Although you can get benefits from containerizing your applications after the fact, you get the most value when you take a container-first approach. 

In this blog post, we discuss insights from Cracking the Code: Effectively Managing All of Those Applications, highlighting five benefits of embracing a container-first approach to software development.

banner cracking the code effectively managing all of those applications
  1. Consistent and reliable software performance

Inconsistency can be a major roadblock to progress. The all too familiar frustration of “it works on my machine” can cause software delivery delays and hinders collaboration. But with containers comes standardization. This ensures that software will perform consistently across the entire development process, regardless of the underlying environment.

Developers and infrastructure engineers also save a lot of time and cognitive energy on configuring and maintaining their environments and workstations. Containers have a small resource footprint, which means your infrastructure can do more with less. And, because each container includes the exact versions of software it needs, you don’t have to worry about conflicting dependencies.

  1. Fewer bugs

Bugs are the bane of every software developer’s existence. However, a container-first approach provides environmental parity. This means that the development, staging, and production environments remain consistent, reducing the likelihood of encountering bugs caused by disparities in underlying conditions. With containers, businesses can significantly reduce debugging time and enhance the overall quality of their software, leading to higher user satisfaction and a stronger competitive edge.

  1. Faster developer onboarding

The learning curve for new developers can often be steep, especially when dealing with complex software environments. Containers revolutionize developer onboarding by providing a replica of the exact environment in which an application will be tested and executed. This is irrespective of the developer’s local operating system or installed libraries. With containers, developers can hit the ground running, accelerating their productivity and contributing to the project’s success from day one.

  1. A more secure supply chain

The Consortium for Information & Software Quality estimates that poor software quality has cost the United States economy $2.41 trillion. Two of the top causes are criminals exploiting vulnerabilities and supply chain problems with third-party software. Containers can help.

Because the Dockerfile is a recipe for creating the container, you can use it to produce a software bill of materials (SBOM). This makes clear what dependencies — including the specific version — go into building the container. Cryptographically signed SBOMs let you verify the provenance of your dependencies, so you can be sure that the upstream library you’re using is the actual one produced by the project.

Using the SBOM, you can also monitor your fleet of containers for known vulnerabilities. When a new vulnerability is discovered, you can quickly tell which of your containers are affected, which makes the response quicker. Containers also provide isolation, micro-segmentation, and other zero-trust techniques, which reduce your attack surface and limit the impact of exploited vulnerabilities.

  1. Improved productivity for faster time-to-market

The standardization, consistency, and security containers bring directly impact software delivery time. With fewer issues to deal with (bugs, compatibility issues, maintenance, etc.), developers can focus on more meaningful tasks and ultimately deliver solutions to customers faster. All of this helps development teams work more efficiently, collaborate effectively, and deliver higher-quality software.

Learn more

Dive deeper into the world of containers and the benefits of adopting a container-first model in your software development by downloading the full white paper, Cracking the Code: Effectively Managing All of Those Applications.

Using Heredocs in Dockerfiles: Simplify Your Image-Building Process

Par : Ajeet Raina
26 mars 2023 à 12:41
When it comes to software development, Docker has emerged as a game-changer in recent years. Docker is an open-source platform that enables developers to build, package, and deploy applications in a consistent manner, regardless of the underlying infrastructure. One of the key components of a Docker image is the Dockerfile, which specifies the instructions for […]
❌
❌