5 Docker Best Practices for Faster Builds and Smaller Images

By applying a few smart Docker practices, you can build faster images, and keep them clean, compact, and production-ready.



5 Docker Best Practices for Faster Builds and Smaller Images
Image by Author

 

Introduction

 
You've written your Dockerfile, built your image, and everything works. But then you notice the image is over a gigabyte, rebuilds take minutes for even the smallest change, and every push or pull feels painfully slow.

This isn’t unusual. These are the default outcomes if you write Dockerfiles without thinking about base image choice, build context, and caching. You don’t need a complete overhaul to fix it. A few focused changes can shrink your image by 60 — 80% and turn most rebuilds from minutes into seconds.

In this article, we’ll walk through five practical techniques so you can learn how to make your Docker images smaller, faster, and more efficient.

 

Prerequisites

 
To follow along, you'll need:

  • Docker installed
  • Basic familiarity with Dockerfiles and the docker build command
  • A Python project with a requirements.txt file (the examples use Python, but the principles apply to any language)

 

Selecting Slim or Alpine Base Images

 
Every Dockerfile starts with a FROM instruction that picks a base image. That base image is the foundation your app sits on, and its size becomes your minimum image size before you've added a single line of your own code.

For example, the official python:3.11 image is a full Debian-based image loaded with compilers, utilities, and packages that most applications never use.

# Full image — everything included
FROM python:3.11

# Slim image — minimal Debian base
FROM python:3.11-slim

# Alpine image — even smaller, musl-based Linux
FROM python:3.11-alpine

 

Now build an image from each and check the sizes:

docker images | grep python

 

You’ll see several hundred megabytes of difference just from changing one line in your Dockerfile. So which should you use?

  • slim is the safer default for most Python projects. It strips out unnecessary tools but keeps the C libraries that many Python packages need to install correctly.
  • alpine is even smaller, but it uses a different C library — musl instead of glibc — that can cause compatibility issues with certain Python packages. So you may spend more time debugging failed pip installs than you save on image size.

Rule of thumb: start with python:3.1x-slim. Switch to alpine only if you're certain your dependencies are compatible and you need the extra size reduction.

 

// Ordering Layers to Maximize Cache

Docker builds images layer by layer, one instruction at a time. Once a layer is built, Docker caches it. On the next build, if nothing has changed that would affect a layer, Docker reuses the cached version and skips rebuilding it.

The catch: if a layer changes, every layer after it is invalidated and rebuilt from scratch.

This matters a lot for dependency installation. Here's a common mistake:

# Bad layer order — dependencies reinstall on every code change
FROM python:3.11-slim

WORKDIR /app

COPY . .                          # copies everything, including your code
RUN pip install -r requirements.txt   # runs AFTER the copy, so it reruns whenever any file changes

 

Every time you change a single line in your script, Docker invalidates the COPY . . layer, and then reinstalls all your dependencies from scratch. On a project with a heavy requirements.txt, that's minutes wasted per rebuild.

The fix is simple: copy the things that change least, first.

# Good layer order — dependencies cached unless requirements.txt changes
FROM python:3.11-slim

WORKDIR /app

COPY requirements.txt .           # copy only requirements first
RUN pip install --no-cache-dir -r requirements.txt   # install deps — this layer is cached

COPY . .                          # copy your code last — only this layer reruns on code changes

CMD ["python", "app.py"]

 

Now when you change app.py, Docker reuses the cached pip layer and only re-runs the final COPY . ..

Rule of thumb: order your COPY and RUN instructions from least-frequently-changed to most-frequently-changed. Dependencies before code, always.

 

Utilizing Multi-Stage Builds

 
Some tools are only needed at build time — compilers, test runners, build dependencies — but they end up in your final image anyway, bloating it with things the running application never touches.

Multi-stage builds solve this. You use one stage to build or install everything you need, then copy only the finished output into a clean, minimal final image. The build tools never make it into the image you ship.

Here's a Python example where we want to install dependencies but keep the final image lean:

# Single-stage — build tools end up in the final image
FROM python:3.11-slim

WORKDIR /app

RUN apt-get update && apt-get install -y gcc build-essential
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .
CMD ["python", "app.py"]

 

Now with a multi-stage build:

# Multi-stage — build tools stay in the builder stage only

# Stage 1: builder — install dependencies
FROM python:3.11-slim AS builder

WORKDIR /app

RUN apt-get update && apt-get install -y gcc build-essential

COPY requirements.txt .
RUN pip install --no-cache-dir --prefix=/install -r requirements.txt

# Stage 2: runtime — clean image with only what's needed
FROM python:3.11-slim

WORKDIR /app

# Copy only the installed packages from the builder stage
COPY --from=builder /install /usr/local

COPY . .

CMD ["python", "app.py"]

 

The gcc and build-essential tools — needed to compile some Python packages — are gone from the final image. The app still works because the compiled packages were copied over. The build tools themselves were left behind in the builder stage, which Docker discards. This pattern is even more impactful in Go or Node.js projects, where a compiler or node modules that are hundreds of megabytes can be completely excluded from the shipped image.

 

Cleaning Up Within the Installation Layer

 
When you install system packages with apt-get, the package manager downloads package lists and caches files that you don't need at runtime. If you delete them in a separate RUN instruction, they still exist in the intermediate layer, and Docker's layer system means they still contribute to the final image size.

To actually remove them, the cleanup must happen in the same RUN instruction as the install.

# Cleanup in a separate layer — cached files still bloat the image
FROM python:3.11-slim

RUN apt-get update && apt-get install -y curl
RUN rm -rf /var/lib/apt/lists/* # already committed in the layer above

# Cleanup in the same layer — nothing is committed to the image
FROM python:3.11-slim

RUN apt-get update && apt-get install -y curl \
    && rm -rf /var/lib/apt/lists/*

 

The same logic applies to other package managers and temporary files.

Rule of thumb: any apt-get install should be followed by && rm -rf /var/lib/apt/lists/* in the same RUN command. Make it a habit.

 

Implementing .dockerignore Files

 
When you run docker build, Docker sends everything in the build directory to the Docker daemon as the build context. This happens before any instructions in your Dockerfile run, and it often includes files you almost certainly don't want in your image.

Without a .dockerignore file, you're sending your entire project folder: .git history, virtual environments, local data files, test fixtures, editor configs, and more. This slows down every build and risks copying sensitive files into your image.

A .dockerignore file works exactly like .gitignore; it tells Docker which files and folders to exclude from the build context.

Here's a sample, albeit truncated, .dockerignore for a typical Python data project:

# Python
__pycache__/
*.pyc
*.pyo
*.pyd
.Python
*.egg-info/

# Virtual environments
.venv/
venv/
env/

# Data files (don't bake large datasets into images)
data/
*.csv
*.parquet
*.xlsx

# Jupyter
.ipynb_checkpoints/
*.ipynb

...

# Tests
tests/
pytest_cache/
.coverage

...

# Secrets — never let these into an image
.env
*.pem
*.key

 

This causes a substantial reduction in the data sent to the Docker daemon before the build even starts. On large data projects with parquet files or raw CSVs sitting in the project folder, this can be the single biggest win of all five practices.

There's also a security angle worth noting. If your project folder contains .env files with API keys or database credentials, forgetting .dockerignore means those secrets could end up baked into your image — especially if you have a broad COPY . . instruction.

Rule of thumb: Always add .env and any credential files to .dockerignore in addition to data files that don't need to be baked into the image. Also use Docker secrets for sensitive data.

 

Summary

 
None of these techniques require advanced Docker knowledge; they're habits more than techniques. Apply them consistently and your images will be smaller, your builds faster, and your deploys cleaner.

 

Practice What It Fixes
Slim/Alpine base image Ensures smaller images by starting with only essential OS packages.
Layer ordering Avoids reinstalling dependencies on every code change.
Multi-stage builds Excludes build tools from the final image.
Same-layer cleanup Prevents apt cache from bloating intermediate layers.
.dockerignore Reduces build context and keeps secrets out of images.

 
Happy coding!
 
 

Bala Priya C is a developer and technical writer from India. She likes working at the intersection of math, programming, data science, and content creation. Her areas of interest and expertise include DevOps, data science, and natural language processing. She enjoys reading, writing, coding, and coffee! Currently, she's working on learning and sharing her knowledge with the developer community by authoring tutorials, how-to guides, opinion pieces, and more. Bala also creates engaging resource overviews and coding tutorials.


Get the FREE ebook 'KDnuggets Artificial Intelligence Pocket Dictionary' along with the leading newsletter on Data Science, Machine Learning, AI & Analytics straight to your inbox.

By subscribing you accept KDnuggets Privacy Policy


Get the FREE ebook 'KDnuggets Artificial Intelligence Pocket Dictionary' along with the leading newsletter on Data Science, Machine Learning, AI & Analytics straight to your inbox.

By subscribing you accept KDnuggets Privacy Policy

Get the FREE ebook 'KDnuggets Artificial Intelligence Pocket Dictionary' along with the leading newsletter on Data Science, Machine Learning, AI & Analytics straight to your inbox.

By subscribing you accept KDnuggets Privacy Policy

No, thanks!