Dockerfile Best Practices: 10 Anti-Patterns to Stop Shipping
A tour of the ten Dockerfile mistakes that bloat images, leak secrets, and break caching. With the fix for each, plus the multi-stage pattern that solves most of them at once.
A working Dockerfile is not the same as a good one. Most production Dockerfiles ship several of the patterns below. Fixing them cuts image size, speeds up rebuilds, and closes real security holes.
1. Running as root
# Bad
CMD ["node", "server.js"]
If the process gets compromised, the attacker has root inside the container. Add a user:
# Good
RUN useradd -m -u 1001 app
USER app
CMD ["node", "server.js"]
On Alpine, it’s adduser -D -u 1001 app. Same idea.
2. Using latest as a base image tag
FROM node:latest
latest moves without warning. Your build is fine today, broken tomorrow, and you have no record of what changed. Pin to a digest or at least a major+minor version:
FROM node:20.11-alpine
# or for true immutability:
FROM node@sha256:abc123...
3. COPY . . before dependency install
# Bad — invalidates the cache every time any file changes
COPY . .
RUN npm install
Dockerfile layers cache top-down. If you copy your whole source before installing deps, every code edit busts the dependency cache and triggers a full reinstall. Split it:
# Good
COPY package.json package-lock.json ./
RUN npm ci
COPY . .
Now npm ci only re-runs when your lockfile changes.
4. Shipping build tools in the runtime image
Python, Node, and Ruby projects all need build tools (gcc, python-dev, make) to install native deps. But they don’t need those tools at runtime. A 1.2 GB image becomes 180 MB with multi-stage:
# Build stage
FROM python:3.12-slim AS build
RUN apt-get update && apt-get install -y gcc python3-dev
COPY requirements.txt .
RUN pip install --prefix=/install -r requirements.txt
# Runtime stage — clean, no compilers
FROM python:3.12-slim
COPY --from=build /install /usr/local
COPY . /app
WORKDIR /app
CMD ["python", "app.py"]
The runtime image has no gcc, no dev headers, no pip cache, no surface area for an attacker to compile a backdoor in-place.
5. ADD when you mean COPY
ADD auto-extracts tarballs and fetches URLs. That behavior is almost never what you want, and when it bites you, it bites hard (silently extracting an attacker-controlled tarball, for instance). Rule of thumb: always use COPY unless you have a specific reason not to.
6. Baking secrets into layers
# Very bad
ARG NPM_TOKEN
RUN npm install # token ends up in layer history
Anyone who pulls your image can extract the token from the layer cache. Even RUN ... && rm doesn’t help — the file is in the previous layer. Use BuildKit secrets:
# syntax=docker/dockerfile:1.4
RUN --mount=type=secret,id=npm_token \
NPM_TOKEN=$(cat /run/secrets/npm_token) npm install
Or use build args that are injected at runtime via orchestration (Kubernetes secrets, Docker secrets, etc.) — never at build time.
7. Not cleaning apt caches
# Bad — 200 MB of leftover package lists
RUN apt-get update && apt-get install -y curl
Every apt-get install leaves /var/lib/apt/lists/ behind. Clean it in the same RUN layer:
RUN apt-get update \
&& apt-get install -y --no-install-recommends curl \
&& rm -rf /var/lib/apt/lists/*
The --no-install-recommends flag alone often saves 100+ MB on Debian bases.
8. One huge RUN that you can’t cache
Or, the opposite — ten tiny RUNs. Each RUN is a layer. Too few and every edit forces a full rebuild. Too many and you have dozens of tiny layers with overhead. The right grouping is by cache lifetime: commands that change at the same rate go in the same RUN.
- System packages → one
RUN(rarely changes). - Language deps → one
RUN(changes with lockfile). - Application code → copied last (changes every commit).
9. No .dockerignore
Without one, COPY . . ships your .git/, node_modules/, .env.local, and __pycache__/ into the image. At best this bloats the build context (slow). At worst, it leaks secrets.
Minimal .dockerignore for most projects:
.git
node_modules
__pycache__
.venv
.env*
*.log
.DS_Store
dist
build
coverage
10. No HEALTHCHECK
Kubernetes, ECS, and most orchestrators have their own probe system, so some people skip HEALTHCHECK in the Dockerfile itself. But for docker-compose, local runs, and any orchestrator that does respect it, a Dockerfile healthcheck makes the container self-report:
HEALTHCHECK --interval=30s --timeout=3s --start-period=10s \
CMD curl -fsS http://localhost:8080/health || exit 1
Without this, an orchestrator can happily route traffic to a container whose process is up but hung.
The multi-stage pattern solves most of this
The single highest-leverage change in any Dockerfile is going multi-stage. It forces you to separate build tools from runtime, which automatically fixes anti-patterns 4, 7, and partly 6. A decent starting template:
# syntax=docker/dockerfile:1.7
FROM node:20.11-alpine AS deps
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci --omit=dev
FROM node:20.11-alpine AS build
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
RUN npm run build
FROM node:20.11-alpine AS runtime
WORKDIR /app
RUN adduser -D -u 1001 app
COPY --from=build --chown=app:app /app/dist ./dist
COPY --from=deps --chown=app:app /app/node_modules ./node_modules
USER app
EXPOSE 8080
HEALTHCHECK CMD wget -qO- http://localhost:8080/health || exit 1
CMD ["node", "dist/server.js"]
Three stages, non-root user, pinned base, healthcheck, and the runtime image has no build tools. On a typical Node service, this pattern drops the final image from ~900 MB to ~150 MB.
Auditing someone else’s Dockerfile
If you’ve inherited a Dockerfile and don’t know where to start, scan for these in order: base image tag (is it pinned?), presence of USER, presence of multi-stage, and layer ordering (deps before code?). Those four questions identify roughly 80% of the issues in most real-world Dockerfiles.
Or paste the whole file into our Docker Audit tool — it runs the checklist above and flags each issue with a suggested fix. Free, 5 runs per day.
The underlying principle
Every line in a Dockerfile should answer two questions: when does this layer’s cache get invalidated? and does this belong in the runtime image? If you can’t answer both cleanly, the line is probably in the wrong place.