Workflow CI/CD

AI-Assisted CI/CD

GitHub Actions YAML is the thing most developers copy from Stack Overflow and then never fully understand. Docker multi-stage builds, deployment pipelines, automated review — AI handles all of it well if you know how to ask. This guide covers the complete pipeline, from local Dockerfile to production deployment workflow.

Last reviewed: Apr 22 2026


Part 1: GitHub Actions from Scratch

GitHub Actions YAML has a steep learning curve. The syntax is obscure, the context variables are non-obvious, and the error messages are often useless. AI shortens this dramatically — not just by writing the YAML for you, but by explaining the reasoning behind every stanza so you understand what you've got.

The Right Prompt Pattern

The most common mistake when asking AI for a GitHub Actions workflow is underspecifying the context. "Write a CI workflow for my Node.js app" produces a generic workflow that may not match your project structure, test command, or deployment target. Specify everything:

You

Write a GitHub Actions CI workflow for my project. Here are the specifics:

  • Node.js 20, npm workspaces (root + packages/api + packages/web)
  • Run on: push to main, push to any branch with an open PR
  • Steps needed: install deps, type-check (tsc --noEmit), lint (ESLint), run tests (Vitest), build (tsc)
  • Tests require a PostgreSQL 16 database — use a service container
  • Cache node_modules between runs using actions/cache
  • If any step fails, I want the full output, not just a summary
  • Do NOT deploy — this is CI only

After writing the workflow, explain what each section does in plain English. I want to understand it, not just copy it.

The "explain what each section does" instruction is not just pedagogy — it forces AI to produce a workflow it can actually justify, rather than one assembled from training patterns that happen to look right. If the explanation is wrong or vague on a section, the YAML for that section is probably wrong too.

name: CI

on:
  push:
    branches: [main]
  pull_request:
    branches: ['**']

jobs:
  ci:
    runs-on: ubuntu-latest

    services:
      postgres:
        image: postgres:16
        env:
          POSTGRES_USER: test
          POSTGRES_PASSWORD: test
          POSTGRES_DB: testdb
        ports:
          - 5432:5432
        options: >-
          --health-cmd pg_isready
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5

    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-node@v4
        with:
          node-version: '20'

      - name: Cache dependencies
        uses: actions/cache@v4
        with:
          path: |
            ~/.npm
            node_modules
            packages/*/node_modules
          key: ${{ runner.os }}-npm-${{ hashFiles('**/package-lock.json') }}
          restore-keys: |
            ${{ runner.os }}-npm-

      - name: Install dependencies
        run: npm ci

      - name: Type check
        run: npm run typecheck --workspaces

      - name: Lint
        run: npm run lint --workspaces

      - name: Test
        env:
          DATABASE_URL: postgresql://test:test@localhost:5432/testdb
        run: npm run test --workspaces

      - name: Build
        run: npm run build --workspaces

Iterating on a Workflow

The first workflow AI generates is rarely final. You'll discover edge cases: a step that needs an environment variable, a service that takes longer to be ready than the health check allows, a caching key that's too broad and restores stale modules. Iterate by pasting the workflow back with a specific failure description:

You

This workflow fails intermittently with "ECONNREFUSED" when the test step tries to connect to PostgreSQL. The health check passes, but the connection still fails for the first few tests. How do I fix the timing issue? Here's the current workflow: [paste]

Claude

The issue is that the health check confirms PostgreSQL is accepting connections, but the test runner starts immediately — sometimes before the database is fully initialized. Add a small wait step after the service starts, or use pg_isready in a retry loop before running tests:

This is the debugging pattern that makes AI useful for CI work: paste the error, paste the context, get a focused fix. Don't try to anticipate all edge cases in the original prompt — iterate toward the final workflow through real failures.

Workflow Templates Worth Having AI Generate

Beyond the basic CI workflow, several GitHub Actions patterns come up repeatedly. Ask AI for each one specifically, with your project's details:

Always Pin Action Versions

When AI generates workflows, it may use actions/checkout@v4 or it may use actions/checkout@main. Always use pinned versions (@v4 or, for security-critical contexts, the full commit SHA). Floating refs like @main can introduce breaking changes or supply-chain risks without notice.


Part 2: Docker and Containers

Dockerfile syntax is another area where the gap between "technically valid" and "production appropriate" is large. AI can produce a Dockerfile in seconds — the question is whether it uses multi-stage builds, non-root users, appropriate base images, and efficient layer caching. Specify these requirements explicitly.

Generating a Production Dockerfile

You

Write a production-ready Dockerfile for a Node.js 20 TypeScript Express API. Requirements:

  • Multi-stage build: build stage (compile TS) and production stage (run JS only)
  • Production image should not include TypeScript, devDependencies, or source files
  • Run as a non-root user
  • Use node:20-alpine as the base (small image)
  • Expose port 3000
  • The app reads config from environment variables — no .env file baked in
  • Include a HEALTHCHECK instruction that hits GET /health

Explain each layer decision, especially the cache optimization choices.

# ── Build stage ──────────────────────────────────────────
FROM node:20-alpine AS builder

WORKDIR /app

# Copy package files first — separate layer so npm ci only reruns
# when dependencies change, not when source changes
COPY package*.json ./
RUN npm ci

# Copy source and compile
COPY tsconfig.json ./
COPY src ./src
RUN npm run build

# ── Production stage ──────────────────────────────────────
FROM node:20-alpine AS production

WORKDIR /app

# Create non-root user before copying files
RUN addgroup -S appgroup && adduser -S appuser -G appgroup

# Copy only production dependencies manifest, then install
# (excludes devDependencies)
COPY package*.json ./
RUN npm ci --omit=dev && npm cache clean --force

# Copy compiled output from build stage only — no source, no TS
COPY --from=builder /app/dist ./dist

# Switch to non-root user
USER appuser

EXPOSE 3000

HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
  CMD wget -qO- http://localhost:3000/health || exit 1

CMD ["node", "dist/index.js"]

The explanation AI provides for each layer is where you'll catch issues specific to your setup — if your app uses a native module that requires build tools in production, the --omit=dev npm ci stage will need build-base installed in the production image. That kind of project-specific knowledge won't be in AI's first pass; it comes out when you iterate.

Docker Compose for Local Development

You

Write a docker-compose.yml for local development of the same Node.js API. Include:

  • The API service, using a development Dockerfile stage that runs ts-node-dev with hot reload
  • PostgreSQL 16 with a named volume for persistence
  • A pgAdmin service so I can inspect the database in the browser
  • Environment variables loaded from a .env file that I'll gitignore
  • The API should not start until PostgreSQL passes a health check
services:
  api:
    build:
      context: .
      target: development
    volumes:
      - ./src:/app/src        # hot reload: source changes reflected immediately
      - /app/node_modules     # prevent host node_modules from overriding container's
    ports:
      - "3000:3000"
    env_file:
      - .env
    environment:
      DATABASE_URL: postgresql://${POSTGRES_USER}:${POSTGRES_PASSWORD}@postgres:5432/${POSTGRES_DB}
    depends_on:
      postgres:
        condition: service_healthy
    command: npx ts-node-dev --respawn --transpile-only src/index.ts

  postgres:
    image: postgres:16-alpine
    environment:
      POSTGRES_USER: ${POSTGRES_USER:-dev}
      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-dev}
      POSTGRES_DB: ${POSTGRES_DB:-appdb}
    volumes:
      - postgres_data:/var/lib/postgresql/data
    ports:
      - "5432:5432"
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER:-dev}"]
      interval: 5s
      timeout: 5s
      retries: 5

  pgadmin:
    image: dpage/pgadmin4:latest
    environment:
      PGADMIN_DEFAULT_EMAIL: admin@admin.com
      PGADMIN_DEFAULT_PASSWORD: admin
    ports:
      - "5050:80"
    depends_on:
      - postgres

volumes:
  postgres_data:
Don't Use Latest in Production

AI often uses image: postgres:latest or dpage/pgadmin4:latest. Always pin to a specific major version (postgres:16-alpine) in any file that runs in CI or production. Floating :latest tags will silently upgrade, breaking your pipeline at the worst possible moment.

Optimizing Image Size

When an image is large, paste the Dockerfile and ask AI to diagnose the cause:

You

My production Docker image is 1.2GB. Here's the Dockerfile. Identify the likely causes of the large size and rewrite the Dockerfile to minimize it. Don't sacrifice functionality — just remove what production doesn't need.

Common findings: using node:20 instead of node:20-alpine (difference: ~800MB), installing devDependencies in the production stage, including source files, leaving build artifacts (compiled test files, type declaration outputs), and running npm install without npm cache clean --force afterward.


Part 3: Automated AI Review in CI

Running AI code review on every PR automatically is one of the highest-value CI integrations available. Unlike linters and type checkers, AI can catch semantic issues: missing error handling, authorization gaps, subtle logic errors, and hallucinated API calls that pass compilation but fail at runtime.

Basic Automated PR Review

name: AI Code Review

on:
  pull_request:
    types: [opened, synchronize]

jobs:
  review:
    runs-on: ubuntu-latest
    permissions:
      pull-requests: write
      contents: read

    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0   # full history needed for accurate diff

      - name: Get PR diff
        id: diff
        run: |
          DIFF=$(git diff origin/${{ github.base_ref }}...HEAD -- \
            '*.ts' '*.tsx' '*.js' '*.jsx' \
            ':!*.test.*' ':!*.spec.*' ':!dist/*' ':!node_modules/*')
          # Store multi-line output safely
          echo "diff<> $GITHUB_OUTPUT
          echo "$DIFF" >> $GITHUB_OUTPUT
          echo "EOF" >> $GITHUB_OUTPUT

      - name: Run AI review
        id: review
        if: steps.diff.outputs.diff != ''
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
        run: |
          REVIEW=$(curl -s https://api.anthropic.com/v1/messages \
            -H "x-api-key: $ANTHROPIC_API_KEY" \
            -H "content-type: application/json" \
            -H "anthropic-version: 2023-06-01" \
            -d "$(jq -n \
              --arg diff "${{ steps.diff.outputs.diff }}" \
              '{
                model: "claude-sonnet-4-6",
                max_tokens: 2000,
                messages: [{
                  role: "user",
                  content: ("Review this PR diff for a TypeScript/Node.js application.\n\nFocus only on real issues:\n- Bugs or logic errors\n- Missing error handling on failure paths\n- Security concerns (injection, auth gaps, data exposure)\n- API calls that may be hallucinated or incorrect\n- Database operations missing transaction safety\n\nDo not comment on style, formatting, or matters of preference.\nIf there are no real issues, say \"No issues found.\"\nBe concise — one line per finding.\n\n" + $diff)
                }]
              }')" | jq -r '.content[0].text')
          echo "review<> $GITHUB_OUTPUT
          echo "$REVIEW" >> $GITHUB_OUTPUT
          echo "EOF" >> $GITHUB_OUTPUT

      - name: Post review comment
        if: steps.review.outputs.review != '' && steps.review.outputs.review != 'No issues found.'
        uses: actions/github-script@v7
        with:
          script: |
            github.rest.issues.createComment({
              issue_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.repo,
              body: '## AI Code Review\n\n' + `${{ steps.review.outputs.review }}`
            });

Two design decisions worth noting: the diff excludes test files (AI review of test files generates more noise than signal), and the comment is only posted when there are actual findings (no "No issues found" comment spam in every PR).

Targeted Security Review

A more focused variant that runs only on PRs touching security-sensitive files:

name: Security Review

on:
  pull_request:
    paths:
      - 'src/routes/auth**'
      - 'src/middleware/**'
      - 'src/routes/**'
      - 'migrations/**'
      - 'src/db/**'

jobs:
  security-review:
    runs-on: ubuntu-latest
    permissions:
      pull-requests: write
      contents: read

    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Get security-relevant diff
        id: diff
        run: |
          DIFF=$(git diff origin/${{ github.base_ref }}...HEAD -- \
            'src/routes/**' 'src/middleware/**' 'src/db/**' 'migrations/**')
          echo "diff<> $GITHUB_OUTPUT
          echo "$DIFF" >> $GITHUB_OUTPUT
          echo "EOF" >> $GITHUB_OUTPUT

      - name: Security review
        id: review
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
        run: |
          REVIEW=$(curl -s https://api.anthropic.com/v1/messages \
            -H "x-api-key: $ANTHROPIC_API_KEY" \
            -H "content-type: application/json" \
            -H "anthropic-version: 2023-06-01" \
            -d "$(jq -n \
              --arg diff "${{ steps.diff.outputs.diff }}" \
              '{
                model: "claude-sonnet-4-6",
                max_tokens: 2000,
                messages: [{
                  role: "user",
                  content: ("Perform a security review of this diff.\n\nCheck for:\n1. SQL injection — are queries parameterized?\n2. Authorization gaps — authenticated but not authorized to access this resource?\n3. Sensitive data in logs (passwords, tokens, PII)\n4. Input validation — is external input validated before use?\n5. Path traversal — user-supplied strings in file paths?\n6. Missing rate limiting on auth endpoints\n7. Insecure direct object references\n\nFor each issue: severity (critical/high/medium), location, description, suggested fix.\nIf nothing found, say \"No security issues found.\"\n\n" + $diff)
                }]
              }')" | jq -r '.content[0].text')
          echo "review<> $GITHUB_OUTPUT
          echo "$REVIEW" >> $GITHUB_OUTPUT
          echo "EOF" >> $GITHUB_OUTPUT

      - name: Post security review
        uses: actions/github-script@v7
        with:
          script: |
            const review = `${{ steps.review.outputs.review }}`;
            const hasIssues = !review.includes('No security issues found');
            const prefix = hasIssues ? '⚠️ ' : '✓ ';
            github.rest.issues.createComment({
              issue_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.repo,
              body: '## ' + prefix + 'Security Review\n\n' + review
            });
What Automated Review Catches vs. Misses

Automated AI review is good at: hallucinated API calls, missing null checks, obvious missing validation, SQL injection patterns, and logging sensitive data. It misses: business logic authorization errors where the check exists but uses the wrong rules, race conditions that require understanding your concurrency model, and anything requiring knowledge of how your specific data model is used in production. Automated review supplements human review; it doesn't replace it.

Cost Management for AI in CI

API costs in CI add up. A team pushing 20 PRs a day, each reviewed with a 2000-token response, runs to roughly $5–10/day at Sonnet pricing — manageable, but worth monitoring. A few controls:


Part 4: Deployment Workflows

Deployment workflows are where mistakes are most expensive. AI can draft them quickly, but the verification process matters more here than anywhere else — a deployment workflow that runs in production without manual review is a liability.

Staging Deploy on Merge

You

Write a GitHub Actions workflow that deploys to a staging environment when a PR is merged to main. The deployment:

  • Builds a Docker image and pushes to GitHub Container Registry (ghcr.io)
  • Tags the image with the commit SHA and "staging"
  • SSHes into a staging server and runs: docker pull, docker stop the old container, docker run the new one
  • Runs a smoke test after deploy (GET /health should return 200)
  • Posts a Slack notification with the result (success or failure)
  • The SSH key and Slack webhook are stored as GitHub secrets
name: Deploy to Staging

on:
  push:
    branches: [main]

env:
  REGISTRY: ghcr.io
  IMAGE_NAME: ${{ github.repository }}

jobs:
  deploy:
    runs-on: ubuntu-latest
    permissions:
      contents: read
      packages: write

    steps:
      - uses: actions/checkout@v4

      - name: Log in to Container Registry
        uses: docker/login-action@v3
        with:
          registry: ${{ env.REGISTRY }}
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Build and push image
        id: push
        uses: docker/build-push-action@v5
        with:
          context: .
          push: true
          tags: |
            ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }}
            ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:staging

      - name: Deploy to staging server
        uses: appleboy/ssh-action@v1
        with:
          host: ${{ secrets.STAGING_HOST }}
          username: ${{ secrets.STAGING_USER }}
          key: ${{ secrets.STAGING_SSH_KEY }}
          script: |
            docker pull ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }}
            docker stop app-staging || true
            docker rm app-staging || true
            docker run -d \
              --name app-staging \
              --restart unless-stopped \
              -p 3000:3000 \
              --env-file /etc/app/staging.env \
              ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }}

      - name: Smoke test
        id: smoke
        run: |
          sleep 10  # give container time to start
          STATUS=$(curl -s -o /dev/null -w "%{http_code}" \
            https://staging.yourdomain.com/health)
          if [ "$STATUS" != "200" ]; then
            echo "Smoke test failed: HTTP $STATUS"
            exit 1
          fi
          echo "Smoke test passed"

      - name: Notify Slack
        if: always()
        uses: slackapi/slack-github-action@v1
        with:
          payload: |
            {
              "text": "${{ job.status == 'success' && '✅' || '❌' }} Staging deploy *${{ job.status }}* — <${{ github.server_url }}/${{ github.repository }}/commit/${{ github.sha }}|${{ github.sha }}>"
            }
        env:
          SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }}

Production Deploy with Manual Approval

Production deploys should never be fully automated without a human gate. GitHub Actions supports this through environments with required reviewers:

You

Extend the staging deploy workflow to also support production deploys, triggered by pushing a version tag (v*.*.*). Production deploy should require manual approval from a GitHub environment called "production" before it runs. The production deploy is identical to staging except it deploys to a different server and uses different secrets.

jobs:
  # Build job runs first without environment restriction
  build:
    runs-on: ubuntu-latest
    permissions:
      contents: read
      packages: write
    steps:
      - uses: actions/checkout@v4
      - name: Build and push
        uses: docker/build-push-action@v5
        with:
          context: .
          push: true
          tags: |
            ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.ref_name }}
            ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:latest

  # Deploy job requires approval via GitHub environment settings
  deploy:
    needs: build
    runs-on: ubuntu-latest
    environment: production     # ← requires reviewers configured in repo settings
    steps:
      - name: Deploy to production
        uses: appleboy/ssh-action@v1
        with:
          host: ${{ secrets.PROD_HOST }}
          username: ${{ secrets.PROD_USER }}
          key: ${{ secrets.PROD_SSH_KEY }}
          script: |
            docker pull ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.ref_name }}
            docker stop app-prod || true
            docker rm app-prod || true
            docker run -d \
              --name app-prod \
              --restart unless-stopped \
              -p 3000:3000 \
              --env-file /etc/app/production.env \
              ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.ref_name }}

The environment: production key connects the job to a GitHub environment that can be configured with required reviewers, wait timers, and deployment branch restrictions — all through the repository settings UI, without any workflow changes.


Part 5: Pipeline Debugging

CI failures are particularly hard to debug: you can't attach a debugger, the error often comes from a different OS than your laptop, and the feedback loop (push, wait for runner, read logs, push again) is slow. AI shortens this loop significantly.

The Failure Debugging Pattern

Copy the full failure log — not just the error line, but the complete step output — and paste it with the workflow file:

You

My GitHub Actions workflow is failing. Here's the complete log from the failing step and the relevant workflow section:

Error log:
Error: buildx failed with: ERROR [production 4/5] RUN npm ci --omit=dev && npm cache clean --force
npm warn reify: To address all issues (including breaking changes), run:
npm warn reify npm audit fix --force
npm error code ENOTFOUND
npm error errno ENOTFOUND
npm error network request to https://registry.npmjs.org/express failed

Workflow section: [paste]

What's causing this and how do I fix it?

Claude

The ENOTFOUND error means the Docker build container can't reach the npm registry. This happens in multi-stage builds when the build runs in a network-isolated context. The production stage's npm ci has no internet access.

The fix: move the production dependency install to a separate stage that runs before network isolation, then copy node_modules into the production stage rather than installing there...

Common CI Failure Patterns

Certain CI failures come up repeatedly. Paste the error and workflow together for each one — AI recognizes these patterns quickly:

act — Run GitHub Actions Locally

act (github.com/nektos/act) runs GitHub Actions workflows locally using Docker. When debugging CI failures, use act to iterate without the push-wait-read loop. Ask AI to help install and configure act for your specific workflow — it has quirks with service containers and secrets that AI can help navigate.


Part 6: Workflow Maintenance

GitHub Actions workflows age. Action versions go out of date, runners deprecate, and the patterns that made sense six months ago may have better alternatives. A few practices keep your pipeline healthy over time.

Keeping Action Versions Current

Dependabot can automatically open PRs for action version updates — add this to .github/dependabot.yml:

version: 2
updates:
  - package-ecosystem: "github-actions"
    directory: "/"
    schedule:
      interval: "weekly"
    groups:
      actions:
        patterns:
          - "actions/*"
          - "docker/*"

Ask AI to generate this file with the appropriate grouping for your workflows — grouping related action updates into a single PR reduces review overhead.

Auditing an Existing Workflow

Paste your existing workflows and ask AI to audit them for problems and improvements:

You

Here are our current GitHub Actions workflows. Audit them for:

  • Security issues (unpinned actions, overly broad permissions, secrets exposure risks)
  • Performance (unnecessary steps, missing caching, things that could run in parallel)
  • Reliability (steps that could intermittently fail, missing error handling)
  • Outdated patterns (actions or approaches that have better modern equivalents)

Prioritize findings by impact. [paste workflows]

Run this audit once a quarter, or before a major deployment. Workflows accumulate technical debt silently — a step that was "good enough" a year ago may now be the bottleneck in your build or a security gap that a newer action version closes.

Generating Workflow Documentation

Workflows are often undocumented — everyone on the team knows they exist, nobody remembers exactly what triggers what. AI can generate clear documentation from the YAML:

You

Generate documentation for our GitHub Actions workflows. For each workflow: what triggers it, what it does step-by-step, what secrets it needs, what permissions it requires, and what happens if it fails. Write it as a README section that a new developer could use to understand our pipeline. [paste all workflow files]


Getting Started

1

Start with a CI workflow for your current project

Pick a real project and ask AI to generate a CI workflow with your exact stack, test command, and any service dependencies. Use the detailed prompt pattern from Part 1 — specify everything. Run it, fix what breaks, and understand every section before moving on.

2

Add a production Dockerfile

Use the multi-stage pattern from Part 2. Check the resulting image size with docker images. If it's over 200MB, ask AI to diagnose why. The first version is never optimal — iterate to understand the layer caching.

3

Add automated AI review to one repository

Start with the basic PR review workflow from Part 3. Run it for two weeks on real PRs. Note what it catches and what it misses. Tune the prompt based on your actual failure patterns.

4

Audit your existing workflows

If you have existing CI/CD pipelines, use the audit prompt from Part 6. Most workflows that are more than a year old have at least one pinning issue, one caching miss, and one step that could run in parallel. Fix those before building new workflows on top of them.


AI-Assisted CI/CD — Summary

Related Guides

CLI-First AI Development

The local complement to this guide: shell-based AI automation, pre-commit hooks, and terminal workflows that feed into your CI pipeline.

Working with AI in a Team

How automated CI review fits into the broader team PR workflow: tiered review, what AI misses, and how human review and automated review divide responsibility.

Build a REST API from Spec to Deployment

A complete project that includes Docker and a production deployment step, showing CI/CD patterns in the context of a full build.

Back to Home