Migrating a Legacy Codebase with AI

The Migration Mindset

Legacy migrations fail for the same reason every time: developers start changing code before they understand it. They refactor a module, break something downstream they didn't know existed, spend two days debugging, lose confidence, and the migration stalls.

AI amplifies this risk. It's fast enough to rewrite an entire module in minutes — which means it's fast enough to introduce bugs across your codebase in minutes. The speed is a liability if you don't have a safety net.

The correct order for an AI-assisted migration is:

Comprehension

Use AI to understand the existing code. Map it. Document it. Know what it does before you touch it.

Test scaffolding

Generate tests for the existing behavior. These tests are your safety net — they tell you when a change breaks something.

Incremental transformation

Migrate one module at a time. Run the tests after each change. Small steps, verified continuously.

Pattern modernization

Update language patterns, framework idioms, and coding conventions across the migrated code.

Validation and cleanup

Verify the migration preserved behavior. Remove dead code. Update documentation.

The Cardinal Rule

Never let AI change code you haven't tested yet. Understanding first, tests second, changes third. This order is non-negotiable. Skipping ahead is how migrations go wrong.

Phase 1: Comprehension

Before changing a single line, you need to understand what the codebase does, how it's structured, and where the hidden dependencies live. AI is exceptionally good at this — it can read thousands of lines and produce explanations, summaries, and dependency maps faster than any human.

The Codebase Overview

You

I'm about to start migrating a legacy codebase. Before I change anything, I need to understand it. Here's the project structure:

[paste output of find src -type f -name '*.js' | head -60]

Give me a high-level architecture overview: what are the major modules, how do they relate, and what does each directory appear to be responsible for?

This first prompt gives you a map. Not a detailed one — just enough to know which neighborhoods exist before you start walking the streets.

Module-Level Documentation

For each major module, paste the source and ask AI to explain it:

You

Here's src/services/billing.js. This is a legacy file with no documentation.

[paste entire file]

Explain: what does this module do, what are its public functions, what external services does it call, and what are the non-obvious side effects?

The "non-obvious side effects" part is critical. Legacy code is full of functions that quietly write to a database, send an email, update global state, or modify an argument in place. AI will often catch these because it reads every line — something humans tend not to do when scanning unfamiliar code.

Dependency Mapping

You

Here are all the import/require statements in the project:

[paste output of grep -rn "require\|import" src/ --include="*.js"]

Create a dependency map: which modules depend on which other modules? Identify any circular dependencies. Which modules are most depended-on (changing them would affect the most other files)?

The most-depended-on modules are the ones you migrate last, not first. Start with leaf modules — the ones that nothing else depends on. Migrate them, verify they work, then move inward toward the core.

Identifying Risk Areas

You

Here's the billing service module. I need to migrate this codebase. Before I start, identify:

1. Code that directly accesses the database (SQL queries, ORM calls)
2. Code that calls external APIs or services
3. Code that uses global mutable state
4. Code that uses framework-specific features that won't exist after migration
5. Code that has implicit dependencies (relying on execution order, global variables, monkey-patching)

This prompt produces your risk register. Every item on this list needs a test before migration and careful attention during migration. These are the places where things will break.

Pro Tip: Save the Comprehension Output

Save every explanation, dependency map, and risk assessment in a docs/migration/ folder. This documentation serves three purposes: it helps you during the migration, it helps the next developer understand the codebase, and it gives AI context in future sessions so you don't re-explain the project from scratch.

Phase 2: Test Scaffolding

This is the phase most developers want to skip. Don't. Tests written before migration are the only reliable way to know whether your changes preserved the existing behavior. Without them, you're guessing.

The good news: AI is extremely effective at generating tests for existing code, even code that wasn't designed to be testable.

Characterization Tests

Characterization tests capture what the code actually does — not what it's supposed to do, not what the spec says, but what the running software produces for a given input. These tests aren't assertions about correctness; they're assertions about current behavior.

You

Here's src/services/billing.js:

[paste full file]

Write characterization tests that capture the current behavior of every public function. These tests aren't about what the code should do — they're about what it actually does.

For each function: test with typical inputs, boundary values, and null/undefined. Mock all external dependencies (database, APIs). Use the existing behavior as the expected result, even if it looks wrong.

⚠ "Even If It Looks Wrong"

This is important. If the legacy code returns null instead of throwing an error, your characterization test should assert that it returns null. You're not testing correctness — you're documenting current behavior. If the migration changes this behavior, you want the test to fail so you can decide intentionally whether the change is acceptable.

Testing Untestable Code

Legacy code often has tight coupling, global state, and side effects that make it hard to test. AI can help you identify the seams:

You

This function is hard to test because it directly calls the database and sends emails. Without changing the function's behavior, how can I test it? Show me how to mock the dependencies and write a test.

[paste the function]

AI will usually suggest dependency injection, module mocking (jest.mock / vi.mock), or extracting side effects into separate functions. The key constraint is "without changing the function's behavior" — you want the tests to work with the code as-is, before you start refactoring.

Coverage Strategy

You don't need 100% coverage before migrating. Focus your testing effort on:

Public API surface — Every function that other modules call. These contracts must not change.
Database operations — Any code that reads or writes data. Incorrect data is the worst migration outcome.
Business logic — Calculations, validations, transformations. The rules that the business depends on.
Integration points — Code that calls external APIs, sends messages, or triggers side effects.

Skip coverage for: pure UI rendering (it will change anyway), configuration files, and simple getter/setter functions. Spend your testing time where the risk is highest.

The Test Confidence Threshold

You're ready to start migrating a module when you have enough tests that you'd feel confident reverting if they fail. If a test suite failure after migration would make you think "something real broke" rather than "the test is probably wrong" — your tests are good enough.

Phase 3: Incremental Transformation

The strangler fig pattern: build the new system around the old one, replacing pieces incrementally until the old system is gone. AI accelerates this process dramatically — but the discipline of "one module at a time" is still essential.

Migration Order

Start from the leaves of your dependency tree and work inward:

                    ┌─── utils/format.js     ← start here (leaf)
                    │
   routes/api.js ───┤─── services/billing.js ← then here
                    │         │
                    │         ├── repos/invoiceRepo.js  ← and here
                    │         └── lib/stripe.js         ← and here
                    │
                    └─── middleware/auth.js   ← then here
                              │
                              └── repos/userRepo.js    ← start here (leaf)

   routes/api.js ← migrate last (most depended-on)

Leaf modules first because they have no downstream dependencies — if you break them, only their parent breaks, and you'll see that immediately in tests. Core modules last because breaking them breaks everything.

The Single-Module Migration Prompt

You

Migrate this module from CommonJS to ESM with TypeScript. Here's the current file:

[paste src/repos/userRepo.js]

Here's the type information from how it's used across the codebase:

[paste grep results: grep -rn "userRepo\." src/ --include="*.js"]

Requirements:

- Convert require to import, module.exports to export
- Add TypeScript types to all functions (infer from usage)
- Keep the exact same public API — same function names, same parameters, same return values
- Don't change any logic. This is a syntax migration, not a refactor.

The last line is the most important constraint: "Don't change any logic." Syntax migrations and logic changes must be separate steps. If you combine them, you can't tell whether a failing test is caused by the migration or the refactor.

The Verify-and-Continue Loop

After each module migration:

# 1. Run the module's own tests
$ npm test -- tests/repos/userRepo.test.ts

# 2. Run tests for modules that depend on it
$ npm test -- tests/routes/api.test.ts tests/middleware/auth.test.ts

# 3. Run the full test suite
$ npm test

# 4. If everything passes → commit and move to next module
$ git add -A && git commit -m "migrate: userRepo to ESM + TypeScript"

# 5. If tests fail → read the failure, fix the migration, re-run
# Do NOT skip failing tests. Each one is telling you something.

One module per commit. If something goes wrong three modules later, you can bisect to find exactly which migration broke it.

Pro Tip: The Two-Branch Strategy

Keep the legacy code on main and migrate on a migration branch. Merge frequently from main into migration to stay current with bugfixes. Only merge migration into main when a meaningful chunk is complete and all tests pass. This lets the team keep shipping on main while the migration progresses without blocking anyone.

Phase 4: Pattern Modernization

Once the structural migration is complete (new language, new module system, new framework), you can modernize the patterns within the migrated code. This is where AI really shines — pattern transformation across many files is tedious for humans and fast for AI.

Callbacks to async/await

function getUser(id, callback) {
  db.query('SELECT * FROM users WHERE id = ?', [id], (err, rows) => {
    if (err) return callback(err, null);
    if (rows.length === 0) return callback(null, null);
    callback(null, rows[0]);
  });
}

async function getUser(id: string): Promise<User | null> {
  const rows = await db.query('SELECT * FROM users WHERE id = ?', [id]);
  return rows.length > 0 ? rows[0] : null;
}

You

Convert this callback-based module to async/await. Every function that takes a callback should become an async function. Error-first callbacks become try/catch. Preserve all the logic — just change the async pattern.

[paste module]

Class Components to Hooks

class UserProfile extends React.Component {
  constructor(props) {
    super(props);
    this.state = { user: null, loading: true };
  }

  componentDidMount() {
    fetchUser(this.props.id).then(user =>
      this.setState({ user, loading: false })
    );
  }

  componentDidUpdate(prevProps) {
    if (prevProps.id !== this.props.id) {
      this.setState({ loading: true });
      fetchUser(this.props.id).then(user =>
        this.setState({ user, loading: false })
      );
    }
  }

  render() {
    if (this.state.loading) return <Spinner />;
    return <div>{this.state.user.name}</div>;
  }
}

function UserProfile({ id }: { id: string }) {
  const [user, setUser] = useState<User | null>(null);
  const [loading, setLoading] = useState(true);

  useEffect(() => {
    setLoading(true);
    fetchUser(id).then(user => {
      setUser(user);
      setLoading(false);
    });
  }, [id]);

  if (loading) return <Spinner />;
  return <div>{user?.name}</div>;
}

AI handles this transformation reliably because it's a well-defined pattern conversion. The prompt to use:

You

Convert this class component to a functional component with hooks. Map lifecycle methods to useEffect. Convert this.state to useState. Convert this.props to destructured props with TypeScript types. Keep the rendering logic identical.

CommonJS to ESM

const express = require('express');
const { validate } = require('../utils/validate');
const userRepo = require('../repos/userRepo');

// ...

module.exports = router;

import express from 'express';
import { validate } from '../utils/validate.js';
import { userRepo } from '../repos/userRepo.js';

// ...

export { router };

This is a mechanical transformation that AI does perfectly — but watch for these gotchas that AI sometimes misses:

File extensions — ESM imports need .js extensions in Node.js. AI often omits them.
Default vs named exports — module.exports = x is a default export, module.exports.x = x is named. AI sometimes confuses them.
Dynamic requires — require(variable) has no direct ESM equivalent. These need manual handling.
__dirname and __filename — Don't exist in ESM. Need import.meta.url conversion.

jQuery to Vanilla JS

$('.submit-btn').on('click', function() {
  var data = {
    name: $('#name-input').val(),
    email: $('#email-input').val()
  };
  $.ajax({
    url: '/api/users',
    method: 'POST',
    data: JSON.stringify(data),
    contentType: 'application/json',
    success: function(res) { $('#result').text('Saved!'); },
    error: function() { $('#result').text('Error'); }
  });
});

document.querySelector('.submit-btn').addEventListener('click', async () => {
  const data = {
    name: document.querySelector<HTMLInputElement>('#name-input').value,
    email: document.querySelector<HTMLInputElement>('#email-input').value,
  };
  try {
    await fetch('/api/users', {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify(data),
    });
    document.querySelector('#result').textContent = 'Saved!';
  } catch {
    document.querySelector('#result').textContent = 'Error';
  }
});

⚠ Test After Each Pattern Migration

Run your test suite after converting each module's patterns, not just after converting them all. Pattern transformations are where subtle behavioral changes hide — a $.ajax call and a fetch call handle errors differently, for example. Your characterization tests from Phase 2 will catch these.

Phase 5: Validation & Cleanup

The migration is structurally complete. Now you verify it, clean up the debris, and make sure no legacy patterns snuck through.

Behavioral Verification

You

Here's the original module and the migrated version. Compare them line by line and identify any behavioral differences — places where the migrated version would produce a different result for the same input.

Original: [paste old version]

Migrated: [paste new version]

This prompt catches differences your tests might have missed. AI is meticulous at line-by-line comparison — more meticulous than most humans doing a code review.

Dead Code Removal

You

Here are all the exported functions in src/utils/helpers.ts:

[paste the exports]

Here's every place this module is imported across the project:

[paste: grep -rn "from.*helpers" src/ --include="*.ts"]

Which exported functions are never imported anywhere? These are dead code candidates.

After identifying dead code, don't delete it immediately. Comment it out with a note and a date. If nothing breaks after two weeks, delete it. If someone needs it, they'll find it in git history — but the comment buys you a safety window.

Legacy Pattern Detection

You

Scan this migrated codebase for any remaining legacy patterns that should have been converted:

[paste directory listing or key files]

Look for: require() statements that should be import, callback-style async code, var instead of const/let, class components that should be functional, any types that should be specific, and jQuery selectors that should be vanilla JS.

This is a sweep for stragglers — the patterns you missed or that were introduced during the migration itself. Run this once across the full codebase as a final check.

Documentation Update

The comprehension documents you created in Phase 1 are now outdated — they describe the legacy version. Update them to reflect the migrated codebase:

You

Here's the original architecture doc I wrote during migration planning:

[paste Phase 1 documentation]

Here's the current state of the migrated codebase:

[paste updated file structure and key modules]

Update the architecture documentation to reflect the migrated state. Note what changed and why.

Real-World Considerations

When the Codebase Is Too Big

AI context windows have limits. If your codebase is 200,000+ lines, you can't paste it all into one session. The approach:

Work at the module level. Paste one module at a time, never the whole codebase. Provide imports and type signatures from other modules as context.
Use a summary document. Create a one-page architecture overview that you paste at the start of every session. AI doesn't need to see every line — it needs to understand the structure.
Use CLI tools for large-scale changes. Claude Code and Aider can read your project files directly, which avoids the copy-paste problem entirely.

When Tests Don't Exist and Can't Be Written

Some legacy code is so tightly coupled, so full of global state, and so dependent on external services that writing tests before migration is impractical. In that case:

Manual smoke testing. Create a checklist of user-facing behaviors and verify them manually after each change.
Golden file testing. Capture the output of key operations (API responses, report generation, data exports) as "golden files." After migration, run the same operations and diff the output.
Feature flags. Deploy both old and new code paths, route a percentage of traffic to the new path, and compare results.

When the Framework Is Changing

Migrating from Express to Fastify, from AngularJS to React, or from Django to FastAPI is a different challenge from modernizing within the same framework. AI can help, but the risk is higher because the architecture itself is changing.

The approach: extract business logic from framework code first. If your billing calculation lives inside an Express route handler, extract it into a pure function. Then the framework migration becomes a UI/routing concern, and your business logic is already tested independently.

You

This Express route handler mixes routing logic with business logic. Extract the business logic into a separate pure function that I can test independently and reuse when I migrate to Fastify.

[paste route handler]

When the Team Is Still Shipping Features

Migrations that block feature development get cancelled. The two-branch strategy from Phase 3 handles this — but you also need to communicate the migration plan to your team:

Announce the migration order. "I'm migrating the billing module this week. If you need to change billing code, coordinate with me so we don't conflict."
Migrate in vertical slices. Instead of "all repos, then all routes, then all middleware," migrate one complete feature path at a time. This gives the team a working feature in the new system sooner.
Set a merge cadence. "I merge completed modules into main every Friday. Feature work continues normally."

The Migration Succeeds

Understand before you change. Test before you migrate. Migrate one piece at a time. Verify after every step. AI makes each of these phases faster — but the discipline of doing them in order is what makes the migration succeed.

Migration Guide — Summary

Phase 1: Comprehension — Use AI to map the codebase, document modules, identify dependencies, and flag risk areas. Understand before you touch.
Phase 2: Test scaffolding — Generate characterization tests that capture current behavior. Focus on public APIs, database operations, and business logic.
Phase 3: Incremental transformation — Migrate from leaf modules inward. One module per commit. Syntax first, logic changes later. Run tests after every change.
Phase 4: Pattern modernization — Callbacks to async/await, classes to hooks, CommonJS to ESM, jQuery to vanilla. AI handles these mechanical transformations reliably.
Phase 5: Validation — Line-by-line comparison, dead code removal, legacy pattern sweep, documentation update.
Cardinal rule — Never let AI change code that doesn't have tests yet.

→

Back to Home