Skip to main content

The Plugin Conveyor Belt Behind Babel

Ever wonder how Babel really works under the hood? The Plugin Conveyor Belt Behind Babel digs into the pipeline powering all those JavaScript transforms.

Code Cracking
20m read
#Babel#JavaScript#plugins#buildtools
The Plugin Conveyor Belt Behind Babel - Featured blog post image

MENTORING

1:1 engineering mentorship.

Architecture, AI systems, career growth. Ongoing or one-off.

We’re examining how Babel’s core transform pipeline turns raw JavaScript into transformed code by pushing it through a conveyor belt of plugins. Babel is a JavaScript compiler used across modern build systems to parse, transform, and generate code. At the center of this process is packages/babel-core/src/transformation/index.ts, a small orchestrator that wires configuration, plugins, and code generation into one flow.

I’m Mahmoud Zalt, an AI solutions architect, and we’ll use this file as a case study in how to design clean, extensible pipelines: keep orchestration thin, make extension points rich, and let plugins do the heavy lifting.

How Babel’s transform pipeline is structured

The file packages/babel-core/src/transformation/index.ts is Babel’s core transformation entry point. It doesn’t know how to rename a variable or turn JSX into function calls. Instead, it runs a pipeline:

  1. Normalize configuration and input code into a File object.
  2. Run a sequence of plugin passes over the file’s AST (abstract syntax tree).
  3. Generate output code and source maps from the transformed AST.
  4. Return a structured FileResult with code, AST, metadata, and dependencies.
Project (babel)
└── packages/
    └── babel-core/
        └── src/
            └── transformation/
                ├── index.ts        (orchestrates transform pipeline)
                ├── plugin-pass.ts  (PluginPass implementation)
                ├── block-hoist-plugin.ts
                ├── normalize-opts.ts
                ├── normalize-file.ts
                └── file/
                    ├── file.ts     (File abstraction)
                    └── generate.ts (code generation)
  
index.ts sits at the center, orchestrating specialized helpers around it.

A helpful mental model is an assembly line. The raw material is source code. normalizeFile unpacks it into a standardized File (AST, options, scope, path). Each plugin is a station on the line, inspecting and modifying pieces as they pass. Finally, generateCode re-assembles everything into finished code and a source map.

The public entry point is the run function, which coordinates this process:

export type FileResult = {
  metadata: Record;
  options: Record;
  ast: t.File | null;
  code: string | null;
  map: GeneratorResult["map"];
  sourceType: Exclude;
  externalDependencies: Set;
};

export function* run(
  config: ResolvedConfig,
  code: string,
  ast?: t.File | t.Program | null,
): Handler {
  const file = yield* normalizeFile(
    config.passes,
    normalizeOptions(config),
    code,
    ast,
  );
  // ... transform + generate + return FileResult
}
  • run is a generator function (function*) that integrates with gensync, allowing the same implementation to run in sync or async mode.
  • FileResult describes exactly what downstream tools care about: transformed code, AST (optional), metadata, source type, and external dependencies.

With the key actors in place—run, File, plugins, and code generation—we can focus on the central lesson: this file is a compact masterclass in plugin pipeline design.

Inside the plugin conveyor belt

The primary lesson from this file is how to keep a plugin-driven pipeline small, composable, and robust while delegating real work to plugins. Babel does this in three main ways:

  1. Composing plugin passes and visitors into a single traversal.
  2. Handling plugin lifecycle hooks without leaking async complexity.
  3. Wrapping errors with context that both humans and tools can use.

1. One traversal, many plugin behaviors

The core of the conveyor belt is transformFile, which builds and executes plugin passes:

function* transformFile(file: File, pluginPasses: PluginPasses): Handler {
  const async = yield* isAsync();

  for (const pluginPairs of pluginPasses) {
    const passPairs: [Plugin, PluginPass][] = [];
    const passes = [];
    const visitors = [];

    for (const plugin of pluginPairs.concat([loadBlockHoistPlugin()])) {
      const pass = new PluginPass(file, plugin.key, plugin.options, async);

      passPairs.push([plugin, pass]);
      passes.push(pass);
      // FIXME: plugin.visitor may be undefined
      visitors.push(plugin.visitor!);
    }

    // ... pre hooks, traversal, post hooks
  }
}

In practice:

  • pluginPasses is a list of plugin groups. Each group is a checkpoint on the conveyor belt.
  • For each plugin in the group, Babel creates a PluginPass, which holds per-run state: a reference to the file, options, and whether this run is async.
  • It collects each plugin’s visitor, an object that says “which AST node types do I care about, and what should happen when we see them?”
  • It appends a special block-hoisting plugin via loadBlockHoistPlugin() to handle Babel’s hoisting semantics.

Instead of traversing the AST once per plugin, Babel merges all visitors in a group into a single composite visitor and traverses once:

const visitor = traverse.visitors.merge(
  visitors,
  passes,
  file.opts.wrapPluginVisitorMethod,
);

traverse(file.ast.program, visitor, file.scope, null, file.path, true);

The conveyor belt is the traversal. Each station is a visitor merged into the composite. As AST nodes flow along the belt, every relevant plugin gets a chance to react, but the tree is only walked once per group.

Why this matters: Traversing large ASTs dominates cost. Merging visitors keeps plugins modular while minimizing redundant work.

There is one explicit rough edge: // FIXME: plugin.visitor may be undefined next to plugin.visitor!. A safer pattern would only collect defined visitors, allowing plugins that exist solely for pre/post hooks without forcing non-null assertions and aligning runtime behavior with types.

2. Lifecycle hooks that hide async complexity

Each plugin can implement pre and post hooks—setup before traversal and cleanup after. Babel supports both synchronous and asynchronous plugins, but callers may invoke the transform synchronously. The orchestration file reconciles this with isAsync and maybeAsync:

for (const [plugin, pass] of passPairs) {
  if (plugin.pre) {
    const fn = maybeAsync(
      plugin.pre,
      `You appear to be using an async plugin/preset, but Babel has been called synchronously`,
    );

    // eslint-disable-next-line @typescript-eslint/no-floating-promises
    yield* fn.call(pass, file);
  }
}

The pattern is simple and powerful:

  • isAsync() tells transformFile whether this run call is executing in async mode.
  • maybeAsync wraps pre/post hooks, allowing them to be async when the transform is async, and throwing a clear error if an async plugin is used in a purely synchronous call.

The async concern is localized:

const async = yield* isAsync();

After that, the orchestration logic reads as if everything were synchronous. Gensync and maybeAsync handle the dual-mode complexity behind the scenes.

The report notes a small duplication: the long error message string passed to maybeAsync is repeated for both pre and post. Extracting it into a constant would make this core path clearer and easier to maintain.

3. Errors that respect humans and tools

In a plugin-heavy pipeline, failures are inevitable. The way this file wraps errors is a practical pattern worth copying:

const opts = file.opts;
try {
  yield* transformFile(file, config.passes);
} catch (e) {
  e.message = `${opts.filename ?? "unknown file"}: ${e.message}`;
  if (!e.code) {
    e.code = "BABEL_TRANSFORM_ERROR";
  }
  throw e;
}

let outputCode, outputMap;
try {
  if (opts.code !== false) {
    ({ outputCode, outputMap } = generateCode(config.passes, file));
  }
} catch (e) {
  e.message = `${opts.filename ?? "unknown file"}: ${e.message}`;
  if (!e.code) {
    e.code = "BABEL_GENERATE_ERROR";
  }
  throw e;
}

This wrapper does three important things:

  1. Prefixes messages with filenames. Developers immediately see which file broke. If no filename is available, it falls back to "unknown file" instead of omitting context.
  2. Attaches machine-readable error codes. Codes like "BABEL_TRANSFORM_ERROR" and "BABEL_GENERATE_ERROR" let build tools categorize failures without brittle string matching.
  3. Separates transform and generate phases. It becomes obvious whether a plugin corrupted the AST (transform error) or the generator hit an issue (generate error).

Why this matters: A few extra fields on an exception can turn opaque plugin failures into actionable signals for both humans and CI systems.

Finally, run constructs the FileResult in one place:

return {
  metadata: file.metadata,
  options: opts,
  ast: opts.ast === true ? file.ast : null,
  code: outputCode === undefined ? null : outputCode,
  map: outputMap === undefined ? null : outputMap,
  sourceType: file.ast.program.sourceType,
  externalDependencies: flattenToSet(config.externalDependencies),
};
  • AST and code emission are controlled by options (opts.ast, opts.code), so callers can trade performance for introspection.
  • flattenToSet turns nested dependency collections into a Set, giving tools a clean, de-duplicated view of external dependencies.

At this point, we’ve seen how the orchestrator composes plugins, hides async, and formats errors. The remaining question is how this design behaves under real-world load.

Performance and operational realities

Under production workloads—thousands of files, many plugins, large ASTs—the core cost centers are exactly where this file spends its time:

  • AST traversal (traverse(file.ast.program, visitor, ...)).
  • Plugin visitor callbacks.
  • Visitor merging (traverse.visitors.merge).

In rough terms, traversal cost scales with:

  • N: number of AST nodes in the file.
  • V: average number of visitors interested in each node type.

Total work is about O(N * V). More code increases N, more plugins increase V, and heavy visitors inflate the constant factors.

Factor Examples Impact on runtime
File size (AST nodes) Minified bundles, generated code Roughly linear increase in traversal time
Plugin count Presets with many transforms More visitors per node; higher merge and dispatch cost
Plugin behavior Heavy work in visitors or hooks Dominates per-node cost; can cause significant spikes

To keep this conveyor belt healthy, the report proposes metrics that map directly to the orchestrator’s responsibilities:

  • babel_transform_duration_ms – end-to-end time for one run call (per file), with attention to P95/P99.
  • babel_ast_traversal_nodes_count – number of nodes visited per transform, to correlate file size with duration.
  • babel_plugins_per_transform_count – number of active plugins for each file, to reveal configuration bloat.
  • babel_transform_errors_total – count of failures, labeled by error.code, to separate transform from generate issues.

Why these metrics: They reflect exactly what this file controls: how long the belt runs, how much it processes, how many stations it passes, and how often it fails.

Concurrency-wise, the design is intentionally simple: each run call mutates a single File in place. There is no shared state inside the orchestrator, so higher layers can safely run many run calls in parallel across files, typically in separate workers or threads.

The real scaling risks live in plugin implementations:

  • Plugins that perform heavy synchronous work in pre/post or visitors will stall the entire transform for that file.
  • Plugins that touch global state or make network calls introduce contention and flakiness the orchestrator cannot manage.

The index.ts file doesn’t try to solve those; instead, it provides a predictable, well-instrumented conveyor belt that makes plugin behavior visible and debuggable.

Pipeline patterns you can reuse

Babel’s index.ts is small, but the design lesson is clear: a good plugin pipeline keeps orchestration thin and extension points rich. The file normalizes inputs into a File, runs a series of plugin passes via a single shared traversal per group, wraps errors with filename and machine-readable codes, and returns a FileResult that downstream tools can rely on.

Boiled down, here are concrete patterns you can apply in your own systems:

  1. Keep orchestration thin, make extension points rich.
    Let a central function define the high-level steps (normalize → process → emit), and push behavior into plugins, strategies, or callbacks. This keeps the core stable while allowing the ecosystem to evolve.
  2. Traverse once, compose many behaviors.
    When several components need to see the same data structure, prefer a single traversal with merged visitors or handlers. It’s often both simpler and faster than multiple independent passes.
  3. Design errors for humans and machines.
    Prefix messages with contextual details such as filenames, and attach stable error codes. These small additions make CI failures and plugin bugs far easier to diagnose and automate around.
  4. Hide async complexity behind focused helpers.
    Centralize sync/async reconciliation (like isAsync and maybeAsync) instead of spreading it across the pipeline. The orchestrator should read as straightforward control flow, even when it supports both modes.

If you treat your own transformation pipelines—whether they process code, data, or events—with the same discipline, you get systems that are easier to extend, reason about, and operate at scale. The next time you build a “do X with a bunch of plugins” feature, it’s worth asking: how close is it to Babel’s conveyor belt?

Full Source Code

Direct source from the upstream repository. Preview it inline or open it on GitHub.

packages/babel-core/src/transformation/index.ts

babel/babel • main

Choose one action below.

Open on GitHub

Thanks for reading! I hope this was useful. If you have questions or thoughts, feel free to reach out.

Content Creation Process: This article was generated via a semi-automated workflow using AI tools. I prepared the strategic framework, including specific prompts and data sources. From there, the automation system conducted the research, analysis, and writing. The content passed through automated verification steps before being finalized and published without manual intervention.

Mahmoud Zalt

About the Author

I’m Zalt, a technologist with 16+ years of experience, passionate about designing and building AI systems that move us closer to a world where machines handle everything and humans reclaim wonder.

Let's connect if you're working on interesting AI projects, looking for technical advice or want to discuss anything.

Support this content

Share this article

CONSULTING

AI consulting. Strategy to production.

Architecture, implementation, team guidance.