We’re examining how the ffmpeg CLI keeps control while running long, intensive jobs. FFmpeg is a command-line workhorse for transcoding and processing media, often running for hours under heavy CPU and disk load, yet it must react instantly to signals, keyboard input, and monitoring. The core of that behavior lives in fftools/ffmpeg.c, the main orchestrator for the binary.
In this article we’ll treat ffmpeg.c as a case study in designing a robust, observable, and interruptible CLI. I’m Mahmoud Zalt, an AI solutions architect, and we’ll focus on one lesson: how to design a thin control layer around heavy work so your tool stays responsive and debuggable instead of turning into an opaque, fragile monolith.
We’ll map the file’s responsibilities, dissect the main transcode loop, see how signals and interrupts propagate safely, look at how FFmpeg extends foreign types with metadata, and study its dual-mode progress reporting. Along the way, we’ll extract patterns you can borrow for your own long-running tools and services.
The scene: one file, many responsibilities
ffmpeg.c is the front door of the ffmpeg CLI. It owns main, sets up the process, coordinates transcoding, prints progress, handles signals and keyboard input, and tears everything down.
FFmpeg/
fftools/
ffmpeg.c <-- main CLI orchestration
ffmpeg.h (InputFile, OutputFile, FilterGraph, FrameData, ...)
ffmpeg_sched.h (Scheduler API)
ffmpeg_utils.h (helpers, sizes, error merging)
graph/graphprint.h (filter graph printing)
ffmpeg.c in the FFmpeg source tree.Think of this file as the air traffic control tower of the FFmpeg process. It doesn’t decode or encode itself – the FFmpeg libraries and scheduler do that – but it decides when work starts, how it’s observed, and how it stops.
Its main responsibilities cluster into a few themes:
- Process lifecycle:
main,transcode,ffmpeg_cleanup - Signals and terminal handling:
term_init,sigterm_handler,read_key - Interactive control:
check_keyboard_interaction - Metadata attachments:
frame_data*,packet_data* - Observability:
print_stream_maps,print_report, benchmarking helpers
The transcode loop as control surface
With the landscape in place, the heart of control is the transcode function. It doesn’t do heavy media work; it guards it.
static int transcode(Scheduler *sch)
{
int ret = 0;
int64_t timer_start, transcode_ts = 0;
print_stream_maps();
atomic_store(&transcode_init_done, 1);
ret = sch_start(sch);
if (ret < 0)
return ret;
if (stdin_interaction)
av_log(NULL, AV_LOG_INFO, "Press [q] to stop, [?] for help\n");
timer_start = av_gettime_relative();
while (!sch_wait(sch, stats_period, &transcode_ts)) {
int64_t cur_time = av_gettime_relative();
if (received_nb_signals)
break;
if (stdin_interaction)
if (check_keyboard_interaction(cur_time) < 0)
break;
print_report(0, timer_start, cur_time, transcode_ts);
}
ret = sch_stop(sch, &transcode_ts);
for (int i = 0; i < nb_output_files; i++) {
int err = of_write_trailer(output_files[i]);
ret = err_merge(ret, err);
}
term_exit();
print_report(1, timer_start, av_gettime_relative(), transcode_ts);
return ret;
}
We can read this as a compact story:
- Print stream mappings so the user sees what will happen.
- Raise
transcode_init_doneto mark that steady state is beginning. - Start the scheduler, which drives decoding, encoding, and filtering.
- Enter a loop that waits on the scheduler, checks for signals and keyboard commands, and emits progress reports.
- On exit, stop the scheduler, write all file trailers, restore the terminal, and print a final report.
The key design choice is that transcode owns the control surface, not the work. It decides whether to continue, how to respond to signals and keys, and when to report. The scheduler and libraries focus on media processing.
| Concern | Component | Effect |
|---|---|---|
| Decoding / encoding / filtering | Scheduler + FFmpeg libs |
Throughput and correctness |
Reacting to SIGINT / SIGTERM |
received_nb_signals, decode_interrupt_cb |
Safe, predictable shutdown |
| Interactive keyboard commands | check_keyboard_interaction |
Runtime control and debugging |
| Progress and stats output | print_report |
Human and machine observability |
Signals, interrupts, and safe shutdown
The transcode loop checks received_nb_signals and relies on an interrupt callback, so the next question is how FFmpeg turns OS events into those simple checks without leaving the process half-dead.
Signal handler with a hard-stop escape hatch
static volatile int received_sigterm = 0;
static volatile int received_nb_signals = 0;
static atomic_int transcode_init_done = 0;
static volatile int ffmpeg_exited = 0;
static void sigterm_handler(int sig)
{
int ret;
received_sigterm = sig;
received_nb_signals++;
term_exit_sigsafe();
if (received_nb_signals > 3) {
ret = write(2, "Received > 3 system signals, hard exiting\n",
strlen("Received > 3 system signals, hard exiting\n"));
if (ret < 0) { /* ignore */ }
exit(123);
}
}
This handler:
- Records the last signal and increments
received_nb_signals. - Restores terminal settings via
term_exit_sigsafe(), which avoids unsafe operations inside a signal handler. - After more than three signals, emits a short message using
write(signal-safe) and callsexit(123)to force termination.
This models a big red “panic” button: FFmpeg tries to land cleanly when you hit Ctrl+C, but if you keep slamming it, it chooses a hard exit over leaving the process in a mysterious state.
Interruptible I/O via a decode callback
A signal alone doesn’t break a blocking network read or slow protocol. FFmpeg wires a tiny callback into its I/O layer so long operations periodically ask, “should I abort?”
static int decode_interrupt_cb(void *ctx)
{
return received_nb_signals > atomic_load(&transcode_init_done);
}
const AVIOInterruptCB int_cb = { decode_interrupt_cb, NULL };
Any FFmpeg I/O context using int_cb periodically calls decode_interrupt_cb. If it returns non-zero, the operation aborts (typically with AVERROR_EXIT). The comparison against transcode_init_done is the subtle part:
- Before steady state,
transcode_init_done == 0. A signal here aborts startup quickly. - After
transcodemarks steady state by settingtranscode_init_doneto 1, signals interrupt ongoing I/O instead.
Normalizing platform-specific shutdown
On Windows, console events (Ctrl+C, closing the terminal, logoff) don’t arrive as POSIX signals. FFmpeg registers a control handler that translates relevant events into calls to sigterm_handler, then waits in the handler until ffmpeg_exited is set during ffmpeg_cleanup. The rest of the code only deals with received_nb_signals and the interrupt callback.
This is the pattern to copy: normalize OS-specific shutdown hooks into a small internal signaling API, then teach the rest of the codebase to read that, not raw platform events.
Extending frames and packets with metadata
Process-level control is only part of the story. ffmpeg.c also needs finer-grained control over how individual frames and packets are tracked, without violating FFmpeg’s copy-on-write behavior for core types.
The constraint: don’t touch library structs
AVFrame and AVPacket belong to the FFmpeg libraries. The CLI often needs extra per-frame information – encoder parameters, wall-clock timestamps, or analysis hints – but it can’t modify these structs directly or casually hang arbitrary pointers off them.
The chosen solution is a small FrameData struct referenced via AVBufferRef stored in AVFrame.opaque_ref. Conceptually, each frame gets a ref-counted backpack where the CLI can store its own metadata.
static int frame_data_ensure(AVBufferRef **dst, int writable)
{
AVBufferRef *src = *dst;
if (!src || (writable && !av_buffer_is_writable(src))) {
FrameData *fd = av_mallocz(sizeof(*fd));
if (!fd)
return AVERROR(ENOMEM);
*dst = av_buffer_create((uint8_t *)fd, sizeof(*fd),
frame_data_free, NULL, 0);
if (!*dst) {
av_buffer_unref(&src);
av_freep(&fd);
return AVERROR(ENOMEM);
}
if (src) {
const FrameData *fd_src = (const FrameData *)src->data;
memcpy(fd, fd_src, sizeof(*fd));
fd->par_enc = NULL;
fd->side_data = NULL;
fd->nb_side_data = 0;
if (fd_src->par_enc) {
int ret = 0;
fd->par_enc = avcodec_parameters_alloc();
ret = fd->par_enc ?
avcodec_parameters_copy(fd->par_enc, fd_src->par_enc) :
AVERROR(ENOMEM);
if (ret < 0) {
av_buffer_unref(dst);
av_buffer_unref(&src);
return ret;
}
}
if (fd_src->nb_side_data) {
int ret = clone_side_data(&fd->side_data, &fd->nb_side_data,
fd_src->side_data, fd_src->nb_side_data, 0);
if (ret < 0) {
av_buffer_unref(dst);
av_buffer_unref(&src);
return ret;
}
}
av_buffer_unref(&src);
} else {
fd->dec.frame_num = UINT64_MAX;
fd->dec.pts = AV_NOPTS_VALUE;
for (unsigned i = 0; i < FF_ARRAY_ELEMS(fd->wallclock); i++)
fd->wallclock[i] = INT64_MIN;
}
}
return 0;
}
The control story here is about ownership and isolation:
- Explicit lifetime:
frame_data_freeknows how to free every nested field, and that code runs when the lastAVBufferRefis released. The metadata’s lifetime is tied to the frame’s. - Copy-on-write safety: If a caller needs writable metadata but the backing buffer is shared, FFmpeg allocates a new
FrameData, deep-copies nested data, and drops the old ref. No two frames accidentally share mutable metadata. - Ergonomic access: Helper wrappers like
frame_data(frame)andpacket_data(pkt)hide this machinery; callers either get a pointer orNULLon error.
This is a clean example of a decorator-style attachment: extend behavior with a ref-counted side object rather than modifying the original type or using global side channels. Control over memory and ownership stays local and explicit.
Progress for humans and machines
A responsive CLI that you can’t observe is still hard to operate. FFmpeg’s answer is print_report, which translates internal state into both human-readable and machine-readable progress.
Observability is the ability to understand a system’s internal state from its outputs: logs, metrics, and traces. Here, print_report is the central observability hook inside the transcode loop.
bitrate = pts != AV_NOPTS_VALUE && pts && total_size >= 0 ?
total_size * 8 / (pts / 1000.0) : -1;
speed = pts != AV_NOPTS_VALUE && t != 0.0 ?
(double)pts / AV_TIME_BASE / t : -1;
if (total_size < 0) av_bprintf(&buf, "size=N/A time=");
else av_bprintf(&buf, "size=%8.0fKiB time=", total_size / 1024.0);
if (pts == AV_NOPTS_VALUE)
av_bprintf(&buf, "N/A ");
else
av_bprintf(&buf, "%s%02"PRId64":%02d:%02d.%02d ",
hours_sign, hours, mins, secs, (100 * us) / AV_TIME_BASE);
if (bitrate < 0) {
av_bprintf(&buf, "bitrate=N/A");
av_bprintf(&buf_script, "bitrate=N/A\n");
} else {
av_bprintf(&buf, "bitrate=%6.1fkbits/s", bitrate);
av_bprintf(&buf_script, "bitrate=%6.1fkbits/s\n", bitrate);
}
if (nb_frames_dup || nb_frames_drop)
av_bprintf(&buf, " dup=%"PRId64" drop=%"PRId64,
nb_frames_dup, nb_frames_drop);
av_bprintf(&buf_script, "dup_frames=%"PRId64"\n", nb_frames_dup);
av_bprintf(&buf_script, "drop_frames=%"PRId64"\n", nb_frames_drop);
if (speed < 0) {
av_bprintf(&buf, " speed=N/A");
av_bprintf(&buf_script, "speed=N/A\n");
} else {
av_bprintf(&buf, " speed=%4.3gx", speed);
av_bprintf(&buf_script, "speed=%4.3gx\n", speed);
}
print_report maintains two views in parallel:
- A single, human-friendly status line (
buf) printed on stderr. - A key-value style log (
buf_script) written toprogress_avio, which scripts and monitoring tools can parse.
These include metrics like output size, encoded time, bitrate, frame duplication/drop counts, and processing speed. The transcode loop calls print_report on every iteration, so operators and automation see a continuous, low-friction view of progress.
Keeping reporting cheap and safe
Because it runs in the hot path, print_report has to avoid becoming the bottleneck or a source of instability:
- It walks output streams once per report, so cost scales with the number of streams, not frames.
- It uses
AVBPrint, a bounded print buffer, to avoid buffer overflows in formatted output. - It reads cross-thread counters via atomics, so progress isn’t racing with encoder threads.
This design gives you control and visibility without sacrificing performance. You can monitor speed and frame drops as health signals, wire -progress into dashboards, and still keep the core loop lean.
Design patterns to reuse
We’ve followed FFmpeg’s control story from process entry to shutdown, through metadata handling and progress reporting. The common thread is a thin, explicit control layer around heavyweight work. Here are concrete patterns you can apply to your own CLIs and services.
1. Keep orchestration thin and explicit
Model the main loop as a control surface, not a work queue. In FFmpeg, transcode owns:
- Entry into steady state (
transcode_init_done). - Checks for signals and keyboard commands.
- Calls to pure reporting functions like
print_report.
Apply the same idea by centralizing “should we continue?” logic in one loop that delegates real work to a scheduler or worker layer.
2. Treat interrupts as a design constraint, not an afterthought
Shutdown paths deserve the same design attention as startup paths. FFmpeg:
- Normalizes platform-specific events into a simple counter of received signals.
- Wires that counter into blocking I/O via
decode_interrupt_cb. - Provides a hard-exit escape hatch after repeated signals.
This makes interrupt behavior predictable instead of “best effort.” For anything that might run under supervisors, orchestrators, or user terminals, that’s essential.
3. Extend foreign types with attached metadata, not globals
The FrameData “backpack” is a pattern you’ll need whenever you integrate with a library that owns its core types. The steps are:
- Define a small struct for your metadata.
- Attach it via a ref-counted handle or side pointer.
- Centralize allocation, copy-on-write, and freeing in helper functions.
That keeps extensions local, testable, and compatible with the library’s semantics.
4. Make progress machine-readable from the start
FFmpeg’s dual output – one line for humans, structured fields for tools – is easy to copy. Even if you only log to stderr initially, consider emitting a parallel stream of stable key-value pairs or JSON. That small decision pays off when you later add dashboards and alerts.
5. Refactor around seams instead of rewriting the world
ffmpeg.c shows its age: long functions, many globals, deep field access. Yet it remains reliable at massive scale. The realistic path to improving a similar orchestrator is incremental:
- Extract focused helpers (for example, command parsing out of
check_keyboard_interaction). - Gradually route global state through context structs passed into key functions.
- Add accessor helpers instead of deep chains like
ost->filter->graph->index.
FFmpeg’s ffmpeg.c is ultimately a blueprint: a large, battle-tested CLI that stays responsive, observable, and extensible by keeping a clear control layer on top of heavy work. If you’re building tools that run for minutes or hours, borrowing these patterns will make your systems easier to operate – and far easier to evolve.
