Skip to main content
المدونة

Zalt Blog

Deep Dives into Code & Architecture

AT SCALE

How Linux Shapes Its Module World

By محمود الزلط
Code Cracking
25m read
<

Curious how a kernel keeps its module chaos under control? “How Linux Shapes Its Module World” breaks down the ideas behind its module design.

/>
How Linux Shapes Its Module World - Featured blog post image

MENTORING

1:1 engineering mentorship.

Architecture, AI systems, career growth. Ongoing or one-off.

We’re examining how the Linux kernel organizes its loadable modules around a single internal contract. The module subsystem takes binaries from user space, proves they’re safe enough to run in the kernel, wires them into symbol tables, exposes knobs in sysfs, and tracks their lifetime. At the center of this is one header file, kernel/module/internal.h, which quietly defines how all these pieces agree to work together. I’m Mahmoud Zalt, an AI solutions architect, and we’ll look at how this header keeps a configurable, feature‑heavy subsystem from collapsing into a pile of #ifdefs — and how you can apply the same contract‑first thinking in your own code.

We’ll first build a mental model of the module loader, then see how internal.h acts as a stable contract through conditional stubs and helpers, how it tames global state, how its choices impact performance and operations, and finally what patterns you can reuse in your own architectures.

The module loader’s internal contract

The Linux kernel’s module subsystem lives under kernel/module/. Multiple source files collaborate to load, verify, map, expose, and track modules, but they all meet at internal.h, the header that defines their shared vocabulary.

kernel/
  module/
    +-- internal.h   (internal interface: structs, globals, helpers)
    +-- main.c       (core module loader implementation, uses internal.h)
    +-- sysfs.c      (sysfs integration, uses mod_sysfs_* from internal.h)
    +-- signature.c  (module_sig_check, mod_verify_sig implementation)
    +-- decompress.c (module_decompress implementation)
    +-- livepatch.c  (copy_module_elf, set_livepatch_module implementation)

[user-space modprobe]
          |
          v
   load_module()  -- uses -->  struct load_info
          |
          +--> symbol resolution  --> __start___ksymtab .. __stop___ksymtab
          +--> address lookup    --> mod_tree_root / modules list
          +--> sysfs exposure    --> mod_sysfs_setup
          +--> security checks   --> module_sig_check, module_enable_*
internal.h sits between the core loader and feature‑specific implementations.

The heart of this contract is struct load_info. It aggregates the information the loader needs about a module’s ELF layout, symbol and string tables, kallsyms offsets, decompression buffers, and some configuration‑specific details.

Mental model: struct load_info is the shipping manifest for a module. It doesn’t hold all the bytes forever, but it tells the loader where everything is and how to unpack it safely.

struct load_info {
	const char *name;
	/* pointer to module in temporary copy, freed at end of load_module() */
	struct module *mod;
	Elf_Ehdr *hdr;
	unsigned long len;
	Elf_Shdr *sechdrs;
	char *secstrings, *strtab;
	unsigned long symoffs, stroffs, init_typeoffs, core_typeoffs;
	bool sig_ok;
#ifdef CONFIG_KALLSYMS
	unsigned long mod_kallsyms_init_off;
#endif
#ifdef CONFIG_MODULE_DECOMPRESS
#ifdef CONFIG_MODULE_STATS
	unsigned long compressed_len;
#endif
	struct page **pages;
	unsigned int max_pages;
	unsigned int used_pages;
#endif
	struct {
		unsigned int sym;
		unsigned int str;
		unsigned int mod;
		unsigned int vers;
		unsigned int info;
		unsigned int pcpu;
		unsigned int vers_ext_crc;
		unsigned int vers_ext_name;
	} index;
};

Everything else in internal.h either fills this manifest (signature checks, decompression), consumes it (layout, sysfs), or connects loaded modules to global registries. The primary lesson in this file is simple but strict: define a clear internal contract once, then adapt configuration complexity to that contract instead of letting it leak everywhere.

One contract across many configurations

With that mental model in place, the interesting question is how internal.h supports features like livepatching, signature verification, decompression, kallsyms, and module stats without sprinkling #ifdefs across every call site. The answer is that the header defines a stable API and uses configuration‑dependent stubs and helpers as adapters.

The symbol abstraction: isolating architectural quirks

Consider how a kernel symbol’s address is represented. Architectures that use PREL32 relocations want a different layout from those that don’t, but users of symbols should not have to care.

struct kernel_symbol {
#ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS
	int value_offset;
	int name_offset;
	int namespace_offset;
#else
	unsigned long value;
	const char *name;
	const char *namespace;
#endif
};

static inline unsigned long kernel_symbol_value(const struct kernel_symbol *sym)
{
#ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS
	return (unsigned long)offset_to_ptr(&sym->value_offset);
#else
	return sym->value;
#endif
}

kernel_symbol stores either offsets or direct pointers, and kernel_symbol_value() normalizes that difference. All relocation complexity is forced into one helper. New architectures can choose the representation they need, but the rest of the subsystem keeps calling kernel_symbol_value() without extra branches or awareness of PREL32.

Feature switches as safe, unconditional functions

The same pattern shows up more clearly with optional subsystems. The kernel uses many CONFIG_* flags, but internal.h gives call sites a simple rule: the function name and signature exist regardless of configuration; what changes is the implementation.

Livepatch ELF handling is a straightforward example:

#ifdef CONFIG_LIVEPATCH
int copy_module_elf(struct module *mod, struct load_info *info);
void free_module_elf(struct module *mod);
#else
static inline int copy_module_elf(struct module *mod, struct load_info *info)
{
	return 0;
}

static inline void free_module_elf(struct module *mod) { }
#endif

From the loader’s perspective, copy_module_elf() and free_module_elf() are always callable. With livepatch enabled, they manage extra ELF state; without it, they are cheap no‑ops that still respect the control‑flow contract: “you’re allowed to call me, I won’t break you.”

Decompression follows the same idea but chooses an explicit error code instead of a silent success:

#ifdef CONFIG_MODULE_DECOMPRESS
int module_decompress(struct load_info *info, const void *buf, size_t size);
void module_decompress_cleanup(struct load_info *info);
#else
static inline int module_decompress(struct load_info *info,
				    const void *buf, size_t size)
{
	return -EOPNOTSUPP;
}

static inline void module_decompress_cleanup(struct load_info *info) { }
#endif
  • The function names and signatures are configuration‑independent.
  • The failure mode when the feature is off is well‑defined: -EOPNOTSUPP means “this capability isn’t compiled in,” not “some unrelated internal error.”

When stubs blur semantics

Not all stubs return explicit “feature off” errors. Some are designed to look like success, which keeps control flow simple but can hide important semantics if they’re not clearly documented.

#ifdef CONFIG_MODULE_SIG
int module_sig_check(struct load_info *info, int flags);
#else
static inline int module_sig_check(struct load_info *info, int flags)
{
	return 0;
}
#endif

With signature checking enabled, 0 means “module signature verified successfully.” With it disabled, 0 really means “no signature checks were performed.” The call site sees the same value in both cases. That’s convenient for branching, but dangerous if a future maintainer assumes “0 means verified” instead of “0 means not rejected.”

The report’s suggestion is to fix this at the contract level: keep the stub behavior for compatibility, but add a clear comment next to it stating that security‑sensitive call sites must gate behavior on IS_ENABLED(CONFIG_MODULE_SIG), not just on module_sig_check() == 0. The code stays trivial, but the semantics become explicit.

Lesson: when a function’s meaning changes across configurations, document that difference right where the stub lives. Otherwise, your stable API surface hides unstable semantics.

Taming global state

internal.h also exposes real global state: the module list, the address lookup structure, and statistics structures. In most codebases, “globals in a header” is a red flag. Here it’s a necessity, but the header constrains how they’re touched by wrapping them in narrow helpers with clear expectations.

Module address lookup with clear concurrency contracts

When the kernel wants to answer “which module owns this instruction pointer?”, it uses mod_find(). Depending on configuration, this may be backed by a tree for O(log M) lookup or by a linear list scan, but the observable behavior at the call site is the same.

struct mod_tree_root {
#ifdef CONFIG_MODULES_TREE_LOOKUP
	struct latch_tree_root root;
#endif
	unsigned long addr_min;
	unsigned long addr_max;
#ifdef CONFIG_ARCH_WANTS_MODULES_DATA_IN_VMALLOC
	unsigned long data_addr_min;
	unsigned long data_addr_max;
#endif
};

extern struct mod_tree_root mod_tree;

When tree lookup is disabled, the header provides a fallback implementation that walks the global module list:

static inline struct module *mod_find(unsigned long addr, struct mod_tree_root *tree)
{
	struct module *mod;

	list_for_each_entry_rcu(mod, &modules, list,
				lockdep_is_held(&module_mutex)) {
		if (within_module(addr, mod))
			return mod;
	}

	return NULL;
}

Even in this small helper, the contract is layered:

  • Data contract: callers pass a mod_tree_root whose addr_min/addr_max define the search bounds.
  • Concurrency contract: list_for_each_entry_rcu and lockdep_is_held(&module_mutex) encode that you must be inside an RCU read‑side critical section and hold module_mutex when traversing &modules.
  • Behavioral contract: enabling CONFIG_MODULES_TREE_LOOKUP changes the lookup algorithm but not the function signature or its basic semantics.

The report recommends adding an explicit comment above this fallback describing the locking requirements. That would turn what lockdep currently enforces implicitly into clear documentation for anyone adding new callers.

Tracking failures with minimal surface area

internal.h also defines small data structures for tracking nuanced behavior like duplicate module load attempts, which can waste vmalloc space and point to user‑space races.

enum fail_dup_mod_reason {
	FAIL_DUP_MOD_BECOMING = 0,
	FAIL_DUP_MOD_LOAD,
};

#ifdef CONFIG_MODULE_STATS
struct mod_fail_load {
	struct list_head list;
	char name[MODULE_NAME_LEN];
	atomic_long_t count;
	unsigned long dup_fail_mask;
};

int try_add_failed_module(const char *name, enum fail_dup_mod_reason reason);
#else
static inline int try_add_failed_module(const char *name,
					enum fail_dup_mod_reason reason)
{
	return 0;
}
#endif

The enum documents the conceptual states (FAIL_DUP_MOD_BECOMING versus FAIL_DUP_MOD_LOAD), and all mutation goes through one helper, try_add_failed_module(). With CONFIG_MODULE_STATS disabled, this compiles down to a no‑op that still satisfies the function contract. The public interface stays small while allowing configuration‑specific implementations behind it.

This pattern repeats for other tracking structures (for example, unload taints). Even when the data models overlap, the helpers ensure there is exactly one place to update the global statistic, which simplifies reasoning about side effects and observability.

Performance and operational implications

The contract‑first approach in internal.h is not just about cleanliness; it shapes how module behavior scales and how operators can see what’s happening at runtime. Once you understand the helpers and stubs, it becomes obvious where performance and observability hooks belong.

Lookup strategies and complexity

The most visible scalability dial here is CONFIG_MODULES_TREE_LOOKUP. With it enabled, mod_find() uses a tree under the hood and has O(log M) complexity in the number of modules M. With it disabled, the fallback linear scan is O(M).

Configuration mod_find() complexity Reasonable when… Be cautious when…
CONFIG_MODULES_TREE_LOOKUP=y O(log M) You have many modules, or frequent stack traces and probes. You run tiny embedded kernels where tree maintenance overhead matters.
CONFIG_MODULES_TREE_LOOKUP=n O(M) You have a small, mostly static module set. Module counts are large and address lookups are frequent.

The report suggests exposing a metric such as a fallback‑path counter for mod_find() so operators can see how often the linear scan is used. Combined with metrics like module load duration and symbol lookup counts, this would make configuration choices around tree lookup and stats grounded in real workload data.

Graceful degradation through explicit failure

Optional features like decompression also show how internal.h encourages graceful degradation instead of configuration‑driven control‑flow explosion. When module decompression support is compiled out, module_decompress() always returns -EOPNOTSUPP. Higher‑level code has a single, predictable way to recognize “feature not available” and return a clear error to user space, without open‑coded #ifdefs in the loader itself.

This keeps control flow stable across builds: the same functions are called in the same order, but their behavior is cheap and explicit when a feature is off. That predictability is important both for performance profiling and for reasoning about security properties.

Patterns you can reuse

kernel/module/internal.h is only a few hundred lines of C, but it coordinates security checks, livepatching, decompression, symbol resolution, and statistics across a large configuration matrix. The techniques it uses are widely applicable outside the kernel.

1. Centralize your internal API

Linux treats internal.h as the facade for the module subsystem: one place where internal data structures, helpers, and expectations are declared. main.c, sysfs.c, signature.c, decompress.c, and livepatch.c all depend on that shared contract.

In your own systems, this might be a single internal package or module that defines:

  • core data structures (your equivalent of load_info),
  • invariants and concurrency expectations as comments,
  • narrow helper functions that hide representation and configuration details.

2. Design stubs as adapters, with explicit semantics

Conditional stubs are powerful, but they only help if their semantics are obvious:

  • Use explicit error codes or no‑ops (-EOPNOTSUPP, empty functions) when you want to say “this feature is off.”
  • Use “pretend‑success” stubs (returning 0 or true) only when you also document how their meaning differs by configuration, as with module_sig_check().

When you add a feature flag in your own code, decide deliberately whether the stub means “capability absent” or “assume success,” and state that near the stub so future readers don’t have to guess.

3. Hide global state behind narrow helpers

The module subsystem can’t avoid global registries, but it avoids global free‑for‑all access. Helpers like mod_find() and try_add_failed_module() concentrate access to shared structures and encode what must be true while you touch them (locks held, RCU critical section, error handling expectations).

Even in application code, wrapping a shared map or registry in a single module with documented helpers makes it much easier to change the underlying representation, add synchronization, or attach metrics later.

4. Tie configuration switches to observability

The report’s proposed metrics around module loading and lookup show a useful habit: every major configuration choice should have a way to observe its impact. Whether it’s tree‑based lookups, decompression, signature checking, or stats, the internal contract defines natural points to hang counters and latency measurements.

In your systems, whenever you introduce a new mode or configuration that affects behavior or performance, also decide which 1–2 metrics would tell you if that choice is paying off or hurting you.


kernel/module/internal.h doesn’t execute any module code itself, but it determines how modules are represented, how optional features plug in, how globals are accessed, and how the subsystem behaves under different configurations. Its main achievement is not a novel algorithm; it’s the discipline of shaping a clear internal contract and forcing complexity to adapt to that contract.

If you give your own internal interfaces the same treatment — a central contract, well‑designed stubs, disciplined access to shared state, and metrics aligned with configuration — you can add features and options without letting them leak across your entire codebase.

The next time you add a feature flag, a new struct field, or a global registry, ask: “Am I tightening the contract around this subsystem, or making it fuzzier?” That question is what turns a tangle of conditionals into a coherent module world.

Full Source Code

Direct source from the upstream repository. Preview it inline or open it on GitHub.

heads/master/kernel/module/internal.h

torvalds/linux • refs

Choose one action below.

Open on GitHub

Thanks for reading! I hope this was useful. If you have questions or thoughts, feel free to reach out.

Content Creation Process: This article was generated via a semi-automated workflow using AI tools. I prepared the strategic framework, including specific prompts and data sources. From there, the automation system conducted the research, analysis, and writing. The content passed through automated verification steps before being finalized and published without manual intervention.

Mahmoud Zalt

About the Author

I’m Zalt, a technologist with 16+ years of experience, passionate about designing and building AI systems that move us closer to a world where machines handle everything and humans reclaim wonder.

Let's connect if you're working on interesting AI projects, looking for technical advice or want to discuss anything.

Support this content

Share this article