We’re examining how the Linux kernel makes module loading feel boring—in the best possible way. Kernel modules appear in lsmod, resolve symbols, plug into sysfs and debugfs, and mostly just work. That reliability is engineered, not accidental, and a big part of it is a single internal header that quietly coordinates the whole subsystem.
This header, kernel/module/internal.h, is the private contract for the Linux module subsystem. It doesn’t implement module loading; it defines the structures and functions that all the module code relies on. I’m Mahmoud Zalt, an AI software engineer, and we’ll use this file as a case study in how to design internal APIs that stay stable while features and configurations vary wildly.
We’ll see how this header:
- Keeps the same API across very different kernel builds.
- Hides optional features behind disciplined stubs instead of scattered
#ifdefs. - Provides a unified map from addresses to modules.
- Defines the contracts that enforce security checks.
- Balances the benefits and costs of heavy compile-time configuration.
Underneath all of this sits one core lesson: stabilize your internal interfaces and push variability behind them. Everything that follows is how the kernel makes that principle real for modules.
1. One Header, Many Responsibilities
Before we dive into patterns, we need to know what internal.h actually holds and how it fits into the module subsystem.
kernel/
module/
internal.h <-- internal interfaces & structs for module subsystem
main.c (implements core loading logic, uses internal.h)
sysfs.c (implements mod_sysfs_setup/teardown)
sign.c (implements module_sig_check, mod_verify_sig)
decompress.c (implements module_decompress when enabled)
tree.c (implements mod_tree_* and mod_find tree variant)
internal.h as the shared contract between core module logic and its satellites.Think of this header as the contract every module-related subsystem signs: it defines what must exist, not how it’s done. Its main responsibilities are:
- A module dossier via
struct load_info, which describes the ELF image, sections, symbol tables, and decompression pages. - A symbol directory via
struct kernel_symboland helpers likekernel_symbol_value. - A global registry of modules and their address ranges (the modules list and
mod_tree_root). - Hooks for optional subsystems: livepatch, decompression, sysfs, signatures, versioning, debugfs, stats, taint tracking.
Crucially, the header is designed so that most users of the module subsystem barely notice how many optional subsystems exist. Feature variability is pushed to compile time and hidden behind a stable API surface. That’s the through-line we’ll follow.
2. Facades Over Optional Subsystems
The Linux kernel is built in many configurations: with or without livepatch, decompression, signatures, versioning, stats, and more. If each caller had to sprinkle #ifdef CONFIG_... around every use, the codebase would be unreadable.
The answer in internal.h is a consistent pattern: present the same functions in every build, and hide differences behind small inline stubs when features are disabled. Call sites stay clean; configuration complexity moves into the header.
2.1 Livepatch: uniform API, configuration-specific behavior
Livepatch is a good example. From a caller’s perspective, you can always call copy_module_elf, free_module_elf, and set_livepatch_module. What they do depends on whether livepatch is compiled in.
#ifdef CONFIG_LIVEPATCH
int copy_module_elf(struct module *mod, struct load_info *info);
void free_module_elf(struct module *mod);
#else /* !CONFIG_LIVEPATCH */
static inline int copy_module_elf(struct module *mod, struct load_info *info)
{
return 0;
}
static inline void free_module_elf(struct module *mod, struct load_info *info) { }
#endif /* CONFIG_LIVEPATCH */
static inline bool set_livepatch_module(struct module *mod)
{
#ifdef CONFIG_LIVEPATCH
mod->klp = true;
return true;
#else
return false;
#endif
}
CONFIG_LIVEPATCH.When livepatch is enabled, the prototypes bind to real implementations. When it’s disabled, the header still exposes the same functions, but compiles them as no-ops (for copy_module_elf/free_module_elf) or as a simple boolean capability probe (set_livepatch_module).
Callers don’t need to know the feature matrix. They simply:
- Call the function unconditionally.
- Interpret the result (
true/false, success/error).
2.2 Decompression: unsupported as a first-class outcome
Module decompression follows the same pattern, but with explicit signaling when the feature is absent.
#ifdef CONFIG_MODULE_DECOMPRESS
int module_decompress(struct load_info *info, const void *buf, size_t size);
void module_decompress_cleanup(struct load_info *info);
#else
static inline int module_decompress(struct load_info *info,
const void *buf, size_t size)
{
return -EOPNOTSUPP;
}
static inline void module_decompress_cleanup(struct load_info *info)
{
}
#endif
Here, the disabled stub semantics are intentional:
- Always returns
-EOPNOTSUPP: clearly “operation not supported”. - Does not modify
load_info, preserving its invariant state.
This forces callers into an “all or nothing” mindset: either decompression exists and runs, or the kernel tells you directly that it can’t handle compressed modules. There is no partial-progress state for callers to unwind.
2.3 Versioning: fixed semantics, surprising constants
Versioning (via CONFIG_MODVERSIONS) reuses the same façade idea, but the chosen constants are less obvious at first glance.
#ifdef CONFIG_MODVERSIONS
int check_version(const struct load_info *info,
const char *symname, struct module *mod, const u32 *crc);
...
#else /* !CONFIG_MODVERSIONS */
static inline int check_version(const struct load_info *info,
const char *symname,
struct module *mod,
const u32 *crc)
{
return 1;
}
static inline int check_modstruct_version(const struct load_info *info,
struct module *mod)
{
return 1;
}
static inline int same_magic(const char *amagic, const char *bmagic, bool has_crcs)
{
return strcmp(amagic, bmagic) == 0;
}
#endif /* CONFIG_MODVERSIONS */
Here, 1 means “acceptable” to callers that treat this as a boolean-like helper. With versioning disabled, the semantics are deliberately:
- All version checks succeed (return non-zero).
- Magic comparison is a direct string compare.
The report recommends documenting these magic values explicitly. The behavior is correct for the kernel’s conventions; the issue is human readability. The important idea is that the header locks in the semantics: if versioning is off, the system behaves as if everything is compatible.
| Feature | Enabled Build | Disabled Build (Stub) | Caller’s Perspective |
|---|---|---|---|
Livepatch (CONFIG_LIVEPATCH) |
Real copy/free of ELF, module flagged as livepatchable | Copy/free are no-ops; set_livepatch_module returns false |
Functions always exist; capability indicated by return value |
Decompression (CONFIG_MODULE_DECOMPRESS) |
Decompresses into load_info |
Always returns -EOPNOTSUPP; no state change |
Call always compiles; must handle “unsupported” explicitly |
Modversions (CONFIG_MODVERSIONS) |
Real CRC checks and layout validation | All checks “OK” via return 1 |
Higher-level logic can treat version checks as always-success |
3. One Map from Addresses to Modules
Feature toggling is only half the story. internal.h also defines a consistent way to answer a fundamental question: “Given this kernel address, which module owns it?” That’s performance-critical for stack traces, fault handling, and diagnostics.
The answer revolves around struct mod_tree_root and mod_find.
3.1 Two data structures, one lookup API
The kernel can implement the address-to-module map in two ways:
- A tree-based index (
CONFIG_MODULES_TREE_LOOKUP) with ~O(log n) lookups. - A simple RCU-protected list scan when the tree is disabled, with O(n) complexity.
Callers, however, always use the same API:
struct mod_tree_root {
#ifdef CONFIG_MODULES_TREE_LOOKUP
struct latch_tree_root root;
#endif
unsigned long addr_min;
unsigned long addr_max;
#ifdef CONFIG_ARCH_WANTS_MODULES_DATA_IN_VMALLOC
unsigned long data_addr_min;
unsigned long data_addr_max;
#endif
};
extern struct mod_tree_root mod_tree;
#ifdef CONFIG_MODULES_TREE_LOOKUP
void mod_tree_insert(struct module *mod);
void mod_tree_remove_init(struct module *mod);
void mod_tree_remove(struct module *mod);
struct module *mod_find(unsigned long addr, struct mod_tree_root *tree);
#else /* !CONFIG_MODULES_TREE_LOOKUP */
static inline void mod_tree_insert(struct module *mod) { }
static inline void mod_tree_remove_init(struct module *mod) { }
static inline void mod_tree_remove(struct module *mod) { }
static inline struct module *mod_find(unsigned long addr, struct mod_tree_root *tree)
{
struct module *mod;
list_for_each_entry_rcu(mod, &modules, list,
lockdep_is_held(&module_mutex)) {
if (within_module(addr, mod))
return mod;
}
return NULL;
}
#endif /* CONFIG_MODULES_TREE_LOOKUP */
mod_find always exists; its performance changes with configuration.mod_tree_root is the map itself; mod_find is the query. With the tree enabled, insert/remove and lookups are backed by a latch tree. Without it, insert/remove are no-ops and mod_find scans the global modules list under RCU.
This encapsulation has direct scalability consequences:
- O(n) lookup is fine with a handful of modules; painful when you have many.
- Switching to
CONFIG_MODULES_TREE_LOOKUPupgrades performance without touching callers.
4. Security as a Contract, Not an Afterthought
Modules are a security boundary: they may be signed, version-checked, and subject to memory protection rules. internal.h doesn’t implement these checks, but it defines where they plug into the load pipeline and what they must see. That contract is a key part of the kernel’s defense in depth.
4.1 The module dossier: struct load_info
struct load_info is the central “dossier” the report keeps referring to. It travels through the loading pipeline and carries everything the various reviewers need.
What struct load_info conceptually contains
- Identity: module
name, pointer tostruct module. - ELF metadata: main header (
hdr), section headers (sechdrs), section string table, symbol string table, and offsets. - Kallsyms offsets when
CONFIG_KALLSYMSis enabled. - Decompression pages and lengths under
CONFIG_MODULE_DECOMPRESS. - Indexed section numbers (symbols, strings, versions, etc.) in a nested
indexstruct.
Multiple subsystems read from this dossier:
- Decompression (
module_decompress) uses the raw buffer to populate pages inload_info. - Signatures (
module_sig_check,mod_verify_sig) parse the image to verify cryptographic signatures when enabled. - Versioning (
check_version,module_layout) validates symbol and struct versions. - Memory protection (
module_enforce_rwx_sections) inspects ELF section flags to avoid writable+executable regions. - Sysfs/debugfs exposure (
mod_sysfs_setup) uses the identity and layout to create user-visible entries.
load_module()
|
v
+----------------+
| struct load_info|
+----------------+
| | | |
| | | +--> module_decompress()
| | +------> module_sig_check()/mod_verify_sig()
| +----------> check_version()/module_layout()
+--------------> mod_sysfs_setup()
load_info is the single source of truth for every checker in the load pipeline.Because this contract is centralized, the kernel can evolve individual checks without destabilizing the rest of the pipeline—again the same pattern of a stable interface with evolving internals.
4.2 Signatures: the function exists, but does it check anything?
Module signatures (CONFIG_MODULE_SIG) demonstrate another subtle dimension of this design: the separation between code-level APIs and configuration-level guarantees.
#ifdef CONFIG_MODULE_SIG
int module_sig_check(struct load_info *info, int flags);
#else /* !CONFIG_MODULE_SIG */
static inline int module_sig_check(struct load_info *info, int flags)
{
return 0;
}
#endif /* !CONFIG_MODULE_SIG */
To the caller, the convention is simple:
0means “signature accepted”.- Negative errno values mean “signature rejected” (when signatures are enabled).
However, if the kernel is built without CONFIG_MODULE_SIG, signatures are not enforced at all. The function silently returns success for every module. The API exists independently of whether the security property is active.
The report highlights this as a reminder: in systems with compile-time feature flags, you have to consult both the code and the configuration to understand your actual security posture. The header makes the difference explicit in code; operations teams need metrics and policies on top of that.
5. The Real Cost of Conditional Compilation
So far we’ve seen the upside of this design: stable APIs, simple call sites, and configuration flexibility. The report also surfaces the costs of this approach, which matter if you want to apply similar patterns in your own codebase.
5.1 A heavy header
internal.h carries a lot of conceptual weight. In one file you find:
- Core loading structures (
struct load_infoand related helpers). - Symbol handling (
struct kernel_symbol,kernel_symbol_value). - Feature facades: livepatch, decompression, versioning, signature checking.
- Subsystem hooks: sysfs, debugfs, taint tracking, statistics.
- Address lookup infrastructure:
mod_tree_root,mod_find. - Memory protection helpers for RX/WX enforcement.
That scope creep makes the file harder to learn and maintain, especially for new contributors who must understand Kconfig options, RCU, ELF details, and security conventions all at once. The maintainability assessment in the report reflects this: overall solid design, but with high cognitive load.
A practical refinement the report suggests is to carve out narrowly focused areas—like module statistics under CONFIG_MODULE_STATS—into their own internal headers, and have internal.h include them. That keeps the “one contract” idea while reducing wall-of-text fatigue.
5.2 #ifdef scatter
Even with the façade pattern, the file contains a lot of #ifdef CONFIG_... blocks. Each one is defensible, but in aggregate they make it harder to reason about:
- You must mentally simulate multiple configurations to understand all code paths.
- Unusual Kconfig combinations are difficult to evaluate and test in your head.
- Cross-cutting concerns like security and performance become configuration-dependent.
The report’s guidance here is to keep conditional complexity as localized as possible. Use headers to stabilize prototypes and a few key stubs, but push the bulk of configuration-specific logic into the corresponding .c files.
5.3 Magic return values
Finally, there are the “magic” constants in stub implementations: check_version returning 1, decompression using -EOPNOTSUPP, and so on. Each choice is individually reasonable, but their meaning is not obvious at the declaration site.
The report’s top refactor suggestion is purely explanatory: add concise comments documenting stub semantics at the point of declaration. No behavior change, just a lower mental tax for future readers who don’t live and breathe kernel conventions.
6. What to Steal for Your Own Systems
The kernel module subsystem looks complex because it is, but the core design move in internal.h is simple and broadly applicable: keep your internal interfaces boringly stable and push variability behind them.
Here are concrete patterns you can apply today:
- Stabilize APIs across configurations.
If you have feature flags, keep the public surface area constant. Provide stubs for disabled features instead of spraying#ifdefs across call sites. Decide whether the stub should signal “unsupported”, “no-op”, or “always OK”, and document that choice. - Treat stubs as real implementations.
Stubs run in production whenever a feature is off. Choose return values intentionally (like-EOPNOTSUPPfor unsupported operations) and make their behavior explicit in comments and tests. - Separate the question from the data structure.
Design functions around domain questions (“find module by address”) instead of concrete structures. That allowed the kernel to switch from list scans to tree lookups without changing callers; you can do the same for caches, indices, or routing layers. - Use a single dossier for complex lifecycles.
For multi-step flows (module loading, tenant onboarding, job scheduling), build a struct likestruct load_infothat carries all necessary state through the pipeline. It becomes the shared truth every stage reads from and updates. - Make configuration part of the contract.
Functions likemodule_sig_checkshow that security properties depend on build-time flags. If your system behaves differently under different configs, surface that clearly in code and, ideally, in metrics and documentation.
If you design your internal headers with this level of discipline—stable facades, carefully chosen stub semantics, and well-defined contracts—your infrastructure becomes easier to reason about and change. Most importantly, it becomes pleasantly boring in production. And for the layer everything else depends on, boring is exactly what you want.



