Skip to main content

The Invisible Arguments Powering LangChain Tools

Most LangChain examples focus on visible tool inputs. This dives into the invisible arguments that actually drive LangChain tools at runtime.

Code Cracking
20m read
#LangChain#LLM#developers#AItools
The Invisible Arguments Powering LangChain Tools - Featured blog post image

1:1 FOR ENGINEERS

Get private mentorship from an expert software engineer.

If you want to move faster in your career, I offer direct mentorship on architecture, AI systems, performance engineering and much more.

We’re dissecting how LangChain’s tooling core keeps its APIs simple for developers while still wiring in rich runtime context. The key idea is a quiet one: injected arguments—parameters that don’t appear in the LLM-facing schema but still arrive reliably at execution time.

LangChain is a framework for building LLM-powered applications. At the center of its tools system is BaseTool, which turns plain Python functions into safe, traceable operations that agents and runtimes can orchestrate. I’m Mahmoud Zalt, an AI solutions architect, and we’ll use BaseTool and its helpers to understand how to keep schemas clean while your runtime stays powerful.

By the end, you’ll have a concrete pattern you can reuse: separate user-facing schemas from framework wiring with injected arguments, validate and enrich inputs in one place, and centralize orchestration in a template method so your tools still feel like simple Python functions.

Where BaseTool Sits in LangChain

To understand injected arguments, we first need the stage they operate on: the BaseTool abstraction and its schema helpers.

langchain_core/
  tools/
    base.py   <-- BaseTool, BaseToolkit, schema & injection utilities

Call graph (simplified):

  invoke / ainvoke
        |
        v
   _prep_run_args
        |
        v
     run / arun
        |
        +--> _filter_injected_args --> callbacks.on_tool_start
        |
        +--> _to_args_and_kwargs
        |         |
        |         v
        |      _parse_input --(Pydantic & injection)--> validated_input
        |
        +--> _run / _arun (implemented by concrete tool)
        |
        v
   _format_output --> ToolMessage (if tool_call_id present)
Figure 1 – From agent call to ToolMessage: where validation, injection, and callbacks plug in.

BaseTool is a classic Template Method implementation: the public run/arun methods handle configuration, callbacks, validation, and output formatting, while subclasses only implement the core business logic in _run/_arun.

The other major pieces in this file are:

  • create_schema_from_function – builds a Pydantic model from a plain Python function signature and docstring.
  • InjectedToolArg and InjectedToolCallId – markers for arguments that the framework fills in at runtime instead of the LLM.
  • _filter_injected_args and get_all_basemodel_annotations – utilities that hide injected arguments from the LLM-facing schema but still let them participate in validation and execution.

The Secret Life of Injected Arguments

With the context in place, we can zoom in on injected arguments. An injected argument is a parameter that the framework provides automatically at runtime but that should not appear in the schema the LLM sees. It’s a backstage pass: invisible to the audience, essential behind the curtain.

The file defines two marker types:

class InjectedToolArg:
    """Annotation for tool arguments that are injected at runtime.

    Tool arguments annotated with this class are not included in the tool
    schema sent to language models and are instead injected during execution.
    """


class InjectedToolCallId(InjectedToolArg):
    """Annotation for injecting the tool call ID.

    This annotation is used to mark a tool parameter that should receive the tool call
    ID at runtime.
    """
Listing 1 – Marker types for runtime-only parameters.
  • You can annotate a parameter with Annotated[, InjectedToolArg] (or use a directly injected type), and BaseTool will treat it as a framework-provided value.
  • For InjectedToolCallId, the framework injects the LLM tool call’s ID into this parameter when the tool is invoked with a ToolCall envelope.

For this pattern to work, two constraints must hold:

  1. Injected parameters must be hidden from the LLM schema so the model never tries to set them.
  2. They must still be present during validation and execution so your tool logic can rely on them.

Hiding them from the schema happens in BaseTool.tool_call_schema. After building a full Pydantic model, the code walks the annotations and drops anything that looks injected:

@property
def tool_call_schema(self) -> ArgsSchema:
    if isinstance(self.args_schema, dict):
        ...

    full_schema = self.get_input_schema()
    fields = []
    for name, type_ in get_all_basemodel_annotations(full_schema).items():
        if not _is_injected_arg_type(type_):
            fields.append(name)
    return _create_subset_model(
        self.name, full_schema, fields, fn_description=self.description
    )
Listing 2 – Building an LLM-facing schema that excludes injected fields.

The deciding logic lives in _is_injected_arg_type, which inspects Annotated metadata and directly injected marker types to decide whether a field should be treated as injected.

Validation as an Airport Customs Checkpoint

Hiding injected fields from the public schema is only half the work. We also need to validate real inputs, apply defaults, and merge in injected values in a predictable way. That all happens in _parse_input.

Think of _parse_input as an airport customs checkpoint: it takes a messy stream of passengers (raw input), checks passports and visas (schemas and injected markers), and only lets through people with the right stamps (validated data plus injected context).

def _parse_input(
    self, tool_input: str | dict, tool_call_id: str | None
) -> str | dict[str, Any]:
    input_args = self.args_schema

    if isinstance(tool_input, str):
        if input_args is not None:
            if isinstance(input_args, dict):
                raise ValueError(
                    "String tool inputs are not allowed when "
                    "using tools with JSON schema args_schema."
                )
            key_ = next(iter(get_fields(input_args).keys()))
            if issubclass(input_args, BaseModel):
                input_args.model_validate({key_: tool_input})
            elif issubclass(input_args, BaseModelV1):
                input_args.parse_obj({key_: tool_input})
            else:
                raise TypeError(...)
        return tool_input

    if input_args is not None:
        if isinstance(input_args, dict):
            return tool_input
        if issubclass(input_args, BaseModel):
            # Inject tool_call_id when schema declares InjectedToolCallId
            for k, v in get_all_basemodel_annotations(input_args).items():
                if _is_injected_arg_type(v, injected_type=InjectedToolCallId):
                    if tool_call_id is None:
                        raise ValueError(
                            "When tool includes an InjectedToolCallId ..."
                        )
                    tool_input[k] = tool_call_id
            result = input_args.model_validate(tool_input)
            result_dict = result.model_dump()
        elif issubclass(input_args, BaseModelV1):
            ...  # Similar logic for Pydantic v1
        else:
            raise NotImplementedError(...)

        # Apply defaults but avoid synthetic args/kwargs
        field_info = get_fields(input_args)
        validated_input = {}
        for k in result_dict:
            if k in tool_input:
                validated_input[k] = getattr(result, k)
            elif k in field_info and k not in {"args", "kwargs"}:
                fi = field_info[k]
                has_default = (
                    not fi.is_required()
                    if hasattr(fi, "is_required")
                    else not getattr(fi, "required", True)
                )
                if has_default:
                    validated_input[k] = getattr(result, k)

        # Re-inject runtime-only keys like tool_call_id into validated_input
        for k in self._injected_args_keys:
            if k in tool_input:
                validated_input[k] = tool_input[k]
            elif k == "tool_call_id":
                if tool_call_id is None:
                    raise ValueError(...)
                validated_input[k] = tool_call_id

        return validated_input

    return tool_input
Listing 3 – Customs checkpoint: merging user input, schema validation, and injected IDs.

A few behaviors are worth calling out:

  • Different input styles are normalized. If you pass a simple string and your schema has a single field, the string is mapped to that field and validated. If you pass a dict, it’s validated field by field.
  • Pydantic v1 and v2 are both supported. BaseModel and BaseModelV1 are handled explicitly so tools can migrate gradually.
  • InjectedToolCallId is enforced as a contract. If your schema declares an InjectedToolCallId but the tool wasn’t called with a ToolCall containing an ID, a ValueError explains the expected structure.
  • Defaults are applied carefully. The code avoids synthetic fields that Pydantic adds for *args/**kwargs and only carries through explicitly defined fields with defaults.

Orchestrating Tool Runs

Once inputs are validated and enriched, BaseTool still has to set up callbacks, thread configuration, choose sync vs async execution, and normalize outputs into ToolMessage objects. That orchestration lives in the run/arun methods.

Both methods are long and multi-responsibility, but the high-level pattern is consistent:

def run(..., config: RunnableConfig | None = None, tool_call_id: str | None = None, **kwargs: Any) -> Any:
    callback_manager = CallbackManager.configure(...)

    # 1) Hide injected args from observability inputs
    filtered_tool_input = (
        self._filter_injected_args(tool_input)
        if isinstance(tool_input, dict)
        else None
    )
    tool_input_str = (
        tool_input
        if isinstance(tool_input, str)
        else str(filtered_tool_input if filtered_tool_input is not None else tool_input)
    )

    # 2) Emit on_tool_start event
    run_manager = callback_manager.on_tool_start(
        {"name": self.name, "description": self.description},
        tool_input_str,
        inputs=filtered_tool_input,
        tool_call_id=tool_call_id,
        ...,
    )

    content = None
    artifact = None
    status = "success"
    error_to_raise: Exception | KeyboardInterrupt | None = None
    try:
        # 3) Thread config and callbacks into Runnable context
        child_config = patch_config(config, callbacks=run_manager.get_child())
        with set_config_context(child_config) as context:
            tool_args, tool_kwargs = self._to_args_and_kwargs(tool_input, tool_call_id)
            if signature(self._run).parameters.get("run_manager"):
                tool_kwargs |= {"run_manager": run_manager}
            if config_param := _get_runnable_config_param(self._run):
                tool_kwargs |= {config_param: config}
            response = context.run(self._run, *tool_args, **tool_kwargs)

        # 4) Handle response format contract
        if self.response_format == "content_and_artifact":
            msg = (...)
            if not isinstance(response, tuple):
                error_to_raise = ValueError(msg)
            else:
                try:
                    content, artifact = response
                except ValueError:
                    error_to_raise = ValueError(msg)
        else:
            content = response
    except (ValidationError, ValidationErrorV1) as e:
        ...  # map to content via _handle_validation_error if configured
    except ToolException as e:
        ...  # map to content via _handle_tool_error if configured
    except (Exception, KeyboardInterrupt) as e:
        error_to_raise = e

    if error_to_raise:
        run_manager.on_tool_error(error_to_raise, tool_call_id=tool_call_id)
        raise error_to_raise

    output = _format_output(content, artifact, tool_call_id, self.name, status)
    run_manager.on_tool_end(output, ...)
    return output
Listing 4 – High-level orchestration of a synchronous tool run.
  • Observability is schema-aware. Before logging or emitting events, the tool input is passed through _filter_injected_args so runtime-only pieces like callbacks or injected IDs don’t appear as user inputs in logs or traces.
  • Callbacks are threaded consistently. patch_config and set_config_context ensure that the same RunnableConfig stack is visible to anything the tool calls downstream. In the async variant, coro_with_context plays the same role.
  • Error handling is policy-driven. The handle_validation_error and handle_tool_error fields let you decide whether validation failures and ToolExceptions bubble up as exceptions or become safe, user-visible strings.
  • Outputs are normalized to ToolMessage. The final call to _format_output wraps content, artifact, and tool_call_id into a ToolMessage when an ID is present, so agents can treat tool results uniformly.

Practical Patterns to Reuse

We’ve walked from schemas to injected arguments, through validation and into orchestration. The unifying lesson is simple: separate what the user controls from what the runtime controls, and make that separation explicit in your types and schemas.

  1. Separate public schemas from runtime wiring.
    Use marker types (like InjectedToolArg) or equivalent metadata to distinguish user-facing parameters from framework wiring. Build your JSON schema or OpenAPI spec from only the user-facing fields; keep runtime-only fields injected at execution time.
  2. Treat validation as a customs checkpoint.
    Normalize inputs early (_parse_input), apply defaults, and inject runtime context there. After that, business logic should only see a clean, well-typed dict instead of raw, heterogeneous user input.
  3. Centralize cross-cutting concerns with a template method.
    The combination of run/arun calling abstract _run/_arun lets tool authors focus on core logic while the framework handles callbacks, configs, output shaping, and error policy. Use a similar pattern wherever every endpoint repeats the same logging, metrics, and error-handling boilerplate.
  4. Be explicit about contracts like InjectedToolCallId.
    When a tool depends on a particular invocation shape (for example, always needing a tool_call_id), encode that as a schema constraint and fail fast with precise errors when the contract is violated. Don’t rely on documentation alone.
  5. Measure around the same boundaries.
    Even though this module doesn’t emit metrics itself, it defines natural measurement points: per-tool execution duration around run/arun, validation failures in _parse_input, tool errors, and payload sizes at _format_output. Instrumenting those gives you enough signal to catch most scaling and reliability issues.

LangChain’s tool core shows how to balance developer ergonomics (functions that look simple), interoperability (Pydantic v1/v2), and production concerns (callbacks, schemas, observability) using one central idea: invisible arguments that keep runtime power off the public surface area.

If you’re designing tools or APIs that must talk to LLMs—or any external caller—it’s worth asking: which of my parameters are real user input, and which are secret backstage passes? Making that distinction explicit, as BaseTool does, keeps your schemas honest while your runtime stays flexible.

Full Source Code

Here's the full source code of the file that inspired this article.
Read on GitHub

Thanks for reading! I hope this was useful. If you have questions or thoughts, feel free to reach out.

Content Creation Process: This article was generated via a semi-automated workflow using AI tools. I prepared the strategic framework, including specific prompts and data sources. From there, the automation system conducted the research, analysis, and writing. The content passed through automated verification steps before being finalized and published without manual intervention.

Mahmoud Zalt

About the Author

I’m Zalt, a technologist with 16+ years of experience, passionate about designing and building AI systems that move us closer to a world where machines handle everything and humans reclaim wonder.

Let's connect if you're working on interesting AI projects, looking for technical advice or want to discuss anything.

Support this content

Share this article