# Environment simulation for evaluations

<div class="language-support-tag">
    <span class="lst-supported">Supported in ADK</span><span class="lst-python">Python v1.24.0</span>
</div>

When evaluating agents that rely on external dependencies — such as APIs,
databases, or third-party services — running those tools live during testing can
be slow, costly, or unreliable. The **Environment Simulator** lets you safely
intercept these tool calls during agent execution and replace them with
controlled, deterministic responses, without modifying the agent itself. This
approach can fill a critical gap in the agent improvement loop, allowing you to
create hermetic, offline test runs that isolate your agent logic for reliable
scoring.

Overall, this feature lets you:

*   Test how an agent handles API errors or edge-case responses.
*   Run evaluations offline, without access to live backends.
*   Generate realistic mock responses automatically using an LLM.
*   Produce reproducible test runs by seeding probabilistic injections.

The Environment Simulation integrates with ADK's tool execution pipeline via the
[`before_tool_callback`](/callbacks/types-of-callbacks/#tool-execution-callbacks)
hook or the [plugin system](/plugins/), so no
changes to your agent code are required.

```
The Environment Simulation is an experimental feature. Its API may change in future
releases.
```

## How it works

While [User Simulation](/evaluate/user-sim/)
drives the conversation forward, Environment Simulation provides the stable
backend. At a high level, the Environment Simulator sits between your agent and
its tools. When the agent calls a tool, the simulator intercepts the call and
decides whether to return a synthetic response — either a predefined injection
or an LLM-generated mock — or to let the real tool execute.

The decision logic follows this order for each configured tool:

1.  **Injection configs** are checked first, in order. If a matching injection
    is found (based on argument matching and probability), its error or response
    is returned immediately.
2.  **Mock strategy** is used as a fallback if no injection config applies. The
    simulator calls an LLM to generate a realistic response based on the tool's
    schema and any stateful context.
3.  **No-op** is returned (`None`) if the tool is not in the simulator config,
    allowing the real tool to execute normally.

## Integration

The `EnvironmentSimulationFactory` class provides two integration points:

*   `create_callback()` — Returns an async callable suitable for use as a
    `before_tool_callback` on any `LlmAgent`.
*   `create_plugin()` — Returns an `EnvironmentSimulationPlugin` instance that
    integrates with the ADK plugin system.

### Using as a callback

The following example shows how to create an environment simulation as one of the adk agent callbacks.


```python
from google.adk.agents import LlmAgent
from google.adk.tools.environment_simulation import EnvironmentSimulationFactory
from google.adk.tools.environment_simulation.environment_simulation_config import (
    EnvironmentSimulationConfig,
    InjectedError,
    InjectionConfig,
    ToolSimulationConfig,
)

config = EnvironmentSimulationConfig(
    tool_simulation_configs=[
        ToolSimulationConfig(
            tool_name="get_user_profile",
            injection_configs=[
                InjectionConfig(
                    injected_error=InjectedError(
                        injected_http_error_code=503,
                        error_message="Service temporarily unavailable.",
                    )
                )
            ],
        )
    ]
)

agent = LlmAgent(
    name="my_agent",
    model="gemini-flash-latest",
    tools=[get_user_profile],
    before_tool_callback=EnvironmentSimulationFactory.create_callback(config),
)
```

### Using as a plugin

The following example shows how to create environment simulation as an ADK agent plugin.

```python
from google.adk.apps import App
from google.adk.tools.environment_simulation import EnvironmentSimulationFactory
from google.adk.tools.environment_simulation.environment_simulation_config import (
    EnvironmentSimulationConfig,
    MockStrategy,
    ToolSimulationConfig,
)

config = EnvironmentSimulationConfig(
    tool_simulation_configs=[
        ToolSimulationConfig(
            tool_name="search_products",
            mock_strategy_type=MockStrategy.MOCK_STRATEGY_TOOL_SPEC,
        )
    ]
)

app = App(
    agent=my_agent,
    plugins=[EnvironmentSimulationFactory.create_plugin(config)],
)
```

## Configuration reference

You can configure the Environment Simulator with a set of dataclasses. The
following sections provide a detailed reference for each configuration object.

### `EnvironmentSimulationConfig`

The top-level configuration object.

Field                            | Type                         | Default              | Description
:------------------------------- | :--------------------------- | :------------------- | :----------
`tool_simulation_configs`        | `List[ToolSimulationConfig]` | required             | One entry per tool to simulate. Must not be empty, and tool names must be unique.
`simulation_model`               | `str`                        | `"gemini-flash-latest"` | The LLM used for tool connection analysis and mock response generation.
`simulation_model_configuration` | `GenerateContentConfig`      | thinking enabled     | LLM generation config for internal simulator calls.
`environment_data`               | `str \| None`                | `None`               | Optional environment context (e.g., a JSON database snapshot) passed to mock strategies to generate more realistic responses.
`tracing`                        | `str \| None`                | `None`               | Tracing data (e.g., a prior agent run trace in JSON string format) to provide historical context.

### `ToolSimulationConfig`

Defines how a single named tool should be simulated.

Field                | Type                    | Default                     | Description
:------------------- | :---------------------- | :-------------------------- | :----------
`tool_name`          | `str`                   | required                    | Must match the tool's registered name exactly.
`injection_configs`  | `List[InjectionConfig]` | `[]`                        | Zero or more injection configs, checked in order before the mock strategy.
`mock_strategy_type` | `MockStrategy`          | `MOCK_STRATEGY_UNSPECIFIED` | Fallback strategy when no injection is triggered.

### `InjectionConfig`

Controls a single synthetic response that can be injected into a tool call.
Exactly one of `injected_error` or `injected_response` must be set.

Field                      | Type                     | Default | Description
:------------------------- | :----------------------- | :------ | :----------
`injected_error`           | `InjectedError \| None`  | `None`  | Error to return (mutually exclusive with `injected_response`).
`injected_response`        | `Dict[str, Any] \| None` | `None`  | Fixed response dict to return (mutually exclusive with `injected_error`).
`injection_probability`    | `float`                  | `1.0`   | Probability `[0.0, 1.0]` that this injection fires.
`match_args`               | `Dict[str, Any] \| None` | `None`  | If set, the injection only fires when the tool's arguments contain all key-value pairs in `match_args`.
`injected_latency_seconds` | `float`                  | `0.0`   | Artificial delay (≤ 120 s) added before returning the injection result.
`random_seed`              | `int \| None`            | `None`  | Seed for the probability check, enabling deterministic injection behavior.

### `InjectedError`

Defines an HTTP-style error response.

| Field                      | Type  | Description                             |
| :------------------------- | :---- | :-------------------------------------- |
| `injected_http_error_code` | `int` | HTTP status code to surface as          |
:                            :       : `"error_code"` in the tool response.    :
| `error_message`            | `str` | Human-readable message surfaced as      |
:                            :       : `"error_message"` in the tool response. :

### `MockStrategy`

Enum controlling how the simulator generates responses when no injection fires.

| Value                     | Description                                     |
| :------------------------ | :---------------------------------------------- |
| `MOCK_STRATEGY_TOOL_SPEC` | Uses the tool's schema and stateful context to  |
:                           : prompt an LLM to generate a realistic response. :
| `MOCK_STRATEGY_TRACING`   | *(Deprecated)* Please use                       |
:                           : `MOCK_STRATEGY_TOOL_SPEC` with tracing input.   :

## Injection mode

Use injection configs to test specific failure or edge-case scenarios.
Injections are evaluated in list order; the first one whose `match_args`
criteria are met (and whose probability check passes) is applied.

### Injecting errors

The following example shows how to inject errors with specific error code and error message to the agent.

```python
from google.adk.tools.environment_simulation.environment_simulation_config import (
    InjectedError,
    InjectionConfig,
    ToolSimulationConfig,
)

ToolSimulationConfig(
    tool_name="charge_payment",
    injection_configs=[
        InjectionConfig(
            injected_error=InjectedError(
                injected_http_error_code=402,
                error_message="Payment declined.",
            )
        )
    ],
)
```

The agent will receive `{"error_code": 402, "error_message": "Payment
declined."}` instead of a real tool result, allowing you to evaluate how the
agent handles payment failures.

### Injecting fixed responses

Use the following InjectionConfig to specify a success response with fixed response payload. 

```python
InjectionConfig(
    injected_response={"status": "ok", "order_id": "ORD-9999"}
)
```

### Conditional injection with argument matching

Use `match_args` to inject only when specific arguments are passed.

```python
InjectionConfig(
    match_args={"item_id": "ITEM-404"},
    injected_error=InjectedError(
        injected_http_error_code=404,
        error_message="Item not found.",
    ),
)
```

Here, the error is injected only when the tool is called with
`item_id="ITEM-404"`. All other calls pass through to the next injection config
or to the mock strategy.

### Probabilistic injection

Set `injection_probability` to a value between `0.0` and `1.0` to simulate flaky
behavior. For reproducible test runs, pin the random outcome with `random_seed`.

```python
InjectionConfig(
    injection_probability=0.3,
    random_seed=42,
    injected_error=InjectedError(
        injected_http_error_code=500,
        error_message="Internal server error.",
    ),
)
```

### Injecting latency

Use `injected_latency_seconds` to simulate slow backend responses, useful for
testing timeout handling or user experience under degraded conditions.

```python
InjectionConfig(
    injected_latency_seconds=5.0,
    injected_response={"result": "slow but successful"},
)
```

### Combining multiple injection configs

Multiple injection configs on a single tool are checked in order. You can
combine them to test multiple scenarios:

```python
ToolSimulationConfig(
    tool_name="get_inventory",
    injection_configs=[
        # Always fail for a specific out-of-stock item
        InjectionConfig(
            match_args={"sku": "OOS-001"},
            injected_response={"quantity": 0, "available": False},
        ),
        # Randomly fail 20% of the time for all other items
        InjectionConfig(
            injection_probability=0.2,
            random_seed=7,
            injected_error=InjectedError(
                injected_http_error_code=503,
                error_message="Inventory service unavailable.",
            ),
        ),
    ],
)
```

## Mock strategy mode

When you want the simulator to generate plausible responses automatically —
rather than returning hand-crafted values — use `MOCK_STRATEGY_TOOL_SPEC`.

The simulator uses an LLM to:

1.  Analyze the schemas of all tools the agent has access to, and identify
    *stateful dependencies* between them (e.g., a `create_order` tool produces
    an `order_id` that `get_order` consumes).
2.  Track a **state store** of IDs and resources created during the session.
3.  Generate a response that is consistent with the tool's schema and the
    current state — returning a 404-style error if a consuming tool requests a
    resource that was never created.

```python
from google.adk.tools.environment_simulation.environment_simulation_config import (
    EnvironmentSimulationConfig,
    MockStrategy,
    ToolSimulationConfig,
)

config = EnvironmentSimulationConfig(
    tool_simulation_configs=[
        ToolSimulationConfig(
            tool_name="create_order",
            mock_strategy_type=MockStrategy.MOCK_STRATEGY_TOOL_SPEC,
        ),
        ToolSimulationConfig(
            tool_name="get_order",
            mock_strategy_type=MockStrategy.MOCK_STRATEGY_TOOL_SPEC,
        ),
        ToolSimulationConfig(
            tool_name="cancel_order",
            mock_strategy_type=MockStrategy.MOCK_STRATEGY_TOOL_SPEC,
        ),
    ]
)
```

With this config, the simulator will automatically generate an `order_id` when
`create_order` is mocked, and use it to return consistent results (or a
not-found error) when `get_order` or `cancel_order` are subsequently called.

### Providing environment data

Pass domain-specific context through `environment_data` to make mock responses
more realistic. This can be a JSON string representing a snapshot of your
database or any structured context the LLM should use when generating responses.

```python
import json

db_snapshot = {
    "products": [
        {"id": "P-001", "name": "Wireless Headphones", "price": 79.99, "stock": 12},
        {"id": "P-002", "name": "USB-C Hub", "price": 34.99, "stock": 0},
    ],
    "warehouse_location": "US-WEST-2",
}

config = EnvironmentSimulationConfig(
    tool_simulation_configs=[
        ToolSimulationConfig(
            tool_name="search_products",
            mock_strategy_type=MockStrategy.MOCK_STRATEGY_TOOL_SPEC,
        ),
    ],
    environment_data=json.dumps(db_snapshot),
)
```

The LLM will use this data to return product names, prices, and stock levels
that match your domain, rather than generating arbitrary placeholder values.

### Providing tracing data

Feed traces generated in the agent to be mocked through `tracing` to make mock
responses more realistic.

```python
import json

agent_traces = [
    {
        "invocation_id": "inv-001",
        "user_content": {"role": "user", "parts": [{"text": "Search for high-end headphones"}]},
        "intermediate_data": {
            "tool_uses": [
                {
                    "name": "search_products",
                    "args": {"query": "high-end headphones"},
                    "response": {"products": [{"id": "P-123", "name": "Premium Wireless ANC Headphones"}]}
                }
            ]
        }
    }
]

config = EnvironmentSimulationConfig(
    tool_simulation_configs=[
        ToolSimulationConfig(
            tool_name="search_products",
            mock_strategy_type=MockStrategy.MOCK_STRATEGY_TOOL_SPEC,
        ),
    ],
    tracing=json.dumps(agent_traces),
)
```

The LLM will use this data to return product names, prices, and stock levels
that match your domain, rather than generating arbitrary placeholder values.

## Mixing injections and mock strategy

Injection configs and a mock strategy can be combined on the same tool.
Injections are always checked first; the mock strategy fires only when no
injection applies.

```python
ToolSimulationConfig(
    tool_name="send_notification",
    injection_configs=[
        # Always fail for a known-bad recipient
        InjectionConfig(
            match_args={"recipient_id": "INVALID"},
            injected_error=InjectedError(
                injected_http_error_code=400,
                error_message="Invalid recipient.",
            ),
        ),
    ],
    # For all other recipients, generate a plausible success response
    mock_strategy_type=MockStrategy.MOCK_STRATEGY_TOOL_SPEC,
)
```