# Environment simulation for evaluations
Supported in ADKPython v1.24.0
When evaluating agents that rely on external dependencies — such as APIs,
databases, or third-party services — running those tools live during testing can
be slow, costly, or unreliable. The **Environment Simulator** lets you safely
intercept these tool calls during agent execution and replace them with
controlled, deterministic responses, without modifying the agent itself. This
approach can fill a critical gap in the agent improvement loop, allowing you to
create hermetic, offline test runs that isolate your agent logic for reliable
scoring.
Overall, this feature lets you:
* Test how an agent handles API errors or edge-case responses.
* Run evaluations offline, without access to live backends.
* Generate realistic mock responses automatically using an LLM.
* Produce reproducible test runs by seeding probabilistic injections.
The Environment Simulation integrates with ADK's tool execution pipeline via the
[`before_tool_callback`](/callbacks/types-of-callbacks/#tool-execution-callbacks)
hook or the [plugin system](/plugins/), so no
changes to your agent code are required.
```
The Environment Simulation is an experimental feature. Its API may change in future
releases.
```
## How it works
While [User Simulation](/evaluate/user-sim/)
drives the conversation forward, Environment Simulation provides the stable
backend. At a high level, the Environment Simulator sits between your agent and
its tools. When the agent calls a tool, the simulator intercepts the call and
decides whether to return a synthetic response — either a predefined injection
or an LLM-generated mock — or to let the real tool execute.
The decision logic follows this order for each configured tool:
1. **Injection configs** are checked first, in order. If a matching injection
is found (based on argument matching and probability), its error or response
is returned immediately.
2. **Mock strategy** is used as a fallback if no injection config applies. The
simulator calls an LLM to generate a realistic response based on the tool's
schema and any stateful context.
3. **No-op** is returned (`None`) if the tool is not in the simulator config,
allowing the real tool to execute normally.
## Integration
The `EnvironmentSimulationFactory` class provides two integration points:
* `create_callback()` — Returns an async callable suitable for use as a
`before_tool_callback` on any `LlmAgent`.
* `create_plugin()` — Returns an `EnvironmentSimulationPlugin` instance that
integrates with the ADK plugin system.
### Using as a callback
The following example shows how to create an environment simulation as one of the adk agent callbacks.
```python
from google.adk.agents import LlmAgent
from google.adk.tools.environment_simulation import EnvironmentSimulationFactory
from google.adk.tools.environment_simulation.environment_simulation_config import (
EnvironmentSimulationConfig,
InjectedError,
InjectionConfig,
ToolSimulationConfig,
)
config = EnvironmentSimulationConfig(
tool_simulation_configs=[
ToolSimulationConfig(
tool_name="get_user_profile",
injection_configs=[
InjectionConfig(
injected_error=InjectedError(
injected_http_error_code=503,
error_message="Service temporarily unavailable.",
)
)
],
)
]
)
agent = LlmAgent(
name="my_agent",
model="gemini-flash-latest",
tools=[get_user_profile],
before_tool_callback=EnvironmentSimulationFactory.create_callback(config),
)
```
### Using as a plugin
The following example shows how to create environment simulation as an ADK agent plugin.
```python
from google.adk.apps import App
from google.adk.tools.environment_simulation import EnvironmentSimulationFactory
from google.adk.tools.environment_simulation.environment_simulation_config import (
EnvironmentSimulationConfig,
MockStrategy,
ToolSimulationConfig,
)
config = EnvironmentSimulationConfig(
tool_simulation_configs=[
ToolSimulationConfig(
tool_name="search_products",
mock_strategy_type=MockStrategy.MOCK_STRATEGY_TOOL_SPEC,
)
]
)
app = App(
agent=my_agent,
plugins=[EnvironmentSimulationFactory.create_plugin(config)],
)
```
## Configuration reference
You can configure the Environment Simulator with a set of dataclasses. The
following sections provide a detailed reference for each configuration object.
### `EnvironmentSimulationConfig`
The top-level configuration object.
Field | Type | Default | Description
:------------------------------- | :--------------------------- | :------------------- | :----------
`tool_simulation_configs` | `List[ToolSimulationConfig]` | required | One entry per tool to simulate. Must not be empty, and tool names must be unique.
`simulation_model` | `str` | `"gemini-flash-latest"` | The LLM used for tool connection analysis and mock response generation.
`simulation_model_configuration` | `GenerateContentConfig` | thinking enabled | LLM generation config for internal simulator calls.
`environment_data` | `str \| None` | `None` | Optional environment context (e.g., a JSON database snapshot) passed to mock strategies to generate more realistic responses.
`tracing` | `str \| None` | `None` | Tracing data (e.g., a prior agent run trace in JSON string format) to provide historical context.
### `ToolSimulationConfig`
Defines how a single named tool should be simulated.
Field | Type | Default | Description
:------------------- | :---------------------- | :-------------------------- | :----------
`tool_name` | `str` | required | Must match the tool's registered name exactly.
`injection_configs` | `List[InjectionConfig]` | `[]` | Zero or more injection configs, checked in order before the mock strategy.
`mock_strategy_type` | `MockStrategy` | `MOCK_STRATEGY_UNSPECIFIED` | Fallback strategy when no injection is triggered.
### `InjectionConfig`
Controls a single synthetic response that can be injected into a tool call.
Exactly one of `injected_error` or `injected_response` must be set.
Field | Type | Default | Description
:------------------------- | :----------------------- | :------ | :----------
`injected_error` | `InjectedError \| None` | `None` | Error to return (mutually exclusive with `injected_response`).
`injected_response` | `Dict[str, Any] \| None` | `None` | Fixed response dict to return (mutually exclusive with `injected_error`).
`injection_probability` | `float` | `1.0` | Probability `[0.0, 1.0]` that this injection fires.
`match_args` | `Dict[str, Any] \| None` | `None` | If set, the injection only fires when the tool's arguments contain all key-value pairs in `match_args`.
`injected_latency_seconds` | `float` | `0.0` | Artificial delay (≤ 120 s) added before returning the injection result.
`random_seed` | `int \| None` | `None` | Seed for the probability check, enabling deterministic injection behavior.
### `InjectedError`
Defines an HTTP-style error response.
| Field | Type | Description |
| :------------------------- | :---- | :-------------------------------------- |
| `injected_http_error_code` | `int` | HTTP status code to surface as |
: : : `"error_code"` in the tool response. :
| `error_message` | `str` | Human-readable message surfaced as |
: : : `"error_message"` in the tool response. :
### `MockStrategy`
Enum controlling how the simulator generates responses when no injection fires.
| Value | Description |
| :------------------------ | :---------------------------------------------- |
| `MOCK_STRATEGY_TOOL_SPEC` | Uses the tool's schema and stateful context to |
: : prompt an LLM to generate a realistic response. :
| `MOCK_STRATEGY_TRACING` | *(Deprecated)* Please use |
: : `MOCK_STRATEGY_TOOL_SPEC` with tracing input. :
## Injection mode
Use injection configs to test specific failure or edge-case scenarios.
Injections are evaluated in list order; the first one whose `match_args`
criteria are met (and whose probability check passes) is applied.
### Injecting errors
The following example shows how to inject errors with specific error code and error message to the agent.
```python
from google.adk.tools.environment_simulation.environment_simulation_config import (
InjectedError,
InjectionConfig,
ToolSimulationConfig,
)
ToolSimulationConfig(
tool_name="charge_payment",
injection_configs=[
InjectionConfig(
injected_error=InjectedError(
injected_http_error_code=402,
error_message="Payment declined.",
)
)
],
)
```
The agent will receive `{"error_code": 402, "error_message": "Payment
declined."}` instead of a real tool result, allowing you to evaluate how the
agent handles payment failures.
### Injecting fixed responses
Use the following InjectionConfig to specify a success response with fixed response payload.
```python
InjectionConfig(
injected_response={"status": "ok", "order_id": "ORD-9999"}
)
```
### Conditional injection with argument matching
Use `match_args` to inject only when specific arguments are passed.
```python
InjectionConfig(
match_args={"item_id": "ITEM-404"},
injected_error=InjectedError(
injected_http_error_code=404,
error_message="Item not found.",
),
)
```
Here, the error is injected only when the tool is called with
`item_id="ITEM-404"`. All other calls pass through to the next injection config
or to the mock strategy.
### Probabilistic injection
Set `injection_probability` to a value between `0.0` and `1.0` to simulate flaky
behavior. For reproducible test runs, pin the random outcome with `random_seed`.
```python
InjectionConfig(
injection_probability=0.3,
random_seed=42,
injected_error=InjectedError(
injected_http_error_code=500,
error_message="Internal server error.",
),
)
```
### Injecting latency
Use `injected_latency_seconds` to simulate slow backend responses, useful for
testing timeout handling or user experience under degraded conditions.
```python
InjectionConfig(
injected_latency_seconds=5.0,
injected_response={"result": "slow but successful"},
)
```
### Combining multiple injection configs
Multiple injection configs on a single tool are checked in order. You can
combine them to test multiple scenarios:
```python
ToolSimulationConfig(
tool_name="get_inventory",
injection_configs=[
# Always fail for a specific out-of-stock item
InjectionConfig(
match_args={"sku": "OOS-001"},
injected_response={"quantity": 0, "available": False},
),
# Randomly fail 20% of the time for all other items
InjectionConfig(
injection_probability=0.2,
random_seed=7,
injected_error=InjectedError(
injected_http_error_code=503,
error_message="Inventory service unavailable.",
),
),
],
)
```
## Mock strategy mode
When you want the simulator to generate plausible responses automatically —
rather than returning hand-crafted values — use `MOCK_STRATEGY_TOOL_SPEC`.
The simulator uses an LLM to:
1. Analyze the schemas of all tools the agent has access to, and identify
*stateful dependencies* between them (e.g., a `create_order` tool produces
an `order_id` that `get_order` consumes).
2. Track a **state store** of IDs and resources created during the session.
3. Generate a response that is consistent with the tool's schema and the
current state — returning a 404-style error if a consuming tool requests a
resource that was never created.
```python
from google.adk.tools.environment_simulation.environment_simulation_config import (
EnvironmentSimulationConfig,
MockStrategy,
ToolSimulationConfig,
)
config = EnvironmentSimulationConfig(
tool_simulation_configs=[
ToolSimulationConfig(
tool_name="create_order",
mock_strategy_type=MockStrategy.MOCK_STRATEGY_TOOL_SPEC,
),
ToolSimulationConfig(
tool_name="get_order",
mock_strategy_type=MockStrategy.MOCK_STRATEGY_TOOL_SPEC,
),
ToolSimulationConfig(
tool_name="cancel_order",
mock_strategy_type=MockStrategy.MOCK_STRATEGY_TOOL_SPEC,
),
]
)
```
With this config, the simulator will automatically generate an `order_id` when
`create_order` is mocked, and use it to return consistent results (or a
not-found error) when `get_order` or `cancel_order` are subsequently called.
### Providing environment data
Pass domain-specific context through `environment_data` to make mock responses
more realistic. This can be a JSON string representing a snapshot of your
database or any structured context the LLM should use when generating responses.
```python
import json
db_snapshot = {
"products": [
{"id": "P-001", "name": "Wireless Headphones", "price": 79.99, "stock": 12},
{"id": "P-002", "name": "USB-C Hub", "price": 34.99, "stock": 0},
],
"warehouse_location": "US-WEST-2",
}
config = EnvironmentSimulationConfig(
tool_simulation_configs=[
ToolSimulationConfig(
tool_name="search_products",
mock_strategy_type=MockStrategy.MOCK_STRATEGY_TOOL_SPEC,
),
],
environment_data=json.dumps(db_snapshot),
)
```
The LLM will use this data to return product names, prices, and stock levels
that match your domain, rather than generating arbitrary placeholder values.
### Providing tracing data
Feed traces generated in the agent to be mocked through `tracing` to make mock
responses more realistic.
```python
import json
agent_traces = [
{
"invocation_id": "inv-001",
"user_content": {"role": "user", "parts": [{"text": "Search for high-end headphones"}]},
"intermediate_data": {
"tool_uses": [
{
"name": "search_products",
"args": {"query": "high-end headphones"},
"response": {"products": [{"id": "P-123", "name": "Premium Wireless ANC Headphones"}]}
}
]
}
}
]
config = EnvironmentSimulationConfig(
tool_simulation_configs=[
ToolSimulationConfig(
tool_name="search_products",
mock_strategy_type=MockStrategy.MOCK_STRATEGY_TOOL_SPEC,
),
],
tracing=json.dumps(agent_traces),
)
```
The LLM will use this data to return product names, prices, and stock levels
that match your domain, rather than generating arbitrary placeholder values.
## Mixing injections and mock strategy
Injection configs and a mock strategy can be combined on the same tool.
Injections are always checked first; the mock strategy fires only when no
injection applies.
```python
ToolSimulationConfig(
tool_name="send_notification",
injection_configs=[
# Always fail for a known-bad recipient
InjectionConfig(
match_args={"recipient_id": "INVALID"},
injected_error=InjectedError(
injected_http_error_code=400,
error_message="Invalid recipient.",
),
),
],
# For all other recipients, generate a plausible success response
mock_strategy_type=MockStrategy.MOCK_STRATEGY_TOOL_SPEC,
)
```