Skip to content

fix(migration): restrict unpickling of v0 actions blobs#5866

Open
White-Mouse wants to merge 8 commits into
google:mainfrom
White-Mouse:codex/restrict-migration-unpickler
Open

fix(migration): restrict unpickling of v0 actions blobs#5866
White-Mouse wants to merge 8 commits into
google:mainfrom
White-Mouse:codex/restrict-migration-unpickler

Conversation

@White-Mouse
Copy link
Copy Markdown

@White-Mouse White-Mouse commented May 27, 2026

What

The v0 session schema stored event actions as pickled blobs. The migration helper reads raw bytes via SELECT * FROM events and previously used pickle.loads(...) directly.

This PR replaces the default load path with a restricted unpickler allowlist for builtin containers/primitives, standard ADK EventActions payloads, nested ADK core action types (AuthConfig, ToolConfirmation, EventCompaction), and the google.genai.types.Content / Part dependency classes that normal compaction payloads require.

It also adds an explicit trust toggle for legacy databases that contain custom Python objects in state_delta or other Any fields:

  • Python API: migrate(..., allow_unsafe_unpickling=True)
  • Migration runner: upgrade(..., allow_unsafe_unpickling=True)
  • CLI: adk migrate session --allow_unsafe_unpickling ...
  • Direct script: --allow_unsafe_unpickling / --allow-unsafe-unpickling

Why

pickle is not safe for untrusted inputs. Migration tooling often runs against restored/backed-up DB files or shared storage; failing closed by default reduces the blast radius if the source DB contents are compromised.

The opt-in flag keeps compatibility for users who trust their source database and need the original unsafe pickle behavior for custom legacy objects.

Associated Issue / Background

No existing GitHub issue is linked. This was found while reviewing the v0-to-v1 migration path for unsafe deserialization risks in legacy session data.

Compatibility / fail-closed boundary

Normal v0 EventActions payloads made from primitive/container fields continue to migrate. The allowlist now also covers common nested ADK action models requested during review, including requested auth configs, requested tool confirmations, and event compaction content.

Payloads that require globals outside the explicit allowlist still fail closed by default: the migration logs a warning and falls back to empty EventActions() for that event. Users can opt into the previous unsafe pickle behavior only when they trust the source database.

Verification

  • uv run pytest tests/unittests/sessions/migration/test_migration.py
    • 22 passed, 4 warnings
  • uv run mypy src/google/adk/sessions/migration/migrate_from_sqlalchemy_pickle.py src/google/adk/sessions/migration/migration_runner.py
    • Success: no issues found in 2 source files
  • uv run pre-commit run --files src/google/adk/sessions/migration/migrate_from_sqlalchemy_pickle.py src/google/adk/sessions/migration/migration_runner.py src/google/adk/cli/cli_tools_click.py tests/unittests/sessions/migration/test_migration.py
    • passed
  • git diff --check
    • passed

@google-cla
Copy link
Copy Markdown

google-cla Bot commented May 27, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@adk-bot adk-bot added the services [Component] This issue is related to runtime services, e.g. sessions, memory, artifacts, etc label May 27, 2026
@adk-bot
Copy link
Copy Markdown
Collaborator

adk-bot commented May 27, 2026

Response from ADK Triaging Agent

Hello @White-Mouse, thank you for submitting this pull request!

To help us review and process your contribution more efficiently, please address the following items in accordance with our contribution guidelines:

  1. Contributor License Agreement (CLA): It looks like the CLA check has failed. Please make sure you (or your employer) have signed the Google CLA so we can proceed with the review.
  2. Testing / Verification Plan: The Verification section in your PR description is currently empty. Please update this section with your testing plan or a summary of your test results (e.g., confirming unit tests under tests/unittests/sessions/migration/test_migration.py passed).
  3. Associated Issue: If there is an existing GitHub issue related to this bug/fix, please link it in your PR description. If not, consider creating one or briefly describing the background context.

Thank you for your cooperation!

@White-Mouse White-Mouse force-pushed the codex/restrict-migration-unpickler branch from d09e22e to 71c4053 Compare May 27, 2026 14:45
@rohityan rohityan self-assigned this May 27, 2026
@wyf7107 wyf7107 requested a review from DeanChensj May 27, 2026 19:00
@rohityan
Copy link
Copy Markdown
Collaborator

Hi @White-Mouse , Thank you for your contribution! We appreciate you taking the time to submit this pull request. Can you please fix the failing mypy-diff tests before we can proceed with the review.

@rohityan rohityan added the request clarification [Status] The maintainer need clarification or more information from the author label May 27, 2026
@DeanChensj
Copy link
Copy Markdown
Collaborator

DeanChensj commented May 28, 2026

Hi @White-Mouse, thanks for the update! I appreciate the effort to move towards a "Secure by Default" approach using a restricted unpickler.

To ensure this doesn't negatively impact existing users, I'd like to suggest the following improvements to the PR:

  1. Prevent Data Loss by Expanding the Allowlist:
    The current restricted unpickler only allows EventActions. However, EventActions frequently contains nested ADK core types such as AuthConfig , ToolConfirmation, EventCompaction. Since EventCompaction also relies on google.genai.types.Content, we should include these classes (and their dependencies) in the allowlist. This ensures standard sessions migrate fully without falling back to an empty EventActions().

  2. Add an Optional "Trust" Toggle:
    Since the state_delta field can store arbitrary Python objects (Any), some users may have custom classes in their legacy data. We should provide an optional configuration (e.g., allow_unsafe_unpickling: bool = False) that allows users to explicitly opt-in to the original unpickling behavior if they trust their source database.

With these two changes, we can achieve strong security for the majority of users while maintaining full functionality and flexibility. What do you think?

@White-Mouse
Copy link
Copy Markdown
Author

White-Mouse commented May 28, 2026

Thanks for the review. I updated the PR to address both points: the restricted allowlist now covers standard nested ADK action types (AuthConfig, ToolConfirmation, EventCompaction) plus the google.genai.types.Content / Part dependency classes used by compaction payloads, and migration now has an explicit allow_unsafe_unpickling=False trust toggle for users who need to migrate custom legacy objects from a trusted source database.

I also fixed the mypy-diff blocker by adding the missing return annotation and ran the updated migration tests, targeted mypy, pre-commit, and git diff --check locally. The new pull_request workflow runs for mypy/unit/pre-commit/check-file are currently in GitHub's action_required state for this fork PR, so they need maintainer approval before they will execute on the new commit.

@arian-gogani
Copy link
Copy Markdown

the unpickling restriction is a good security fix. the complementary control: a signed receipt for every action deserialized from the v0 schema.

when migrating v0 actions, each deserialized action should generate a receipt proving what was loaded, from which session, at what time. if a malicious pickle payload was injected, the receipt chain shows exactly what was deserialized.

github.com/arian-gogani/nobulex

@DeanChensj
Copy link
Copy Markdown
Collaborator

@gemini-cli /review

@github-actions
Copy link
Copy Markdown

🤖 Hi @DeanChensj, I've received your request, and I'm working on it now! You can track my progress in the logs for more details.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this security improvement! The restricted unpickler is a great addition to the migration path.

I've added a few minor comments regarding:

  1. CLI flag naming consistency (using hyphens).
  2. Potentially missing datetime types in the whitelist.
  3. A small type hinting refinement for _safe_json_load.

Otherwise, the changes look solid and the tests are very comprehensive.

Comment thread src/google/adk/cli/cli_tools_click.py Outdated
default="INFO",
help="Optional. Set the logging level",
)
@click.option(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a hyphenated version of the flag here as well for consistency with other ADK CLI tools (e.g., --allow-unsafe-unpickling). Click will handle the conversion to the underscore variable name automatically if you provide both or just the hyphenated one.

actions = EventActions()

def _safe_json_load(val):
def _safe_json_load(val: Any) -> dict[str, Any] | None:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type hint suggests it returns a dict[str, Any], but json.loads could return a list. While it's unlikely for these specific columns, it might be safer to use dict[str, Any] | list[Any] | None or verify it's a dict. Also, the cast below might hide issues if it's actually a list.

logger = logging.getLogger("google_adk." + __name__)

_ALLOWED_PICKLE_GLOBALS: set[tuple[str, str]] = {
# Builtin containers/primitives.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also allow datetime.datetime and datetime.timezone? It's quite common for legacy state_delta or other Any fields in EventActions to contain timestamp objects.

DeanChensj
DeanChensj previously approved these changes May 29, 2026
@DeanChensj DeanChensj dismissed their stale review May 29, 2026 18:30

approved by mistake

@rohityan
Copy link
Copy Markdown
Collaborator

Hi @White-Mouse , could you please fix the mypy-diff tests.

@White-Mouse
Copy link
Copy Markdown
Author

White-Mouse commented May 30, 2026

Thanks for the review. I pushed 9494da4 with the follow-ups:

  • added the hyphenated CLI alias --allow-unsafe-unpickling, while keeping the existing underscore form
  • kept the Click typing workaround scoped to the newly added option
  • allowed standard datetime.datetime, datetime.timezone, and datetime.timedelta values in restricted action payloads
  • tightened JSON handling so only object payloads are decoded into model fields
  • added coverage for the CLI aliases, datetime state deltas, and non-object JSON fields

Local verification passed:

  • uv run pytest tests/unittests/sessions/migration/test_migration.py -q -> 24 passed
  • uv run pytest tests/unittests/cli/utils/test_cli_tools_click.py::test_cli_migrate_session_allows_unsafe_unpickling_flag -q -> 2 passed
  • uv run mypy src/google/adk/sessions/migration/migrate_from_sqlalchemy_pickle.py src/google/adk/sessions/migration/migration_runner.py -> success
  • uv run pre-commit run --files ... -> passed
  • local merge-commit mypy-diff simulation against current upstream/main -> 0 new errors

The pull_request_target scan and CLA/header checks are green. The fork pull_request workflows are currently waiting for approval before they can run on the new commit.

@White-Mouse
Copy link
Copy Markdown
Author

Could you please approve the pending pull_request workflow runs (mypy-diff, unit tests, pre-commit) when you get a chance? They need maintainer approval since this PR is from a fork. Happy to address any further feedback — thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

request clarification [Status] The maintainer need clarification or more information from the author services [Component] This issue is related to runtime services, e.g. sessions, memory, artifacts, etc

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants