[RFC] Taxonomy-Driven Skill Routing & Dynamic Context Mutation #5891

ViktorVeselov · 2026-05-29T01:56:43Z

ViktorVeselov
May 29, 2026

Design doc + feature

I am proposing the integration of an enterprise-grade Pluggable Policy & Taxonomy Security Engine directly into Google's Agent Development Kit (ADK).

By classifying prompt contexts using multi-step pipelines and evaluating dynamic business rules (entitlements, roles, feature flags) on the fly, we keep the model strictly in a safe, compliant sandbox while giving enterprises absolute control over skill discovery, skill execution, and dynamic instruction shaping.

Reusing Existing SDK Guards

To keep the library clean and maintain zero duplicate logic, my architecture reuses the existing validation systems in the ADK. It doesn't mean we cannot write a new one, but keeping our code DRY is beneficial and easier to maintain.:

SDK Standard Error: Any validation or security exception will raise InputValidationError from google.adk.errors.input_validation_error for complete parity.
Unified Path Segment Check: I reuse the existing _validate_path_segment function from google.adk.artifacts.file_artifact_service to validate skill names and path configurations, ensuring zero traversal, null-byte injection, or slash escapes.
Unicode Normalization & Regex Whitelist: This architecture reuses the core NFKC normalization and snake/kebab regex patterns defined in models.py for skill tags and taxonomy identifiers.

Public API Library Exports (DX)

Because this is a library (google-adk), developers should not have to dig into deep internal submodules to import their blueprints/code/architecture. To ensure a sound Developer Experience (DX), we can expose the interface and presets directly at the package root level:

# Sleek, clean public imports directly from the SDK root namespace!
from google.adk import SkillPolicy, TaxonomyResolver, TaxonomyPipeline

To enable this, we will expose the new entities inside the module entry points with the following files modified and/or created:

policy.py: Our core interfaces and standard implementations. It will be a new file.
__init__.py (inside the skills package): Exports SkillPolicy, TaxonomyResolver, and TaxonomyPipeline.
__init__.py (inside the root package): Surfaces the classes at the highest level of the SDK.

Dynamic Prompt/Instruction Shaping & Multi-Step Taxonomy

Multi-Step Taxonomy Pipeline

Taxonomy classification can be complex and multi-stage. This technical architecture handles this by defining an abstract TaxonomyResolver and providing a composite TaxonomyPipeline that runs sequential resolvers:

1. Semantic Classification & Dialogue Audit: Analyze past agent interactions inside the LlmRequest or scan the model's own thought-stream block to classify the active security/regulatory domain (e.g., urn:adk:domain:compliance).
2. Entitlements Verification: Gate access using global feature flags (e.g., verifying if the user has beta entitlements).
3. DB-backed RBAC: Query database records to verify custom user permissions.

I am thinking about this Soft. Dev. pattern/design of a class:

from abc import ABC, abstractmethod
from typing import Union, Coroutine, Any
from ..agents.readonly_context import ReadonlyContext
from ..models.llm_request import LlmRequest

class TaxonomyResolver(ABC):
    """Abstract base class for taxonomy resolution. Resolvers can be chained to form multi-step pipelines."""
    
    @abstractmethod
    async def resolve_taxonomies(self, context: ReadonlyContext, llm_request: LlmRequest) -> list[str]:
        """Resolves active taxonomy domain URIs from the runtime context and LLM history.
        
        Args:
            context: The session runtime context.
            llm_request: Outgoing LLM request containing conversation history and agent messages.
        """
        pass


class TaxonomyPipeline(TaxonomyResolver):
    """Executes a sequence of taxonomy resolvers in order (multi-step pipeline)."""
    
    def __init__(self, resolvers: list[TaxonomyResolver]):
        self.resolvers = resolvers
        
    async def resolve_taxonomies(self, context: ReadonlyContext, llm_request: LlmRequest) -> list[str]:
        active_domains = set()
        for resolver in self.resolvers:
            domains = await resolver.resolve_taxonomies(context, llm_request)
            if domains:
                active_domains.update(domains)
        return list(active_domains)

Security & Sanitization

Since we are dealing with taxonomy, we should apply at least the bare minimum of existing sanitization guards. It ensures reasonable safety against prompt hacking and exploit attempts. We can enforce strict schema parsing and string sanitization when loading YAML metadata via standard models. Will it address all possible hacking attempts? Probably not, but would it prevent the most outrageous? Probably yes...

# Validation logic inside models.py:
import re
import unicodedata
from pydantic import Field, field_validator, BaseModel

class Frontmatter(BaseModel):
    taxonomy_binds: list[str] = Field(
        default_factory=list,
        alias="taxonomy-binds",
        serialization_alias="taxonomy-binds",
    )

    @field_validator("taxonomy_binds")
    @classmethod
    def _validate_taxonomy_binds(cls, v: list[str]) -> list[str]:
        sanitized = []
        for item in v:
            if not isinstance(item, str):
                raise ValueError("Taxonomy tags must be strings")
            # Normalize NFKC unicode using standard patterns
            normalized = unicodedata.normalize("NFKC", item).strip()
            # Verify strict alphanumeric and safe symbol whitelist (prevent shell/HTML execution)
            if not re.match(r"^[a-zA-Z0-9:\-_/.]+$", normalized):
                raise ValueError(f"Invalid characters in taxonomy bind tag: {normalized}")
            sanitized.append(normalized)
        return sanitized

To eliminate directory traversal attacks and path tampering inside the skill tools completely, we can hook directly into tool execution (LoadSkillTool, LoadSkillResourceTool, RunSkillScriptTool) using native SDK validation:

from ..errors.input_validation_error import InputValidationError
from ..artifacts.file_artifact_service import _validate_path_segment

# Reuses the standard SDK validator to ensure skill names cannot escape context:
try:
    _validate_path_segment(skill_name, "skill_name")
except InputValidationError as e:
    return {
        "error": f"Invalid skill_name parameter: {e}",
        "error_code": "INVALID_ARGUMENTS"
    }

# Directory traversal assertions on custom file paths:
if "file_path" in args:
    file_path = args["file_path"]
    if ".." in file_path or file_path.startswith(("/", "\\")):
        raise InputValidationError(f"Path traversal attempt blocked: {file_path}")

Potential Codebase Changes We Would Have to Make

1. ADK Skills Policy Module

src/google/adk/skills/policy.py (going to me a new file)
This module defines the abstract bases and default pipelines: SkillPolicy, TaxonomyResolver, TaxonomyPipeline, and DefaultSkillPolicy (which combines list-based taxonomy-bind matching with user-defined policies).

2. ADK Skills Registry & Parser

src/google/adk/skills/models.py
Updates the Frontmatter Pydantic model to support list validation, sanitization mapping, and alias matching for taxonomy-binds as shown in the validation snippet above.

src/google/adk/skills/_utils.py
Expands the whitelist of allowed YAML keys so the lightweight directory validator _validate_skill_dir does not flag the new tags as invalid:

_ALLOWED_FRONTMATTER_KEYS = frozenset({
    "name",
    "description",
    "license",
    "allowed-tools",
    "allowed_tools",
    "metadata",
    "compatibility",
    "taxonomy-binds",
    "taxonomy_binds",
})

3. ADK Toolset & Core Execution

src/google/adk/skills/skill_toolset.py
Modifies SkillToolset, ListSkillsTool, and LoadSkillTool to enable dynamic intercept checks.

SkillToolset Constructor: Accepts taxonomy_resolver: Optional[TaxonomyResolver] = None and policy_engine: Optional[SkillPolicy] = None to enforce dynamic entitlements and roles.
SkillToolset._list_allowed_skills(self, context, llm_request=None): Evaluates active taxonomies via the resolver, passing the llm_request context for conversation-history scanning. It maps permissions using policy_engine.is_skill_allowed(skill, context) and returns only permitted skills.
SkillToolset.process_llm_request: Utilizes _list_allowed_skills with the active llm_request to determine visible skills, injecting filtered skills inside the prompt's XML block.
LoadSkillTool Dynamic Shaping & Safety Guard: Enforces hard assertions using _validate_path_segment and traversal guards. It re-evaluates policy checks on direct loading (run_async) to block hallucinated tool calls with a strict SKILL_NOT_PERMITTED safety message, running policy_engine.shape_instructions(skill, tool_context, instructions) to apply shaped instruction guardrails before returning them to the model.

Future-Proofing

Because this design relies entirely on generic string-based namespaces (urn:adk:*), the ADK remains completely agnostic to the underlying taxonomy dictionary definitions. When central working groups like NIST or AAIF release their finalized taxonomy classifications, developers simply update their YAML frontmatter tags. The core plumbing outlined here requires zero modifications.

Note: it was filtered through Gemini to cut the noise. So, if more details are needed, feel free to ask. To understand the logic, please read the code blocks.

zeroasterisk · 2026-05-29T14:37:32Z

zeroasterisk
May 29, 2026

Thanks Victor for joining the community call and talking this out with me for a while. I "lean positive" on what I see here, specifically the pluggable taxonomy.

Note that a taxonomy should be in some standard format which is flattend and cleaned up for your "validation of terms", and in my opinion it ought to include definitions which could be used for LLM disambiguation or lookups.

JSON-LD with SKOS

{
  "@context": "http://w3.org",
  "@type": "Concept",
  "@id": "https://example.com",
  "prefLabel": { "@value": "Machine Learning", "@language": "en" },
  "altLabel": [
    { "@value": "ML", "@language": "en" },
    { "@value": "Automated Learning", "@language": "en" }
  ],
  "definition": { 
    "@value": "A branch of artificial intelligence focused on building systems that learn from data.", 
    "@language": "en" 
  },
  "broader": "https://example.com"
}

Flat Key-Value JSON

A much simpler approach

[
  {
    "id": "100",
    "parentId": null,
    "name": "Artificial Intelligence",
    "definition": "The simulation of human intelligence by machines."
  },
  {
    "id": "101",
    "parentId": "100",
    "name": "Machine Learning",
    "definition": "Systems that learn from data to improve performance."
  }
]

Assuming we adopt a specific taxonomy configuration spec, we parse and extract from it what we need for validation and we can validate skills and map more effectively to our domain.

I can see it.

Next question: could this start out as an ADK plugin, which you can just do yourself today?

3 replies

ViktorVeselov May 29, 2026
Author

Yes.

ViktorVeselov May 29, 2026
Author

Update: getting closer to the finish line.

ViktorVeselov May 29, 2026
Author

Opened PR and ready to address concerns, inconsistencies, etc.

#5898

ViktorVeselov · 2026-05-29T20:13:15Z

ViktorVeselov
May 29, 2026
Author

It was pointed out that this feature should belong in a different repo, adk-python-community. So, I will go ahead and push it there until/after further instructions/notice.

1 reply

ViktorVeselov May 29, 2026
Author

Per reviewer notes, moved it to adk-python-community as a plugin: google/adk-python-community#151

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Taxonomy-Driven Skill Routing & Dynamic Context Mutation #5891

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments 4 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[RFC] Taxonomy-Driven Skill Routing & Dynamic Context Mutation #5891

Uh oh!

Uh oh!

ViktorVeselov May 29, 2026

Design doc + feature

Reusing Existing SDK Guards

Public API Library Exports (DX)

Dynamic Prompt/Instruction Shaping & Multi-Step Taxonomy

Multi-Step Taxonomy Pipeline

Security & Sanitization

Potential Codebase Changes We Would Have to Make

1. ADK Skills Policy Module

2. ADK Skills Registry & Parser

3. ADK Toolset & Core Execution

Future-Proofing

Replies: 2 comments · 4 replies

Uh oh!

zeroasterisk May 29, 2026

JSON-LD with SKOS

Flat Key-Value JSON

Uh oh!

ViktorVeselov May 29, 2026 Author

Uh oh!

ViktorVeselov May 29, 2026 Author

Uh oh!

ViktorVeselov May 29, 2026 Author

Uh oh!

ViktorVeselov May 29, 2026 Author

Uh oh!

ViktorVeselov May 29, 2026 Author

ViktorVeselov
May 29, 2026

Replies: 2 comments 4 replies

zeroasterisk
May 29, 2026

ViktorVeselov May 29, 2026
Author

ViktorVeselov May 29, 2026
Author

ViktorVeselov May 29, 2026
Author

ViktorVeselov
May 29, 2026
Author

ViktorVeselov May 29, 2026
Author