feat(search-optimization): cache tool catalog and parallelize per-account MCP fetches by shashi-stackone · Pull Request #173 · StackOneHQ/stackone-ai-python

shashi-stackone · 2026-04-13T20:25:36Z

Problem

tool_search latency in agent loops was ~3 seconds per call in setups with multiple linked accounts, while
a direct hit to /actions/search via Postman returns in milliseconds. Investigation showed the SDK was
doing the same catalog-fetch and index-build work on every single call:

search_tools() unconditionally called fetch_tools() with no cache on the search path (the
existing _tools_cache only covered the meta-tool execute path).
fetch_tools() opened a fresh MCP session per linked account sequentially in a demo with 7
accounts, that's 7 serial round-trips before any search logic ran.
ToolIndex (BM25 + TF-IDF over ~662 tools) was reconstructed on every local-search fallback call
no caching.

Fix

Internal memoization only, no public API signature changes, no behavioral changes in the happy path.

Catalog cache: StackOneToolSet._catalog_cache memoizes fetch_tools() results keyed by
(sorted(account_scope), providers, actions). Shared across search_tools, tool_execute, openai(),
langchain(), search_action_names() they all benefit transparently.
Parallel per-account fetch: replaced the sequential for account in account_scope: loop with a
ThreadPoolExecutor fan-out (capped at 10 workers), mirroring the existing semantic-search parallelism
pattern in the same file (toolset.py:920-927). Single-account path preserved as-is — no threading overhead
for the common case.
ToolIndex cache: memoized on the toolset instance by Tools identity; rebuilt only when the
underlying catalog changes.
New public method: clear_catalog_cache() for users rotating accounts mid-session.
Invalidation: set_accounts() now calls clear_catalog_cache() automatically.
Dropped redundant _ExecuteTool._cached_tools local cache now delegates to the shared toolset
cache.

Expected latency change

Warm tool_search: ~3s → ~200ms (cache hit, roughly the latency of /actions/search alone).
Cold tool_search with N accounts: drops from ~N × per-account latency to ~max(per-account) + small
overhead.
Subsequent fetch_tools / openai() / langchain() calls on the same toolset are near-instant.

Summary by cubic

Cache the MCP tool catalog and local search index, and fetch per-account catalogs in parallel to cut tool_search latency in multi-account setups. Addresses ENG-12639.

Performance
- Catalog cache in StackOneToolSet._catalog_cache keyed by accounts/providers/actions (accounts order-insensitive; providers lowercased); used by search_tools, tool_execute, openai(), langchain(), search_action_names().
- Per-account MCP fetches run in parallel (up to 10 workers); single-account path stays non-threaded.
- Local ToolIndex is cached by tools identity and rebuilt only when the catalog changes; invalidated via clear_catalog_cache().
- Added clear_catalog_cache(); set_accounts() now clears the cache and local index. _ExecuteTool always defers to toolset.fetch_tools to use the shared cache.
- Latency: warm tool_search ~3s → ~200ms; cold calls across N accounts now bounded by the slowest account, not N.
New Features
- Added examples/benchmark_search.py to measure cold vs warm latency for fetch_tools, local search, and semantic search.

^{Written for commit 1c834b1. Summary will update on new commits.}

Copilot

Pull request overview

Improves tool_search and related paths by memoizing the fetched tool catalog, parallelizing per-account MCP catalog fetches, and caching the local-search ToolIndex to avoid repeated rebuilds.

Changes:

Added an instance _catalog_cache for fetch_tools() results with explicit invalidation via clear_catalog_cache() and automatic invalidation on set_accounts().
Parallelized multi-account MCP catalog fetches using ThreadPoolExecutor (single-account path remains sequential).
Cached the local-search ToolIndex per tool catalog identity; updated/added tests covering cache behavior and parallel fetch semantics.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File	Description
stackone_ai/toolset.py	Adds catalog + ToolIndex memoization, parallel per-account MCP fetch, and cache invalidation hooks.
tests/test_fetch_tools.py	Adds tests for catalog cache keys/invalidation, parallel fetch behavior, and ToolIndex reuse/rebuild.
tests/test_agent_tools.py	Updates expectations to reflect removal of `_ExecuteTool`’s local cache and delegation to toolset caching.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-13T20:29:01Z

+                    futures = [pool.submit(_fetch_for_account, acc) for acc in account_scope]
+                    for future in concurrent.futures.as_completed(futures):
+                        all_tools.extend(future.result())


The parallel fetch aggregates results via as_completed, which makes all_tools order nondeterministic across runs. Since Tools builds _tool_map = {tool.name: tool for tool in tools}, any duplicate tool names across accounts will be resolved by “last one wins” in completion order, potentially changing which account’s tool get_tool()/tool_execute selects compared to the previous deterministic per-account loop. Consider preserving a deterministic account ordering while still fetching in parallel (e.g., use executor.map over an ordered account list, or collect results keyed by account and extend in that order; ideally use the same ordering as the cache key).

Suggested change

futures = [pool.submit(_fetch_for_account, acc) for acc in account_scope]

for future in concurrent.futures.as_completed(futures):

all_tools.extend(future.result())

for account_tools in pool.map(_fetch_for_account, account_scope):

all_tools.extend(account_tools)

Copilot · 2026-04-13T20:29:01Z

+            cache_key = (
+                tuple(sorted(account_scope, key=lambda a: (a is None, a))),
+                tuple(sorted(providers)) if providers else None,


providers filtering is documented/implemented as case-insensitive (_filter_by_provider lowercases providers), but the catalog cache_key uses the raw providers values. Calls like fetch_tools(providers=["Hibob"]) vs fetch_tools(providers=["hibob"]) will create separate cache entries even though they return identical results. Normalize providers (e.g., lowercase + sort) when building the cache key to avoid redundant cache misses and growth.

Suggested change

cache_key = (

tuple(sorted(account_scope, key=lambda a: (a is None, a))),

tuple(sorted(providers)) if providers else None,

normalized_providers = (

tuple(sorted(dict.fromkeys(provider.lower() for provider in providers)))

if providers

else None

)

cache_key = (

tuple(sorted(account_scope, key=lambda a: (a is None, a))),

normalized_providers,

Copilot · 2026-04-13T20:29:01Z

+            cached = self._catalog_cache.get(cache_key)
+            if cached is not None:
+                return cached


On cache hits this returns the same cached Tools instance. Tools and StackOneTool are mutable (e.g., Tools.set_account_id() mutates the contained tools), so a consumer mutating the returned object will implicitly mutate the toolset’s cache and affect subsequent fetch_tools()/search_tools() calls. If you want memoization to be internal-only/no behavioral change, consider returning a defensive copy (or caching an immutable representation and constructing new tool objects per call) so external mutations can’t leak back into the cache.

cubic-dev-ai

cubic analysis

2 issues found across 3 files

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="stackone_ai/toolset.py">

<violation number="1" location="stackone_ai/toolset.py:1188">
P2: Cache key includes `providers` without case normalization, but provider filtering is case-insensitive. Calls like `fetch_tools(providers=['HiBob'])` vs `fetch_tools(providers=['hibob'])` will miss the cache despite producing identical results.</violation>

<violation number="2" location="stackone_ai/toolset.py:1209">
P2: `as_completed` yields futures in non-deterministic finish order, so the tool list order now varies between runs for multi-account fetches. Iterating `futures` directly (instead of `as_completed`) still parallelizes the HTTP calls but collects results in the original account order.</violation>
</file>

Linked issue analysis

Linked issue: ENG-12639: Investigate Performance of Search on SDK

Status	Acceptance criteria	Notes
✅	Add a shared catalog cache so search_tools/tool_execute/openai/langchain/search_action_names reuse fetch_tools results	Added _catalog_cache and fetch_tools returns cached Tools
✅	Parallelize per-account MCP fetches (fan-out) capped at 10 workers; preserve single-account path	Replaced loop with ThreadPoolExecutor, single-account branch preserved
✅	Memoize ToolIndex on toolset instance and rebuild only when underlying catalog changes	Added _tool_index_cache and reuses ToolIndex when id(all_tools) unchanged
✅	Provide a new public method clear_catalog_cache() for explicit invalidation	clear_catalog_cache() was added to invalidate caches
✅	Ensure set_accounts() calls clear_catalog_cache() automatically	set_accounts now calls self.clear_catalog_cache()
✅	Drop redundant _ExecuteTool._cached_tools and delegate to toolset cache	Removed _cached_tools and _ExecuteTool.execute calls toolset.fetch_tools()
✅	Cache key must include sorted account_scope (order-invariant), providers, and actions	Cache key built from sorted account_scope, sorted providers/actions tuples
✅	Include tests verifying catalog memoization, account-order invariance, provider/action key participation, clear_catalog_cache, and set_accounts invalidation	Comprehensive tests added to tests/test_fetch_tools.py
✅	Include tests verifying parallel per-account fetch behavior, bounding, and correctness	Parallel fetch tests check elapsed time and tool aggregation
✅	Preserve single-account fast path (no threading overhead)	Single-account branch calls _fetch_for_account directly without executor
✅	No breaking public API signature changes; internal memoization only	Existing method signatures unchanged; clear_catalog_cache added non-breaking
⚠️	Achieve the expected latency improvements (warm: ~3s→~200ms; cold: N×→max+overhead)	Parallelism tests measure faster fetches but exact latency targets not validated

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

cubic-dev-ai

0 issues found across 1 file (changes from recent commits).

_{Auto-approved: Performance optimization adding internal catalog and index caching and parallelized fetches. Safe as it uses standard patterns, adds no public API changes, and includes extensive tests.}

glebedel

LGTM

cubic-dev-ai

1 issue found across 2 files (changes from recent commits).

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="examples/benchmark_search.py">

<violation number="1" location="examples/benchmark_search.py:61">
P2: Validate `--iterations` to require a positive integer; `0` or negative values currently crash at `times[0]` in `bench()`.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

cubic-dev-ai · 2026-04-14T09:54:13Z

+    parser = argparse.ArgumentParser(description="Benchmark SDK search latency")
+    parser.add_argument("--iterations", "-n", type=int, default=100, help="iterations per benchmark (default 100)")
+    args = parser.parse_args()
+    n = args.iterations


P2: Validate --iterations to require a positive integer; 0 or negative values currently crash at times[0] in bench().

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At examples/benchmark_search.py, line 61: <comment>Validate `--iterations` to require a positive integer; `0` or negative values currently crash at `times[0]` in `bench()`.</comment> <file context> @@ -0,0 +1,131 @@ + parser = argparse.ArgumentParser(description="Benchmark SDK search latency") + parser.add_argument("--iterations", "-n", type=int, default=100, help="iterations per benchmark (default 100)") + args = parser.parse_args() + n = args.iterations + + api_key = os.getenv("STACKONE_API_KEY") </file context>

cubic-dev-ai

0 issues found across 1 file (changes from recent commits).

_{Requires human review: Auto-approval blocked by 1 unresolved issue from previous reviews.}

willleeney · 2026-04-14T12:27:00Z

+    def next_query() -> str:
+        nonlocal query_idx
+        q = QUERIES[query_idx % len(QUERIES)]
+        query_idx += 1
+        return q


feel like we don't need a global variable for a simple iteration on a function??

Agree, its trivial to fix but I think we can clean this up in Examples cleaup or Can do now?

willleeney

lgtm

Add caching to the fetch tools to boost search performance

68faae6

Copilot AI review requested due to automatic review settings April 13, 2026 20:25

Copilot started reviewing on behalf of shashi-stackone April 13, 2026 20:26 View session

Copilot AI reviewed Apr 13, 2026

View reviewed changes

cubic-dev-ai bot reviewed Apr 13, 2026

View reviewed changes

Comment thread stackone_ai/toolset.py Outdated

Comment thread stackone_ai/toolset.py Outdated

Address Co-Pilot Comments

6a52804

cubic-dev-ai bot previously approved these changes Apr 14, 2026

View reviewed changes

shashi-stackone requested review from glebedel and willleeney April 14, 2026 08:58

glebedel previously approved these changes Apr 14, 2026

View reviewed changes

Add benchmark search example in the repo

982165d

shashi-stackone dismissed stale reviews from glebedel and cubic-dev-ai[bot] via 982165d April 14, 2026 09:50

cubic-dev-ai bot reviewed Apr 14, 2026

View reviewed changes

Fix issues spotted by ruff

1c834b1

cubic-dev-ai bot reviewed Apr 14, 2026

View reviewed changes

willleeney reviewed Apr 14, 2026

View reviewed changes

willleeney approved these changes Apr 14, 2026

View reviewed changes

shashi-stackone merged commit cd635e6 into main Apr 14, 2026
16 checks passed

shashi-stackone deleted the ENG-12639 branch April 14, 2026 16:16

github-actions bot mentioned this pull request Apr 14, 2026

chore(main): release stackone-ai 2.7.0 #176

Merged

Conversation

shashi-stackone commented Apr 13, 2026 • edited by cubic-dev-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

Expected latency change

Summary by cubic

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

cubic analysis

Linked issue analysis

Uh oh!

Uh oh!

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

glebedel left a comment

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

willleeney Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

shashi-stackone Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

willleeney left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

shashi-stackone commented Apr 13, 2026 •

edited by cubic-dev-ai bot

Loading

cubic-dev-ai bot Apr 14, 2026 •

edited

Loading