Cloud Blog

Cloud CISO Perspectives: How to build an AI-ready security program for the public sector

Fri, 29 May 2026 16:00:00 +0000

Welcome to the second Cloud CISO Perspectives for May 2026. Today, Usman Chaudhary, Field CISO, Google Public Sector, offers a guide for CISOs protecting government agencies and critical infrastructure on how to get started — and get the most out of — defending with AI.

As with all Cloud CISO Perspectives, the contents of this newsletter are posted to the Google Cloud blog. If you’re reading this on the website and you’d like to receive the email version, you can subscribe here.

aside_block: <ListValue: [StructValue([('title', 'Get vital board insights with Google Cloud'), ('body', <wagtail.rich_text.RichText object at 0x7f256ddae100>), ('btn_text', 'Visit the hub'), ('href', 'https://cloud.google.com/solutions/security/board-of-directors?utm_source=cgc-site&utm_medium=et&utm_campaign=FY26-Q2-GLOBAL-GCP39634-email-dl-dgcsm-CISOP-NL-177159&utm_content=-&utm_term=-'), ('image', <GAEImage: GCAT-replacement-logo-A>)])]>

How to build an AI-ready security program for the public sector

By Usman Chaudhary, Field CISO, Google Public Sector

Usman Chaudhary, Field CISO, Google Public Sector

Deciphering actionable signals from deafening noise can be hard for CISOs, even with AI — and especially for those guiding government agencies, critical manufacturing plants, or in a foundational industry.

From industrial control systems to decades-old municipal databases, you’re securing complex, deeply entrenched systems, and the sudden mandate to adopt AI can feel less like an evolution and more like a breaking point.

While it’s true that you face a monumental challenge, we know that from our conversations with CISOs and customers that we can offer concrete, actionable steps on how to build an adaptable, AI-augmented defense while managing the operational load on your staff.

The urgency created by machine-speed exploits means you can not rely solely on reactive measures. Once the immediate administrative toil has been reduced, you should aggressively shift your focus toward posture elevation, proactive hunting, and structural integration in the next six to 12 months.

Importantly, executing this vision does not mean developing everything from scratch. This roadmap relies on a strategic combination of building custom internal workflows (like Gemini Gems), buying established commercial AI capabilities, and integrating them into your existing security stack.

Google's Gemini for Government delivers agentic AI for more than three million federal civilian and military personnel on a platform accredited at FedRAMP High and DOW Impact Level 5.

To help you prioritize resources, we have structured the necessary AI initiatives across five core CISO workload domains, highlighting your team's immediate quick wins in the first 90 days alongside tactical goals in the first six months, and strategic goals in the six-to-12-month horizon.

Your tactical execution plan: Months zero to six

Building an AI-ready security program is a journey. We’re focusing strictly on high-value use cases you can deploy immediately and in the next six months.

1. Executive alignment and business justification: The goal is to stop defending your budget with technical jargon and start explaining resilience in terms of financial risk and operational efficiency.

AI-driven board reporting (Immediate): Translate complex technical data into clear business impact. Pipe your metrics into a secure enterprise workspace (like Gemini for Workspace). Prompt the model to synthesize the raw data into a concise, two-page risk narrative that includes highlights such as containment metrics, potential impact on citizen services, and production uptime for critical assembly lines.
Vendor and spend optimization (Immediate): Upload vendor capability matrices and contracts to an isolated AI agent (like NotebookLM). Have it identify feature redundancies across your stack, suggesting clear paths for tool consolidation and budget optimization. Be sure to ground these insights with third-party validation from reputable sources like Gartner or Forrester.

2. Process optimization and toil reduction: The goal is to treat AI as a muse, not an oracle. Do not trust it to make final administrative decisions, but do use it to drastically reduce cognitive fatigue.

Automated context gathering and SOC triage (Immediate): Level 1 analysts spend a lot of time manually gathering context across logs, correlating IP reputations, and triaging ambiguous alerts. Integrate a specialized large-language model (LLM) workflow or use built-in capabilities in your SIEM and SOAR (like Google Security Operations) to consolidate this data automatically and provide instant, clear triage verdicts to investigate further or ignore.
Threat intelligence analysis (within six months): Automate a daily pipeline where an LLM ingests industry advisories and distills the noise into prioritized summaries relevant to your sector. Translating that raw text into functional detection rules is a complex engineering challenge. Instead of building this pipeline internally, use security platforms that natively automate indicators of compromise (IOC) extraction and rule engineering.
SOP mapping and agent creation (within six months): Churn and burnout are significant operational risks. Ingest your historical incident resolution notes and standard operating protocols (SOP) into an AI to build a knowledge-base agent. Identify the top five most frequent manual processes, and task an analyst with using a coding agent to document and automate them.

3. Talent upleveling and augmentation: The goal is to empower your practitioners to become AI builders rather than viewing technology as a threat to their expertise.

Natural language to query generation (within six months): Bridge the skills gap inside your SOC. Provide analysts with a secure conversational AI assistant or chatbot to translate plain English hypotheses into executing SIEM queries.
AI-driven security training (within six months): As manual processes are increasingly automated, use that reclaimed time to run capture the flag (CTF) exercises and community contests for your security team. Use an LLM to generate unique, one-shot red team test cases and training scripts that map specifically to your environment's architecture, helping train analysts through hyper-realistic, hands-on learning in simulated environments.

Your strategic horizon: Months six to 12

4. Posture elevation and threat hunting: The goal is to transition your team from a purely reactive posture into a state of continuous defense.

Contextual vulnerability prioritization: Deploy an AI agent to correlate scanner output with your internal architecture context and active threat intelligence, scoring vulnerabilities against actual environment exposure.
AI-assisted architectural threat modeling: Paste proposed system architecture diagrams into an AI assistant during the design phase — before your developers write a single line of application code — to generate a prioritized risk backlog, highlighting business logic flaws and data egress risks early.
Proactive threat hunting: Use AI as a hunting advisor. Have it generate hypotheses aligned with MITRE ATT&CK, suggest the necessary log sources to prove or disprove the hypothesis, and help pivot investigations when a human analyst hits a dead end. Eventually, you want to move to a fully-automated hunting agent which initiates a hunt upon detecting a new IOC and proactively selects the appropriate data, searches through it, and provides findings.
Continuous red team agents: Deploy autonomous or semi-autonomous red team agents to continuously probe your defenses. The active findings and attack paths generated by these agents create a continuous feedback loop — feeding directly into your threat intelligence analysis, SOC playbooks, and contextual vulnerability prioritization.

5. Advanced governance and incident response: The goal is to build structural guardrails for an environment where AI generates code, while preparing for high-stress incidents.

Policy and compliance gap analysis: Rapidly check if new operational proposals or cloud architectures conflict with internal policies or strict regulatory frameworks (like FedRAMP and NIST guidelines). Use an isolated agent preloaded with your governance documentation to review new project proposals and highlight violations.
Interactive incident response (IR) playbooks: Standard tabletops and static PDF playbooks often fail during a real breach. Train an internal agent on your organization’s historical IR tickets and SOPs. During a live crisis, this agent can act as an interactive guide, providing step-by-step containment instructions that actively adapt to the specific details and telemetry of the ongoing incident.
Secure code review at the pull request: The proliferation of AI coding assistants means your developers are generating code — and potential vulnerabilities — faster than ever. Manual security reviews can no longer keep up. You must turn AI inward on your own pipelines. Integrate advanced LLM-powered auditors directly into your CI/CD pipeline as a mandatory security gate to catch AI-generated vulnerabilities and automatically block insecure commits before they merge into production.
Autonomous defense for collapsed exploit windows: The rapid advancement of AI capabilities has effectively collapsed the time-to-exploit window, and to be faster than the adversary you should use AI to actively find and patch vulnerabilities. This approach requires a continuous, multi-step workflow to map and prioritize your codebase, deploy AI to deeply scan the highest-risk code, autonomously verify and implement patches, and continuously monitor the runtime environment.

Because these sophisticated workflows are incredibly difficult to build and maintain internally, it is highly practical to use leading solutions — such as Google AI Threat Defense — to help you predict attack paths and deploy fixes at machine speed.

Moving forward with confidence

The transition to an AI-augmented security program can feel intimidating, but the technological barrier to entry is lower than it has ever been. By shifting your focus from reactive alert management to internal context, structured automation, and rapid governance, you can effectively outpace modern threats while also alleviating the operational burden on your workforce.

Start small. Pick one quick win from the roadmap this week — such as automating your alert triage or mapping your top five SOPs — and begin building the muscle memory your team needs to stay resilient for the era ahead.

To learn more, check out our Security Talks online event on June 10.

aside_block: <ListValue: [StructValue([('title', 'Fact of the month'), ('body', <wagtail.rich_text.RichText object at 0x7f256ddae850>), ('btn_text', 'Learn more'), ('href', 'https://cloud.google.com/blog/topics/threat-intelligence/m-trends-2026'), ('image', <GAEImage: Cloud-CISO-Perspectives-logo-A>)])]>

In case you missed it

Here are the latest updates, products, services, and resources from our security teams so far this month:

Introducing Google AI Threat Defense to help you outpace the adversary: AI Threat Defense is a comprehensive AI-powered cybersecurity solution, an always-on security platform to outpace AI-driven attacks. Read more.
State of SDLC Security 2026: How risk scales in modern development: Wiz researchers share their latest insights from real-world environments into how code, developer tooling, automation, and AI are reshaping application security. Read more.
Claude Enterprise meets the Wiz Security Graph: Security and compliance teams can now monitor Claude activity directly in Wiz, extending to AI the workflows they already rely on. Read more.
How Fraud Defense uses AI to protect the internet: Google Cloud Fraud Defense (formerly reCAPTCHA) now supports agents as first-class users in the browser, has extensively revamped our detection stack with advanced predictive machine learning to model user and bot behavior, and can adapt continuously to new bots and threat vectors. Read more.
What’s new in Android security and privacy in 2026: Android elevates mobile security with new AI-powered protections and advanced safeguards to help keep you safe. Read more.
Defending at machine-speed: Building AI threat readiness with Wiz: Learn how Wiz can help organizations adopt an AI-driven operating model for AI threat readiness. Read more.
Introducing Runtime Threat Detection for Google Cloud Run: Wiz Runtime Sensor support for Google Cloud Run Containers is now generally available, giving teams real-time threat detection and response for their serverless container workloads. Read more.

Please visit the Google Cloud blog for more security stories published this month.

aside_block: <ListValue: [StructValue([('title', 'Join the Google Cloud CISO Community'), ('body', <wagtail.rich_text.RichText object at 0x7f256ddae280>), ('btn_text', 'Learn more'), ('href', 'https://rsvp.withgoogle.com/events/google-cloud-ciso-community-interest-form-2026?utm_source=cgc-blog&utm_medium=blog&utm_campaign=FY25-Q1-global-GCP30328-physicalevent-er-dgcsm-parent-CISO-community-2025&utm_content=cisop_&utm_term=-'), ('image', <GAEImage: GCAT-replacement-logo-A>)])]>

Threat Intelligence news

Welcome to BlackFile: Inside a vishing extortion operation: Google Threat Intelligence Group (GTIG) has continued to track an expansive extortion campaign by UNC6671, a threat actor operating under the "BlackFile" brand, that targets organizations via sophisticated voice phishing (vishing) and single sign-on (SSO) compromise. Read more.
2 PhaaS 2 Furious: The evolution of Chinese-language phishing services: While Russian-speaking threat actors have historically dominated the phishing-as-a-service (PhaaS) landscape, a rival ecosystem is rapidly growing within the Chinese-language underground. Within this ecosystem, GTIG has observed a fundamental move away from static password harvesting towards real-time interception and tokenization. Read more.
Exploitation of KnowledgeDeliver via ViewState deserialization vulnerability: In late 2025, Mandiant responded to a security incident involving a compromised web server running KnowledgeDeliver, a learning management system (LMS) developed by Digital Knowledge commonly used in Japan. Mandiant identified a critical vulnerability that allowed unauthenticated remote code execution (RCE), stemming from the use of identical pre-shared ASP.NET machine keys across customer deployments. Read more.

Please visit the Google Cloud blog for more threat intelligence stories published this month.

Now hear this: Podcasts from Google Cloud

Cloud Security Podcast: Is ‘good enough’ the same as winning: Gal Ordo, co-founder and chief product officer, Native, debates native controls and what happens when a customer needs a feature that a cloud provider hasn't built yet. Listen here.
Cloud Security Podcast: What agentic SOCs should measure: So far this year, what are we measuring for success in agentic SOCs? Matt Gregson, principal, PwC Cyber Security, talks about the state of the agentic SOC. Listen here.
Cloud Security Podcast: CISO as CFO: From Citi to celery, it's all about the cabbage: Most people do not associate grocery wholesale and retail with cutting edge technology and threat models. Arvin Bansal, CISO, C&S Wholesale Grocers, explains why there’s more here than just dry goods. Listen here.
Cyber-Savvy Boardroom: From CISO checklists to CEO strategy: Dom Cussatt discusses the importance of mapping security and risk directly to business objectives. Listen here.

To have our Cloud CISO Perspectives post delivered twice a month to your inbox, sign up for our newsletter. We’ll be back in a few weeks with more security-related updates from Google Cloud.

Cool stuff Google Cloud customers built, May edition: Agentic algorithms for supply chains; virtual try-on APIs; robotic camera operators & more

Fri, 29 May 2026 16:00:00 +0000

AI and cloud technology are reshaping every corner of every industry around the world. Without our customers, who are building the future on our platform, there would be no Google

Cloud. In this regular round-up, we dive into some of the exciting projects redefining businesses, shaping industries, and creating new categories.

For our latest edition, we learn how Urban Outfitters sped up its order management; BASF uses AlphaEvolve algorithms to map global supply chains; the unification strategy for UKG’s workforce intelligence; WPP’s secrets to training humanoid robot camera operators; how Breuninger piloted Virtual Try-On APIs; creating automated video clips with Glance; and Movix improves the production of dental aligners.

Be sure to check back next month to see how more industry leaders and exciting startups are putting Google Cloud technologies to use. And if you haven’t already, please peruse our list of 1,302 real-world gen AI use cases from our customers.

Urban Outfitters saves big by migrating order management

Who: Urban Outfitters, Inc. (URBN), the popular clothing and home goods retailer, relies on IBM Sterling OMS as the nerve center of its global ecommerce operations. However, the foundation of this critical system — a massive 11TB Oracle database — was increasingly becoming a bottleneck.

What they did: URBN completed a major infrastructure upgrade, migrating its IBM Sterling OMS from an Oracle database to Google Cloud's AlloyDB for PostgreSQL. To enhance performance and provide high availability and scalability, the AlloyDB deployment architecture includes two read replicas, providing low-latency access to data for reporting and analytics. Google Cloud and IBM teams also assisted URBN in a rigorous, iterative switchover testing strategy.

Why it matters: The migration to AlloyDB has fundamentally reshaped URBN’s data strategy, delivering a more favorable total cost of ownership through an optimized storage and compute architecture, without sacrificing performance or reliability. Furthermore, the shift to a PostgreSQL-compatible database gave URBN the flexibility of an open-source ecosystem, providing freedom from vendor lock-in, as well as significant speed improvements that enhanced responsiveness.

Learn from us: "URBN’s successful migration serves as a blueprint for organizations looking to modernize their mission-critical infrastructure and future-proof their environment for AI expansion. This journey proves that even the most complex, mission-critical migrations can be achieved through deep cross-organizational partnership and a phased, risk-mitigated approach." – Rob Frieman, CIO, Urban Outfitters & Raj Pai, VP, Product Management, Databases, Google Cloud

BASF manages supply chain decisions with AlphaEvolve

Who: BASF Agricultural Solutions manages a complex network of 180 production sites with more than 5,000 distinct value chains. Currently, human planners make thousands of local decisions every day on what to produce, when to produce it, and how much safety stock to hold.

What they did: To understand how local decisions ripple across their entire global network, BASF turned to AlphaEvolve on Google Cloud to build a digital twin of their supply chain. In collaboration with Google Cloud and prognostica GmbH, BASF fed the model three years of historical data and then generated variations of the code, mutating the logic to see if it could simulate a supply chain that matched the real-world historical data.

Why it matters: By running thousands of experiments, AlphaEvolve developed a clear, human-readable algorithm that explains how the BASF network truly operates. The final algorithm successfully mirrored the actual historical performance of the supply chain, significantly reducing the error rates compared to the initial seed model. It automatically discovered factually correct, domain-specific supply chain rules, providing a clear foundation for optimizing asset utilization globally.

Learn from us: “We had several attempts to build a digital twin. … By using AlphaEvolve, we cannot only map the complex network based on system data, but at the same time understand and copy the human decisions that drive our daily operations.” – Dr. Goetz Krabbe, vice president for global supply chain at BASF

UKG unlocks real-time workforce intelligence at scale

Who: UKG is one of the leading providers of human capital management (HCM) and workforce management (WFM) solutions, but years of growth led to backend sprawl. They have 126 application teams, dozens of tech stacks, and more than 12,000 database instances.

What they did: To bring the full UKG suite onto one real-time foundation, the company built People Fabric, a new data and intelligence platform powered by AlloyDB for PostgreSQL and the just-announced Agentic Data Cloud. They created a custom change data capture (CDC) framework to extract changes from existing operational databases, and for larger analytical workloads, the same data flows into BigQuery, while Cloud SQL holds the metadata and tenancy context.

Why it matters: People Fabric gives UKG a complete and consistent view of people, work, pay, and culture data that’s updated continuously and ready for AI to use in real time. For engineering teams, People Fabric acts as a database-as-a-service that accelerates development and supports modernization without customer disruption. Additionally, migrating core person and employment data off their on-prem monolith has generated cost savings significant enough to fund half of People Fabric.

Learn from us: “As we continue expanding People Fabric, we’re laying the groundwork for deeper agentic automation, more responsive analytics, and a growing set of AI-driven capabilities — all on a trusted, scalable foundation built for what’s next.” – Radhi Chagarlamudi, Group Vice President, Product Engineering, UKG & Heather White, Cloud Data Architect, Google Cloud

WPP accelerates humanoid robot training 10x with G4 VMs

Who: WPP is one of the world’s largest marketing organizations, handling $70 billion of media for enterprise clients. They work on some of the most complex commercial film shoots and were eager to test the viability of robotic cameras to capture more footage, but this required complex training of physical models AI.

What they did: WPP used the new G4 VM instance powered by NVIDIA RTX PRO 6000 Blackwell on Google Cloud to tackle the unique challenges of training physical AI for robotics in videography settings. After capturing human motion with the OptiTrack mocap system, they undertook reinforcement learning using the AI Hypercomputer together with the NVIDIA Isaac Sim image. MuJoCo, an open source physics engine by Google DeepMind, was a critical piece of simulation software that validated accuracy continuously, in real-time.

Why it matters: WPP was able to utilize a P2P topology that moves data directly between GPUs without the bottleneck of central processing. They saw speed increases in excess of 10x, taking training times down to less than one hour. Through high-volume simulation, the humanoid robots learned how to respond to small changes and bridge the tough "sim-to-real" gap, helping ensure the robot's simulated adaptability translated to safety and stability in the real world.

Learn from us: "Our process for mastering complex, natural movement on a film set can be replicated across industries to overcome the massive computational complexity of training robots." – Perry Nightingale, SVP of Creative AI, WPP

Breuninger boosted sales with its "be your own model" AI

Who: Breuninger, a fashion and lifestyle company based in Germany, thought emerging generative media models could be a good fit to answer the question every online fashion shopper asks: "How will this look on me?"

What they did: Working with Google Cloud, they built a virtual try-on experience that lets shoppers see high-end fashion on their own bodies using a simple selfie. Using the Virtual Try-On (VTO) API, Breuninger’s data team worked directly with Google’s engineers to test and refine the technology in three stages, ultimately moving from pre-selected models to a user-first, selfie-based approach. The project was also part of Breuninger’s move to a Flutter-based platform, which helped the team move from its vision to a live launch in only three months.

Why it matters: During a six-week A/B test over Black Week and the holiday season, the team found that shoppers who used the virtual try-on converted purchases at a higher rate than those who didn't. Customer surveys reinforced the numbers: shoppers responded well to the high image quality and the personalized experience.

Learn from us: “Breuninger continues to refine the experience based on how customers actually use virtual try-on in everyday shopping — the same user-first approach that shaped the project from the start.” – Daniel Rascher, Senior Product Owner, Breuninger & Dr. Michael Menzel, Customer AI Specialist, Google Cloud

Glance turns hours of video into mobile-ready clips

Who: Glance, a mobile-first content platform, processes 1-2 hour videos from sources like podcasts, news reports, movies, and web series, and transforms them into 30 to 180-second vertical clips optimized for mobile lock screens.

What they did: The goal was to create a complete pipeline that takes a long-form landscape video (16:9) and outputs multiple ready-to-publish short-form portrait videos (9:16). The final technical solution uses Google Cloud Speech-to-Text v2, Gemini, and the Google Vision API, combined with custom video manipulation using Samurai (an open-source object tracking tool), OpenCV and MoviePy. The process involves audio extraction, speech-to-text transcription, and using Gemini 2.5 Flash to analyze transcript text and identify optimal start and end timestamps for short video clips.

Why it matters: With daily volume projected to grow from 3,500 to over 10,000 videos per day, manual editing wasn’t a realistic path forward. Glance’s video pipeline demonstrates what becomes possible when AI handles the repetitive, judgement-intensive work of video editing. The system transforms thousands of long-form videos into mobile-ready clips each day, preserving narrative context while optimizing for vertical viewing. Rather than choosing between scale and quality, automated pipelines can deliver both.Learn from us: “Glance’s video pipeline demonstrates what becomes possible when AI handles the repetitive, judgement-intensive work of video editing. … The approach offers a template for any organization sitting on long-form video archives. Rather than choosing between scale and quality, automated pipelines can deliver both.” – Himanshu Aggarwal,

Machine Learning Engineer, Glance & Sharmila Devi, AI Consulting Lead, Google Cloud

Movix fills a gap in dental skills with specialized agentic AI

Who: Movix is building one of the first agentic AI solutions for dental appliance manufacturers and dental labs, to help solve a serious shortage of skilled dental technicians in aligner manufacturing.

What they did: Movix developed custom models for deep learning, computer vision, and 3D mesh analysis over a five-month period, using Google Cloud infrastructure. Once defects are detected, they use the Gemini Enterprise Agent Platform to generate client-facing feedback that reads as if it came directly from a human technician. Their 3D models use Cloud Run with L4 GPUs for the massive compute power required, and they use Compute Engine VMs to run experiments and train models.

Why it matters: Movix’s agentic solutions automate data entry and quality control, which are traditionally manual, time-consuming, and error-prone tasks. The automation and higher level of accuracy the QC agent delivers can save $300 per remake for an aligner manufacturer, and speed up the appliance manufacturing process with quicker turnaround times.

Learn from us: “We plan to build hybrid solutions … designing an architecture that connects our cloud-based AI agents with older, on-premises software that many conservative labs still use — through lightweight local connectors and standardized APIs. This will allow us to access a large market segment that has not yet migrated to the cloud.” – Marina Domracheva, CEO, Movix & Bakit Dzhumagulov, CTO, Movix

Developer's guide to Gemini Enterprise and A2UI integration

Fri, 29 May 2026 16:00:00 +0000

If you've built a chatbot, you know this conversation:

User: "Book a table for two tomorrow at 7pm." Agent: "Okay, for what day?" User: "Tomorrow." Agent: "What time?"

A date picker would have ended this in one tap. But until recently, agents had no standard way to render a date picker — or a map, or a multi-select list — inside the chat surface they live in. They could only return text or markdown for generic usage.

Today, we're walking through how to fix that with A2UI, an open protocol for agent-driven user interfaces, and how to integrate an A2UI-enabled agent with Gemini Enterprise (GE) so your agent renders rich and interactive UI natively in the GE chat surface — and in your own custom frontend if you want one. We'll use a working restaurant-finder agent — built with the Google Agent Development Kit (ADK), the A2A protocol, and Gemini — as the reference. The full source is on GitHub and there's a 2-minute demo video.

The problem: agents speak text, but users want UI

Most agent frameworks today return strings. That's fine for short answers, but it breaks down quickly:

Multi-turn slot filling (date, time, party size) burns turns and patience.
Choices among options (which restaurant? which insurance plan?) become long bulleted lists the user has to copy-paste back.
Spatial information (locations, routes, floor plans) is reduced to addresses.

Developers have tried to patch this by sending HTML or JavaScript fragments, but that introduces real risks: cross-site scripting, UI injection from a remote agent you don't fully control, and visual drift from the host app's design system. What's needed is a way to transmit UI that's safe like data and expressive like code.

What A2UI is

A2UI is an open protocol, introduced by Google and co-developed with the Flutter team and product teams behind Gemini Enterprise. Instead of returning text or HTML, an agent returns a JSON payload that describes a UI: a tree of components (Card, Text, Button, ChoicePicker, Image, …) and a separate data model holding the values those components display.

Three properties make this useful in practice:

Declarative, not executable. The payload is data. The client only renders components from a pre-approved catalog, so a remote agent can't inject arbitrary code or steal credentials through a UI widget.
Streaming-friendly. The format is a flat list of small JSON messages, so the LLM can emit them incrementally and the client can paint as they arrive.
Framework-agnostic. The same agent response renders through Lit, Angular, Flutter, or native mobile. The agent doesn't know — or care — what's on the other end.

A2UI is also transport-agnostic. The messages ride inside whatever pipe you already use: A2A JSON-RPC, AG-UI, WebSockets, SSE. In our reference implementation, A2UI rides inside the A2A protocol as DataPart objects with the MIME type application/json+a2ui.

Where A2UI sits in the stack

A2UI is one piece of a four-layer stack. Confusion usually comes from conflating these layers — they're each doing a different job:

Layer	Owns	Examples
App experience	Client shell and conversation state — chat window, input box, message history	CopilotKit, AG-UI
Pixel drawing	Turning component descriptions into actual rendered UI	Lit, Flutter, Angular
Conversation pipeline	Client–server transport — sending messages, receiving responses	A2A Protocol
Cargo (data format)	The thing flowing through the pipeline that describes the UI	A2UI

Read top to bottom: CopilotKit/AG-UI owns the app experience. Lit/Flutter/Angular own the rendering. While CopilotKit and AG-UI provide valuable abstractions, they remain strictly optional for implementing A2UI. In this architecture, A2A serves as the underlying conversation pipeline, while A2UI represents the structured cargo that actually traverses that pipe.

That separation is why the same A2UI payload renders identically in three very different deployment shapes:

Bespoke web app — a custom client shell (like the reference repo's Lit frontend/) plus a custom A2UI renderer.
CopilotKit / AG-UI app — CopilotKit owns the chat shell, an A2UI renderer is registered inside it for rich cards.
Gemini Enterprise — GE is the shell, the renderer, and the transport client. You only build the agent.

So for the GE path, the stack collapses to two layers you control: the A2A endpoint (your agent) and the A2UI cargo it emits. The other two layers are GE's responsibility. CopilotKit and AG-UI are great if you're building a standalone product UI elsewhere — they're just out of scope for embedding an agent inside Gemini Enterprise.

Pattern revisions

The protocol evolves quickly, and different clients support different revisions. Two patterns are common today:

Inline pattern — the agent sends a component tree with the data baked into each component (the pattern Gemini Enterprise renders today).
Decoupled pattern — the agent sends the component tree and the data model as separate messages, so subsequent turns can update one without re-sending the other. This reduces tokens and latency for long-running conversations and is the direction the protocol is heading.

The reference repo serves both patterns from one backend, picking which to emit per request based on the client's X-A2A-Extensions header. As new revisions ship, you add another catalog and the same negotiation pattern keeps working.

How A2UI works inside Gemini Enterprise

Gemini Enterprise ships with a built-in A2UI renderer. For the developer, that means the integration story is short:

Build your A2A agent, embedding an A2UI catalog and example payloads alongside the regular tool definitions.
Register the agent with Gemini Enterprise as an A2A endpoint. (Use make register-gemini-enterprise in the reference repo.)
A GE admin shares the agent with employees, just like any other agent in the GE catalog.

At runtime, the flow looks like this:

The user types a request in the GE chat. GE calls your agent's A2A endpoint and sends along GE's own A2UI catalog — the list of UI components GE knows how to render.
Your agent decides whether a UI widget is the right response. If yes, it emits an A2UI JSON message (e.g., a ChoicePicker of restaurant options). If no, it falls back to text. Both can coexist in the same response.
GE receives the JSON, validates it against its catalog, and renders the widget natively in GE's own design language — so it visually matches the rest of the chat surface.
When the user interacts with the widget (selects three options, picks a date), GE serializes the interaction back into JSON and sends it to your agent as the next turn. Your agent processes structured input, not free-form text.

One thing worth flagging: because your agent doesn't ship its own renderer for GE, you don't need to choose a frontend framework to start. Your A2A endpoint can run anywhere — Cloud Run, GKE, on-prem — and GE handles the rendering.

High-level architecture example

The reference implementation is an ADK backend on Cloud Run designed to plug seamlessly into Gemini Enterprise.

Gemini Enterprise connects directly to your agent using standard A2A JSON-RPC calls.
The agent serves the inline message pattern expected by the Gemini Enterprise managed UI.
Custom components like GoogleMap render via Google Maps Embed iframes, with the API key injected server-side so the LLM never sees it.

The following demonstration illustrates how Google Maps functions as a live, interactive component within Gemini Enterprise rather than a static image. Leveraging A2UI's streaming-friendly architecture, the agent updates the map view in real-time—dropping pins and adjusting coordinates incrementally as results arrive from the Maps API.

See it running, then build your own

Detailed implementation guide here.
Demo video (2 minutes, end-to-end with both the Lit shell and Gemini Enterprise): https://youtu.be/_5AaYwyqVio
A2UI spec and component reference: a2ui.org
Gemini Enterprise updates, including the A2UI renderer: What's new in Gemini Enterprise
A2UI generative UI announcement: Introducing A2UI generative UI

If you're already building agents on Google Cloud, the fastest path is to clone the reference repo, run make local-backend for a local smoke test, and then make register-gemini-enterprise to wire it into GE. From there, swap in your own catalog, your own tools, and your own domain. The next time a user asks your agent for "a table for two tomorrow at 7pm," the answer can be a date picker instead of another question.

From petabytes to predictions: Easy BigQuery insights in Google Sheets

Fri, 29 May 2026 16:00:00 +0000

Many organizations’ single source of truth is data that resides in BigQuery, Google’s governed, secure and petabyte-scale data platform. However, the "last mile" of ad-hoc analysis, modeling, and reporting often happens where business users are most comfortable: Google Sheets.

Bridging this gap usually involves exporting data as CSVs. But this is inefficient, creating data silos, version control problems, and security and governance risks. Connected Sheets helps to eliminate this trade-off, turning the familiar Google Sheets interface into a direct, live window into your BigQuery data platform, letting you analyze petabytes of data quickly, securely, and easily.

In this post, we’ll do a quick overview of Connected Sheets, walk through real-world use cases, and show you how to perform enterprise-grade data analysis using BigQuery directly in Google Sheets.

A live window into the single source of truth

Business users often wait days or weeks for simple reports. Connected Sheets solves this by letting you analyze your critical data via a secure, direct connection to billions of rows of live data, with no SQL required.

For data admins, this architecture is appealing because it maintains a strong security and governance posture. They can provision access to specific tables or views, confident that the underlying data cannot be altered from a Connected Sheet. Admins can also take advantage of Google Workspace’s enterprise data protections to control reading, sharing, and copying data throughout its lifecycle.

For end users, the benefit is immediate agility and ease of use. They can use familiar tools like pivot tables, charts, calculated columns, and formulas to analyze billions of rows of live data as if it were a local file, balancing centralized control with the business's demand for speed. End users don’t have to learn technical concepts like databases, schemas, tables, and query languages like SQL to access, analyze, and visualize the data.

Key use cases and core journeys

We consistently hear about three primary use cases for Connected Sheets from customers across industries.

1. Self-service exploratory analysis: Data teams provide access to curated tables and datasets in BigQuery. Business Analysts in sales, operations, finance, or marketing can then build their own pivot tables or charts that run over the entire live data source directly from Sheets, then filter data to answer day-to-day questions, freeing the data team from a constant backlog of ad-hoc requests.

Example: Deep-dive investigation

Scenario: A sales manager analyzes millions of global transactions to review quarterly performance.
Action: Using a Connected Sheets pivot table, they quickly create a pivot table to summarize revenue by region and product line. When they spot an anomaly — an unexpected revenue spike in EMEA, for example — they simply double-click the summarized value to drill down and learn more about exactly what led to that value.
Outcome: Connected Sheets instantly queries and retrieves the precise, granular transaction rows behind that summary value, making it easy and fast to find the root cause.

2. Operational reporting: Business users can create live, refreshable, and easy-to-understand dashboard-like views of their data that their partner teams can rely on and share with executives and leads.

Example: Automated executive summary

Scenario: An operations lead provides weekly updates on sales invoices to their leadership, based on a BigQuery dataset with millions of rows.
Action: The operations lead creates their Connected Sheet and builds a series of charts to visualize invoice trends over time. They then configure the sheet to automatically refresh on a schedule every Monday morning, so it’s always ready ahead of their executive review.
Outcome: The manual routine of exporting data and pasting it into workbooks is completely eliminated. Leadership gets a reliable report and analysis powered by the latest warehouse data.

3. Hybrid data modeling: Data practitioners often need to blend governed warehouse data with real-time manual inputs and annotations. For example, a finance team might pull revenue data from BigQuery and combine it with manual procurement entries from your ERP system in a separate tab, using VLOOKUP to create a consolidated view for month-end reporting.

Example: Custom business metrics

Scenario: A financial analyst calculates custom commission payouts based on live sales data from your CRM system. The commission tier logic changes frequently and isn't modeled in the central data warehouse.
Action: Instead of requesting a new data pipeline from their data team, the analyst can add a calculated column directly within the Connected Sheet. They use standard spreadsheet formulas (like IF or IFS) to apply custom business logic directly against the BigQuery data.
Outcome: The analyst retains the flexibility to model scenarios and calculate metrics quickly, while maintaining governed BigQuery data as their single source of truth.

Getting started

Connecting Google Sheets to BigQuery is straightforward and requires only a Google Workspace account and a billing-enabled Google Cloud project. There are two primary ways to establish a connection and create a Connected Sheet.

Path 1: Starting from Sheets
This is the typical workflow for users who work primarily within spreadsheets.

Open a new Google Sheet.
Navigate to Data > Data Connectors > Connect to BigQuery.
Select your billing-enabled Google Cloud project.
Browse available datasets, select a Saved Query to connect right away, or input a custom SQL query.
Click Connect.

Path 2: Starting from BigQuery
This workflow is common for data analysts starting from the Google Cloud console.

Navigate to the BigQuery UI in the console.
In the Explorer pane, locate the table or query result you wish to analyze.
Click the Export menu (or the three-dot action menu) next to the asset.
Select Open in > Connected Sheets.

From petabytes to predictions with Connected Sheets

We designed Connected Sheets to help you bridge the gap between the scalability of the cloud and the flexibility of the spreadsheet. With Connected Sheets, we’re making it easier than ever for organizations to put data into the hands of the people who need it.

To explore these features, connect your BigQuery data to Google Sheets today. For more technical details, visit the Connected Sheets documentation.

AlloyDB Hot Standby: Faster failovers, consistent performance

Fri, 29 May 2026 16:00:00 +0000

AlloyDB for PostgreSQL is a fully managed, PostgreSQL-compatible database service designed for the most demanding enterprise workloads. It combines the best of PostgreSQL with the power of Google, delivering exceptional performance, scalability, and availability. We are continuously innovating to make AlloyDB even more resilient, and today, we're excited to announce a significant upgrade to our High Availability (HA) architecture: Hot Standby.

Understanding AlloyDB HA Architecture

An AlloyDB primary instance configured for high availability consists of an active node and a standby node, located in different zones within a region for resilience. AlloyDB's cloud-native architecture separates compute and storage to allow for individual scaling of each resource. Database write-ahead logs (WAL) are synchronously written to a regional log persistor, ensuring durability, while data blocks reside in AlloyDB's regional storage service. A load balancer directs traffic to the current active node using a stable IP address.

In the traditional HA model, if the active node became unavailable, AlloyDB would automatically initiate a failover. The standby node, previously idle from a PostgreSQL perspective, would start the database, process any remaining logs, and then take over. While this ensures high availability, the database startup time and the subsequent cache warming period could impact application recovery time and performance.

Introducing AlloyDB Hot Standby: The New Architecture

With the new Hot Standby capability, we've transformed the role of the standby node. Instead of being a passive node, the standby node now continuously applies WAL records streamed from the primary. This architectural shift brings two massive advantages:

Dramatically Reduced Failover Times: Because PostgreSQL is already running, initialized, and actively replicating on the standby, the time required to promote it to primary in the event of a failure is significantly shorter. The system detects the failure (typically within 30 seconds), promotes the standby, and redirects connections. The database startup phase on the standby is eliminated, reducing overall downtime and improving your Recovery Time Objective (RTO).
Consistent Performance After Failover: Since the Hot Standby node is actively replaying logs, its memory caches (like the PostgreSQL buffer cache) are kept "warm." They contain much of the same frequently accessed data as the primary node's caches. When a failover occurs, the new primary can serve requests at optimal speed almost immediately. This avoids the performance "brownout" typically seen while caches warm up from disk, ensuring application performance remains stable.

And the best part? This substantial enhancement to availability and resilience comes at no additional cost to you.

See Hot Standby in Action

We've prepared a short demonstration to illustrate the difference between the new Hot Standby HA and the legacy HA setup. In the video, we run a benchmark load on two AlloyDB instances and trigger a failover on both simultaneously.

As you can see in the demo:

The instance with Hot Standby completes the failover in approximately 15 seconds. Crucially, its transaction per second (TPS) rate returns to the pre-failover levels almost immediately.
The instance with Legacy HA takes noticeably longer to complete the failover. Even when it comes back online, the TPS is significantly lower and takes several minutes to ramp back up to the original performance levels as its caches warm up.

This side-by-side comparison clearly shows the benefits of Hot Standby in minimizing downtime and eliminating the post-failover performance impact.

Get Started with Enhanced HA

Hot Standby is being rolled out to newly created AlloyDB instances in PostgreSQL 18, providing an upgraded HA experience automatically, and will be rolling out to the earlier major versions in the coming months. You can continue to rely on AlloyDB's 99.99% SLA, now backed by even faster failovers and more predictable post-failover performance.

This enhancement underscores our commitment to providing a best-in-class, enterprise-grade managed PostgreSQL experience.

To learn more about AlloyDB's High Availability features, please refer to the official documentation. New to AlloyDB? Try it out today!

What’s new with Google Cloud

Fri, 29 May 2026 16:00:00 +0000

Want to know the latest from Google Cloud? Find it here in one handy location. Check back regularly for our newest updates, announcements, resources, events, learning opportunities, and more.

Tip: Not sure where to find what you’re looking for on the Google Cloud blog? Start here: Google Cloud blog 101: Full list of topics, links, and resources.

aside_block: <ListValue: []>

May 25 - May 29

Anthropic’s Claude Opus 4.8 is now available on Gemini Enterprise Agent Platform. As we continue to expand our platform's model offerings, this addition gives organizations more options for handling complex, multi-stage enterprise workflows. Claude Opus 4.8 brings strong capabilities in agentic coding, allowing developers to manage extensive refactors and tracking dependencies over extended sessions.
API Horizon Munich July 6, 2026: Orchestrating the Next Era of AI and APIs
Master the orchestration of next-gen AI and digital ecosystems. Join Google Cloud experts and DACH tech leaders on July 6 for an exclusive look at the Apigee roadmap, Agent Management, and Model Context Protocol (MCP). Gain real-world insights and connect with the regional integration community.

Register now
Securing AI Agents: The Extended Agent Gateway Pattern
Learn how to prevent autonomous AI agents from invoking unauthorized APIs. Join Apigee Specialist Joel Gauci on June 4 for a technical deep dive into the Extended Agent Gateway pattern. This session covers enforcing Fine-Grained Authorization (FGA), implementing secure token exchange, and establishing Model Context Protocol (MCP) governance at the API gateway layer to protect enterprise backend services.

Register for the June 4 Community TechTalk
API-to-Agent Security: Exposing REST APIs to Gemini Enterprise via MCP
Connect Gemini Enterprise agents to core data without creating security hazards. Join Google Cloud Specialist Nigel Walters on June 11 to learn how to instantly transform legacy REST APIs into secure Model Context Protocol (MCP) servers. We’ll cover how to safely register tools with Gemini while enforcing gateway-level guardrails like rate limiting and access control policies.

Register for the June 11 Community TechTalk

May 18 - May 22

Chinese Webinar | June 4: AI Command and Control
As AI agents move from experimental pilots to core enterprise functions, governance has become a critical next step. Join Google Cloud on June 4th at 10:00 AM (Beijing Time) to learn how to build a secure AI management layer architecture. We'll explore how to develop governed MCP (Model Context Protocol) endpoints, manage tool access to enterprise data, and leverage robust audit logs to operationalize AI. This session also includes a practical demonstration of these governance frameworks on Google Cloud.

Register here
GCP Announces New Features to Benchmark and Optimize LLMs for On-Device Use Cases
Deploying fine-tuned LLMs from GCP to edge devices like smartphones is complex due to fragmented hardware. Google AI Edge Portal bridges this gap, giving GCP developers the ability to test AI performance on 120+ Android devices, representing the full diversity of high, medium, and low tier smartphones on the market today. This week at I/O, we announced brand new capabilities to benchmark and debug LLM performance across these devices. Sign-up to utilize these new features in private preview today.

May 11 - May 15

Build Your AI & MCP Control Tower for Universal Governance
Master the future of agentic security with Apigee. Join our Community TechTalk on May 21 to discover how Apigee serves as a central "Control Tower" for the Model Context Protocol (MCP). We will explore how new JSON-RPC tool authorization enables fine-grained access policies across your organization, ensuring secure and scalable AI deployments. Whether managing internal tools or external users, learn to govern your agentic ecosystem with absolute precision. This session is designed for global coverage across EMEA and AMER regions.

Register for the May 21 Community TechTalk

Apr 27 - May 1

Master Your Launch: The Apigee Production Go-Live Checklist
Ensure a secure launch with the Apigee production guide. Join Nicola Cardace on May 28 to explore security guardrails, including IAM roles, mTLS configurations, and encrypted KVM migrations. Scheduled at 11 AM EDT / 5 PM CEST to support EMEA and AMER teams, this TechTalk provides the technical roadmap you need to flip the switch with absolute confidence.

Register for the May 28 Community TechTalk
Transforming APIs into Governed Agentic Tools on the Google Cloud Agentic Platform
Turn your APIs into secure, governed agentic tools on the Google Cloud Agentic Platform. Join Specialist Christophe Lalevée on May 7 for a technical deep dive into AI productization. Scheduled at 5 PM CEST / 11 AM EDT to maximize coverage for developers across EMEA and AMER, this session explores the integration and governance frameworks required to scale enterprise-ready AI with confidence.

Register for the May 7 Community TechTalk
Fractional G4 VMs are Generaly Available, providing a highly efficient and cost-effective entry point for AI and graphics workloads. These new configurations, using NVIDIA virtual GPU (vGPU) technology, allow you to leverage the power of the NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs in flexible, smaller increments, so you can right-size your infrastructure to match the specific demands of your applications. By providing more granular access to advanced hardware, fractional G4 VMs let you optimize resource allocation and reduce overhead without sacrificing performance. You can now select from additional GPU slice sizes for your specific needs:
- 1/2 GPU: Ideal for more intensive tasks such as LLM inference, robotics sensor simulation, and high-fidelity 3D rendering.
- 1/4 GPU: Optimized for mainstream workloads, including mid-range creative design, video transcoding, and real-time data visualization.
- 1/8 GPU: Great for lightweight applications such as remote desktops, productivity tools, and entry-level streaming services.
Transitioning AI from a sandbox prototype to an enterprise-grade system is a major hurdle. A monolithic script won't suffice for widespread deployment. To achieve true scale and reliability with Gemini, organizations must adopt service-oriented micro-agent architectures, establish Zero-Trust security, and implement rigorous EvalOps. Master the "Agentic Maturity Ladder" to ensure your AI & Agentic solutions are robust, secure, and ready for the real world.

Watch the deep dive and read the developer blog to learn more.
ML Development in VS Code with Google Cloud Power: Workbench Extension Now Available
Data scientists and developers can now combine the local productivity of VS Code with the scalable infrastructure of Google Cloud. The new Google Cloud Workbench Notebooks extension allows you to connect to and run notebooks on managed cloud environments directly within your local IDE. This integration streamlines the ML lifecycle by eliminating context switching and providing high-performance compute for complex workloads in a familiar interface. As part of our commitment to the developer ecosystem, the extension is fully open-sourced to support community-driven innovation.
- Install from Marketplace: GoogleCloudTools.workbench-notebooks
- Contribute on GitHub: colab-enterprise-vscode

Apr 20 - Apr 24

Announcing the 2026 Google Cloud Partners of the Year
Google Cloud is honored to celebrate the winners of the 2026 Partner of the Year awards! These awards recognize an exceptional group of partners across AI, Security, Infrastructure, and more, who have demonstrated a commitment to customer success. From global system integrators to specialized startups, these winners are leveraging the power of Google Cloud to solve complex challenges and drive digital transformation worldwide. Join us in congratulating these organizations for their innovation, collaboration, and impactful results over the past year.

See the 2026 Partner Award winners

Apr 13 - Apr 17

We're excited to announce the Public Preview of Datastream’s metadata integration with Knowledge Catalog. This is the first step in our vision to provide a centralized, "single pane of glass" for all Datastream assets. The enhancement automatically synchronizes Streams, Connection Profiles, and Private Connections, eliminating data silos. It enhances discoverability, allowing you to search for Datastream assets using the same interface as BigQuery tables. Centralized governance is also provided, making your real-time data estate more transparent and easier to manage.
Upgrading Apigee OPDK to 4.53 with OS Modernization
Modernize your infrastructure using Google’s official, sequential upgrade path. Our Technical expert, Rakesh Talanki outlines how to upgrade Apigee OPDK to v4.53 while migrating to a supported OS (RHEL 8.x/9.x). This guide covers the "build-out" methodology, including multi-data center syncing, to ensure a stable, zero-downtime transition

Read the guide
Cloud Run Worker Pools and CREMA: Powering Serverless AI at Scale
Google Cloud has announced the General Availability of Cloud Run worker pools, a new resource type designed specifically for pull-based, non-HTTP workloads. Unlike traditional Cloud Run services that scale based on request traffic, worker pools provide an "always-on" environment for background tasks like processing message queues or running large-scale AI inference. To support this, Google Cloud also open-sourced the Cloud Run External Metrics Autoscaler (CREMA). Built on KEDA, CREMA enables queue-aware autoscaling for worker pools, allowing them to dynamically scale based on external signals like Pub/Sub backlog or Kafka lag.
Apigee Model Context Protocol (MCP) now Generally Available
Expose enterprise APIs as MCP tools for agentic AI applications with the General Availability of MCP in Apigee. This update allows developers to transform APIs into AI-ready tools using OpenAPI Specifications, removing the need for local MCP servers or additional infrastructure. With managed endpoints and semantic search in API hub, you can now provide AI agents with secure, governed access to enterprise data at scale.

Explore the MCP overview

Apr 6 - Apr 10

Community TechTalk: Powering Retail Agents with ADK, UCP & Apigee X
Move beyond basic chatbots to secure, transactional AI experiences. Join our Community TechTalk on April 16 to learn how Apigee X and Gemini build a "Trust Layer" for AI shopping assistants using UCP standards. We’ll demonstrate how to block prompt injections with Model Armor and implement cost governance via token limits to secure the path from discovery to purchase.

Register for the TechTalk
Implement multimodal capabilities in your AI agents
Explore three new reference architectures for building sophisticated multi-agent AI systems that can process and analyze multimodal data. To analyze disparate multimodal data and produce a high-confidence classification, see Classify multimodal data. To create a fluid conversational AI that processes audio and video streams in real time, see Enable live bidirectional multimodal streaming. To consolidate fragmented multimodal data into a searchable knowledge graph, see Multimodal GraphRAG resource orchestration.
Automate SecOps workflows with an agentic AI system
To accelerate incident response and reduce manual toil for your security team, you need a system that can automate remediation playbooks. Our new reference architecture helps you build an AI agent that orchestrates complex triage and investigation workflows across disparate security tools, such as SIEM, CSPM, and EDR, from a single interface. See the full guide to orchestrate security operations workflows.

Mar 30 - Apr 3

ASEAN Webinar | April 30: Mastering Agentic Governance at Scale with GCP
As AI agents move from experimental pilots to core enterprise functions, governance is the critical next step. Join Google Cloud experts Shilpi Puri & Wely Lau for a webinar on April 30th at 11:00 AM SGT to learn how to architect a secure AI Management layer. We’ll explore developing governed MCP endpoints, managing tool access to enterprise data, and operationalizing AI with robust audit logs. The session includes a live demo of these frameworks in action on Google Cloud.

RSVP here.

Mar 23 - Mar 27

Turn your API sprawl into an agent-ready catalog
As organizations scale, APIs often become scattered across multiple gateways, creating "blind spots" that hinder AI adoption. To solve this, we’ve introduced two new capabilities for Apigee API hub: a new integration with API Gateway to automatically centralize API metadata into a single control plane, and a specification boost add-on (now in public preview). This add-on uses AI to enhance your API documentation with the precise examples and error codes that AI agents need to function reliably.

Read the full blog post to get started.
Webinar | April 16: AI Command & Control
As AI agents move from experimental pilots to core enterprise functions, governance is the critical next step. Join Google Cloud expert Satyam Maloo for a webinar on April 16th at 11:00 AM IST to learn how to architect a secure AI Management layer. We’ll explore developing governed MCP endpoints, managing tool access to enterprise data, and operationalizing AI with robust audit logs. The session includes a live demo of these frameworks in action on Google Cloud.

RSVP here.
Modernizing and Decoupling Event Ingestion with Apigee
In modern cloud-native architectures, decoupling producers from consumers is critical for building resilient systems. While Google Cloud Pub/Sub provides a scalable backbone, exposing it directly to external clients can introduce security and management overhead. This new guide explores how to leverage Apigee as an intelligent HTTP ingestion point. Learn how to handle security, mediation, and traffic control before messages reach your internal bus using the PublishMessage policy or Pub/Sub API.

Read the full guide.

Mar 16 - Mar 20

Gemini-powered Assistant in BigQuery Studio Gets Context-Aware Upgrades
The Gemini-powered assistant in BigQuery Studio has been transformed into a fully context-aware analytics partner, supporting your entire data lifecycle. The new capabilities include intelligent resource discovery, which uses Dataplex Universal Catalog search to find resources across projects and deep dive into metadata using natural language. You can now automate tasks, such as scheduling production-grade queries directly through the chat interface, and instantly troubleshoot long-running or failed jobs with root cause analysis and cost control auditing.

Explore the full range of what the assistant can do.

Mar 9 - Mar 13

Want to use Gemini to develop code and don't know where to start?
This article includes a couple of examples of developing code with Gemini prompts; it identified changes that were needed to be made to get the code working. The article also refers to other examples that are available on github.

Mar 2 - Mar 6

Introducing Gemini 3.1 Flash-Lite, our fastest and most cost-efficient Gemini 3 series model. Built for high-volume developer workloads at scale, 3.1 Flash-Lite delivers high quality for its price and model tier. Gemini 3.1 Flash-Lite can tackle tasks at scale, like high-volume translation and content moderation, where cost is a priority. And it can also handle more complex workloads where more in-depth reasoning is needed, like generating user interfaces and dashboards, creating simulations or following instructions.

Starting today, 3.1 Flash-Lite is rolling out in preview to enterprises via Vertex AI and developers via the Gemini API in Google AI Studio.
TechTalk: Implementing Device Authorization Grant (RFC 8628) for Apigee
Learn how to authorize "headless" devices like Smart TVs or AI agents that lack keyboards and browsers. Join our Community TechTalk on March 19 (5PM CET / 12PM EDT) to go under the hood of Apigee X/Hybrid. We’ll cover the real-world mechanics of state management, polling, and human-in-the-loop security patterns for devices and autonomous agents.

Register for the TechTalk

Feb 23 - Feb 27

Pro-level image generation gets faster and more accessible with Nano Banana 2
Nano Banana 2 is our state-of-the-art image generation and editing model. It delivers Pro-level image generation and editing at the speed you expect from Flash — making the quality, reasoning, and world knowledge you loved about Nano Banana Pro more accessible. Learn more about the model here.

The Intelligent Path to Compliance: Transforming Regulatory QC with Google Cloud
Reducing "Refuse to File" (RTF) risks and submission cycle times is critical for life sciences leaders. Google Cloud’s Regulatory Submission Semantic QC Auditor leverages Gemini and RAG architecture to transform Quality Control from a manual burden into an active, intelligent workflow.

By automating semantic cross-referencing, narrative coherence checks, and dynamic guidance-based auditing, this solution ensures rigorous accuracy and auditability. Operating within a secure GxP-ready environment, it empowers teams to detect subtle inconsistencies and generate remediation plans without sacrificing data privacy.

Learn more.
Stop typing, start interacting! The Gemini Live Agent Challenge is here. Build immersive agents that can help you see, hear, and speak using Gemini and Google Cloud. Compete for your share of $80,000+ in prizes and a trip to Google Cloud Next '26!

Submissions are open from February 16, 2026 to March 16, 2026. Learn more and register at geminiliveagentchallenge.devpost.com

Feb 9 - Feb 13

Introducing Gemini 3.1 Pro on Google Cloud.
3.1 Pro is a noticeably smarter, more capable baseline for complex problem-solving. We’re shipping 3.1 Pro at scale, building upon our goal to help you transform your business for the agentic future. Learn more about the model’s capabilities here. Gemini 3.1 Pro is available starting today in preview in Vertex AI and Gemini Enterprise. Developers can access the model in preview via the Gemini API in Google AI Studio, Android Studio, Google Antigravity, and Gemini CLI.
Automate Storage Compatibility with GKE Dynamic Default Storage Classes
Managing storage across mixed-generation VM clusters in GKE just got easier. With the new Dynamic Default Storage Class, Google Kubernetes Engine automatically selects between Persistent Disk (PD) and Hyperdisk based on a node's specific hardware compatibility. This abstraction eliminates the need for complex scheduling rules and manual pairing, ensuring your volumes "just work" regardless of the underlying infrastructure. By defining both variants in a single class, you reduce operational overhead while maintaining peak performance and cost-efficiency across your entire cluster.

Explore automated disk type selection
Community TechTalk: AI-Powered Apigee Development with strofa.io
Join the Apigee community on February 26 for a deep dive into strofa.io. Guest speaker Denis Kalitviansky will demonstrate how this new AI-powered tool automates and orchestrates Apigee development, from local emulators to large-scale hybrid environments. Discover how to scale your API management and streamline team collaboration using the latest in AI-driven automation.

Register now to reserve your spot.

Jan 26 - Jan 30

Simplify API Governance with Native OpenAPI v3 Support
Eliminate integration debt and accelerate deployment velocity with the General Availability of OpenAPI v3 (OASv3) support for API Gateway and Cloud Endpoints. You no longer need to downgrade modern specifications to OASv2. Instead, you can now define API contracts and enforce critical policies—including telemetry, quotas, and security—using native Google-specific extensions directly within your OASv3 files. This update ensures your APIs are secure by design while remaining fully compatible with the modern developer ecosystem and Google Cloud’s AI services.

Get started with OpenAPI v3 on API Gateway and Cloud Endpoints.

Accelerate API Testing with the New Open Source API Tester
Start validating your APIs with API Tester, a simple, YAML-based Test Driven Development (TDD) framework. Designed for the Apigee community, this tool allows you to write human-readable tests, run them instantly via a web client or CLI, and perform deep unit testing on Apigee proxies. With native support for JSONPath assertions and Apigee shared flows, you can verify everything from payload data to internal variables like proxy.basepath without leaving your terminal.

Explore the API Tester guide and start testing your proxies today.
Secure Sensitive Data with Kubernetes Secrets in Apigee hybrid
Enhance security in Apigee hybrid by accessing Kubernetes Secrets directly within your API proxies. This hybrid-exclusive feature keeps sensitive credentials within your cluster boundary and prevents replication to the management plane. It supports strict separation of duties: operators manage secrets via kubectl, while developers reference them as secure flow variables—ideal for high-compliance and GitOps workflows.

Implement Kubernetes Secrets in your hybrid proxies.
See the Console in a Whole New Light: Dark Mode is Now Generally Available in Google Cloud
Elevate your cloud management workflow with Dark Mode, now generally available in the Google Cloud console. We have delivered a modern, cohesive, and accessible experience reimagined for maximum comfort and productivity—especially during extended working hours and low-light environments. Dark Mode can be enabled automatically based on your operating system's preference, or manually through the Settings -> Appearance menu.

Switch to Dark Mode today to enjoy a modern, comfortable, and productive environment!
Apigee X Networking: PSC or VPC Peering?
Deciding how to connect Apigee X? Watch this video to compare Private Service Connect and VPC Peering. We break down northbound and southbound routing, IP consumption, and how to reach targets on-prem or in the cloud. Learn to simplify your architecture and avoid common networking "gotchas" for a smoother deployment.

Watch the video.

Jan 19 - Jan 23

Bridge the Gap: Excel-to-API Conversion in Apigee Portals
Give your customers more ways to connect! This new article by Tyler Ayers explores how to extend the Apigee Integrated Portal to support direct Excel file uploads. By leveraging SheetJS and custom portal scripts, you can enable users to upload spreadsheets, preview data, and submit it directly to your APIs, all without writing a single line of integration code themselves. It’s a powerful way to simplify onboarding for those who aren't yet API-ready.

Learn how to build it.
Elevate your applications with Firestore’s new advanced query engine
We have fundamentally reimagined Firestore with pipeline operations for Enterprise edition. Experience a powerful new engine featuring over a hundred new query features, index-less queries, new index types, and observability tooling to improve query performance. Seamlessly migrate using built-in tools and leverage Firestore’s existing differentiated serverless foundation, virtually unlimited scale, and industry-leading SLA. Join a community of 600K developers to craft expressive applications that maximize the benefits of rich queryability, real-time listen queries, robust offline caching, and cutting-edge AI-assistive coding integrations.

Learn more about Firestore pipeline operations.

How the University of Central Oklahoma is using AI to streamline analysis of complex criminal cases

Thu, 28 May 2026 19:00:00 +0000

In the high-stakes world of forensic science, time is the enemy of justice. The University of Central Oklahoma (UCO) Forensic Science Institute (FSI) was looking for an innovative AI solution that could help reduce the time required to analyze complex criminal case documents and construct clear timelines. In early tests, this solution, which can typically take criminal investigators months, has been demonstrated to significantly reduce this time.

This collaboration between Google Public Sector and UCO’s technology and forensic experts establishes a path for a new standard for accelerating the pursuit of justice across the country and provides powerful efficiency gains.

A new vision for the pursuit of justice

This project, which originated in an AI hackathon sponsored by the university’s CIO Sonya Watkins, centers on Google's NotebookLM, an AI research tool used by the university as a thinking partner for the review of complex criminal cases. Sonya Watkins, CIO, University of Central Oklahoma and I co-led the hackathon with the goal of rapidly identifying and prototyping high-impact solutions for the university, while providing the technical framework, and brought expertise in rapid prototyping to quickly translate complex criminal case requirements into a functional, evidence-backed digital application. During the hackathon, teams generated ideas, which were then stack-ranked using Gemini, based on potential impact and feasibility, quickly identifying the case timeline analysis as a high priority use case.

The enthusiasm is absolutely warranted. We’ve seen a process that was a months-long burden reduced significantly in our trials. The human expert remains the critical validator, but we're now equipping them with an exponential speed advantage. Sonya Watkins
CIO, University of Central Oklahoma

Building a scalable framework on a trusted foundation

The UCO team, including FSI Instructors Meagan Raddatz and Amber Fortney, have been meticulously working to ensure that the AI-generated timelines are forensically sound and meet stringent standards required for the federal and state justice systems.

Their ongoing work focuses on developing a repeatable case analysis framework using Google’s NotebookLM intended for potential future national adoption. This framework is engineered to ensure that every AI-generated conclusion is directly cited back to the original source document, a critical requirement for maintaining evidence integrity. By automating document analysis and creating a measurable process, UCO is setting a new national standard for how forensic institutes and law enforcement agencies can operate.

Our goal wasn’t just to solve our internal problem; it was to build a framework that other forensic institutes and law enforcement agencies could adopt seamlessly. By focusing on standardized processes and reliable citation, we believe we are creating a scalable solution that has the potential to significantly accelerate how evidence is processed nationwide, aiding the delivery of justice. Amanda Keesee
IT Director for Academic and Research Technology Center, University of Central Oklahoma

Bring AI-powered research to your institution at no cost

Bring AI-powered research to your campus with Gemini for Education to transform how your teams learn, work, and innovate. By exploring NotebookLM within the Gemini ecosystem, you can equip your researchers with the same tools driving UCO’s success in reducing complex data analysis from months to days, right now.

AI in SRE: Where and how Google is deploying agentic AI to improve operations

Thu, 28 May 2026 16:00:00 +0000

Since its inception over 20 years ago, Google has used Site Reliability Engineering (SRE) to keep services like Search, Gmail, Maps, YouTube and Google Cloud reliable and highly available, adhering to the principles and practices of the reliability-first mindset.

Recently though, the emergence of AI has driven multiple step-changes in system complexity. Interactions between components are now more complicated due to a variety of factors:

With microservice architectures, systems are distributed across wider geographical locations and data centers that have greater hardware diversity.
Enterprise cloud products offer an extensive array of capabilities with an incredibly complex set of products.
Google services now cover more unique business and regulatory requirements, making the overall topology and taxonomy much more complex and difficult to understand, a challenge amplified by the constant stream of system changes resulting from continuous deployment pipelines.
AI code generation capabilities have enabled software developers to deliver orders of magnitude more code, resulting in more opportunities to introduce reliability issues.

While AI is in some ways making the SRE team’s work more challenging, it also provides new ways to understand and improve software development lifecycles, including production operations. Google SRE is on the path to fully adopt AI and agentic technologies, leveraging AI as a force multiplier while also maintaining control. We call this SRE AI.

Read on for a summary of considerations when thinking about this topic, or you can dive straight into our comprehensive whitepaper, AI in SRE Practice: Moving Beyond Automation at Google, for an in-depth look at how Google SRE is navigating the transition from deterministic automation to agentic AI.

The SRE AI opportunity landscape

To help define our SRE AI strategy, we considered the overall software development lifecycle (SDLC) for areas of opportunity.

The above diagram shows each of the phases where SRE is involved, and that could be improved with SRE AI.

Perhaps the most obvious SRE area that could benefit from agentic AI is investigation and mitigation, sometimes referred to as root cause analysis (RCA), a cornerstone of the traditional SRE discipline. But RCA is by no means the whole SRE AI. Our plans for SRE AI go far beyond RCA and troubleshooting, and address the entire SDLC. Here are a few areas we are working on:

Reliability design

SRE has been working on the policies, tooling and procedures you need to ensure reliability is an integral part of system design through the design, launch, and deployment phases. An agentic approach does not necessarily imply removing people from the process, specifically for higher-risk services and features, but it does significantly reduce the time people need to spend, as a number of issues can be detected and auto-addressed before they need to be reviewed by a person.

Runbooks (playbooks) and other documentation to be used during incidents are important production artifacts. Google SRE has developed AI agents to continuously monitor and improve playbooks and production documentation based on their usage during incidents. AI agents can also generate new playbooks from incidents.

Anomaly detection and alerting

A core SRE practice is to define service level indicators (SLIs) and service level objectives (SLOs), and to configure alerts for them. This approach tends to be ok if service use cases are fairly uniform, and if it is possible to define objectives that align to customers' expectations.

However, for products that support a range of customer use cases and workloads, like many in Google Cloud, it can be difficult to define a static threshold that works across a variety of workloads. With AI, Google SRE is augmenting our more traditional approaches with anomaly detection, with alerts based on detecting anomalies in regular behavior rather than statically predefined thresholds. This approach relies on agents to collect signals and feed them to a model (e.g., TimesFM) to perform anomaly detection. Historical signals from prior customer cases help the AI agent to predict customer-oriented SLOs. Further, AI-based anomaly detection can consult sources beyond signals produced by service itself — for instance, customer feedback.

In this model, when the SRE AI agent detects an anomaly, it triggers an alert. Then, the SRE AI alerting agent groups, pre-processes, and enriches the alerts with the necessary context and information. These alerts in turn are run through autonomous AI alert handlers, which can address or mitigate a multitude of issues. The outcome of this system is faster issue resolution and a likely significant reduction in the number of alerts that SREs need to review.

What's key in this ecosystem of agents is to be consistently transparent about what the data agents are evaluating — and how — and having consistent controls to prevent unwanted mutations of production state.

Incident management

Within Google SRE, incident management, or IMAG, is a well-established process with clear roles and responsibilities, as well as tooling. SRE AI includes an agentic orchestration layer on top of the current IMAG process, which consists of agents that:

Monitor the communication surfaces used during the incident (incident response tools, chat spaces, videos, tracking documents), and consolidate/summarize data to improve communication and information sharing during the incident
Support handoff between SREs participating in the incident, by creating handoff documents with necessary context
Automatically create drafts of incident postmortems, improving their quality, reducing SRE effort, and ensuring that relevant information is included
Manage internal and external incident communications

Incident investigation

The Google SRE team has also created agents to investigate incidents, and in some cases to autonomously mitigate issues.

Before they can proceed to form hypotheses and propose mitigation steps, these agents use observability data (logging, motoring, tracing), as well as system topology, taxonomy, and dependency data to establish domain and intent. A few other building blocks that these agents use are distinct agents the team has created for navigating and executing playbooks, accessing alerting, performing anomaly detection, and deriving incident insights.

Insights and risk management

SRE requires an understanding of the end-to-end system and effective mitigation solutions, experience and lessons learned from past incidents, and the ability to perform risk management. Autonomous AI agents need similar skills to be able to manage production environments.

While a common topology or taxonomy system can teach agents about the end-to-end system, and well-documented and described production Model Context Protocol (MCP) tools and skills can teach them about available tooling, there needs to be a way to continuously teach agents about historical issues and their associated risks. To solve that problem, the Google SRE team created AI Insights, a system that continuously reviews known incidents and extracts meaningful information from them, then makes it available to agents to drive better investigations and mitigation steps. Gemini embedding models and vector-enabled databases power this system.

The other part of the system is risk insights. The AI system marks each incident with appropriate risk categories that can be used both by agents before applying mitigations, and by SREs to determine critical areas to address.

Design considerations

Before building out these agents, Google SRE defined a few high level principles for their adoption:

Processes and operations that are already successfully automated, or that can be easily automated with classic non-AI based systems, do not need to be replaced (as long as they meet business needs).
Any new AI-based system must comply with existing and upcoming policies and procedures to keep the strong promises we have to our customers.
An SRE AI agent needs to meet security, safety, and privacy requirements the same way as current systems and humans.
SRE AI agents must have a strong identity (agents have roles and permissions assigned).
SRE AI agents need to provide a high level of reliability SLOs and have well-defined backup options (automated or manual).
SRE AI agents must be able to explain and reason about why and how they performed an action, as well as what options were considered and rejected. In other words, we favor transparency over black-box automation.
Business continuity plans must include contingencies for potential AI failures.
AI-based systems need continuous access to production data to make correct decisions.
AI systems need to be continuously evaluated against a quality framework, as well as to support auditing and reporting to enable security tooling like detection and response.

In addition, we stipulated that SRE AI systems should make Google services even better for users and customers by accomplishing at least one of the following:

Relieve engineers from laborious and repetitive operations
Help engineers improve the quality and speed of decision making and execution
Allow SREs to better prevent, detect, and/or mitigate problems than they could address before
Enable autonomous agentic feedback loops that drive toward service reliability improvements
Reduce overall operational costs

Built on proven infrastructure

Google SRE AI is built on proven Google infrastructure:

Gemini: The base foundational model behind Google SRE AI. The SRE team also depends heavily on custom fine-tuned Gemini models based on internal Google data and knowledge.
Gemini Enterprise Agent Platform (formerly Vertex AI): A full AI stack for developing solutions.
Agent Development Kit (ADK): The development platform.
MCP servers: Running on top of standard Google API infrastructure, this is the same infrastructure used to provide external customers with MCP support.
Standard internal observability infrastructure (monitoring, logging, tracing).
AI and ML capabilities built into Google BigQuery, and Google vector databases.

We group these infrastructure components together into autonomous systems. At Google, we’ve been developing and using autonomous systems to manage production for a long time. However, today’s AI-based autonomous systems are very powerful and not always deterministic. To help us understand how autonomous the systems truly are, we developed a way to track autonomous levels.

Dive deeper: Read the white paper

For engineers and leaders looking to explore the technical architecture and rigorous governance models behind these innovations, we invite you to read our comprehensive whitepaper, “AI in SRE Practice: Moving Beyond Automation at Google,” which provides an in-depth look at how Google SRE is navigating the transition from deterministic automation to agentic AI. Download the whitepaper here.

Evolving Dataflow to process massive datasets for machine learning

Thu, 28 May 2026 16:00:00 +0000

Google created MapReduce more than 20 years ago to solve the scaling problems in data processing that the then young company was running into. The AI era that we are in now demands efficient, large-scale data processing for everything from training frontier models like Gemini by Google DeepMind to powering fully autonomous vehicles like Waymo.

Many aspects of machine learning, including data ingestion, transformation, and feature extraction, rely heavily on processing massive datasets. To meet this astronomical scale required by efforts across Google, we evolved our data platform, Flume, the successor to the original MapReduce, with innovations focused on scalability, efficiency, and a better developer experience. And many of those innovations are available as part of Dataflow, our fully managed batch and streaming platform built on the same core technology Google uses to power its most demanding internal workloads.. In this blog, we provide an overview of the many innovations in the Flume platform, and a glimpse into how Google Cloud customers are putting those features into action with Dataflow.

Addressing massive scalability

The scale of data processing at Google has exploded over the last 20 years and continues to drive innovation. To tackle the challenges of immense scale, we introduced several features within Google's data processing platform, which are also available in Dataflow::

Liquid sharding dynamically splits work units (shards) during execution for on-the-fly rebalancing. This helps pipelines with uneven data distribution and stragglers to maximize worker efficiency as data grows.
Global compute enables enormous scaling by dynamically scheduling workloads across Google's global infrastructure. The system automatically determines the optimal location based on factors like data locality and resource availability.
Automatic pipeline optimization fuses consecutive operations into a single stage. This reduces I/O and stage-transition overhead, allowing large-scale execution to scale more gracefully.
Rate-limiting external API calls manages load on external services. This is essential for modern ML pipelines that frequently call external APIs for tasks like model evaluation, preventing high data volumes from overloading systems.
Tandem pools facilitate serverless remote inference. This feature helps overcome scalability limitations often found in remote inference systems by efficiently hosting, sharing, managing, and autoscaling external model servers.

Boosting efficiency with accelerators

Doing more with less isn't just a constraint; it fuels our progress. By finding ways to run more efficiently, we create the space and capacity needed for rapid innovation. This is particularly evident for teams that use accelerators like TPUs for their workloads. To improve utilization and cost efficiency, our engineers devised several novel features for our platform, now part of Dataflow:

Heterogeneous worker pools allow developers to specify custom resource requirements for different pipeline stages. For example, TPU-intensive work runs on TPU-equipped workers, while other stages use standard CPU workers. This ensures optimal resource allocation.
TPU-aware autoscaling prevents excessive initial assignment of TPU workers and improves efficiency during subsequent autoscaling events.
Duty-cycle policy enforcement automatically scales down TPU workloads when the accelerator's duty cycle (the fraction of time it is active) is low, scaling back up only when utilization improves.
TPU fungibility: By working with other infrastructure teams, we developed optimizations to encourage scheduling jobs to the most suitable TPU version and cell location based on quota and resource availability.

Enhancing the developer experience

Considering the wide mix of backgrounds and tools across Google, rapid prototyping, iteration, and reliable production operations are extremely important. Google has invested in significant capabilities to enhance the overall user experience:

Language flexibility is provided through a versatile SDK with a simple API in C++ (internal to Google), Java, Python, and Go (with SQL support). This allows users to build batch, ML, and streaming pipelines.
Integration with ML frameworks like JAX is available, along with native support for LLM-specific optimizations. The underlying platform also provides building blocks for robust agentic inference pipelines and supports simple transitions between bulk and streaming paradigms.
Unified batch and streaming enables users to use the same code for both historical batch and live streaming data. This simplifies the architecture, which traditionally would have required separate pipelines for batch and streaming data processing.
Observability for production pipelines is available through the monitoring UI, which offers comprehensive control and essential diagnostic data. Detailed performance metrics, such as stage-level TPU utilization graphs, provide transparency for troubleshooting and optimization tasks.
Advanced developer workflows for quicker day 0 and day 2 operations include features like sampling and dry-run to help ensure code accuracy. Users can also test pipelines on small in-memory collections, and even pause and resume production pipelines.

Dataflow brings innovation from Google's internal platform to Google Cloud

Dataflow is built upon Google's internal platform, sharing many core components, including the execution engine and the Apache Beam SDK (which originated from Flume’s APIs). This close relationship means that the cutting-edge solutions we build to handle Google’s internal data processing challenges, like pipelines that process hundreds of billions of documents, directly benefit Dataflow users. In fact, unique Dataflow features like vertical scaling, right fitting, dynamic sharding, and straggler detection all resulted from solutions developed for Google’s internal workloads.

This is one of the reasons many Google Cloud customers rely on Dataflow for critical ML applications: Spotify uses Dataflow for large-scale generation of ML podcast previews. Etsy leverages Dataflow for data preparation and ETL for its ML workloads. And Moloco uses Dataflow to process terabytes of data a day to update its prediction model for real-time ad bidding.

The momentum continues: Last quarter we launched support for TPU in Dataflow in addition to supporting GPUs. Looking ahead, we are working on an advanced reliability feature called speculative execution and are enhancing the developer experience with features like failure isolation and replay and pause/resume, which are coming soon. To learn more or get started with Dataflow visit https://docs.cloud.google.com/dataflow/docs/get-started.

Nano Banana 2 and Nano Banana Pro are generally available, and already powering creative workflows

Thu, 28 May 2026 16:00:00 +0000

Organizations are unlocking entirely new ways to use image generation and editing across their industries. To drive next-generation experiences, businesses are embedding AI directly into creative, agentic workflows. But next-gen workflows require enterprise-grade AI you can trust.

What’s new: To help customers continue their creative journey securely, we are announcing Nano Banana 2 (Gemini 3.1 Flash Image) and Nano Banana Pro (Gemini 3 Pro Image) are generally available (GA) today via Gemini Enterprise Agent Platform. Backed by enterprise-grade infrastructure and security, these models empower you to integrate high-quality image generation and editing capabilities directly into your applications and workflows.

Alongside this milestone, we are introducing a powerful new capability in preview that significantly expands how models process multimodal inputs: Nano Banana 2 now supports video files as an input prompt. In addition to text, pdf or image input references, the model now utilizes deep video understanding to analyze the visual context, specific subjects, and actions within video footage to generate context-aware images, including thumbnails, rich infographics, and more. Try this feature here.

Note: The 1K and 2K output capabilities are generally available for both models, while the 4K capability remains in preview.

How our customers are innovating with Nano Banana models

We are continually inspired by how our partners and customers are pushing the boundaries of what is possible. By bringing these advanced image generation capabilities into their operations, organizations are innovating across their creative pipelines and empowering their users to build the next generation of visual experiences.

Enabling creative and marketing innovation

By embedding our generative image models directly into industry-leading creative tools and workflows, organizations are driving unprecedented creative innovation, scaling tailored campaigns, and fundamentally changing how brands engage with their audiences.

"Marketing and creative teams are under pressure to produce higher-quality enterprise-grade content faster, all while keeping brand integrity front and center," said Aaron Mitchell Finegold, Head of Product Marketing, Adobe Firefly Enterprise. "Nano Banana models are already powering that reality for enterprise teams working in Adobe Firefly and Adobe GenStudio, where customers can access the industry's leading AI models alongside Adobe's best-in-class creative tools. By pairing the power of advanced generative models with trusted creative and marketing workflows, organizations can move from experimentation to execution at enterprise scale.”

"Through our expanded partnership with Google, WPP received early access to Nano Banana 2 and Pro, which have been integrated into WPP Open, our agentic marketing platform. These models provide increased consistency and controls and have quickly become foundational for scaled content production systems, implemented for clients such as Verizon, L’Oreal and Unilever. Using Google’s Image models in WPP Open, teams are able to quickly optimize assets in media and adapt creative. We are thrilled to partner with Google Cloud to continually push the boundaries of creativity leveraging Generative Media.” — Elav Horwitz, Chief Innovation Officer, WPP

Transforming retail and customer interactions

Shopping platforms use our image models to offer immersive experiences like virtual try-ons and dynamic catalog enrichment, giving shoppers a highly interactive, personalized feel for products before they make a purchase.

"Nano Banana and Nano Banana Pro are a step forward in quality and speed that can help us unlock even better image generation for merchants. Merchants leverage image generation capabilities to expand their existing product photography and to generate compelling, high-fidelity social and lifestyle imagery that highlights their catalog for buyers." — Matthew Koenig, Senior Staff Product Manager, Shopify

Similarly, URBN (Urban Outfitters) uses Google's generative media capabilities to accelerate early-stage product development.

"URBN is leveraging Google’s image generation and editing capabilities to accelerate early-stage product development. In an initial pilot, the team has demonstrated the potential to significantly compress its trend-to-market pipeline” - Demo Lymberopoulos, Global Executive Director, URBN (Urban Outfitters)

Building next-generation media production workflows

Media and entertainment companies adopt these models to build next-generation applications that manage complex production pipelines, allowing studios to innovate their workflows while maintaining directorial control.

"Drawing on experience tackling some of the world's most complex AI creative challenges, Nodey was built to fix the fragmented interfaces and manual workflows that hold creators back. The integration of Nodey into OKO - our spatial intelligence platform - bridges the gap between AI experimentation and professional production. We've replaced trial-and-error prompting with a workflow anchored in a spatial environment, giving creators a way to use Google's generative models - like Nano Banana and Veo within a controllable and secure 3D pipeline, ensuring that every generated element stays perfectly aligned with the creative intent." — Ben Grossmann, CEO, Magnopus

Gemini 3 Pro Image and Veo 3.1 workflow

Get started with building enterprise-grade multimodal experiences

Whether you're building immersive retail applications, interactive commerce tools, or accelerating media production workflows, Google Cloud provides the models and tools to build the next generation of agentic creative and multimodal experiences. Access the technical and commercial frameworks you need to deploy Nano Banana 2 and Nano Banana Pro at enterprise-scale, fully supported by our enterprise SLA.

Resources:

Nano Banana 2 (Gemini 3.1 Flash Image) documentation
Nano Banana Pro (Gemini 3 Pro Image) documentation
Developers can also access both models via Gemini API (not backed by our enterprise SLA)
Ultimate prompting guide to Nano Banana

Go from resource-level to business-level maintenance in Google Cloud

Thu, 28 May 2026 16:00:00 +0000

Managing planned maintenance is a critical part of running a reliable business. But as your cloud footprint grows into hundreds or even thousands of projects, keeping track of every individual update can feel like a full-time job. For many platform teams, the reality is a disjointed experience: jumping between dashboards to figure out which maintenance event affects which business service.

At Google Cloud, we believe you shouldn’t have to think like an infrastructure manager when you’re trying to solve a business problem. That’s why we are excited to announce the launch of App-centric maintenance visibility within Unified Maintenance.

Shifting the focus to your business

Until now, maintenance visibility was primarily resource-focused. You could see when a specific Compute Engine VM or Cloud SQL instance was due for an update, but you had to manually map those resources to the applications they powered.

With App-centric visibility, we are shifting the focus from infrastructure-level resources to a business-oriented view. By integrating directly with App Hub, Unified Maintenance now allows you to see maintenance events in the context of your applications.

How it works

This new capability leverages the "application" as the primary unit of management. When you register your resources in App Hub — whether they are GKE clusters, GCE VMs, or AlloyDB instances—Unified Maintenance automatically aggregates their maintenance schedules into a single, application-aware dashboard.

For platform engineers, this means:

Reduced toil: No more manual mapping of maintenance alerts to application owners.
Faster triage: Instantly see if a performance dip in an app coincides with a planned infrastructure update
Predictable operations: Gain a business-oriented view of maintenance impacts across your entire landscape

Get started today

Enable the Maintenance API now. If you already have Applications defined in Google cloud, you can explore the new App-centric visibility features directly in the Google Cloud Console. To learn more about setting up your application boundaries and mapping resources, check out our Get Started Guide.

Finally, don’t forget to check out our supported services page as we are onboarding new services.

Announcing the newest cohort of the Google for Startups Accelerator: Middle East, North Africa & Turkey

Thu, 28 May 2026 07:00:00 +0000

Google’s mission is to organize the world’s information and make it universally accessible. In high-growth, technically ambitious markets like the Middle East, North Africa, and Türkiye (MENA-T), we fulfill this mission by supporting AI-First startups building the next generation of information-driven services on a global scale. In a region known for its resilience, we want to help founders flourish in any conditions.

The newest cohort of 15 companies in the Google for Startups Accelerator: MENA-T program starts on June 1. They follow on the success of our sixth group, which concluded in November 2025 and set a new benchmark for the region.

Over the course of the fall 2025 program, 14 AI-first startups from 8 different countries received more than 230 hours of specialized 1:1 mentorship from Google experts. This support allowed them to achieve measurable technical and business milestones, including refining their business strategies, accelerating AI/ML initiatives with Google Cloud, and enhancing overall product design.

We’re supplementing the 2026 program with additional resources, focus, and training to help these startups navigate the uncertain geopolitics that can affect the region and the world at any time.

Introducing the newest Google for Startups Accelerator: Middle East, North Africa, Turkey cohort

With a record breaking volume of applications, we are seeing more and more startups leveraging AI technology and addressing meaningful challenges with their business. Please join us in welcoming the 15 companies selected to participate in this cohort:

BioTwin creates virtual twins from health data to detect risks and recommend preventative actions.
Coral replaces manual sustainability processes with real-time enterprise overviews.
Each::labs builds the next generation of AI-native tools to streamline complex developer workflows.
Hakeem translates clinical studies into real-time, patient-specific guidance for clinicians.
inveon.ai deploys agentic AI to provide autonomous digital employees for e-commerce.
Jusoor Labs uses AI to analyze science experiment interactions and improve learning outcomes.
Openfarming automates distributor workflows to reduce waste and protect margins.
Plusfinity builds AI-native learning infrastructure for scalable, interactive education.
Promake empowers the manufacturing sector with AI-driven design and production optimization tools.
Qanooni transforms manual legal work into structured, searchable workflows.
Repzo uses AI to turn complex field data into natural language reports for field teams.
RFxAI streamlines procurement and sales through AI-driven response evaluation.
Tapper applies machine learning to detect anomalies and block invalid traffic.
TruBuild analyzes unstructured construction data for faster, objective tender evaluation.
Woliz uses voice AI to make digital ordering accessible for nanostore owners.

A curriculum designed for impact

Starting June 1st, founders will participate in a three-month program specifically tailored to help startups navigate their unique challenges. The curriculum provides intensive technical support, including comprehensive stack audits and one-on-one mentorship from global experts.

By balancing advanced technical training — focused on AI security and generative design — with strategic business modeling and go-to-market planning, we empower founders to scale their innovations securely. This holistic approach is designed to help startups maintain momentum and drive the region’s sustained digital growth and long-term resilience.

The program has already demonstrated significant impact for the fall cohort, with a number of startups accelerating their growth and development.

COGNNA, a provider of an agentic security operations center (SOC) suite, is among those seeing sustained growth. With improvements made during the accelerator, their platform now allows analysts to work 80% faster, and subsequently have closed a $9.2-million Series A funding round.

By using BigQuery to ingest petabytes of data and Google Kubernetes Engine to scale investigations, the startup has transformed its security operations and dramatically improved efficiency. "Google is shaping the future of COGNNA by enabling us to scale with global markets," said Ziyad Alshehri, co-founder and CTO of COGNNA.

Smart Bricks, a UAE-based startup for AI-powered real estate investing, recently closed a $5 million pre-seed round led by a16z Speedrun. Smart Bricks uses Google’s machine learning pipelines to automate 99% of manual real estate investment workflows across Dubai, London, and New York.

“The Google for Startups Accelerator played a key role in accelerating our technical development,” Mohamed Mohamed, founder and CEO of Smart Bricks, said. “Access to Google’s AI and cloud stack has been instrumental in building and scaling our agentic AI models, particularly given the scale and complexity of the data we’re working with. And infrastructure like Gemini Enterprise Agent Platform and BigQuery allowed us to significantly speed up our development cycles, improve model performance, and bring a much more robust, data-driven platform to market faster.”

Google’s commitment to MENA-T growth

We continue to support founders across the region, providing the specialized resources and cloud infrastructure needed to ensure that innovation continues to scale. Our goal is to ensure that the region’s digital economy continues its acceleration toward a more secure and innovative future.

We are excited to see how this new cohort will shape the future of the MENA-T ecosystem.

A Guide to AI Cold Starts on Cloud Run

Wed, 27 May 2026 17:23:00 +0000

I saw a developer asking on Reddit if there was any “sane way” to manage Cloud Run cold starts for AI across multiple regions. They were experiencing startup latencies of up to 20 seconds, a frustrating gap where the infrastructure is spinning up while the user waits for a response.

The discussion was full of developers who had almost given up on serverless GPUs, with some even migrating back to GKE just to escape the latency. I decided it was time to dive deep into the Mechanics of AI Cold Starts and see if we could find that "sane way."

During my research into hosting models like Gemma 4 on Cloud Run, I had the privilege of co-presenting at Google Cloud Next '26 with Oded Shahar (Senior Engineering Manager for Cloud Run) and our guest speaker Ajay Nair (Global VP of Platform at Elastic).

In our session, "Build AI architectures with custom models on Cloud Run," Ajay shared the production-hardened strategies that allow Elastic to serve millions of daily requests across 17+ model variants, all while maintaining the 'scale-to-zero' efficiency of Cloud Run.

Build AI architectures with custom models on Cloud Run

Ajay showed us that the secret isn't just in the model, but in treating GPUs as fungible compute rather than infrastructure to manage.

I realized then that minimizing cold start latency isn't just about the model, it's about the infrastructure patterns and architectural decisions that keep it fast, scalable, and secure.

The anatomy of an AI cold start

As the official Google Cloud GPU best practices explain, an AI cold start is a shift from standard web microservices. You aren't just booting code, you're moving gigabytes of weights into a specialized physical accelerator.

Think of it as a four-phase race. If you don't optimize each step, you're going to lose your users.

Phase 1: Infrastructure Provisioning (~5s)

Cloud Run allocates the physical GPU and injects pre-installed NVIDIA drivers. Since Google manages the drivers for you, you don't have to bloat your Dockerfile.

Phase 2: Block-Level Container Image Streaming (1-2s)

Cloud Run uses "image streaming," meaning it pulls only the blocks needed to boot. Your 15GB CUDA image can actually start as fast as a tiny Node.js app!

Phase 3: Engine Initialization (5-15s)

This is where your inference engine (vLLM, Ollama) warms up. This is a massive CPU-heavy task, and it's where most people get throttled without realizing it.

Phase 4: Model Loading & VRAM Transfer

This is the final hurdle - moving those model weights from storage into the GPU memory. Unlike standard web apps where CPU is king, GPU memory is your primary constraint here. If your model’s weights don’t fit entirely within the GPU memory, performance degrades significantly as it swaps to slower system RAM.

Best practices to handling AI cold starts

To build a "sane" production environment, here are a few crucial levers you can pull, informed by the official Google Cloud documentation on AI inference with GPUs.

Optimize Phase 4

Pick the Right Deployment Option

Phase 4 is the "final hurdle" where you move gigabytes of weights from storage into GPU memory. Your choice of storage determines how fast this transfer happens:

Cloud Storage (Concurrent Download) - Fastest: Using the Google Cloud CLI (gcloud storage cp) allows you to download model files in parallel. This is the recommended method for massive weights because it maximizes network throughput and drastically reduces transfer time.
Cloud Storage (FUSE) - Easiest: This provides "zero-code" changes by mounting a bucket as a local file system. However, because it does not parallelize the initial download, it is significantly slower for large model weights
Container Image - Best for <10GB: Baking weights into your image is efficient for smaller models thanks to Cloud Run's Image Streaming. For models over 10GB, however, the import and streaming overhead can become a bottleneck.
Internet: Avoid this. It is the slowest and least predictable path for production inference.

Model Format & Size

Optimizing your model's format and size is a direct "hack" to shorten Phase 4 (Model Loading & VRAM Transfer). Because this phase is constrained by how fast you can move gigabytes of data into VRAM, smaller and more efficient files are critical.

4-bit Quantization: This is the ultimate cold start hack. Smaller weights mean fewer gigabytes to pull from storage, which directly accelerates the download and transfer portion of Phase 4,
Fast Formats: Pick a model format with fast load times like GGUF to minimize startup time. For the fastest performance, move away from Python "pickle" files and use Safetensors for zero-copy loading.
Ensure VRAM Fit: Use quantized models to ensure the weights fit entirely within the GPU memory. If the model exceeds VRAM, Phase 4 will stall as the system swaps to significantly slower RAM.

Optimize Phases 3 & 4: Infrastructure & Network Levers

These infrastructure settings provide the necessary resources to accelerate the most demanding parts of the startup process.

Startup CPU Boost (Accelerates Phase 3)

This feature temporarily doubles your CPU power during startup. A 1 vCPU instance boosts to 2 vCPUs for the duration of startup and the first 10 seconds of serving. It is essential for Phase 3, as engine initialization is a massive CPU-heavy task.

Direct VPC Egress & PGA (Accelerates Phase 4)

Utilizing Direct VPC Egress with Private Google Access (PGA) ensures your model weight traffic stays on Google’s internal high-speed backbone. This optimizes the network path to shorten the time spent moving gigabytes of weights into VRAM.

Concurrency Tuning (Cold Start Avoidance):

In Cloud Run, "concurrency" refers to the maximum number of requests a single instance can handle before the platform scales out to start a new one. For AI workloads, you must tune this setting in tandem with your model engine's internal parallelism flags (e.g., --max-num-seqs for vLLM or OLLAMA_NUM_PARALLEL for Ollama).

Use the official Google Cloud formula to find your ideal Cloud Run concurrency:

(Number of model instances∗parallel queries per model)+(number of model instances∗ideal batch size)

Example: If your instance loads 3 model instances onto the GPU, and each model instance can handle 4 parallel queries with an ideal batch size of 4, you would set your Cloud Run maximum concurrent requests to 24: (3×4)+(3×4)

How the math works: The goal is to keep the GPU fully saturated while ensuring users aren't stuck in a long queue. In this example, the total of 24 concurrent requests is split into two functional groups:

Active Processing (12 requests): Calculated as (3 instances×4 queries), this represents the total number of requests the GPU can actively process at any given moment.
The "Next Batch" Buffer (12 requests): Calculated as (3 instances×4 batch size), these are the requests waiting "on deck" inside the container. As soon as the GPU finishes the first batch, it immediately picks up these waiting requests.

By tuning this value as high as your VRAM allows (usually 10-20 users), one warm instance can serve many requests without triggering a new scale-out event and the cold start that comes with it.

Scaling Controls (Tuning the Threshold)

While the formula above defines your maximum capacity, you can also tune when Cloud Run decides to start the next instance. Cloud Run's autoscaler typically targets 60% utilization, but for long-running AI cold starts, you can increase this threshold to 80% or 90% via Scaling Controls.

Concurrency Target: Increasing this allows you to "pack" more requests into a single warm instance before triggering a scale-out.
CPU Target: Increasing the CPU target prevents the platform from starting a new instance just because initialization or high-intensity inference spiked the CPU utilization.

Scaling & Reliability Strategies

Sometimes the best way to handle a cold start is to avoid it entirely or manage it proactively.

The Single-Region "Always-On" Tradeoff

If you are deploying globally, the cost of keeping minimum instances set to 1 in every region adds up. Instead, consider an 'always-on' service in just one region. A 100ms global network delay is a much better user experience than a 20s local cold start.

The 15-Minute Grace Period: A common question is 'How long will my instance stay warm after a request?' Cloud Run generally keeps instances alive for 15 minutes after they become idle (processing zero requests). If your traffic is predictable and comes in every 10–12 minutes, you might not even need an 'always-on' service, the platform’s default shutdown policy will keep a warm instance ready for your next user.

Note: While this idle time is "free" for standard request-based services, remember that GPU services require instance-based billing, so you will be billed for the duration the instance remains warm between requests.

The "Wake-Up Call" Strategy

Sometimes the best way to handle a cold start is to proactively mask it. If your UI can predict an upcoming request, for example, when a user clicks "New Chat" or begins hovering over a text area, you can send a lightweight health check to your service immediately. By the time the user finishes typing their prompt, the first two phases of the cold start (Infrastructure Provisioning and Container Image Streaming) are already finished in the background.

Pro-Tip: Use Non-Inference Endpoints To make this "wake-up call" as fast as possible, always use a non-inference endpoint rather than sending a dummy prompt like "hi".

Why it’s faster: Non-inference endpoints (like /v1/models for vLLM or /api/tags for Ollama) are handled by the container’s web server the moment it starts. They don’t have to wait for the slow "Phase 4" model loading and VRAM transfer to complete before sending a success response.
No Chat Pollution: Because these endpoints don't trigger the model's completion logic, they won't interfere with the user's actual chat history or accidentally trigger session creation in your backend.

Recommended Endpoints:

vLLM: GET /health or GET GET /v1/models
Ollama: GET /api/tags or GET /api/version

Tune Startup Probes for VRAM

AI models take significant time to move gigabytes of weights from storage into GPU memory (Phase 4). If your startup check fails too many times, Cloud Run will assume your container is broken and kill it.

To prevent this:

Increase the Failure Threshold: Use a high failureThreshold (e.g., 60 or more). Since the total allowed startup time is the product of failureThreshold \times periodSeconds, a threshold of 60 with a 5-second period gives your model a healthy 5-minute window to load.
Utilize the 30-Minute Maximum: While standard services are limited to 4 minutes, Cloud Run supports a total startup time of up to 30 minutes (1,800 seconds) for intensive workloads.
Avoid False Positives (The Ollama Fix): Be careful with engines like Ollama, which may open a TCP port as soon as the service starts, but before the model is actually in VRAM. Always ensure you are preloading models during the container's entrypoint script to ensure the startup probe only passes once the model is truly ready for inference.

Lessons from Elastic’s strategy

In our NEXT ‘26 session, Ajay Nair highlighted three architectural decisions that allowed Elastic to treat GPUs as fungible compute, rather than infrastructure to manage:

Bypass the Compilation Tax: By setting enforce_eager=True in vLLM, they traded a tiny bit of throughput for cold starts that finish in less than a minute rather than multiple minutes.
Standalone Checkpoints: They avoided the latency of runtime adapter-switching by pre-merging each LoRA variant into a standalone checkpoint.
One Workload, One Service: Each independently-scalable workload — defined by model, task adapter, and traffic shape — is deployed as its own Cloud Run service. This produces 30+ services across ~15 model families, with some models split by task (e.g., v5 retrieval vs. clustering) or by query/passage role.

Ready to get started?

Optimizing the cold start process is the difference between a hobby project and a production-ready application. The best part? Cloud Run handles the NVIDIA driver and CUDA installation for you, starting the instance in about 5 seconds.

For a deeper dive, the official documentation is your best friend:

For the full technical breakdown, I highly recommend watching the recording of the session from Google Cloud Next '26. It provides the most comprehensive blueprint for hosting high-performance open models on serverless infrastructure."

Happy building!

Special thanks to Sara Ford and Shane Ouchi from the Cloud Run team and to Zac Li from Elastic for the helpful review and feedback on this article.

Introducing Google AI Threat Defense to help you outpace the adversary

Wed, 27 May 2026 12:00:00 +0000

aside_block: <ListValue: [StructValue([('title', 'Summary of today’s news'), ('body', <wagtail.rich_text.RichText object at 0x7f258c153d00>), ('btn_text', ''), ('href', ''), ('image', None)])]>

AI-powered cyber threats have been receiving a lot of attention lately. AI has changed the threat landscape; cybercriminals are using it to find security cracks faster than cybersecurity teams can manually fix them. Attacks that used to take weeks to carry out can now happen in mere hours or days. Organizations need to be able to keep pace and protect themselves against AI agent-driven, high-speed attacks — but they can no longer rely on legacy, manual methods.

To defend against this range of threats, organizations need more than one model or agent. No single model will catch everything, you want to use a collection of models for multiple passes. And you need a solution that can analyze your systems, prioritize the most significant threats, patch vulnerabilities quickly, and continuously monitor for new attacks.

That’s why we’re launching Google AI Threat Defense — an automated security system designed to help you continuously monitor for and stop AI-powered threats before they can impact your business.

Built on a decade of security leadership

Security isn’t just a layer of Google’s tech stack; it’s the part of the foundation. Our secure-by-default architecture automatically blocks 10 million spam emails every minute, and protects billions of users and customers across our broad portfolio.

But protecting the modern enterprise requires constant evolution. When we needed an architecture built on trust, we pioneered Zero Trust. To secure hardware, we built Titan chips. And to help enterprises manage an avalanche of threat data, we created Google Security Operations.

Now, AI is rewriting the rules of cybersecurity. By combining the expertise of Mandiant and Wiz with the advanced reasoning and code-generation capabilities of Gemini, we’re automating defense at scale for customers. We’re deploying LLM-powered analysis to help autonomously discover software flaws, and AI agents across Wiz and CodeMender to validate risk, generate fixes, and support remediation workflows before vulnerabilities can be exploited. Unlike other model providers that simply hand security teams a massive, unprioritized list of AI-generated alerts, we deliver prioritized fixes to accelerate remediation and secure the Defender’s Advantage.

Introducing Google AI Threat Defense

Google AI Threat Defense fuses the reasoning power of Gemini and other frontier models, the contextual risk prioritization of Wiz, the code remediation capabilities of Gemini and CodeMender, and the frontline expertise of Mandiant.

By connecting real-world exposure directly to autonomously creating and prioritizing patching, AI Threat Defense helps organizations actively predict attack paths, prioritize the most significant threats, and deploy verified fixes faster than adversaries can exploit them.

AI Threat Defense is based on Google’s own approach to combating today’s threats and transforming vulnerability management across a four-step framework:

Prepare: Harden your foundation, and operationalize your framework for machine-speed prioritization and response.
Scan and prioritize: Conduct deep-dive analysis and AI-driven posture validation.
Remediate: Implement a workflow to autonomously verify and accelerate the patching of vulnerabilities.
Monitor: Transition to continuous detection and rehearsed, active response playbooks.

Google AI Threat Defense can help transform vulnerability identification and remediation.

Prepare: Harden the foundation for machine-speed response

As more vulnerabilities are discovered and exploitation accelerates, the first priority is to reduce unnecessary exposure. Sensitive assets should not be reachable from the internet or exposed through untrusted paths, regardless of patch status. The goal is not only to fix known critical issues, but to reduce what is reachable, validate what can actually be exploited, and make sure new risk does not depend on manual triage.

From there, organizations need to understand how quickly they can patch and respond across exposed technologies. As common vulnerabilities and exposure (CVE) volume grows and exploitation windows shrink, teams need clear ownership, prioritization, and execution paths before the next urgent vulnerability appears. Any exposed application, service, or technology should be prioritized based on reachability, exploitability, and business impact, with a fast process to route the issue to the right owner and drive remediation.

Finally, organizations need to scan every exposure with AI. This cannot be limited to code scanning, because not every vulnerability lives in code. Many real attack paths emerge from how applications, APIs, identities, configurations, permissions, and business logic interact in a live environment. Traditional attack surface management helps identify what is exposed, but organizations now need an AI penetration tester that can continuously analyze every exposure, determine whether it can actually be exploited, and understand what it would enable an attacker to do before attackers do the same.

AI Threat Defense operationalizes this process through Wiz. Wiz continuously discovers exposed applications, infrastructure, APIs, identities, and runtime environments, creating a live exposure map so teams can reduce unnecessary reachability. Wiz’s AI, context-aware, pen-testing agent simulates attacks to identify and validate complex exploitable paths, including application-layer and identity-driven risks traditional testing often misses.

Learn how Wiz continuously scans code repositories, CI/CD pipelines, AI platforms and models, hybrid clouds, and more to surface AI-native risks.

Scan and prioritize: Conduct deep-dive analysis, AI-driven adversarial testing and exploitability validation

Strategic defense requires multiple levels of environmental scanning — moving from superficial checks to deep, AI-driven code analysis.

Frontier models can uncover complex logic flaws, risky trust boundaries, vulnerable dependencies, exposed APIs, and chains of lower-severity issues that combine into exploitable paths. But these deeper scans are more expensive, slower, and harder to run continuously across every asset.

That’s why organizations need to prioritize deep scanning for internet-facing applications, customer-facing services, sensitive data flows, authentication and authorization logic, privileged services, and other business-critical systems.

Using multiple models and multiple passes can improve coverage, because model performance varies by cybersecurity task. Some models may be stronger at application logic, others at cloud configuration, binary analysis, exploitability validation, or remediation guidance. No single model finds the superset of vulnerabilities that other models find — organizations need to use a collection of models to find a broad range of vulnerabilities with optimal cost per token.

Our multi-AI strategy creates a more cost-effective scanning strategy: Use lighter-weight, faster models for broad, continuous coverage, and reserve frontier models for the highest-risk applications and findings. With Wiz, those priorities are guided by real risk context — exposure, vulnerabilities, identity, sensitive data access, and runtime signals — so the highest-risk assets are scanned deeply not just once, but continuously as risk changes.

AI Threat Defense operationalizes this process by deploying AI security agents to help you actively hunt for deep vulnerabilities. These agents draw on multiple industry-leading frontier models via the Gemini Enterprise Agent Platform — where customers will be testing CodeMender — helping organizations choose the best model for the job, without sacrificing strict enterprise privacy, security, or data governance.

This demo showcases how developers can easily secure their applications using CodeMender's command-line interface (CLI).

Once a code flaw is discovered, AI Threat Defense instantly enriches and validates findings with live architectural and runtime context from Wiz. This capability transforms a raw list of model findings into a prioritized map of real business risk, filtering out the noise to focus exclusively on what is reachable. This visibility enables developers to look at the dependencies across source code libraries and binaries to understand the changes that may need to be made in concert — for example, if the signature or behavior of specific libraries needs to be altered.

Translating deep analysis into effective action, AI Threat Defense incorporates Mandiant’s expertise to create actionable response plans. This strategic guidance helps organizations manage sudden surges in critical issues, create strategies for safely retiring legacy products, and assist with rolling out AI-generated patches without overwhelming engineering teams.

Remediate: Accelerate resolution with immediate fixes

After identifying vulnerabilities, the goal is to shrink the time to remediate from weeks to minutes. AI Threat Defense achieves that velocity by driving a high-speed, autonomous workflow that provides and prioritizes fixes without placing a heavy implementation burden on your development teams.

To ensure your security keeps pace with deployment, the platform proactively generates vulnerability fixes directly in a developer’s IDE or CLI as they build. Harnessing the full reasoning power of Gemini, CodeMender works seamlessly with Antigravity and Wiz to empower engineering teams to replace vulnerable code, re-write older code to modern, memory-safe languages, and to analyze library dependencies to coordinate seamless rollouts. In parallel, it automates triage and prioritizes remediation across applications and cloud infrastructure.

Before any patch goes live, the platform automatically generates tests to verify every fix. Once remediated, libraries are tagged across both source control and production environments, providing complete end-to-end tracking to allow the organization to see which model was used to generate what patches and when.

As part of your overall risk posture, you need to understand where vulnerable systems can access sensitive data, since these paths increase exfiltration risk. By consolidating visibility across your data estate, you can identify sensitive data services that are reachable from risky workloads, and prioritize encryption, identity, network controls, exfiltration monitoring, and more.

In addition, consolidating visibility over your software development lifecycle gives you control over how software and configuration changes are being deployed.

Ultimately, our approach delivers autonomy under human supervision — empowering teams to burn down security backlogs and harden the software development lifecycle without sacrificing speed or strategic control.

CodeMender can find and fix deep vulnerabilities in your codebase.

Monitor: Establish machine-speed detection and rehearsed, active response

Even with a hardened foundation, true resilience requires constant vigilance in runtime. While code-level scanning pipelines are excellent at catching flaws before deployment, they cannot block an active exploit. AI Threat Defense shifts operations from manual oversight to machine-speed detection and real-time defense.

As exposure cycles accelerate, AI Threat Defense builds resilience by establishing a consistent operational framework — informed by Mandiant’s frontline expertise — where ownership is defined and outcomes are tracked.

To support active defense against automated adversaries, AI Threat Defense leverages autonomous agents, enabling teams to rapidly hunt for hidden threats, investigate suspicious activity, and respond to live attacks in real time. Together with AI Threat Defense, agentic security operations center (SOC) capabilities from Google Security Operations further enable automated detections, triage and investigation, and hunting of emerging anomalies across your network, identity, and application telemetry. This provides an ongoing monitoring capability to help you discover vulnerabilities before your adversaries do.

Finally, the platform secures the environment from the ground up, minimizing the attack surface right from the start using hardened container images built, signed, and verified daily.

How our partners use AI Threat Defense

To realize the full potential of autonomous defense, our customers are increasingly teaming up with trusted strategic advisors to guide their cloud security journey. Our ecosystem partners, including Accenture, Deloitte, Netenrich, PwC, and TENEX.AI, bring the critical expertise needed to assess your unique cloud architecture and embed AI-driven security capabilities into your existing development pipelines.

Beyond initial deployment of AI Threat Defense, these partners will deliver continuous management, custom harness building, and tailored security workflows. Together, we will help ensure that threats are being identified at machine speed and being automatically remediated, aligning with your organization's specific operational and compliance requirements.

The path forward: Outpacing the adversary with AI

The collapse of the exploit window has made one thing clear: Human-speed vulnerability management is no longer a viable strategy for enterprise risk. The era of machine-speed attacks demands an autonomous, continuous defense.

By combining the contextual risk prioritization of Wiz, the code remediation capabilities of CodeMender, the intelligence of Gemini, and the frontline expertise of Mandiant, we provide the architecture needed to match the speed of the adversary. AI Threat Defense also uses a variety of models to enable organizations to find the largest collection of vulnerabilities while managing costs enabling you to scan, remediate, and maintain your software assets on an ongoing basis.

A key part of our approach is the Google Cloud CISO Community, our close partnership with an important, growing community of industry leaders. This group includes executives from companies including Morgan Stanley, MSCI, TELUS, and Thales. Together, we are building real-time ideas into solutions and shaping the future of AI defense.

To ensure that your enterprise doesn't just keep pace with automated adversaries, but consistently outpaces them, learn more about how Google AI Threat Defense can help you fight AI with AI.

How we evolved Google’s global and data center networks for the AI era

Tue, 26 May 2026 16:00:00 +0000

Over the last 25 years of building Google’s global network, we’ve navigated major architectural eras — from the Internet, to streaming, and the cloud. Today, we are squarely in the midst of a fourth: the AI era. The applications in the AI era are fundamentally different from the consumer and enterprise applications of the previous eras and impose a set of novel and demanding requirements — on compute resources, of course, but also on the network.

Consider the fundamental physical challenge, which is that it is far more difficult to move electrons (electrical power) than it is to move photons (data over fiber). Because the demand for AI compute frequently outpaces the space and power capacities of individual facilities, we strategically locate data centers near sustainable energy sources, or in locations with pathways to add clean energy sources to the local grid. Then, by utilizing the network to distribute AI workloads across campuses, we create a massive-scale, pooled hypercomputing resource that overcomes the power limitations of any single site.

To deliver this, we created an end-to-end, vertically integrated AI technology stack that comprises everything from chips to systems, to platforms and application and agentic ecosystems. This stack includes a portfolio of pre-built agents and applications; our Gemini Enterprise Agent Platform for you to build, scale, govern, and optimize your AI-enabled applications; world-class AI models; as well as our unified data platform. All this is anchored by our AI Hypercomputer, a unified infrastructure that combines purpose-built hardware and open software, and that comes with flexible consumption options. Our network, forged through decades of innovation, is the essential fabric of the AI Hypercomputer.

The network supporting this stack must meet the stringent bandwidth, scale, and performance needs of AI workloads. This applies not only within the campus, where the network must scale up and out, but also across the wide area network (WAN) along with high-bandwidth interconnects, to bring AI training data from its source to AI compute resources.

To address these challenges, we’ve reimagined three key pillars of our network infrastructure: the fabric inside the AI Hypercomputer, the fabric across the AI Hypercomputer, and our global network. Let’s take a closer look at each of these.

1. The fabric inside AI Hypercomputer

The massive scale of today’s AI models, fueled by the explosive growth of foundational AI model parameters, makes AI training very compute- and network-intensive.

This necessitates an exponential increase in required network bandwidth, with strict bounds on delay (e.g., tail latency) to accommodate AI workloads’ peculiar traffic patterns, which are characterized by sensitivity to performance variation and synchronized bursts, i.e., intense, coordinated, millisecond-level traffic spikes. Furthermore, since large-scale training jobs are uniquely vulnerable to failures and performance stragglers, maintaining high reliability and predictable performance is absolutely essential.

To address the scale, low latency, and high predictability that modern AI workloads require — as well as protection from extreme bursts — we’ve adopted a "campus as a computer" philosophy, decoupling our network into three distinct domains:

a scale-up domain for intra-pod connectivity
a dedicated east-west scale-out accelerator fabric
the Jupiter frontend network for north-south compute and storage access

This decoupled architecture provides three strategic advantages: it allows domains to evolve independently for faster innovation; provides a non-blocking scale-out network with massive training bandwidth; and helps ensure the network can be co-designed in lockstep with new ML accelerators, for superior hardware support.

Recently, we unveiled Virgo Network, our scale-out data center fabric specifically engineered for modern AI. Virgo utilizes high-radix switches and a flat, two-layer non-blocking topology to provide massive bisection bandwidth, while minimizing latency by reducing network tiers. Its multi-planar design, featuring independent control domains for each plane, provides hardware-level resilience and fault isolation. Furthermore, Virgo can expand across multiple data centers, removing physical building limitations and enabling flexible AI compute scaling.

The effectiveness of our network and accelerator codesign is perfectly illustrated by the recently debuted eighth generation TPUs. Within this architecture, Virgo Network can link 134,000 TPU 8t chips with up to 47 petabits/sec of non-blocking bi-sectional bandwidth in a single fabric. Virgo Network delivers up to 4x the bandwidth per TPU 8t accelerator over the previous generation, and 40% lower unloaded fabric latency for TPU 8t compared to the previous generation network for TPUs. In this setup, Virgo Network manages the raw accelerator traffic, while Jupiter provides reliable and rapid access to the global WAN and storage. When integrated with Pathways and JAX, this AI Hypercomputer networking engine facilitates near-linear scaling for up to a million TPU 8t chips in a single logical cluster.

Autonomous reliability: protecting workload goodput

Building a resilient megascale fabric represents only part of the challenge. In a cluster of hundreds of thousands of chips, hardware failures are a statistical certainty. A single stalled instance can stop an entire synchronous training job, wasting valuable compute cycles. As such, efficient fault localization is critical.

We engineered Virgo Network with autonomous reliability capabilities to maximize workload efficiency at scale, a.k.a., goodput. Expanding on our existing straggler detection, Virgo Network now also features automated hang detection. The moment a fail-stop event occurs, our specialized agents immediately localize the fault, isolate the faulty instance, and enable you to restore the training job from a checkpoint — getting your training timeline back on track, with minimal manual intervention. Learn more by watching this demo:

To complement these capabilities, we also use high-resolution, sub-millisecond telemetry to identify elusive network micro-bursts that are usually missed by conventional 30-second monitoring intervals. These high-resolution telemetry advancements enable more efficient network operations, better provisioning, and a lower mean time to recovery.

2. The fabric across AI Hypercomputer

The exponential growth of modern AI workloads requires us to scale and distribute AI workloads across multiple campuses over a WAN. At the same time, traditional networks weren’t built for the high bandwidth and extreme burstiness of AI traffic, and often fail to detect microbursts that can lead to severe performance degradation. We have developed a suite of innovations to optimize WAN performance for cross-site AI deployments, including:

A multi-shard global network that enables horizontal scaling. Our global network sustained a 10X WAN traffic growth from 2020 to 2025.
Tuning the fabric for essential availability, latency, and quality of service (QoS) attributes. Real-time microburst management helps ensure fair bandwidth allocation and infrastructure isolation across our multi-tenant infrastructure.
Multi-shard isolation to ensure each network shard operates with its own control, data, and management planes.

Combined with regional isolation and Protective Reroute, this architecture minimizes failure impact and shortens user-visible outages — delivering the beyond-nines reliability essential for AI workloads.

Providing high-speed, flexible, and cost-effective interconnectivity is also a priority. AI training relies on vast datasets that are often located on-premises or across various clouds. Given the high cost of AI compute, minimizing idle time is essential; for instance, upgrading from a 100 Gbps link to a 3.2 Tbps connection reduces the time to transfer a petabyte of data from 22.2 hours to just 0.7 hours — a 97% reduction in AI compute idle time spent waiting for data. Our AI-native Cloud Interconnect is purpose-built for the high-bandwidth and low-latency needs of AI workloads, featuring an optimized data path with 400 Gbps links that scale in 3.2 Tbps increments to reach petabit-per-second capacity. It also offers traffic differentiation and flexible connection options, including direct fiber peering and colocation facilities. AI-native Cloud Interconnect supports petabit-scale data transfer with reliable, private connectivity necessary for your cross-cloud AI training and serving.

3. A resilient global network for the age of inference

Applications serving AI inference to a global user population or supporting an agentic enterprise are far more demanding than conventional web apps. The need for opportunistic use of expensive AI compute available at distant locations, distributed service dependencies, and the burstiness of the traffic demand high bandwidth network with a global footprint, as well as deep peering to SaaS providers, ISPs, and hyperscalers. To maintain responsiveness and "always-on" availability, applications need low latency and a highly resilient network.

With its connectivity, scale, and resilience, Google’s global network is well-equipped to handle the demands of the age of AI inference. Our network spans more than 10 million kilometers of terrestrial and subsea fiber, connects our 43 cloud regions, and features 200+ edge locations, providing the essential footprint for serving AI inference. Our Premium Tier network delivers the low latency and reliability needed for consistent, high-quality global user experience. By optimizing traffic entry and exit points, the network significantly boosts application performance, with resilience at the core of this "always-on" infrastructure.

Building the future, together

As a Google Cloud customer, these network innovations are built directly into your environment. Google’s network delivers the massive scale, capacity, reliability and performance essential for your AI workloads.

The AI era demands more than just raw compute; it necessitates a robust network fabric to scale. Our vertically integrated AI technology stack — from silicon to software ecosystems — is powered by the AI Hypercomputer to accelerate your transformation and make AI helpful for everyone. Whether through our megascale fabric, resilient global network for inference, or AI-native Cloud Interconnect, we ensure your AI journey is efficient and reliable. We look forward to building this future with you.

New study: Securing AI in the browser is a top priority for IT Leaders

Tue, 26 May 2026 11:20:00 +0000

The way we work has fundamentally changed. From automated agents to sophisticated AI services, Generative AI (GenAI) has become a daily tool for a vast majority of employees. But with this rapid adoption comes a new, critical challenge for IT leaders: the need to protect corporate information in an era where the browser acts as the priority workspace and data often traverses AI-driven workflows.

To understand how organizations are navigating this new landscape, we commissioned a report from industry analyst firm Omdia. They surveyed 400 IT and cybersecurity professionals in North America, and the findings show that the browser is the frontline for modern enterprise security, especially when it comes to securing AI usage.

The importance of the browser continues to grow

New and emerging threats continue to act as a catalyst for browser management. Nine out of ten individuals responded that browser security is a top five priority, with an emphasis on how browsers play an important role in their organizations.

Gen AI usage rises, and so does prioritization around securing it

The survey indicates that overwhelming 92% of organizations now permit employees to use public GenAI applications, and the majority of this activity occurs directly within the browser. This creates new avenues for data to escape enterprise oversight.

Browser security ranks as a major priority, and offers an opportunity to effectively secure AI usage

With this shift, browser-related threats are a major concern. The report found that 55% of organizations have experienced a browser-based security attack in the last 12 months. When asked about emerging threats, IT leaders rated AI-powered phishing (75%) and data leakage from GenAI tools (71%) as their top concerns.

When evaluating new security solutions, 59% of IT leaders reported "GenAI application security" as an important use case for a secure browsing solution at their organization.

The Chrome Enterprise Solution

The report's findings highlight the limitations of traditional, network-based security tools in a cloud-first, AI-powered world. These legacy systems may lack the visibility to inspect encrypted traffic to SaaS and GenAI applications, leaving a significant gap in data protection.

Instead of relying on network proxies or additional agents, Chrome Enterprise builds AI security directly into the browser. With Chrome Enterprise, IT and security teams can:

Discover and Govern AI Usage: Gain visibility into how employees are using GenAI tools and identify potential data exfiltration risks with enterprise controls.
Enforce Data Protection Controls: Prevent data loss by using granular controls to manage uploads, downloads, copy-and-paste, printing, and screen captures within any web-based application, including GenAI tools.
Set Context-Aware Access Policies: Go beyond simple blocking. Chrome Enterprise allows you to create policies that grant access to sanctioned AI tools based on user, device, and location context, ensuring that productivity and security go hand-in-hand.

As work becomes more browser-centric and AI-driven, securing the browser is no longer optional—it's essential.

Read the research spotlight “Securing AI Usage Starts in the Browser” today to learn more about emerging browser security strategies and how Chrome Enterprise can help you protect your organization.

Exploitation of KnowledgeDeliver via ViewState Deserialization Vulnerability

Mon, 25 May 2026 14:00:00 +0000

Written by: Takahiro Sugiyama, Peter Revelant, Mathew Potaczek

Introduction

In late 2025, Mandiant responded to a security incident involving a compromised web server running KnowledgeDeliver. KnowledgeDeliver is a Learning Management System (LMS) developed by Digital Knowledge commonly used in Japan. Mandiant identified a critical vulnerability that allowed unauthenticated Remote Code Execution (RCE). An unknown threat actor leveraged this access to inject malicious code into the LMS platform, with the goal of infecting users visiting the site.

This vulnerability stems from the use of identical pre-shared ASP.NET machine keys across multiple customer deployments. The vulnerability was initially exploited as a zero-day, now tracked as CVE-2026-5426.

The Vulnerability

KnowledgeDeliver installations deployed before Feb. 24, 2026 relied on a standardized web.config file provided by the vendor. This configuration file contained hardcoded machineKey values used by the ASP.NET framework to encrypt and sign data, including ViewState payloads.

Because these keys were identical across independent customer environments, a threat actor who obtained the keys from one deployment could compromise any other internet-facing KnowledgeDeliver instance.

The following is an example of the relevant configuration line found in the web.config file:

<machineKey decryptionKey="<REDACTED>" validationKey="<REDACTED>" />

The ASP.NET ViewState persists page state across postbacks. When the machineKey is known, a threat actor can craft a malicious ViewState payload. By sending this payload in an HTTP request (via the __VIEWSTATE parameter), the threat actor can make the server deserialize it.

This technique follows the pattern of the ViewState Deserialization Zero-Day Vulnerability affecting Sitecore (previously reported by Mandiant), and Code injection attacks using publicly disclosed ASP.NET machine keys reported by Microsoft. This highlights how it is critical to keep the machine key unique and secure.

Post-Exploitation Activity

Once access was established, the threat actors focused on maintaining their presence and expanding the impact of the compromise.

BLUEBEAM Web Shell Deployment

The threat actor deployed a .NET-based in-memory web shell called BLUEBEAM (also known as Godzilla). The use of BLUEBEAM is consistent with the Microsoft reporting. This malware operates entirely in memory within the IIS worker process (w3wp.exe), making it difficult to detect through traditional file-based scanning. It allows threat actors to execute further commands and payloads by sending encrypted data via HTTP POST request bodies.

File Tampering

The threat actor was observed executing commands to escalate their control over the web server's file system:

Permission Modification: The threat actor used icacls to grant "Everyone" full access to the web application directory.
JavaScript Tampering: The threat actor modified an application JavaScript file, adding code to perform the following:

Display a fake security alert, prompting users to install a "security authentication plugin".
Silently load a remote malicious script hosted on a threat actor-controlled domain.

Cobalt Strike Infection

The remote script convinced users to download a fake installer, which led to workstations being infected with a Cobalt Strike BEACON backdoor. The payload was encrypted using a key that used the name of the compromised organization, which indicated that the threat actor prepared this payload specifically for the targeted organization.

How to Hunt for This Activity

Organizations should monitor for the following indicators to identify potential ViewState exploitation and post-exploitation activity.

1. Application Event Logs (Event ID 1316)

Monitor the Windows Application log for Event ID 1316 from the source ASP.NET 4.0.30319.0 (or similar).

Failed Attempt (Integrity Failure): Event code: 4009-++-Viewstate verification failed. Reason: The viewstate supplied failed integrity check. May indicate an attack attempt with an incorrect key.
Successful Execution (Invalid ViewState): Event code: 4009-++-Viewstate verification failed. Reason: Viewstate was invalid. Confirms integrity checks were passed. Deserialization of the payload was attempted and may have succeeded. The payload may or may not have been executed.

Mandiant decrypted payload strings recorded in the event log messages with the server’s machine keys and recovered a payload related to a BLUEBEAM web shell.

2. Suspicious Process Activity

Monitor for unusual child processes spawned by w3wp.exe. Commands observed include:

cmd.exe /c ...
whoami
powershell.exe

3. File Integrity Monitoring

Monitor for unauthorized changes to .js, .aspx, or .config files within the web root. Specifically, look for the addition of remote script loaders or unusual logic in commonly used libraries.

4. Anomalous User-Agent Strings

Mandiant identified User-Agent strings consisting of two distinct identifiers concatenated together, which were consistent with ones reported in ViewState Deserialization Zero-Day vulnerability. Monitor for web request logs for such anomalous User-Agent strings. The following are examples of identified User-Agent strings:

Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.2 (KHTML, like Gecko) Chrome/22.0.1216.0 Safari/537.2 Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36
Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.13) Gecko/20101213 Opera/9.80 (Windows NT 6.1; U; zh-tw) Presto/2.7.62 Version/11.01 Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36
Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0) chromeframe/10.0.648.205 Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36

Remediation and Mitigation

Rotate Machine Keys: Immediately generate a unique, cryptographically strong machine key for each KnowledgeDeliver instance. This is the only way to invalidate the shared secret.
Restrict Access: If possible, limit access to the LMS to known organizational IP address ranges.
Investigation: Hunt for this activity, and conduct a thorough investigation if any signs of exploitation are identified.

Outlook and Implications

The exploitation of KnowledgeDeliver highlights the severe risks of using shared secrets in deployment templates. A single leaked key can compromise an entire ecosystem of installations. By implementing unique secrets and robust endpoint monitoring, organizations can defend against these deserialization attacks.

Indicators of Compromise (IOCs)

To assist the wider community in hunting and identifying activity outlined in this blog post, we have included indicators of compromise (IOCs) in a free GTI Collection for registered users.

File Name	Type	SHA-256
`LoadLibrary.dll`	BLUEBEAM	`7c1f99dca8e5a7897892f9d224a6495023a2cfd2671697d229d355978c415ed2`

Google Security Operations (SecOps)

The following SecOps searches can be used to hunt for this activity.

(metadata.log_type = "WINEVTLOG" or metadata.log_type = "WINEVTLOG_XML") 
metadata.product_event_type = "1316"
additional.fields["Message"] = /Event code: 4009\b/ nocase

(metadata.event_type = "PROCESS_LAUNCH" or metadata.event_type = "PROCESS_OPEN") AND
principal.process.command_line = /w3wp.exe/ nocase AND
target.process.command_line = /cmd.+ \/c |whoami|powershell/ nocase

SecOps customers have access to the following rules and more under the Mandiant Hunting Rules, Mandiant Frontline Threats, Mandiant Intel Emerging Threats rule packs:

ASP.NET ViewState Deserialization Attempt
W3wp Launching Cmd With Recon Commands
W3wp Launching Encoded Powershell
W3wp Launching Icacls
Web Server Process Launching Whoami
IIS ViewState Exploitation Success
IIS ViewState Exploitation Followed by Web Root File Tampering
Possible Windows Exchange Server Spawning Shell

Acknowledgements

Mandiant would like to extend our thanks to the Digital Knowledge team for their collaboration regarding this disclosure.

2 PhaaS 2 Furious: The Evolution of Chinese-Language Phishing Services

Mon, 25 May 2026 14:00:00 +0000

While Russian-speaking threat actors have historically dominated the phishing-as-a-service (PhaaS) landscape, a rival ecosystem is rapidly growing within the Chinese-language underground. Google Threat Intelligence Group (GTIG) analyzed a dozen current PhaaS offerings in the Chinese underground, all of them mature services and many likely tied intricately to the broader criminal ecosystem in that region. These services not only lower the barrier to entry for Chinese cyber criminals, but reveal broader patterns on the evolution of social engineering and credential theft. Late last year, Google took legal action against one PhaaS provider and has worked since then to endorse legislation and enact technical safeguards against these types of scams.

Within this ecosystem, GTIG has observed a fundamental move away from static password harvesting towards real-time interception and tokenization. By utilizing live administration panels, attackers can interact with victims in real-time to capture one-time passcodes (OTPs), allowing them to bypass multifactor authentication (MFA) instantly.

Instead of simply gaining account access, these operations focus on exploiting digital wallet provisioning to transform stolen payment data into tokenized assets within ecosystems. This shift—combined with the use of encrypted delivery channels like RCS and iMessage to bypass traditional carrier security filters on SMS messages—represents an emerging development where the goal is no longer just a login, but securing direct, unauthorized control over a victim's financial accounts.

Figure 1: Example phishing site chain

The Chinese-Language PhaaS Ecosystem

The Chinese-language PhaaS ecosystem is not merely a regional mirror of Russian operations – it is a distinct market shaped by a unique professional culture. Nearly all the legitimate organizations mimicked by these phishing services are non-Chinese entities, suggesting they rarely target China.

Public impact: Unlike the major Russia-based PhaaS offerings that are typically used to target customers of large organizations, phishing services advertised in Chinese-language communities are often designed to target the general public more opportunistically.
Open Operations: In contrast to their Russian-speaking counterparts, providers of Chinese-language phishing services often operate openly with less regard for operational security. For instance, the threat actors running these services regularly post photos of their luxury lifestyles on Telegram.
Focus on Telegram: Advertisements for the phishing services are regularly posted to Telegram rather than channels such as WeChat (Weixin) or Tencent QQ, which are regionally more popular. This approach is consistent with the broader Chinese-language cyber crime ecosystem.
Extensive offering: While PhaaS is at the core of these operations, these developers also typically offer numerous ancillary services, forming a complete, mature, and extensive offering. These include the sale of personally identifiable information (PII), domain name registration and virtual private server (VPS) hosting services, server rentals, money laundering services, eavesdropping devices (International Mobile Subscriber Identity [IMSI] catchers), and message sending services (spamming assistance). Some platform vendors are also involved in trading stolen payment card information.

Notable Chinese-Language PhaaS TTPs

Delivery via RCS and iMessage: These attacks begin by exploiting trust in modern communication. Rather than traditional SMS, these Chinese-language PhaaS operators heavily leverage Rich Communication Services (RCS) and Apple’s iMessage. Protocols that use end-to-end encryption make it difficult for server-side delivery infrastructure to inspect or filter malicious links, which makes on-device protections critical. Messages also contain more extensive engagement features (including read receipts, typing indicators, group chat functionalities, as well as the ability to send high-resolution images, videos, and larger files). This makes them ideal for social engineering operations, as lures appear remarkably legitimate to the average user.
Real-time Interception: When a victim clicks a malicious link and enters their credentials, the data is displayed instantly on an administrative panel. This allows an adversary to interact with the victim in real-time. As the victim is prompted for an OTP, an attacker simultaneously triggers that same OTP request on their own device. The victim enters the code into the phishing page, and the attacker captures it seconds before it expires.
Leveraging Digital Wallets for Monetization: A defining characteristic of these operations is their exploitation of digital wallet provisioning to monetize stolen payment details. Attackers use captured credentials and OTPs to provision the victim’s card into a digital wallet on an attacker-controlled device. Once tokenized, the card can be used for high-value transactions, contactless payments, and ATM withdrawals. While payment card data theft is the focus, this ecosystem also develops brokerage-focused templates, which can be used to facilitate traditional account takeovers (ATO) for wire fraud and stock manipulation.
AI-Based Automation: Multiple Chinese-language PhaaS operators have adopted AI for their operations to enable scale and stealth. As one example, the Darcula PhaaS platform, which we link to UNC5814, has moved away from static templates, instead utilizing AI-powered page generators and browser automation tools like Puppeteer. This enables users to clone legitimate websites by replicating their HTML, CSS, JavaScript, and visual elements through providing the target website's URL. As each phishing page is unique as opposed to relying on static templates, signature-based detection methods are rendered increasingly ineffective.

Localization-as-a-Service

The Chinese-speaking PhaaS ecosystem has shifted towards a highly automated model capable of generating localized content for diverse international markets. Unlike traditional phishing kits that have historically relied on static and poorly translated templates, these operators provide the infrastructure for cultural fluency at scale. By offering everything from AI-powered page generators to region-specific delivery assistance, they enable low-skilled affiliates to launch high-fidelity campaigns.

YY Lai Yu (YY来鱼): A Case Study in Localization

YY Lai Yu (YY来鱼), first advertised in August 2024, is one example of a PhaaS offering that provides a local digital ecosystem. While the platform supports phishing across 119 countries, its largest focus has been on Japan. Managed by a core team including "YY Lai Yu," "Jeffrey Carrie," and "Very casual," the service provides Chinese-speaking threat actors with the localized infrastructure necessary to effectively target the Japanese consumer ecosystem.

Figure 2: A graph of countries targeted by YY Lai Yu (YY来鱼) phishing

Figure 3: A YY Lai Yu (YY来鱼) phishing page targeting a Japanese user’s Apple account

Figure 4: A YY Lai Yu (YY来鱼) phishing page targeting a Japanese user’s PayPay account, the largest Japanese mobile payment app

Since November 2025, YY Lai Yu has offered more than 400 phishing templates to its customers, moving beyond generic banking lures to also target the digital lifestyle of Japanese residents. These templates included various Japanese language and Japanese brands, including for Amazon, Apple, DMM, Epos Card, JA Bank, JCB Card, JR (Rail), Matsui Securities, Mercari, Monex, Nintendo, Nomura Securities, Orico Card, PayPay, Rakuten Securities, and Sagawa Express. However, instead of merely providing fake account pages, the threat actors tapped heavily into local consumer habits by developing "points" (积分) and rewards redemption lures, pressuring victims to redeem supposedly expiring loyalty points for cash or goods. Demonstrating a deep awareness of the local economic climate, the operators also exploited cost-of-living concerns by crafting lures around the Japan Winter Electricity Subsidy.

By deploying distinct domains that impersonate everything from local transit and payment apps to major e-commerce and gaming platforms, YY Lai Yu provides an example of how comprehensive these PhaaS offerings have become. To protect this highly localized infrastructure, the phishing sites featured a unique human verification anti-bot screen that appeared prior to the actual phishing page. By requiring a manual click to proceed, this mechanism successfully hindered automated analysis by security vendors, adding a layer of stealth to the localized campaign.

Like most other services, YY Lai Yu leverages RCS and iMessage to send encrypted messages in bulk and supports synchronized interactions with victims to harvest payment card and OTP data. The administration panel allows users to query their phished data and blocklist or highlight certain types of cards according to their BIN number, blocklist individual countries or territories, and register and manage new domains for their phishing pages using Alibaba's domain registration service. Additionally, panel administrators can create new operator users and assign them permissions. The service also offers domains that can be purchased within the administration panel.

While YY Lai Yu showcases a focus on countries like Japan, the broader Chinese PhaaS ecosystem casts a wide global net. GTIG has observed other prominent services routinely deploying automated infrastructure to compromise users across the Americas, Europe, Australia, and the Middle East.

Outlook

The continued popularity of these services demonstrates a sustained interest in payment card fraud from China-based threat actors. The multitude of sophisticated PhaaS platforms available for purchase and the threat actors' focus on the exploitation of digital wallet tokenization and MFA bypass demonstrates that the China-based criminal ecosystem continues to evolve, enabling threat actors with limited technical skills to conduct phishing operations.

Standard phishing security measures (such as user awareness training) remain an important first line of defense. However, the proliferation of the Chinese-language PhaaS ecosystem underscores a need for technical security controls that go beyond user education. For example, transitioning to FIDO2/WebAuthn infrastructure represents an effective countermeasure against the real-time interception of account authentication OTPs. While security keys cannot prevent a user from entering payment details into a novel phishing site directly, increasing the difficulty of leveraging stolen credentials still radically shrinks an adversary's opportunities. These enterprise authentication upgrades should be paired with risk-based verification and device fingerprinting by issuing banks during the digital wallet provisioning process.

As these operators continue to refine their tooling, the goal for defenders must shift from simply "detecting" a phish to making the victim's credentials technically impossible to weaponize. Ongoing and frequent updates to these platforms indicate that Chinese-speaking PhaaS operators are continuing to refine their tooling to maximize global impact.

The Blueprint: How Movix fills a gap in dental skills with specialized agentic AI

Fri, 22 May 2026 16:00:00 +0000

Welcome to The Blueprint, a regular feature where we highlight how Google Cloud customers are tackling unique and common challenges across industries using the latest AI and cloud technologies. We hope to inspire others looking to innovate in their work.

The demand for dental appliances, like crowns and aligners, is booming, but it’s hard for manufacturers to keep up. At Movix, we’re building one of the first agentic AI solutions for dental appliance manufacturers and dental labs to help companies in the sector acquire digital technical expertise so they can scale clinical workflows cost-effectively and consistently.

The challenge:

Movix started in 2025 with a mission to solve a serious shortage of skilled dental technicians in aligner manufacturing through AI and agentic workflows. The need is significant: the global dental market is valued at nearly $400 billion and growing at double digits, yet many operations remain analog - creating enormous demand for co-pilot, agentic solutions.

Before founding Movix, we had previously started a vertically integrated dental aligner company that focused on very difficult dental situations, such as very crooked teeth. Yet even with highly skilled and trained technicians, there were often mistakes that would require remaking an aligner — a process that costs $300, roughly 25% of the retail price. Poor quality control took a real bite out of the company’s margins.

We saw an opportunity with Movix to address these mistakes by providing technicians with AI-powered quality control agents that automate aligner workflows and reduce errors. To achieve this, we needed to solve for a few key technical challenges:

Develop a custom AI model and end-to-end agentic workflow, since off-the-shelf solutions lacked domain expertise,
Ensure scalability would be built into the platform to prevent outages or production delays,
Achieve broad interoperability through a complex hybrid integration strategy since many dental practices are slow to adopt new technology and run on legacy systems.
Optimize security and compliance to comply with medical record regulatory requirements and keep patient data safe.

The solution:

In order to deliver AI agents that can provide expert-level accuracy, we needed to custom build a lot of the tooling ourselves. We started by developing our custom models for deep learning, computer vision, and 3D mesh analysis over a five-month period, using Google Cloud infrastructure. This intensive, methodical time helped ensure the right level of accuracy and quality control.

We use Google Cloud infrastructure across the full pipeline — from dataset storage and model training to evaluation — to build and refine our defect detection models for intraoral scans. Once defects are detected, we use Gemini Enterprise Agent Platform to generate client-facing feedback that reads as if it came directly from a human technician — acting as a digital team member in the quality control workflow.

Our 3D models use Cloud Run with L4 GPUs for the massive compute power we require; notably, performing the 3D segment scans and detecting defects across the entire fabrication process are highly compute-intensive processes. We use Compute Engine VMs to run experiments, along with various other GPUs to train our models, and perform the heavy lifting of model development in this environment.

Cloud Run and other tools like Cloud Storage support our scalability goals as we target large customers who handle high case volumes — some large labs might produce up to 200,000 appliances per year. Google Cloud's global network of data centers also simplifies regulatory compliance across regions and ensures fast delivery of large 3D datasets to clients worldwide.

The architecture:

The outcome:

Our agentic solutions automate data entry and quality control, which are traditionally manual, time-consuming, and error prone tasks. By automating the work of the best dental technicians, we’re ensuring a top quality product that will improve the fit of crowns, aligners, veneers, and implants for many, many patients. We estimate that our automation and the higher level of accuracy our QC agent delivers could save an aligner manufacturer $300 per remake, for example.

We also believe we’re helping to speed the appliance manufacturing process, leading to quicker turnaround times for dental appliances, which helps dental labs receive revenue faster and improve their cash flow. And we already know we’re meeting a critical need: After we launched the QC agent in October 2025, our first customer signed with us in December. That customer, Orthero, an aligner company serving more than 20 countries, has enjoyed significant results.

“Orthero benefits from this automation by making quality control faster, more consistent, and scalable,” Efer Turhan, a co-founder of Orthero, said. “With support from Movix’s QC AI Agent, we detect missing or inconsistent inputs early and flag unusual deviations before they cause delays.”

The details:

Even with the advantages of AI, our goals demand some serious work. Our architecture supports a solution that’s agentic and modular, integrates into existing on-premises dental systems, and ensures security and compliance.

Our agentic approach allows our system to run checks and balances, manage the complex, multi-step process of quality control for dental scans, and eliminate human errors that occur in data handling and quality review. Our goal is to develop five distinct AI agents by 2029 that cover the entire dental appliance workflow, from original patient dental scan to appliance manufacturing. While our first agents focus on data entry and dental scan quality control, our next agents will handle 3D file repair, clinical review, treatment planning, and manufacturing.

Our solution architecture also enables our system to integrate seamlessly with our customers’ existing lab management and manufacturing systems through API integrations. Because we are selling our solution into a conservative market, we decided to bear the burden of responsibility for successful adoption by doing as much of the integration work as possible.

Because we operate in the highly regulated healthcare industry, we built an environment that strictly follows compliance rules, anonymizing protected health information, or PHI, before it enters our machine learning pipeline to prevent health information from being exposed to the processing environment.

We plan to build hybrid solutions to capture a wider market as we move forward. We're designing an architecture that connects our cloud-based AI agents with older, on-premises software that many conservative labs still use — through lightweight local connectors and standardized APIs. This will allow us to access a large market segment that has not yet migrated to the cloud or begun to use new digital dental technologies.

Taken together, we are not just solving a skills gap, we are reimagining what is possible with co-pilot and agentic solutions across the entire dental industry.

How Glance turns hours of video into mobile-ready clips with AI

Thu, 21 May 2026 17:00:00 +0000

Every day, thousands of hours of new video content sits waiting to be discovered. Most of it lives in long-form, horizontal formats, while audiences are scrolling through vertical feeds on their phones.

Glance, a mobile-first content platform, knows this challenge well. The company processes 1-2 hour videos from sources like podcasts, news reports, movies, and web series, and transforms them into 30 to 180-second vertical clips optimized for mobile lock screens. With daily volume projected to grow from 3,500 to over 10,000 videos per day, manual editing wasn’t a realistic path forward.

The solution also needed to go beyond simple cropping. It required the intelligence to identify and center the primary speaker, or dynamically split the screen to stack speakers vertically during conversations, preserving the context that makes content worth watching.

Here’s how Glance’s video generation solution works.

Building for the lock screen era

The goal was to create a complete pipeline that takes a long-form landscape video (16:9) and outputs multiple ready-to-publish short-form portrait videos (9:16). The solution needed to handle:

Key Moment Identification: Finding the most engaging 60-second segments within hours of long-form footage
Active Speaker Detection: Identifying who’s talking in each frame and positioning them at the top of a split screen. This includes distinguishing between a static image and a live person to ensure the crop focuses on the actual speaker.
Split Screen Detection: Recognizing interview layouts (common in news broadcasts) and stacking the frames vertically to preserve conversation context
Intelligent Reframing: Converting a multi-speaker, wide-screen shot into a focused, vertical frame without losing context
Dynamic Caption Highlighting: Generating word-level timestamps for "Karaoke-style" captions that increase engagement on silent-by-default mobile screens
Automated Branding: Applying masks, logos, and overlays programmatically to maintain brand consistency across all videos

The final technical solution uses Google Cloud Speech-to-Text v2, Gemini, and the Google Vision API, combined with custom video manipulation using Samurai (an open-source object tracking tool), OpenCV and MoviePy.

Architecture overview

The pipeline is divided into three distinct modules.

Fig. 2: High-level architecture

Module 1: Video clipping

This module converts long videos to transcripts, identifies key segments, and clips the video. Accuracy matters here: precise word-level timestamps ensure clips start and end exactly where they should.

Fig. 3: Video Clipping Workflow

The process involves audio extraction, speech-to-text transcription, and timestamp identification using generative AI. The module performs the following key functions:

Audio extraction: Extracting the audio from the original video file.
Speech-to-text transcription: Converting audio into text with precise timestamps for each word
Segment identification: Using Gemini 2.5 Flash (aka Nano Banana) to analyze transcripts text and identify optimal start and end timestamps for short video clips
Video clipping: Clipping the video into short segments based on the identified timestamps
Transcript validation: Using Gemini to verify phrases and words are accurately captured (this step does not validate word timing)

The output is a set of short video clips, each paired with its time-aligned transcript, ready for the next stage: the Intelligent Reframing Engine.

Module 2: Intelligent Reframing Engine

The core technical work here is converting a horizontal 16:9 frame into a compelling 9:16 vertical frame. A simple center crop often cuts out key speakers or action, so our solution uses a multi-stage scene analysis pipeline.

Fig. 4: Intelligent reframing engine

Active speaker detection

To know what to crop, we first need to know who’s talking. This happens on a frame-by-frame basis using the face detection capabilities of the Google Cloud Vision API.

Fig. 5: Active speaker detection

The liveness check: Differentiating a live speaker from a static image (like a photo on the wall or a graphic) is essential. This was achieved by tracking facial landmarks:

Mouth movement: Calculating the normalized distance between upper and lower lip landmarks
Head movement: Tracking changes in head pose angles (pan, roll, tilt)
A face must show consistent animation in these cues to be classified as a "live" participant

Quantifying engagement: Once confirmed as live, we calculate an activity score based on:

Mouth openness
Emotional fluctuation (changes in joy, surprise, etc., provided by Vision API)

Primary speaker identification: The final decision uses a liveness ratio:animated frames divided by total frames where the face appears. The person with the ratio closest to 1.0 (meaning they were consistently animated on screen) is designated as the primary speaker.

One edge case addressed during the development was a static background image appearing behind a live news anchor (as shown in Fig. 6). The liveness check handles this correctly because the static image shows no facial animation.

Fig. 6: Scenario with one active speaker and one static background image

Split-screen detection

This step addresses interview scenarios where two subjects appear on opposite sides of the landscape frame. The system detects split-screen layouts and stacks the two halves vertically to maintain conversation context.

Fig. 7: Video reformatting

With active speaker detection complete, the system uses the primary speaker's location to identify split-screen segments. The goal is to find the precise dividing line between panels, enabling the video to be reformatted into a vertical, top-and-bottom layout. Two complementary approaches accomplish this:

Approach 1: Continuous face tracking with Samurai

This method uses Samurai, an open-source object tracking tool, to follow the primary speaker continuously. The trajectory is analyzed for split-screen layouts based on:

Consistent off-center positioning: The speaker remains on one side of the screen (e.g., left or right half), indicating a split panel rather than free movement across the frame.
Vertical dividing line detection: Image analysis identifies a persistent vertical line separating the two panels.
Background discontinuity analysis: Differences in color, texture, and scenery between the speaker’s background and the opposite side confirm two separate video feeds.

Fig. 8: Background discontinuity analysis

Approach 2: Frame-by-frame detection with Google Cloud Vision API

This approach uses Cloud Vision API's face detection to identify split-screen layouts based on the primary speaker's face location:

Off-center face: Consistent face detection in one region (such as the left 40% of the frame) flags a potential split screen.
Proximate dividing line: Vertical lines between the face and the screen center confirm a panel boundary.
Contrasting backgrounds: Inconsistent backgrounds between the speaker's side and the far side confirm the split-screen layout.

The output: Vertical stacking

Once the system recognizes a split-screen, it performs a digital cut-and-paste. This preserves both speakers and their reactions in a mobile-native format.

Automated reformatting

With the scene analysis complete, the OpenCV2-based solution intelligently applies the appropriate reframing rule to each segment:

Single speaker crop: For scenes with one primary speaker, the system anchors the 9:16 frame to the speaker’s face, keeping them centered.
Split screen: When a split is detected, the system slices the frame along the dividing line and stacks the panels vertically (left panel on top, right panel on bottom).
Multi-speaker crop: For scenes with multiple people (not a formal split), the system focuses the crop on the most prominent speaker or the face closest to the center.
Fallback: If no faces are detected (e.g., graphics or wide shots), the system applies a center crop or horizontal padding (letterboxing).

Two final techniques ensure a polished look:

Short scene merging: Segments shorter than a defined threshold merge with the preceding or following scene, eliminating flicker.
Camera smoothing: When focus shifts between speakers, a virtual camera effect creates a slow pan from one position to the next, rather than an abrupt cut.

Module 3: Finishing and branding

The final stage ensures the clips are ready for immediate publication, focusing on viewer engagement and brand reinforcement.

Dynamic caption highlighting

Using the word-level timestamps from the speech-to-text module, the system overlays highlighted captions with MoviePy. This involves:

Fig. 9: Dynamic caption highlighting

Sentence reconstruction: Grouping individual words into readable lines that adhere to character limits
Highlighting: The currently spoken word is highlighted in a distinct color (mustard yellow) against a black background, a proven method for increasing engagement when videos play without sound.

Masking and logo placement

Two overlay techniques maintain consistent branding across all videos:

Mask placement: A PNG mask with an alpha channel resizes the video to fit precisely into the transparent area. The mask's opaque regions (such as colored bars) serve as a dedicated background for captions and persistent graphics.
Logo overlay: The brand logo is placed onto the video based on configurable parameters for position (top-right, bottom-left, and so on), size, and margin.

Fig. 10: Mask and logo placement

Conclusion

Glance’s video pipeline demonstrates what becomes possible when AI handles the repetitive, judgement-intensive work of video editing. By combining speech-to-text transcription, computer vision, and generative AI, the system transforms thousands of long-form videos into mobile-ready clips each day, preserving narrative context while optimizing for vertical viewing.

The approach offers a template for any organization sitting on long-form video archives. Rather than choosing between scale and quality, automated pipelines can deliver both.

If you’re exploring similar video processing, content transformation, or media AI projects, the Google Cloud consulting team is eager to connect and explore the possibilities. For more on the AI products used in solutions Glance’s this, visit our AI & ML Products page.

_{This solution was a collaborative effort between Glance ( Pradeep Tiwari , Himanshu Aggarwal) and Google Cloud Consulting (Sharmila Devi, Jinyeong Yim, Rohit Sroch, Neeraj Shivhare and Kinjal Singh).}