A healthcare agent needs orchestration, policy, state, and tools
A production healthcare agent is usually a small system of cooperating parts rather than a model endpoint with a longer prompt. The planner decides the next step, the tool layer executes only approved actions, the state layer carries context across steps, and a policy gate decides whether the workflow can proceed or must stop for review.
Google Cloud's agentic AI architecture guidance makes the same boundary explicit in its single-agent reference architecture: the user-facing frontend, the agent runtime, and the connected systems stay separate. In healthcare, that separation matters because the chat surface should not become the security boundary for FHIR access, payer APIs, or scheduling updates.
Reference architecture for a governed healthcare agent
Loading diagram...
Start with one agent when one workflow boundary is enough
If one planner plus a narrow tool gateway can finish the task safely, keep the design single-agent. Multi-agent orchestration adds trace, routing, and testing complexity that only pays off when the workflow genuinely splits into distinct specialties.
Agentic AI on Google Cloud
Google Cloud Architecture Center overview of single-agent systems, multi-agent systems, and the linked design-pattern guidance.
Open the Google Cloud overviewIntroduction to the Model Context Protocol
Official MCP introduction covering the standard pattern for connecting models to tools, prompts, and resources.
Open the MCP introductionPattern choice should follow the healthcare workflow shape
Google Cloud's design-pattern guide is useful because it separates fixed workflow patterns from AI-routed patterns. In healthcare, that distinction maps directly to governance: a deterministic intake pipeline should not be over-engineered as a coordinator system, while a complex case-routing workflow should not be forced into a brittle linear chain.
How Google Cloud agentic AI patterns translate into healthcare workflows
| Pattern | Healthcare fit | Why it helps | Main control |
|---|---|---|---|
| Sequential | Fixed prior-auth packet assembly or referral-completion pipeline | The order stays stable across cases | Stop on missing documents or identity mismatches |
| Parallel | Concurrent retrieval of benefits, policy criteria, and recent clinical context | Independent sub-tasks can run at the same time | Normalize provenance before synthesis |
| Review and critique | Draft denial appeal, referral letter, or discharge summary review | A critic can check citations, formatting, and safety criteria before release | Make the critique rubric explicit and auditable |
| Coordinator | Case triage across eligibility, scheduling, payer policy, and escalation agents | Routing depends on the context of the incoming case | Cap the agent set and log dispatch decisions |
| Hierarchical decomposition | Enterprise access center workflows that branch by service line and task class | Complex cases need multi-level planning before execution | Preserve shared case state and owner handoffs |
| ReAct | Bounded investigations where each tool call depends on the previous observation | The agent can adjust its next step as evidence arrives | Limit iterations, tool budgets, and allowed actions |
| Human-in-the-loop | Chart writeback, outbound submission, or de-identified dataset release | A person signs off before an irreversible or sensitive action completes | Keep approval in orchestration rather than only in the UI |
Healthcare pattern selection guide
Loading diagram...
Choose an agent design pattern
Google Cloud Architecture Center guidance for sequential, parallel, review-and-critique, coordinator, hierarchical, ReAct, and human-in-the-loop patterns.
Review the Google Cloud pattern guideTool contracts should be explicit enough to test and restrict
Healthcare teams often expose too much functionality to the model because the API surface already exists. The better pattern is to expose a minimal tool set aligned to the workflow boundary: retrieve demographics, pull recent observations, create a draft task, or send a case to a review queue. Each tool should have well-defined inputs, outputs, and allowed actors.
Tool contract patterns for healthcare agents
| Contract style | Best fit | Important control |
|---|---|---|
| FHIR REST operation | Patient, encounter, observation, or task retrieval | SMART scopes, resource filters, and audit logs |
| OpenAPI business tool | Scheduling, prior auth, or denial-management services | Narrow endpoints and parameter validation |
| MCP tool server | Unified access to approved tools and resources | Per-tool exposure policy and session isolation |
| Queue-backed workflow action | Operations that must stop for asynchronous human review | Explicit state transitions and reviewer ownership |
How Amazon Bedrock Agents work
Official Amazon Bedrock documentation for action groups, schemas, Lambda-backed tools, and knowledge-base wiring.
Review Bedrock agent contractsAmazon Bedrock AgentCore Gateway
AWS documentation for building a governed gateway layer that can expose MCP servers, APIs, Lambda tools, and OAuth-protected endpoints to agents.
Review AgentCore Gateway docsCloud API Registry overview
Google Cloud documentation for discovering, enabling, and monitoring MCP servers and tools through Cloud API Registry.
Open Cloud API Registry overviewUse multi-agent reference architectures when the case really splits into specialties
Google Cloud's multi-agent reference architecture is built around a coordinator agent, specialized subagents, and interoperable runtimes. That model fits healthcare when each subagent owns a materially different job, such as intake normalization, payer-policy retrieval, clinical summarization, or queue writeback, and when the coordinator must keep the full case trace intact across those handoffs.
Prior authorization case routed through specialized healthcare agents
Loading diagram...
Do not create subagents without a real boundary
If two agents share the same tools, policy boundary, and reasoning role, they are usually one agent with cleaner prompts or tools. Specialization should reflect different expertise, approval rules, or runtime needs, not a desire for architectural novelty.
Multi-agent AI system
Google Cloud Architecture Center reference architecture for coordinator-driven multi-agent systems, including A2A-based agent communication and runtime options.
Open the multi-agent reference architectureSessions and approvals keep multi-step behavior inspectable
Because agentic workflows span multiple calls, they need durable session state: what the user asked, what tools were used, what evidence was retrieved, and whether a reviewer already accepted or rejected the proposed action. Without session discipline, healthcare agents become impossible to explain after the fact.
Session-aware healthcare agent interaction
Loading diagram...
Approval is part of architecture, not only UI
If a workflow requires sign-off, that constraint should live in the orchestration and tool policy layer so a future client or automation job cannot bypass it accidentally.
Agent Engine overview
Google Cloud documentation describing managed deployment, sessions, memory, and enterprise operations for agent runtimes.
Open Agent Engine overviewKnowledge Check
Test your understanding with this quiz. You need to answer all questions correctly to mark this section as complete.