Data foundation layer
Agents are only as good as their data access. The data foundation layer provides the knowledge bases, vector stores, structured APIs, document stores, and real-time feeds that agents use to retrieve evidence, validate context, and take informed action.
Data foundation components for agent systems
| Component | Purpose | Design consideration |
|---|---|---|
| Knowledge bases | Curated, domain-specific reference data | Maintainability, update cadence, and source attribution |
| Vector stores | Semantic search and retrieval for RAG workflows | Embedding model choice, chunking strategy, and freshness |
| Structured APIs | Transaction systems and operational data | API design, rate limits, and authentication |
| Document stores | Unstructured content such as policies, manuals, or contracts | Indexing strategy, access controls, and versioning |
| Real-time feeds | Live data streams for time-sensitive decisions | Latency, ordering guarantees, and failure handling |
Data governance is critical for agent-accessible data. Ensure that agents only access data they are authorized to see, that retrieval is auditable, and that stale or incorrect data does not lead to flawed decisions. Treat data products as contracts with clear schemas and service level objectives.
The Agentic Enterprise — Data Layer and Semantic Layer
Salesforce Architect guide covering the Data Layer (VectorDB, Lakehouse, Data Contracts, AI-Ready Data Fabric) and the Semantic Layer (Enterprise Knowledge Graph, Semantic Query Engine) that underpin agent reasoning.
Read the Data and Semantic Layer sectionsKnowledge Bases for Amazon Bedrock
AWS documentation on managed RAG capabilities, enabling foundation models and agents to access company data for grounded responses.
Explore AWS Knowledge BasesAgent runtime and execution environments
The agent runtime determines where agents execute and how state is managed. Runtime choices affect isolation, persistence, scaling, and observability.
Runtime considerations for agent systems
| Consideration | What it means for agents | Example implementation |
|---|---|---|
| Execution isolation | How agents are separated from each other and from the host | Containers, sandboxes, or serverless functions |
| Session persistence | How conversational and workflow state is stored | In-memory cache, durable store, or distributed cache |
| Timeout handling | How long-running operations are managed | Async workflows, polling, or webhooks |
| Resource limits | Constraints on compute, memory, and concurrency | Quotas, throttling, or auto-scaling policies |
| Scaling model | How the system responds to load | Horizontal scaling, function auto-scaling, or provisioned capacity |
Agent runtime request flow
Loading diagram...
Session state management patterns vary in complexity. In-memory state is simple but lost on restart. Persisted state survives restarts but adds latency. Distributed state scales across instances but requires coordination. Choose based on session durability requirements and scaling needs.
Scaling and resilience patterns
Agent systems need production-grade reliability. Resilience patterns prevent cascading failures, handle overload gracefully, and ensure that the system can recover from errors without losing data or trust.
Resilience patterns for agent workflows
| Pattern | What it does | When to use it |
|---|---|---|
| Circuit breaker | Stops calling a failing service after a threshold | Downstream services are failing or timing out |
| Rate limiting | Throttles requests to protect downstream systems | APIs have quotas or scale limits |
| Backpressure | Signals the producer to slow down when the consumer is overwhelmed | Processing pipelines are congested |
| Retry with backoff | Retries failed operations with increasing delays | Transient failures are common |
| Graceful degradation | Reduces functionality instead of failing completely | Non-critical features are unavailable |
| Dead letter queue | Captures failed messages for later inspection and retry | Message processing fails and needs manual review |
Cost-aware scaling is essential because agent invocations are expensive. Cache retrieval results where possible, batch requests when appropriate, and monitor token usage and latency as first-class operational metrics. Set budgets and alerts to prevent cost overruns.
Monitor token usage and latency as first-class operational metrics
Every agent invocation consumes tokens and time. Track average and p95 latency, token counts per request, and cost per invocation. Use these metrics to detect anomalies, optimize prompts, and set budget controls.
Knowledge Check
Test your understanding with this quiz. You need to answer all questions correctly to mark this section as complete.