High-Throughput Database Selection
The core workflow engine of a RIS requires an exceptionally robust database layer to manage complex patient states, dynamic schedules, and dense clinical metadata.
Amazon RDS vs DynamoDB: Architectural Comparison for RIS Workloads
| Architectural Dimension | Amazon RDS (Relational) | Amazon DynamoDB (NoSQL) |
|---|---|---|
| Data Structure | Structured schemas, highly normalized relational tables. | Flexible schema, key-value or document-based architecture. |
| Query Flexibility | High. Supports ad-hoc queries, deep SQL joins, and complex aggregations. | Low. Queries must be specifically designed around partition keys. |
| Latency Profile | Low latency, but dependent on query complexity and table locks. | Consistent single-digit millisecond latency at any scale^[AWS DynamoDB SLA Documentation](https://aws.amazon.com/dynamodb/sla/). |
| Event-Driven Triggers | Complex to implement native event triggers to external serverless functions. | Native integration via DynamoDB Streams for capturing data changes. |
| RIS Application Fit | Ideal for complex retrospective business analytics and structured billing ledgers. | Ideal for real-time patient tracking boards, HL7 ingest, and state machine transitions. |
An optimal RIS architecture frequently utilizes a polyglot persistence strategy. DynamoDB manages the high-velocity operational state machine and real-time patient tracking, while Amazon RDS (often Aurora PostgreSQL) serves as the durable system of record for complex relational billing data.
DynamoDB Partition Key Design Best Practices
Choose partition keys with high cardinality and uniform access patterns to avoid hot partitions. For patient tracking, use patientId#encounterId composite keys to distribute load evenly. Avoid sequential keys like timestamps as sole partition keys, as they concentrate writes on a single partition.
DynamoDB vs RDS: A Practical Guide
Detailed analysis of when to utilize Relational vs NoSQL for modern architectures.
Read MindMesh GuideSQL Schema for RIS Relational Data
The following SQL schema demonstrates a normalized relational database design for RIS billing and reporting workflows. This schema leverages foreign key constraints to maintain referential integrity across patient, order, report, and billing tables.
RDS PostgreSQL Schema: Patient, Order, Report, and Billing Tables
Loading diagram...
-- Patient table: Core demographic information
CREATE TABLE patient (
patient_id VARCHAR(36) PRIMARY KEY DEFAULT gen_random_uuid(),
mrn VARCHAR(20) UNIQUE NOT NULL,
first_name VARCHAR(100) NOT NULL,
last_name VARCHAR(100) NOT NULL,
date_of_birth DATE NOT NULL,
gender VARCHAR(20),
insurance_provider VARCHAR(100),
insurance_id VARCHAR(50),
created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
);
-- Radiology order table: Examination requests
CREATE TABLE radiology_order (
order_id VARCHAR(36) PRIMARY KEY DEFAULT gen_random_uuid(),
patient_id VARCHAR(36) NOT NULL REFERENCES patient(patient_id),
ordering_physician VARCHAR(200) NOT NULL,
procedure_code VARCHAR(20) NOT NULL, -- CPT/HCPCS codes
procedure_description TEXT,
priority VARCHAR(20) DEFAULT 'ROUTINE', -- STAT, URGENT, ROUTINE
clinical_indication TEXT,
order_status VARCHAR(20) DEFAULT 'PENDING', -- PENDING, SCHEDULED, IN_PROGRESS, COMPLETED, CANCELLED
scheduled_datetime TIMESTAMP WITH TIME ZONE,
created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
);
-- Radiology report table: Diagnostic findings
CREATE TABLE radiology_report (
report_id VARCHAR(36) PRIMARY KEY DEFAULT gen_random_uuid(),
order_id VARCHAR(36) NOT NULL REFERENCES radiology_order(order_id),
radiologist_id VARCHAR(36) NOT NULL,
findings_text TEXT,
impression_text TEXT,
report_status VARCHAR(20) DEFAULT 'DRAFT', -- DRAFT, PRELIMINARY, FINAL, AMENDED
dicom_study_uid VARCHAR(64),
created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
finalized_at TIMESTAMP WITH TIME ZONE,
amended_at TIMESTAMP WITH TIME ZONE
);
-- Billing ledger table: Financial transactions
CREATE TABLE billing_ledger (
transaction_id VARCHAR(36) PRIMARY KEY DEFAULT gen_random_uuid(),
order_id VARCHAR(36) NOT NULL REFERENCES radiology_order(order_id),
patient_id VARCHAR(36) NOT NULL REFERENCES patient(patient_id),
service_date DATE NOT NULL,
procedure_code VARCHAR(20) NOT NULL,
charge_amount DECIMAL(10, 2) NOT NULL,
insurance_amount DECIMAL(10, 2),
patient_responsibility DECIMAL(10, 2),
payment_status VARCHAR(20) DEFAULT 'PENDING', -- PENDING, SUBMITTED, PAID, DENIED, ADJUSTED
claim_number VARCHAR(50),
created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
);
-- Indexes for common query patterns
CREATE INDEX idx_order_patient ON radiology_order(patient_id);
CREATE INDEX idx_order_status ON radiology_order(order_status);
CREATE INDEX idx_report_order ON radiology_report(order_id);
CREATE INDEX idx_billing_patient ON billing_ledger(patient_id);
CREATE INDEX idx_billing_order ON billing_ledger(order_id);RDS Read Replicas vs Multi-AZ
Multi-AZ provides synchronous standby replication for high availability and automatic failover (RPO ≈ 0, RTO < 60s). Read Replicas provide asynchronous replication for read scaling and offloading analytical queries. For RIS workloads, use Multi-AZ for production billing databases and Read Replicas for reporting dashboards.
SQL Query Examples for RIS Analytics
Complex SQL joins enable comprehensive reporting across patient, order, report, and billing data. The following query demonstrates a typical revenue cycle analytics query joining all four tables.
-- Revenue cycle analytics: Join patient, order, report, and billing tables
SELECT
p.mrn,
p.first_name || ' ' || p.last_name AS patient_name,
ro.procedure_code,
ro.procedure_description,
ro.order_status,
rr.report_status,
rr.findings_text,
rr.impression_text,
bl.charge_amount,
bl.insurance_amount,
bl.patient_responsibility,
bl.payment_status,
ro.scheduled_datetime,
rr.finalized_at
FROM radiology_order ro
INNER JOIN patient p ON ro.patient_id = p.patient_id
LEFT JOIN radiology_report rr ON ro.order_id = rr.order_id
LEFT JOIN billing_ledger bl ON ro.order_id = bl.order_id
WHERE ro.scheduled_datetime >= CURRENT_DATE - INTERVAL '30 days'
AND ro.order_status IN ('COMPLETED', 'IN_PROGRESS')
ORDER BY ro.scheduled_datetime DESC
LIMIT 100;
-- Aggregation: Monthly revenue by procedure code
SELECT
DATE_TRUNC('month', ro.scheduled_datetime) AS month,
ro.procedure_code,
COUNT(*) AS total_orders,
SUM(bl.charge_amount) AS total_charges,
SUM(bl.insurance_amount) AS total_insurance,
SUM(bl.patient_responsibility) AS total_patient_responsibility,
AVG(bl.charge_amount) AS avg_charge
FROM radiology_order ro
INNER JOIN billing_ledger bl ON ro.order_id = bl.order_id
WHERE ro.scheduled_datetime >= CURRENT_DATE - INTERVAL '12 months'
AND bl.payment_status IN ('PAID', 'SUBMITTED')
GROUP BY DATE_TRUNC('month', ro.scheduled_datetime), ro.procedure_code
ORDER BY month DESC, total_charges DESC;These queries enable critical business intelligence: tracking order-to-report turnaround times, monitoring payment status across insurance providers, and analyzing procedure volume trends for capacity planning.
Polyglot Persistence Architecture
The following architecture diagram illustrates the complete polyglot persistence data flow for a cloud-native RIS. Patient tracking data flows through DynamoDB for low-latency operations, while structured billing and reports are stored in RDS, with documents archived in S3.
Polyglot Persistence: RIS Data Flow Architecture
Loading diagram...
The architecture leverages DynamoDB Streams to capture change data (CDC) in real-time. When a patient status updates in DynamoDB, the stream instantly triggers a Lambda function that replicates the change to RDS. See quiz questions 4 and 17 for CDC pattern details. For multi-region replication, see Resilience for DynamoDB Global Tables.
Amazon S3 serves as the durable document store for radiologist reports (PDF), scanned consent forms, and archival data. The RIS orchestrates movement of decade-old objects to S3 Glacier Instant Retrieval, minimizing storage costs while preserving accessibility for comparative analysis.
S3 Intelligent-Tiering Monitoring Fee
S3 Intelligent-Tiering charges a small monthly monitoring and automation fee per object (~$0.0025 per 1,000 objects). This fee is negligible for large objects but can accumulate for millions of small files. For RIS workloads with large DICOM studies, the cost savings from automatic tiering far exceed the monitoring fee.
Polyglot Persistence on AWS - Prescriptive Guidance
AWS guidance on multi-database strategies for modern applications.
Read AWS Prescriptive GuidanceChange Data Capture with DynamoDB Streams and Lambda
Official documentation for DynamoDB Streams integration with AWS Lambda.
View DynamoDB Streams DocsDynamoDB Table Schema and Query Examples
The following DynamoDB schema demonstrates a patient tracking table optimized for real-time RIS workflows. Model it as a patient-centric item collection: one partition per patient, several sorted rows for encounters/orders/studies, and GSIs that re-project those same items into operational worklists.
DynamoDB Item Collection View: Patient Partition and GSIs
Loading diagram...
DynamoDB Table Schema: Patient Tracking with GSIs
Structured JSON example rendered with depth controls for easier inspection.
Click on an annotation to highlight it in the JSON
The schema uses a single-table design pattern where PK stores PATIENT#{patientId} and SK stores entity types such as PATIENT#PROFILE, ENCOUNTER#{encounterId}, ORDER#{orderId}, and STUDY#{accessionNumber}. GSI1 re-groups items by current workflow status for shared worklists, while GSI2 re-groups them by modality and scheduled time for operational queues.
import { DynamoDBClient } from '@aws-sdk/client-dynamodb';
import { DynamoDBDocumentClient, QueryCommand } from '@aws-sdk/lib-dynamodb';
const client = new DynamoDBClient({});
const docClient = DynamoDBDocumentClient.from(client);
// Query the full patient timeline from the base table
async function getPatientTimeline(patientId: string) {
const command = new QueryCommand({
TableName: 'ris-patient-tracking',
KeyConditionExpression: 'PK = :pk',
ExpressionAttributeValues: {
':pk': `PATIENT#${patientId}`
}
});
const response = await docClient.send(command);
return response.Items;
}
// Query all orders with a shared workflow status from GSI1
async function getOrdersByStatus(status: string) {
const command = new QueryCommand({
TableName: 'ris-patient-tracking',
IndexName: 'GSI1-StatusBoard',
KeyConditionExpression: 'GSI1PK = :statusPk',
ExpressionAttributeValues: {
':statusPk': `STATUS#${status}`
},
ScanIndexForward: false
});
const response = await docClient.send(command);
return response.Items;
}DynamoDB On-Demand vs Provisioned Pricing
On-Demand mode charges per request (~$1.25 per million writes, ~$0.25 per million reads) with no capacity planning. Provisioned mode requires specifying RCU/WCU but costs ~70% less at steady-state workloads. For RIS patient tracking with unpredictable spikes, on-demand is recommended. For stable billing workloads, provisioned with auto-scaling is more cost-effective.
DynamoDB Streams CDC Lambda Handler
The following TypeScript Lambda function processes DynamoDB Streams events to replicate patient status changes to RDS in real-time. This change data capture (CDC) pattern ensures billing systems have access to durable records while operational workflows benefit from low-latency NoSQL.
import { DynamoDBStreamEvent, Context } from 'aws-lambda';
import { Client } from 'pg';
interface PatientStatusChange {
patientId: string;
encounterId: string;
oldStatus: string;
newStatus: string;
timestamp: string;
}
export const handler = async (event: DynamoDBStreamEvent, context: Context): Promise<void> => {
const rdsClient = new Client({
host: process.env.RDS_HOST!,
port: parseInt(process.env.RDS_PORT || '5432'),
database: process.env.RDS_DATABASE!,
user: process.env.RDS_USER!,
password: process.env.RDS_PASSWORD!,
ssl: { rejectUnauthorized: false }
});
try {
await rdsClient.connect();
for (const record of event.Records) {
if (record.eventName === 'MODIFY' && record.dynamodb) {
const oldImage = record.dynamodb.OldImage;
const newImage = record.dynamodb.NewImage;
if (oldImage && newImage) {
const statusChange: PatientStatusChange = {
patientId: oldImage.patientId?.S || '',
encounterId: oldImage.encounterId?.S || '',
oldStatus: oldImage.orderStatus?.S || '',
newStatus: newImage.orderStatus?.S || '',
timestamp: newImage.updatedAt?.S || new Date().toISOString()
};
// Replicate to RDS billing ledger
await rdsClient.query(
`INSERT INTO billing_ledger_audit
(patient_id, encounter_id, old_status, new_status, changed_at)
VALUES ($1, $2, $3, $4, $5)
ON CONFLICT (patient_id, encounter_id, changed_at)
DO UPDATE SET new_status = EXCLUDED.new_status`,
[
statusChange.patientId,
statusChange.encounterId,
statusChange.oldStatus,
statusChange.newStatus,
statusChange.timestamp
]
);
console.log(`Replicated status change: ${statusChange.patientId} ${statusChange.oldStatus} -> ${statusChange.newStatus}`);
}
}
}
} catch (error) {
console.error('Error processing DynamoDB Stream:', error);
throw error; // Lambda will retry on failure
} finally {
await rdsClient.end();
}
};This Lambda function processes each stream record, extracts the old and new images, and replicates status changes to an RDS audit table. The function uses parameterized queries to prevent SQL injection and includes error handling to ensure failed records are retried.
Data Residency Requirements
Healthcare data residency regulations (e.g., GDPR, Australia's Privacy Act) may require patient data to remain within specific geographic boundaries. When implementing CDC replication, ensure both DynamoDB and RDS are deployed in the same AWS Region. For cross-border scenarios, implement data masking or aggregation before replication.
Database Selection Decision Tree
Architects must carefully choose between Amazon RDS and DynamoDB based on specific data structures and access patterns required by the RIS application.
RDS vs DynamoDB: Architectural Decision Tree
Loading diagram...
DynamoDB provides native mechanisms for building event-driven systems. There is a direct 1:1:1 relationship between a DynamoDB partition, its corresponding open stream shard, and the AWS Lambda instance that processes records from that shard. When a patient's status updates in the DynamoDB table, DynamoDB Streams instantly capture this change data (CDC) and trigger a Lambda function, ensuring downstream systems are updated in real-time with sub-second latency.
Terraform Infrastructure Templates
The following Terraform templates demonstrate infrastructure-as-code definitions for the core RIS data architecture components: DynamoDB table with GSIs, RDS PostgreSQL instance, S3 bucket with lifecycle policy, and DMS replication instance.
resource "aws_dynamodb_table" "ris_patient_tracking" {
name = "ris-patient-tracking"
billing_mode = "PAY_PER_REQUEST"
hash_key = "PK"
range_key = "SK"
attribute {
name = "PK"
type = "S"
}
attribute {
name = "SK"
type = "S"
}
attribute {
name = "GSI1PK"
type = "S"
}
attribute {
name = "GSI1SK"
type = "S"
}
attribute {
name = "GSI2PK"
type = "S"
}
global_secondary_index {
name = "GSI1-PatientEncounters"
hash_key = "GSI1PK"
range_key = "GSI1SK"
projection_type = "ALL"
}
global_secondary_index {
name = "GSI2-StatusTracking"
hash_key = "GSI2PK"
projection_type = "ALL"
}
stream_enabled = true
stream_view_type = "NEW_AND_OLD_IMAGES"
point_in_time_recovery {
enabled = true
}
tags = {
Environment = "production"
Application = "RIS"
}
}resource "aws_db_instance" "ris_billing" {
identifier = "ris-billing-db"
engine = "postgres"
engine_version = "15.4"
instance_class = "db.r6g.large"
allocated_storage = 100
storage_type = "gp3"
storage_encrypted = true
db_name = "ris_billing"
username = var.rds_username
password = var.rds_password
multi_az = true
publicly_accessible = false
vpc_security_group_ids = [aws_security_group.rds.id]
db_subnet_group_name = aws_db_subnet_group.rds.name
backup_retention_period = 30
backup_window = "03:00-04:00"
maintenance_window = "Mon:04:00-Mon:05:00"
enabled_cloudwatch_logs_exports = ["postgresql"]
tags = {
Environment = "production"
Application = "RIS"
}
}resource "aws_s3_bucket" "ris_documents" {
bucket = "ris-documents-${var.environment}-${var.region}"
tags = {
Environment = var.environment
Application = "RIS"
}
}
resource "aws_s3_bucket_lifecycle_configuration" "ris_documents_lifecycle" {
bucket = aws_s3_bucket.ris_documents.id
rule {
id = "intelligent-tiering"
status = "Enabled"
transition {
days = 30
storage_class = "INTELLIGENT_TIERING"
}
transition {
days = 90
storage_class = "GLACIER_INSTANT_RETRIEVAL"
}
transition {
days = 365
storage_class = "GLACIER_DEEP_ARCHIVE"
}
}
rule {
id = "expire-temp-uploads"
status = "Enabled"
filter {
prefix = "temp/"
}
expiration {
days = 7
}
}
}resource "aws_dms_replication_instance" "ris_cdc" {
allocated_storage = 100
apply_immediately = true
auto_minor_version_upgrade = true
availability_zone = data.aws_availability_zones.available.names[0]
engine_version = "3.5.2"
multi_az = true
preferred_maintenance_window = "sun:06:00-sun:08:00"
replication_instance_class = "dms.c6g.large"
replication_instance_id = "ris-cdc-replication"
publicly_accessible = false
security_group_ids = [aws_security_group.dms.id]
subnet_ids = data.aws_subids.dms.ids
tags = {
Environment = "production"
Application = "RIS"
}
}Backup and Restore Strategies
For production RIS databases, implement automated backups with 30-day retention (RDS) and Point-in-Time Recovery (DynamoDB). Test restore procedures quarterly. For disaster recovery, consider cross-region read replicas (RDS) or DynamoDB Global Tables for multi-region active-active deployments.
Cost Comparison and Estimation
Understanding the cost implications of different database and storage choices is critical for RIS architecture planning. The following tables compare RDS, DynamoDB, and Aurora pricing, along with S3 storage class costs.
Database Cost Comparison: RDS vs DynamoDB vs Aurora (Monthly Estimates for Medium RIS)
| Service | Configuration | Monthly Cost (USD) | Best For |
|---|---|---|---|
| RDS PostgreSQL | db.r6g.large, Multi-AZ, 100GB GP3 | ~$350 | Structured billing, complex joins |
| Aurora PostgreSQL | db.r6g.large, 2 replicas, 100GB | ~$500 | High availability, read scaling |
| DynamoDB On-Demand | 1M writes/day, 10M reads/day, 50GB | ~$150 | Patient tracking, real-time state |
| DynamoDB Provisioned | 100 RCU, 50 WCU, 50GB | ~$80 | Predictable steady-state workloads |
S3 Storage Class Pricing Comparison (per GB/month)
| Storage Class | Cost (USD/GB/month) | Retrieval Latency | Minimum Storage Duration |
|---|---|---|---|
| S3 Standard | $0.023 | Milliseconds | None |
| S3 Intelligent-Tiering | $0.023 + monitoring fee | Milliseconds | None |
| S3 Glacier Instant Retrieval | $0.004 | Milliseconds | 90 days |
| S3 Glacier Flexible Retrieval | $0.0036 | Minutes to hours | 90 days |
| S3 Glacier Deep Archive | $0.00099 | 12-48 hours | 180 days |
S3 Lifecycle Policy Timeline: Intelligent-Tiering to Glacier Deep Archive
Loading diagram...
Monthly Cost Estimator for Medium-Sized Radiology Department (50,000 studies/year, 500 patients/day):
- DynamoDB (patient tracking): ~$150/month (on-demand, 2M writes, 20M reads)
- RDS PostgreSQL (billing): ~$350/month (db.r6g.large, Multi-AZ)
- S3 Standard (active studies): ~$230/month (10TB)
- S3 Glacier Instant (archival): ~$40/month (10TB >90 days old)
- Data transfer & API requests: ~$50/month
- Total Estimated Monthly Cost: ~$820 USD
HTJ2K Compression Ratios
High Throughput JPEG 2000 (HTJ2K) compression achieves 10:1 to 20:1 compression ratios for medical imaging while maintaining diagnostic quality. For a medium RIS storing 50,000 studies/year at 500MB/study uncompressed, HTJ2K reduces annual storage from 25TB to ~1.25-2.5TB, resulting in significant S3 cost savings.
AWS Pricing Calculator
Pre-filled templates for RIS architecture cost estimation.
Open Pricing CalculatorPerformance Benchmarks
Understanding performance characteristics of each AWS service is essential for capacity planning and SLA compliance. The following benchmarks represent typical performance metrics for RIS workloads.
DynamoDB Throughput Benchmarks (PAY_PER_REQUEST Mode)
| Operation | Throughput | Latency (p99) | Notes |
|---|---|---|---|
| GetItem (single) | Unlimited | <10ms | Consistent single-digit ms |
| Query (100 items) | Unlimited | <50ms | Depends on item size |
| PutItem | Unlimited | <10ms | Automatic scaling |
| BatchWriteItem (25) | Unlimited | <100ms | 25 items per batch |
| TransactWriteItems | Unlimited | <100ms | ACID across items |
RDS PostgreSQL Query Performance (db.r6g.large)
| Query Type | Avg Latency | Max Concurrent | Notes |
|---|---|---|---|
| Simple SELECT (indexed) | <5ms | 500+ | With proper indexes |
| JOIN (2-3 tables) | <50ms | 200+ | Optimized queries |
| Complex analytics | <500ms | 50+ | Use read replicas |
| INSERT/UPDATE | <10ms | 300+ | Connection pooling recommended |
S3 Retrieval Latency by Storage Class
| Storage Class | First Byte Latency | Throughput | Use Case |
|---|---|---|---|
| S3 Standard | <100ms | Up to 100+ Gbps | Active studies |
| S3 Intelligent-Tiering | <100ms | Up to 100+ Gbps | Variable access |
| Glacier Instant | <100ms | Up to 100+ Gbps | Archival with fast access |
| Glacier Flexible | 1-5 minutes | Up to 100+ Gbps | Deep archival |
AWS HealthImaging Metadata API Performance: The GetDICOMSeriesMetadata API typically returns JSON metadata in <100ms for standard studies. For large studies with thousands of instances, expect 200-500ms response times. This enables rapid quality assurance audits across thousands of studies without downloading heavy image payloads.
Amazon ElastiCache for Query Caching
For read-heavy RIS workloads (e.g., radiologist worklist queries), consider Amazon ElastiCache (Redis) as a caching layer. Cache frequently accessed patient demographics and order status to reduce RDS load. Typical cache hit ratios of 80-90% can reduce database queries by an order of magnitude.
Extracting Value with Amazon HealthLake
Amazon HealthLake is a HIPAA-eligible service designed to store, transform, and analyze complex healthcare data in the standardized FHIR R4 format.
A primary challenge in radiology is extracting actionable intelligence from unstructured free text dictated by radiologists. HealthLake solves this by integrating native medical NLP capabilities. When a report is ingested, Amazon Textract and Amazon Comprehend Medical extract meaningful clinical entities (medications, diagnoses, anatomies) and map them to medical ontologies. This data is appended into a FHIR DocumentReference resource encoded in base64, transforming a static text report into a dynamic queryable asset.
HealthLake FHIR R4 Version Support
Amazon HealthLake supports FHIR R4 (Release 4), the most widely adopted FHIR version. All resources conform to the HL7 FHIR R4 specification, ensuring interoperability with other FHIR-compliant systems. HealthLake automatically validates incoming resources against the FHIR R4 schema.
HealthLake NLP Extraction Pipeline
Loading diagram...
AWS HealthLake Developer Guide
Comprehensive guide to structuring FHIR data in HealthLake.
View Dev GuideTransform Unstructured Healthcare Data Using HealthLake
How NLP extracts clinical entities from radiology reports.
Read ML BlogHealthLake Terminology Services
Amazon HealthLake integrates with medical terminology services to map extracted clinical entities to standardized codes. This enables interoperability with billing systems, quality reporting, and clinical decision support.
Medical Terminology Standards Supported by HealthLake
| Terminology | Purpose | Example Code | RIS Use Case |
|---|---|---|---|
| ICD-10-CM | Diagnosis coding | R93.1 (Abnormal findings on diagnostic imaging) | Billing, quality reporting |
| SNOMED CT | Clinical terminology | 39825004 (Radiography) | Clinical documentation, decision support |
| RxNorm | Medication coding | 1049502 (Iodinated contrast) | Contrast agent tracking, allergy checking |
| LOINC | Lab/test codes | 36638-3 (CT Chest) | Procedure standardization, reporting |
| CPT/HCPCS | Procedure billing | 71250 (CT Thorax without contrast) | Revenue cycle, claims submission |
FHIR Observation Resource with ICD-10 and SNOMED CT Coding
Structured JSON example rendered with depth controls for easier inspection.
Click on an annotation to highlight it in the JSON
HealthLake automatically maps extracted entities to these terminologies during NLP processing. For example, when Comprehend Medical identifies "pulmonary nodule" in a radiology report, HealthLake maps it to ICD-10 code R93.1 and SNOMED CT code 233604007, enabling standardized querying and analytics.
AWS HealthImaging (AHI)
AWS HealthImaging revolutionizes the retrieval of massive pixel payloads by fundamentally separating the DICOM metadata from the actual pixel data during the ingestion phase.
It provides sub-second image retrieval latencies at scale by streaming pixels directly to web-based diagnostic viewers using High Throughput JPEG 2000 (HTJ2K) compression^DICOM PS3.5 - HTJ2K Transfer Syntax^ISO/IEC 15444-1:2019 - JPEG 2000 Standard. The RIS interacts primarily with the GetDICOMSeriesMetadata API, which returns developer-friendly JSON metadata instantly without pulling the heavy image. This allows the RIS to orchestrate immediate billing for specific contrast agents and audit radiation doses rapidly.
In a RIS data-architecture context, the value of this reference diagram is the ingest boundary rather than the viewer path. It shows where departmental imaging systems hand DICOM over to a managed platform that can preserve audit artifacts, normalize metadata, and expose study or series context without forcing the RIS to treat pixel payloads as its primary data model.
AWS HealthImaging Metadata Extraction Flow
Loading diagram...
A critical integration point for the RIS workflow engine is the GetDICOMSeriesMetadata API provided by AWS HealthImaging. The RIS can query this API to retrieve developer-friendly JSON metadata at the primary Study, Series, or Instance level without incurring the massive overhead of downloading the actual image payload. This instantaneous metadata extraction drives highly accurate, automated billing workflows and facilitates rigorous departmental quality assurance protocols.
S3 Lifecycle Policy Flow
Implement S3 Lifecycle policies to automatically transition aging studies: Day 0-30 in S3 Standard for active reading, Day 30-90 in Intelligent-Tiering for variable access patterns, Day 90-365 in Glacier Instant Retrieval for archival with fast access, and beyond 365 days in Glacier Deep Archive for long-term retention. This automated tiering can reduce storage costs by 60-80%.
Getting DICOM series metadata from HealthImaging
API documentation for retrieving sub-second diagnostic metadata.
Read AWS DocsIntegration of On-Premises Medical Imaging Data with AWS HealthImaging
How to migrate and integrate existing DICOM archives with AHI.
Read AWS Industries BlogDICOM Standard - HTJ2K Compression
DICOM PS3.5 specification for High Throughput JPEG 2000 transfer syntax.
View DICOM StandardData Architecture Best Practices Summary
The following best practices summarize key architectural decisions for building a cloud-native RIS on AWS:
- Polyglot Persistence: Use DynamoDB for real-time patient tracking and RDS/Aurora for structured billing data. Leverage DynamoDB Streams for CDC replication.
- Storage Tiering: Implement S3 Lifecycle policies to automatically transition studies from Standard → Intelligent-Tiering → Glacier Instant → Glacier Deep Archive based on access patterns.
- Metadata Separation: Use AWS HealthImaging to separate DICOM metadata from pixel data, enabling sub-second metadata queries without downloading heavy images.
- NLP Integration: Leverage Amazon HealthLake with Comprehend Medical to extract clinical entities from unstructured reports and map to standardized terminologies (ICD-10, SNOMED CT, LOINC).
- Compression: Utilize HTJ2K compression for medical imaging to achieve 10:1 to 20:1 compression ratios while maintaining diagnostic quality.
- High Availability: Deploy RDS with Multi-AZ for automatic failover and consider Read Replicas for read scaling. Use DynamoDB Global Tables for multi-region deployments.
- Cost Optimization: Use DynamoDB on-demand for unpredictable workloads and provisioned mode for steady-state. Monitor S3 storage class distribution and adjust lifecycle policies quarterly.
- Security: Enable encryption at rest (KMS) and in transit (TLS). Implement VPC endpoints for private connectivity. Use IAM roles with least-privilege permissions.
AWS Well-Architected Analytics Lens
Best practices for data-intensive workloads on AWS.
View Well-Architected GuideKnowledge Check
Test your understanding with this quiz. You need to answer all questions correctly to mark this section as complete.