AWS DynamoDB Data Modeling Patterns
Practical DynamoDB modeling patterns including single-table design, composite keys, GSIs, and access-pattern-first thinking that keeps queries cheap at scale.
What you'll learn
- ✓Why DynamoDB modeling is access-pattern-first
- ✓Partition keys, sort keys, and composite keys
- ✓Single-table design fundamentals
- ✓When to add a Global Secondary Index
- ✓Hot partition warnings and how to avoid them
Prerequisites
- •Familiarity with key-value or NoSQL databases
What and Why
DynamoDB is fast and cheap when you model for it, slow and expensive when you treat it like a relational store. The single rule that matters: design for your access patterns, not for normalized entities. Everything else — single-table design, composite keys, GSIs — flows from that.
This post walks through the patterns I reach for every time I open a new DynamoDB project.
Mental Model
A DynamoDB table is a giant distributed hash map. The partition key decides which node a row lives on. The sort key (optional) orders rows within a partition and enables range queries. You can GetItem by PK+SK in single-digit milliseconds. You can Query a partition with a sort key condition almost as fast. Everything else is a scan, which is slow and expensive.
So before you write a line of code, list every query your app needs and design keys that answer each one with a Query or GetItem.
Table: AppData
+-----------------+-------------------+-----------------------+
| PK | SK | attributes |
+-----------------+-------------------+-----------------------+
| USER#42 | PROFILE | name, email |
| USER#42 | ORDER#2026-06-01 | total, status |
| USER#42 | ORDER#2026-06-15 | total, status |
| ORDER#abc123 | METADATA | userId, total |
| ORDER#abc123 | ITEM#sku-1 | qty, price |
| ORDER#abc123 | ITEM#sku-2 | qty, price |
+-----------------+-------------------+-----------------------+
Access patterns:
GetUser: PK = USER#<id>, SK = PROFILE
ListUserOrders: PK = USER#<id>, SK begins_with ORDER#
GetOrderItems: PK = ORDER#<id>, SK begins_with ITEM# Hands-on Example
Define a table once, store many entity types. PutItem for a profile and an order:
import boto3
table = boto3.resource('dynamodb').Table('AppData')
table.put_item(Item={
'PK': 'USER#42',
'SK': 'PROFILE',
'name': 'Asha',
'email': 'asha@example.com'
})
table.put_item(Item={
'PK': 'USER#42',
'SK': 'ORDER#2026-06-15#abc123',
'GSI1PK': 'ORDER#abc123',
'GSI1SK': 'METADATA',
'total': 99.95,
'status': 'shipped'
})
Query a user’s orders in date order:
from boto3.dynamodb.conditions import Key
resp = table.query(
KeyConditionExpression=Key('PK').eq('USER#42') &
Key('SK').begins_with('ORDER#'),
ScanIndexForward=False # newest first
)
Look up the same order by order ID via a Global Secondary Index GSI1:
resp = table.query(
IndexName='GSI1',
KeyConditionExpression=Key('GSI1PK').eq('ORDER#abc123')
)
The trick: every item carries the keys for every index it participates in. One row, many access paths.
CDK snippet for the table:
new dynamodb.Table(this, 'AppData', {
partitionKey: { name: 'PK', type: dynamodb.AttributeType.STRING },
sortKey: { name: 'SK', type: dynamodb.AttributeType.STRING },
billingMode: dynamodb.BillingMode.PAY_PER_REQUEST,
pointInTimeRecovery: true,
}).addGlobalSecondaryIndex({
indexName: 'GSI1',
partitionKey: { name: 'GSI1PK', type: dynamodb.AttributeType.STRING },
sortKey: { name: 'GSI1SK', type: dynamodb.AttributeType.STRING },
});
Common Pitfalls
Modeling like SQL. Separate tables per entity sounds tidy but forces you to do joins in application code. Single-table design embraces denormalization to make reads cheap.
Hot partitions. A partition key like STATUS#PENDING concentrates writes on one shard. Spread load with a suffix: STATUS#PENDING#<random 0-9> and query all ten on read.
Filter expressions instead of key conditions. FilterExpression runs after the query reads items and you still pay for them. Always push predicates into the key, not the filter.
Scans in production. A Scan reads every item. Acceptable for nightly exports, never for user requests.
Unbounded item growth. A partition has a 10 GB limit. If a user’s orders grow forever, eventually you exceed it. Shard with a year or month suffix in the PK.
Wrong billing mode. On-demand is great for unpredictable workloads. Provisioned with auto-scaling is cheaper for steady high traffic. Switch as you learn the shape.
Practical Tips
List access patterns first, then design keys. A useful table:
| Access pattern | PK | SK | Index |
|---|---|---|---|
| Get user profile | USER#<id> | PROFILE | main |
| List user orders by date | USER#<id> | ORDER#<date>#<id> | main |
| Get order by id | ORDER#<id> | METADATA | GSI1 |
| List orders by status | STATUS#<s> | <date>#<id> | GSI2 |
If the table grows another column, fine. If access patterns grow, add a GSI or rework keys before shipping.
Use sparse indexes for “items with a flag.” Only items that set the GSI key appear in the index, which saves storage and write costs.
Use DynamoDB Streams to fan out changes to Lambda for materialized views, search indexing, and event-driven workflows. Pair with Kinesis for higher fan-out.
For multi-region active-active, use Global Tables. Resolve conflicts with last-writer-wins or model your data so conflicts are impossible (idempotent updates, additive counters via ADD).
Watch CloudWatch’s ConsumedReadCapacityUnits and ConsumedWriteCapacityUnits per partition. The console hides per-partition heat; the contributor insights view reveals it.
Wrap-up
DynamoDB modeling starts with a list of access patterns, ends with a small number of composite keys and indexes, and almost never looks like a SQL schema. Embrace single-table design, denormalize for read speed, push predicates into keys, and use GSIs to support secondary access paths. Avoid scans, watch for hot partitions, and turn on point-in-time recovery from day one. Done right, DynamoDB scales from prototype to billions of items without architectural rewrites — and that is the entire point.
Related articles
- AWS AWS RDS vs Aurora vs DynamoDB: Choosing Your Database
Trade-offs between RDS, Aurora, and DynamoDB across cost, scaling, latency, and operational overhead, with a concrete decision framework.
- SQL SQL NULL Handling Best Practices
Learn how NULL behaves in SQL, why three-valued logic trips up queries, and the patterns that keep your data consistent and your queries correct.
- AWS AWS API Gateway vs ALB: Choosing the Right Entry Point
Compare API Gateway and Application Load Balancer for fronting AWS workloads, including features, pricing, latency, and when to use each in production.
- AWS AWS CloudFront CDN Tutorial: Caching at the Edge
Learn how AWS CloudFront accelerates content delivery, what cache behaviors look like, and how to wire it up to an S3 origin with sensible defaults.