Terraform State Management Basics
Why Terraform state exists, how it maps configuration to real infrastructure, and how to set up remote backends, locking, and workspaces without losing your mind.
What you'll learn
- ✓Why Terraform needs state in the first place
- ✓Local vs remote backends and why remote always wins
- ✓State locking and how to recover from stuck locks
- ✓Workspaces, state splitting, and import strategies
- ✓How to keep state files safe and auditable
Prerequisites
- •Basic Terraform familiarity
What and why
Terraform state is a JSON file that maps the resources in your configuration to the real-world objects in your provider (AWS instance IDs, GCP project numbers, Postgres role names). Without state, Terraform would have no way to know that the aws_instance.web in your code is the same EC2 instance it created yesterday.
The state file is also a cache. Terraform reads it during plan so it does not have to query every property of every resource from the provider. That speedup is real, but it means a stale state can produce wrong plans.
Mental model
State sits between your configuration and the real world. Configuration is what you want. The real world is what exists. State is what Terraform last saw.
+------------------+ terraform plan +------------------+
| main.tf | ------------------> | Provider APIs |
| desired state | | real resources |
+------------------+ +------------------+
| ^
| compares with |
v |
+------------------+ refresh / apply |
| terraform.tfstate +----------------------------> |
| last-known state |
+------------------+
Plan = (config) - (state)
Apply = call provider APIs to make state match config
then update state file Hands-on example
A remote backend on S3 with DynamoDB locking:
terraform {
required_version = ">= 1.7"
backend "s3" {
bucket = "acme-tf-state"
key = "prod/network/terraform.tfstate"
region = "us-east-1"
dynamodb_table = "tf-locks"
encrypt = true
}
}
Bootstrap the backend with another small Terraform project (or by hand) before you can use it:
resource "aws_s3_bucket" "tf_state" {
bucket = "acme-tf-state"
}
resource "aws_s3_bucket_versioning" "tf_state" {
bucket = aws_s3_bucket.tf_state.id
versioning_configuration { status = "Enabled" }
}
resource "aws_dynamodb_table" "tf_locks" {
name = "tf-locks"
billing_mode = "PAY_PER_REQUEST"
hash_key = "LockID"
attribute {
name = "LockID"
type = "S"
}
}
Versioning on the bucket gives you point-in-time recovery. DynamoDB provides the lock so two engineers cannot apply at the same time.
Initialize and apply:
terraform init
terraform plan -out tfplan
terraform apply tfplan
init configures the backend and downloads providers. plan writes a binary plan file you can review. apply tfplan executes exactly that plan, with no surprises from a config change in between.
When you need to bring an existing resource under management, import it:
terraform import aws_s3_bucket.logs my-existing-logs-bucket
Then add the matching resource block in code and run plan until the diff is empty.
For multiple environments, prefer separate state files per environment (different key paths in the backend) over workspaces. Workspaces share variables and backend config; separate states isolate blast radius.
Common pitfalls
Committing state to git. State files contain secrets in plain text (database passwords, generated keys) and grow until git is unhappy. Add *.tfstate* to .gitignore from day one.
Running terraform apply from a laptop against shared infrastructure. With no lock and no central state, two engineers can race and corrupt resources. Always use a remote backend with locking.
Editing state by hand. terraform state has subcommands (mv, rm, replace-provider) for safe surgery. Editing the JSON directly is a last resort and almost always wrong.
terraform destroy on the wrong workspace. Workspaces look identical at the CLI prompt; setting TF_WORKSPACE or putting the workspace in the shell prompt prevents catastrophes.
Drifting state. Someone clicks in the console, the real resource changes, the state file does not. The next plan tries to “fix” the drift, sometimes destructively. Run terraform plan regularly in CI on every environment, and treat unexpected diffs as alerts.
Production tips
Encrypt state at rest and in transit. The S3 backend with encrypt = true and a KMS key is the standard pattern. Restrict who can read the bucket; state is privileged.
Split state by blast radius, not by tidiness. Network, IAM, data, and application layers in separate states means a bad apply in the app layer cannot accidentally destroy the VPC.
Pin everything. Pin Terraform versions in required_version, pin providers in required_providers with exact versions, and use a lockfile (.terraform.lock.hcl). Reproducible plans depend on it.
Use -refresh-only plans periodically to detect drift without changing config.
Treat state as audit. Versioned S3 plus access logs on the bucket give you who-applied-what across time. Pair with an OIDC-based CI role so only the pipeline can touch state.
For very large estates, evaluate Terragrunt or a registry-backed module pattern. Both help you keep DRY without coupling unrelated state files.
Wrap-up
State is the memory of your infrastructure. Use a remote backend with locking and encryption, never commit state to git, never edit it by hand, and split it by blast radius. Pin versions, plan in CI, watch for drift, and treat the state file like a database: backed up, versioned, and locked down. With those habits, Terraform becomes predictable instead of scary.
Related articles
- DevOps AWS IAM Roles and Policies: A Practical Tutorial
Learn how AWS IAM roles, trust policies, and permissions policies work together. Build a small role from scratch and assume it from another account.
- DevOps Terraform Basics: HCL, Providers, and Your First Resource
Learn the foundations of Terraform: writing HCL, configuring providers, managing state, and creating your first cloud resource the right way.
- DevOps AWS S3 Bucket Policies Explained
How S3 bucket policies, IAM policies, and ACLs interact, how to write least-privilege bucket policies, and patterns for cross-account access without footguns.
- DevOps CI/CD Pipeline Design Fundamentals
How to design a CI/CD pipeline that stays fast, reliable, and reversible: stages, caching, parallelism, environments, and rollback strategies that scale with the team.