C++ Memory Model and Atomics
Understand the C++ memory model, memory orderings, and how std::atomic enables correct lock-free programming across threads.
What you'll learn
- ✓Why a memory model exists
- ✓std::atomic basics
- ✓memory_order semantics
- ✓Release/acquire pattern
- ✓Common bugs
Prerequisites
- •Basic familiarity with C++ threads
What and Why
Before C++11, the language pretended threads didn’t exist. Compilers and CPUs were free to reorder memory operations as long as a single-threaded program behaved correctly. With multithreading, that liberty becomes hostile. C++11 introduced a formal memory model that defines what one thread can observe of another thread’s writes.
std::atomic<T> is the primary tool for cross-thread communication without locks. It guarantees that reads and writes happen as indivisible operations and lets you constrain reordering through memory orderings.
Mental Model
Three layers reorder your memory operations: the compiler, the CPU, and the cache coherence protocol. Each std::atomic operation acts as a fence that limits what reorderings are legal around it. The memory_order argument controls how strict the fence is:
relaxed: only atomicity, no ordering guaranteesacquire/release: pairwise synchronizationacq_rel: bothseq_cst: total global order across all threads (default and safest)
Thread A: Thread B:
data = 42; while (!ready.load(acquire)) {}
ready.store(true, // sees data == 42
release); use(data); The release on A “publishes” all prior writes to whoever performs a matching acquire load on the same atomic.
Hands-on Example
A simple spinlock built on std::atomic_flag:
#include <atomic>
#include <thread>
class SpinLock {
std::atomic_flag flag = ATOMIC_FLAG_INIT;
public:
void lock() {
while (flag.test_and_set(std::memory_order_acquire)) {
// busy wait
}
}
void unlock() {
flag.clear(std::memory_order_release);
}
};
A safer pattern: publish a pointer once it’s fully initialized.
std::atomic<Config*> g_config{nullptr};
void publish(Config* c) {
// build c fully first
g_config.store(c, std::memory_order_release);
}
const Config* current() {
return g_config.load(std::memory_order_acquire);
}
The acquire load on the reader sees every write that happened before the release on the writer.
Common Pitfalls
Defaulting to relaxed for performance. relaxed only guarantees indivisibility. Without acquire/release, the reader may see the new pointer but stale fields it points to. Use it only for counters and statistics.
Assuming volatile is enough. volatile was designed for memory-mapped I/O. It prevents some compiler optimizations but provides zero cross-thread ordering. Never use volatile for thread sync in C++.
Double-checked locking without atomics. The classic broken pattern. If the singleton pointer isn’t atomic, another thread can observe a non-null pointer to a partially constructed object.
Mixing atomic and non-atomic access to the same variable produces a data race, which is undefined behavior, full stop.
Practical Tips
- Default to
memory_order_seq_cst. It’s the easiest to reason about. Only relax orderings after profiling shows real cost. - Pair release stores with acquire loads on the same atomic; that’s the dominant correctness pattern.
- For shared counters with no dependent data,
fetch_add(1, relaxed)is correct and fast. - Prefer
std::shared_ptr(atomic ref counts) orstd::mutexfor most use cases. Hand-rolled lock-free code is hard to get right. - Use tools: ThreadSanitizer (
-fsanitize=thread) catches many ordering bugs you can’t catch by inspection.
Wrap-up
The C++ memory model is dense but learnable. Internalize three patterns: release/acquire for one-way publication, sequential consistency when you want total order, and relaxed only for independent counters. Pair every cross-thread variable with a synchronization mechanism, atomic or otherwise. Lock-free code is rewarding when you need it, but reach for mutexes first; correctness is more valuable than nanoseconds.
Related articles
- C++ C++ Lambda Expressions and Captures
Master C++ lambdas: syntax, capture modes by value and reference, mutable lambdas, generic lambdas, and lifetime pitfalls.
- C++ C++ CMake Tutorial: Build Your First Project
Learn modern CMake for C++ — targets, properties, and dependencies — and configure a small project that compiles cleanly across platforms.
- C++ C++ constexpr and Compile-Time Computing
How constexpr, consteval, and constinit let you move computation from runtime to compile time, with practical patterns and the rules that govern them.
- C++ C++ Coroutines: An Introduction
Understand C++20 coroutines — co_await, co_yield, co_return — and how they enable async and generator-style code without blocking threads.