Python Dataclasses Tutorial
Replace boilerplate-heavy classes with Python dataclasses — learn @dataclass, default values, field(), frozen instances, comparisons, and when to reach for them.
What you'll learn
- ✓What @dataclass generates for you
- ✓Setting default values safely with field()
- ✓Making instances immutable with frozen=True
- ✓Comparing and sorting dataclass instances
- ✓When dataclasses are the right tool
Prerequisites
- •Basic Python familiarity
Dataclasses are Python’s answer to the question, “why am I writing __init__, __repr__, and __eq__ by hand for the hundredth time?” Introduced in Python 3.7, the dataclass decorator generates the obvious code for you while leaving your class fully customisable.
The Old Way
Here is the boilerplate version of a simple Point class.
class Point:
def __init__(self, x, y):
self.x = x
self.y = y
def __repr__(self):
return f"Point(x={self.x!r}, y={self.y!r})"
def __eq__(self, other):
if not isinstance(other, Point):
return NotImplemented
return (self.x, self.y) == (other.x, other.y)
Nothing in this code is interesting. It is all paperwork.
The Dataclass Way
from dataclasses import dataclass
@dataclass
class Point:
x: float
y: float
p = Point(1.0, 2.0)
print(p) # Point(x=1.0, y=2.0)
print(p == Point(1.0, 2.0)) # True
The decorator inspects the class body, sees two type-annotated attributes, and generates __init__, __repr__, and __eq__ for you. The type annotations are required — they tell the decorator which attributes belong to the dataclass.
Default Values
You can give attributes defaults like normal Python parameters.
@dataclass
class User:
name: str
role: str = "viewer"
active: bool = True
As with regular functions, fields without defaults must come before fields with defaults.
Mutable Defaults: Use field()
There is one trap. You cannot use a mutable default directly.
@dataclass
class Cart:
items: list = [] # ValueError at class creation time
The reason is the same as for default arguments in functions — the list would be shared across every instance. The fix is field(default_factory=...).
from dataclasses import dataclass, field
@dataclass
class Cart:
items: list = field(default_factory=list)
metadata: dict = field(default_factory=dict)
The factory is called fresh for each new instance.
Frozen Dataclasses
Pass frozen=True to make instances immutable. Assigning to a field afterwards raises FrozenInstanceError.
@dataclass(frozen=True)
class Currency:
code: str
symbol: str
usd = Currency("USD", "$")
# usd.code = "EUR" # would raise FrozenInstanceError
Frozen instances are also hashable, so you can use them as dictionary keys or put them in sets — a common reason to choose frozen over mutable.
Ordering
The order=True option generates __lt__, __le__, __gt__, and __ge__ based on a tuple of the fields in declaration order.
@dataclass(order=True)
class Version:
major: int
minor: int
patch: int
versions = [Version(1, 2, 0), Version(1, 0, 5), Version(2, 0, 0)]
print(sorted(versions))
The sort works lexicographically over (major, minor, patch), which is exactly what you want for semantic version numbers.
Excluding Fields from comparisons or repr
The field() helper takes flags for fine-grained control.
from dataclasses import dataclass, field
@dataclass
class Task:
title: str
priority: int
_id: int = field(repr=False, compare=False, default=0)
_id will not show up in the repr and won’t influence equality. Use this when an attribute is incidental — a cache, a UUID generated lazily, a database row ID.
Post-Init Hooks
If you need to derive a field from others, use __post_init__. It runs at the end of the generated __init__.
@dataclass
class Rectangle:
width: float
height: float
area: float = field(init=False)
def __post_init__(self):
self.area = self.width * self.height
r = Rectangle(3, 4)
print(r.area) # 12
init=False keeps area out of the constructor signature.
Converting to a Dict
dataclasses.asdict recursively converts a dataclass (and any nested dataclasses) to plain dictionaries — great for JSON serialisation.
from dataclasses import asdict, dataclass
@dataclass
class Address:
city: str
@dataclass
class Person:
name: str
address: Address
p = Person("Ada", Address("London"))
print(asdict(p)) # {'name': 'Ada', 'address': {'city': 'London'}}
There is also astuple if you need a flat tuple of values.
When to Use Dataclasses
Reach for dataclasses when the class is mostly data: a record, a configuration object, a parsed payload, an event. Dataclasses keep the intent visible at the top of the file and remove dozens of lines you would otherwise be reading and skimming over.
Stick with a regular class when behaviour dominates over data — a service object with five methods and no real state has nothing to gain from @dataclass. And if you need strict validation, look at libraries like pydantic or attrs, which build on the same idea with more features.
Wrapping Up
Dataclasses are quiet but transformative. They shrink your class definitions to the parts that actually matter, and they nudge you toward immutability and clear types. Add @dataclass to your toolbox and your future self will thank you the next time you have to model “just some data.”
Related articles
- Python Python Dataclasses: Less Boilerplate, More Clarity
A practical guide to Python dataclasses: the @dataclass decorator, field defaults, frozen instances, __post_init__, and comparisons with NamedTuple and Pydantic.
- Python Python asyncio Event Loop Guide
Understand how Python's asyncio event loop schedules coroutines, what await actually does, and how to avoid the classic mistakes that turn async code into a tangle of bugs.
- Python Python Decorators Deep Dive
A practical tour of Python decorators: how they work under the hood, when to use them, and how to write decorators that preserve metadata, accept arguments, and stack cleanly.
- Python Python Logging Best Practices
How to set up Python logging properly: loggers vs handlers, structured logs, contextual fields, log levels that scale, and how to avoid the classic print-debug trap.