Skip to content
C Codeloom
Python

Python Dataclasses Tutorial

Replace boilerplate-heavy classes with Python dataclasses — learn @dataclass, default values, field(), frozen instances, comparisons, and when to reach for them.

·4 min read · By Codeloom
Beginner 9 min read

What you'll learn

  • What @dataclass generates for you
  • Setting default values safely with field()
  • Making instances immutable with frozen=True
  • Comparing and sorting dataclass instances
  • When dataclasses are the right tool

Prerequisites

  • Basic Python familiarity

Dataclasses are Python’s answer to the question, “why am I writing __init__, __repr__, and __eq__ by hand for the hundredth time?” Introduced in Python 3.7, the dataclass decorator generates the obvious code for you while leaving your class fully customisable.

The Old Way

Here is the boilerplate version of a simple Point class.

class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y

    def __repr__(self):
        return f"Point(x={self.x!r}, y={self.y!r})"

    def __eq__(self, other):
        if not isinstance(other, Point):
            return NotImplemented
        return (self.x, self.y) == (other.x, other.y)

Nothing in this code is interesting. It is all paperwork.

The Dataclass Way

from dataclasses import dataclass

@dataclass
class Point:
    x: float
    y: float

p = Point(1.0, 2.0)
print(p)             # Point(x=1.0, y=2.0)
print(p == Point(1.0, 2.0))  # True

The decorator inspects the class body, sees two type-annotated attributes, and generates __init__, __repr__, and __eq__ for you. The type annotations are required — they tell the decorator which attributes belong to the dataclass.

Default Values

You can give attributes defaults like normal Python parameters.

@dataclass
class User:
    name: str
    role: str = "viewer"
    active: bool = True

As with regular functions, fields without defaults must come before fields with defaults.

Mutable Defaults: Use field()

There is one trap. You cannot use a mutable default directly.

@dataclass
class Cart:
    items: list = []  # ValueError at class creation time

The reason is the same as for default arguments in functions — the list would be shared across every instance. The fix is field(default_factory=...).

from dataclasses import dataclass, field

@dataclass
class Cart:
    items: list = field(default_factory=list)
    metadata: dict = field(default_factory=dict)

The factory is called fresh for each new instance.

Frozen Dataclasses

Pass frozen=True to make instances immutable. Assigning to a field afterwards raises FrozenInstanceError.

@dataclass(frozen=True)
class Currency:
    code: str
    symbol: str

usd = Currency("USD", "$")
# usd.code = "EUR"  # would raise FrozenInstanceError

Frozen instances are also hashable, so you can use them as dictionary keys or put them in sets — a common reason to choose frozen over mutable.

Ordering

The order=True option generates __lt__, __le__, __gt__, and __ge__ based on a tuple of the fields in declaration order.

@dataclass(order=True)
class Version:
    major: int
    minor: int
    patch: int

versions = [Version(1, 2, 0), Version(1, 0, 5), Version(2, 0, 0)]
print(sorted(versions))

The sort works lexicographically over (major, minor, patch), which is exactly what you want for semantic version numbers.

Excluding Fields from comparisons or repr

The field() helper takes flags for fine-grained control.

from dataclasses import dataclass, field

@dataclass
class Task:
    title: str
    priority: int
    _id: int = field(repr=False, compare=False, default=0)

_id will not show up in the repr and won’t influence equality. Use this when an attribute is incidental — a cache, a UUID generated lazily, a database row ID.

Post-Init Hooks

If you need to derive a field from others, use __post_init__. It runs at the end of the generated __init__.

@dataclass
class Rectangle:
    width: float
    height: float
    area: float = field(init=False)

    def __post_init__(self):
        self.area = self.width * self.height

r = Rectangle(3, 4)
print(r.area)  # 12

init=False keeps area out of the constructor signature.

Converting to a Dict

dataclasses.asdict recursively converts a dataclass (and any nested dataclasses) to plain dictionaries — great for JSON serialisation.

from dataclasses import asdict, dataclass

@dataclass
class Address:
    city: str

@dataclass
class Person:
    name: str
    address: Address

p = Person("Ada", Address("London"))
print(asdict(p))  # {'name': 'Ada', 'address': {'city': 'London'}}

There is also astuple if you need a flat tuple of values.

When to Use Dataclasses

Reach for dataclasses when the class is mostly data: a record, a configuration object, a parsed payload, an event. Dataclasses keep the intent visible at the top of the file and remove dozens of lines you would otherwise be reading and skimming over.

Stick with a regular class when behaviour dominates over data — a service object with five methods and no real state has nothing to gain from @dataclass. And if you need strict validation, look at libraries like pydantic or attrs, which build on the same idea with more features.

Wrapping Up

Dataclasses are quiet but transformative. They shrink your class definitions to the parts that actually matter, and they nudge you toward immutability and clear types. Add @dataclass to your toolbox and your future self will thank you the next time you have to model “just some data.”