Python Sets and Set Operations
Learn how Python sets store unique values, the operations they support — union, intersection, difference — and when to choose a set over a list or dictionary.
What you'll learn
- ✓How to create sets and what makes them different from lists
- ✓Adding, removing, and testing membership
- ✓The four core set operations: union, intersection, difference, symmetric difference
- ✓When to reach for a set instead of a list or dictionary
- ✓The frozenset — an immutable set you can use as a dictionary key
Prerequisites
- •Familiarity with lists and dictionaries — see Lists and Dictionaries
A set is an unordered collection of unique values. Two properties define it: duplicates are not stored, and membership tests are extremely fast regardless of size. Those two properties make sets the right tool for a surprising number of everyday problems.
Creating a set
The two ways to create a non-empty set:
# Literal — comma-separated values in braces
tags = {"python", "tutorial", "beginner"}
# Constructor — from any iterable
tags = set(["python", "tutorial", "beginner", "python"])
print(tags) # {'python', 'tutorial', 'beginner'}
Notice that the duplicate "python" is silently dropped. This is the defining feature of a set.
Empty sets need
set(), not{}. Empty braces create an empty dictionary. Useset()for an empty set.
empty_set = set() # correct
empty_dict = {} # NOT an empty set
Unordered
Sets do not preserve insertion order, and the printed order may surprise you:
print({"c", "a", "b"}) # {'b', 'a', 'c'} or similar
If order matters, use a list. If uniqueness matters and order doesn’t, use a set.
Adding and removing
tags = {"python", "tutorial"}
tags.add("beginner") # add one element
print(tags) # {'python', 'tutorial', 'beginner'}
tags.update(["intermediate", "advanced"]) # add multiple
print(tags)
tags.remove("tutorial") # remove — raises KeyError if missing
tags.discard("javascript") # remove — silently ignores if missing
popped = tags.pop() # remove & return an arbitrary element
print(popped)
tags.clear() # remove everything
print(tags) # set()
The distinction between remove and discard is the only one worth memorising — use discard when “not there” is acceptable.
Membership testing
Like dictionary keys, in on a set is fast — constant time regardless of size. This is why sets are the right answer when you need to repeatedly ask “does this collection contain X?”:
allowed = {"admin", "editor", "viewer"}
role = "editor"
if role in allowed:
print("Access granted")
The same check on a long list would scan from the start every time. For thousands of items the difference is dramatic.
Iteration and size
tags = {"python", "tutorial", "beginner"}
print(len(tags)) # 3
for tag in tags:
print(tag)
Iteration order is not guaranteed — never rely on it for logic.
The four set operations
The strength of sets is the algebra of operations between them. Each operation comes in two forms — an operator and a method. The operator form requires both sides to be sets; the method form accepts any iterable on the right.
Union — everything from both
a = {1, 2, 3}
b = {3, 4, 5}
print(a | b) # {1, 2, 3, 4, 5}
print(a.union(b)) # same
Intersection — only what is in both
print(a & b) # {3}
print(a.intersection(b)) # same
Difference — in the first but not the second
print(a - b) # {1, 2}
print(a.difference(b)) # same
Symmetric difference — in either but not both
print(a ^ b) # {1, 2, 4, 5}
print(a.symmetric_difference(b)) # same
In-place variants
Add = (or use _update methods) to modify the left set in place:
a |= b # union update
a &= b # intersection update
a -= b # difference update
a ^= b # symmetric difference update
Try it yourself. Given:
python_devs = {"Alice", "Bob", "Carol", "Dave"}
js_devs = {"Bob", "Eve", "Frank", "Carol"}Find:
- Developers who know both languages
- Developers who know Python but not JavaScript
- Developers who know exactly one of the two
Express each using a set operator.
Subset and superset tests
Sets compare against each other with familiar comparison operators:
small = {1, 2}
big = {1, 2, 3, 4}
print(small <= big) # True — small is a subset
print(big >= small) # True — big is a superset
print(small < big) # True — proper subset (not equal)
a.isdisjoint(b) returns True if the two sets share no elements.
Set comprehensions
Like list and dictionary comprehensions:
words = ["apple", "banana", "apricot", "blueberry"]
initials = {w[0] for w in words}
print(initials) # {'a', 'b'}
Set comprehensions are ideal when you want a list-comprehension-style transformation with deduplication built in.
Common practical uses
A few patterns where sets are clearly the right tool.
1. Deduplicate a list
items = ["apple", "banana", "apple", "cherry", "banana"]
unique = list(set(items))
print(unique) # ['apple', 'banana', 'cherry'] — order may differ
If order matters, use dict.fromkeys() instead, which preserves insertion order:
unique_ordered = list(dict.fromkeys(items))
print(unique_ordered) # ['apple', 'banana', 'cherry']
2. Find common or unique elements
this_month = {"alice", "bob", "carol"}
last_month = {"alice", "carol", "dave"}
returning = this_month & last_month # active both months
new = this_month - last_month # came back this month
churned = last_month - this_month # didn't return
3. Fast filtering
allowed_ids = {1, 5, 9, 13, 17}
events = [
{"id": 5, "action": "login"},
{"id": 9, "action": "purchase"},
{"id": 99, "action": "spam"},
]
filtered = [e for e in events if e["id"] in allowed_ids]
Replacing the right-hand in check with a list would slow this down dramatically as allowed_ids grows.
Try it yourself. Take the string "abracadabra", convert it to a set, and confirm the result is {'a', 'b', 'r', 'c', 'd'}. Then use len() to find the number of distinct characters in one line.
frozenset — an immutable set
set is mutable, which means it cannot be used as a dictionary key or as a member of another set. When you need an immutable version, use frozenset:
fs = frozenset(["a", "b", "c"])
groups = {
frozenset(["alice", "bob"]): "team-a",
frozenset(["carol", "dave"]): "team-b",
}
You rarely need this — but when you do, it is the only way to make it work.
When to use a set
Reach for a set when:
- You need to deduplicate a collection
- You need fast membership testing on a large group
- You are computing unions, intersections, or differences between collections
- The order of items genuinely does not matter
Stick with a list when order matters. Use a dictionary when you need to associate each item with a value.
Recap
You now know:
- Sets store unique, unordered values, created with
{}orset() add,update,remove,discard,pop,clearcover mutationinis fast — sets are ideal for “does X belong?” tests- The four operations: union
|, intersection&, difference-, symmetric difference^ - Set comprehensions and
dict.fromkeysgive you deduplication tools at different cost levels frozensetis the immutable variant, suitable for use as a dictionary key
Next steps
The next post is the natural closer for this section: type conversion — how to safely move values between types when reading user input, parsing data, and combining numbers with strings.
→ Next: Type Conversion (Casting) in Python
Questions or feedback? Email codeloomdevv@gmail.com.