Graphs in DSA: Introduction and Representations

Beginner 11 min read

What you'll learn

✓What a graph is and the vocabulary that goes with it
✓Directed vs undirected, weighted vs unweighted, cyclic vs acyclic
✓How to represent graphs in Python — adjacency list, matrix, edge list
✓The trade-offs between the three representations
✓When a graph problem is hiding in plain sight

Prerequisites

•Comfortable with Dictionaries and Classes

A graph is a collection of vertices (or nodes) connected by edges. Trees are a special case — a graph with no cycles and a designated root. Once you see graphs everywhere, you start spotting them in social networks, road maps, dependency resolvers, recommendation systems, and half the puzzles in any interview. This post sets up the vocabulary, the three standard representations, and the trade-offs between them.

What is a graph?

A graph G = (V, E) is defined by a set of vertices V and a set of edges E. Each edge connects two vertices.

   A --- B
   |     |
   C --- D
        / \
       E   F

Here V = {A, B, C, D, E, F} and E = {(A,B), (A,C), (B,D), (C,D), (D,E), (D,F)}.

A graph imposes no rules about hierarchy, shape, or order. Cycles are allowed. Disconnected pieces are allowed. A vertex with no edges is allowed. That generality is the source of both graphs’ power and their difficulty.

Vocabulary

A handful of terms you’ll keep meeting:

Vertex / node — a point in the graph.
Edge — a connection between two vertices.
Adjacent — two vertices are adjacent if an edge connects them.
Neighbor — same idea, used informally.
Degree — the number of edges touching a vertex. In a directed graph, in-degree and out-degree.
Path — a sequence of vertices connected by edges.
Cycle — a path that ends where it starts and uses no edge twice.
Connected — every pair of vertices has a path between them (undirected).
Component — a maximal connected subgraph.
Dense vs sparse — many edges (close to V²) vs few (close to V).

Flavours of graphs

A graph’s flavour is decided by three independent choices.

Directed vs undirected

In an undirected graph, an edge {A, B} works both ways: you can travel from A to B and from B to A.

   A --- B          A -> B
                    A <- B  (separately, if you want)

In a directed graph (sometimes called a digraph), edges have a direction: (A, B) lets you go from A to B but not necessarily back.

Examples:

Friendship on Facebook — undirected.
“Follows” on Twitter — directed.
A road network — usually directed (one-way streets exist).
A web page link graph — directed.

Weighted vs unweighted

A weighted edge carries a number — distance, cost, capacity, time. An unweighted edge just says “connected.”

   A --5-- B
   |       |
   2       3
   |       |
   C --1-- D

Shortest-path algorithms branch sharply here. BFS handles unweighted shortest paths. Dijkstra and friends handle non-negative weights. Bellman-Ford handles negatives.

Cyclic vs acyclic

A cyclic graph contains at least one cycle. An acyclic graph contains none.

A directed acyclic graph (DAG) is one of the most useful shapes in computing — task dependencies, build systems, version histories, neural network computation graphs are all DAGs. DAGs support topological sorting.

A connected, undirected, acyclic graph is exactly a tree.

Three representations

There are three standard ways to store a graph in code. Each is good at something different.

We’ll use this example throughout:

   0 --- 1
   |     |
   2 --- 3
   |
   4

Adjacency list

For every vertex, store a list of its neighbors. In Python a dict-of-lists or dict-of-sets is the natural fit.

graph = {
    0: [1, 2],
    1: [0, 3],
    2: [0, 3, 4],
    3: [1, 2],
    4: [2],
}

Notice that each undirected edge appears twice — once on each endpoint. For a directed graph you only list the edge on the source vertex.

Properties:

Space: O(V + E).
Check if an edge exists: O(degree(u)) (linear in the neighbor list).
Iterate neighbors of u: O(degree(u)).
Add an edge: O(1) amortized.

Adjacency lists are the default for most graph problems. Almost every BFS/DFS solution you’ll write expects one.

A dict-of-sets variant gives O(1) edge-existence checks:

graph = {
    0: {1, 2},
    1: {0, 3},
    2: {0, 3, 4},
    3: {1, 2},
    4: {2},
}

For weighted graphs, store (neighbor, weight) tuples:

weighted = {
    0: [(1, 5), (2, 2)],
    1: [(0, 5), (3, 3)],
    2: [(0, 2), (3, 1), (4, 7)],
    3: [(1, 3), (2, 1)],
    4: [(2, 7)],
}

Adjacency matrix

A V x V matrix where M[u][v] = 1 if there’s an edge from u to v, else 0. For weighted graphs, store the weight (and use float('inf') for no edge).

#       0  1  2  3  4
matrix = [
    [0, 1, 1, 0, 0],   # 0
    [1, 0, 0, 1, 0],   # 1
    [1, 0, 0, 1, 1],   # 2
    [0, 1, 1, 0, 0],   # 3
    [0, 0, 1, 0, 0],   # 4
]

For an undirected graph the matrix is symmetric across the diagonal.

Properties:

Space: O(V²) regardless of how many edges exist.
Check if an edge exists: O(1).
Iterate neighbors of u: O(V).
Add an edge: O(1).

Use a matrix when the graph is dense (close to V² edges) or when you need very fast edge lookups and V is small.

Edge list

Just a list of edges. The most compact representation.

edges = [(0, 1), (0, 2), (1, 3), (2, 3), (2, 4)]

Or with weights:

edges = [(0, 1, 5), (0, 2, 2), (1, 3, 3), (2, 3, 1), (2, 4, 7)]

Properties:

Space: O(E).
Check if an edge exists: O(E).
Iterate neighbors of u: O(E).

Edge lists are awful for traversal, but they shine for algorithms that process edges directly — like Kruskal’s MST, where you sort edges by weight and process them in order.

Try it yourself. Convert the example graph from edge list to adjacency list using a defaultdict(list). Then convert it to an adjacency matrix using nested list comprehensions. Print both and check they describe the same graph.

When to use which

Need	Best representation
BFS / DFS traversal	Adjacency list
Edge existence check, dense graph	Adjacency matrix
Sort edges by weight (Kruskal)	Edge list
Sparse graph, common case	Adjacency list
Tiny graph (V < 100), dense	Adjacency matrix
Streaming graph data in	Edge list

The rule of thumb: start with an adjacency list. Switch to a matrix only when you specifically need O(1) edge lookups and the graph is small or dense.

Building a graph in Python

A small reusable pattern using defaultdict:

from collections import defaultdict

def build_graph(edges, directed=False):
    g = defaultdict(list)
    for u, v in edges:
        g[u].append(v)
        if not directed:
            g[v].append(u)
    return g

edges = [(0, 1), (0, 2), (1, 3), (2, 3), (2, 4)]
g = build_graph(edges)
print(dict(g))
# {0: [1, 2], 1: [0, 3], 2: [0, 3, 4], 3: [1, 2], 4: [2]}

Two niceties to know about defaultdict(list): missing keys auto-create empty lists, and you don’t have to special-case the first edge for any vertex.

For weighted graphs, change the edge tuple to (u, v, w) and store (v, w) in the list.

Degree and basic properties

Quick functions you’ll write a lot:

def degree(g, v):
    return len(g[v])

def num_edges(g, directed=False):
    total = sum(len(neighbors) for neighbors in g.values())
    return total if directed else total // 2

def vertices(g):
    return list(g.keys())

print(degree(g, 2))        # 3
print(num_edges(g))         # 5
print(vertices(g))          # [0, 1, 2, 3, 4]

The handshake lemma: in any undirected graph, the sum of degrees equals 2 * |E| (every edge contributes to two vertices’ degrees). That’s why we divide by 2 above.

Spotting graph problems

A problem is a graph problem if you can phrase it as “things connected by relationships, and I need to traverse the connections.” Some examples you might not immediately think of:

Maze solving — cells are vertices, walls are absent edges.
Word ladder — words are vertices, edges connect words differing by one letter.
Course scheduling — courses are vertices, prerequisites are directed edges.
Friend recommendations — users are vertices, friendships are edges, recommendations come from second-degree neighbors.
Dependency resolution — packages are vertices, “requires” is a directed edge.
Email forwarding — addresses are vertices, forwarding rules are directed edges.

Train this eye and a lot of “tricky” problems become standard BFS/DFS questions.

Try it yourself. A maze is given as a 2D grid where 0 is open and 1 is a wall. Write a function that converts the grid into an adjacency list — each open cell (r, c) is a vertex, edges connect orthogonal open neighbors. You don’t need to traverse it yet; just build the graph.

A worked example

Build a small social graph and answer a few questions about it:

from collections import defaultdict

friendships = [
    ("alice", "bob"),
    ("alice", "carol"),
    ("bob", "dave"),
    ("carol", "dave"),
    ("dave", "eve"),
]

def build_graph(edges, directed=False):
    g = defaultdict(set)
    for u, v in edges:
        g[u].add(v)
        if not directed:
            g[v].add(u)
    return g

g = build_graph(friendships)

print(sorted(g["alice"]))       # ['bob', 'carol']
print(len(g["dave"]))           # 3 (bob, carol, eve)
print("eve" in g["alice"])      # False — not direct friends

Already we have everything we need to answer “who are my friends” and “are these two people connected at all.” For the second question we need traversal — which is the next post.

Recap

You now know:

A graph is vertices connected by edges; trees are a special case
Vocabulary: vertex, edge, adjacent, degree, path, cycle, component
Graphs vary along three axes — directed/undirected, weighted/unweighted, cyclic/acyclic
A DAG is a directed acyclic graph — task dependencies, build orders
Three representations: adjacency list (default), adjacency matrix (dense or small), edge list (edge-centric algorithms)
Adjacency list with defaultdict(list) is the everyday workhorse in Python

Next steps

The next post applies the two essential graph traversals — BFS and DFS — to the adjacency list representation. With those two tools you can solve shortest-path-on-unweighted, connected components, cycle detection, and a fistful of classic interview problems.

→ Next: Graph Traversals — BFS and DFS

Questions or feedback? Email codeloomdevv@gmail.com.