Hashing: 8 Practice Problems with Solutions
Eight classic hash map problems with worked Python solutions — Two Sum, Group Anagrams, Subarray Sum Equals K, Longest Consecutive Sequence, Top K Frequent Elements, and more.
What you'll learn
- ✓How to convert O(n^2) brute force into O(n) with a hash map
- ✓The prefix-sum + hash map pattern for subarray problems
- ✓How to find longest consecutive runs in unsorted input in O(n)
- ✓How to compute top-k frequencies with a heap or bucket sort
- ✓When a set is enough and when you need a map
Prerequisites
- •Read Hashing and Hash Maps in DSA
- •Comfortable with Python dictionaries and sets
This post drills the eight most common hash-map interview problems. Each entry has a problem statement, examples, an approach with Big-O, and a clean Python solution. Try to write each solution yourself before scrolling.
1. Two Sum
Problem. Given an integer array nums and a target integer target, return the indices [i, j] (with i != j) such that nums[i] + nums[j] == target. You may assume exactly one solution exists.
Examples.
nums = [2, 7, 11, 15], target = 9 -> [0, 1]
nums = [3, 2, 4], target = 6 -> [1, 2]
nums = [3, 3], target = 6 -> [0, 1]
Approach
Brute force is O(n²): try every pair. The hash-map trick: as you walk the array, for each nums[i] compute complement = target - nums[i] and check if it’s already in a dictionary mapping value to index. If yes, return both indices; otherwise record nums[i] -> i.
Time: O(n). Space: O(n).
Solution
def two_sum(nums: list[int], target: int) -> list[int]:
seen = {}
for i, x in enumerate(nums):
complement = target - x
if complement in seen:
return [seen[complement], i]
seen[x] = i
return []
print(two_sum([2, 7, 11, 15], 9)) # [0, 1]
print(two_sum([3, 2, 4], 6)) # [1, 2]
print(two_sum([3, 3], 6)) # [0, 1]
The order matters — check the complement before recording nums[i], so the same element is not used twice.
2. Contains Duplicate
Problem. Given an integer array nums, return True if any value appears at least twice.
Examples.
[1, 2, 3, 1] -> True
[1, 2, 3, 4] -> False
Approach
Walk the array; record each value in a set; return True the moment you see one twice. Compared to the one-liner len(set(nums)) != len(nums), the loop short-circuits and uses less memory on inputs with an early duplicate.
Time: O(n). Space: O(n).
Solution
def contains_duplicate(nums: list[int]) -> bool:
seen = set()
for x in nums:
if x in seen:
return True
seen.add(x)
return False
print(contains_duplicate([1, 2, 3, 1])) # True
print(contains_duplicate([1, 2, 3, 4])) # False
One-liner equivalent:
def contains_duplicate(nums):
return len(set(nums)) != len(nums)
3. Group Anagrams
Problem. Group a list of strings such that anagrams are together. Order of groups and order within groups does not matter.
Examples.
["eat","tea","tan","ate","nat","bat"]
-> [["eat","tea","ate"], ["tan","nat"], ["bat"]]
Approach
Use the canonical form of each word as a hash key. Two natural choices:
- Sorted string —
sorted(word)as a tuple. O(m log m) per word. - Letter counts — a 26-tuple of counts. O(m) per word.
Time: O(n · m log m) with sorted keys, O(n · m) with count keys. Space: O(n · m).
Solution
from collections import defaultdict
def group_anagrams(strs: list[str]) -> list[list[str]]:
groups = defaultdict(list)
for word in strs:
key = tuple(sorted(word))
groups[key].append(word)
return list(groups.values())
print(group_anagrams(["eat","tea","tan","ate","nat","bat"]))
# [['eat','tea','ate'], ['tan','nat'], ['bat']]
The count-key variant for fixed lowercase ASCII:
def group_anagrams_counts(strs):
groups = defaultdict(list)
for word in strs:
counts = [0] * 26
for ch in word:
counts[ord(ch) - ord('a')] += 1
groups[tuple(counts)].append(word)
return list(groups.values())
4. Subarray Sum Equals K
Problem. Given an integer array nums and an integer k, return the number of contiguous subarrays whose sum equals k. Values may be negative.
Examples.
nums = [1, 1, 1], k = 2 -> 2 ([1,1] starting at 0 and at 1)
nums = [1, 2, 3], k = 3 -> 2 ([1,2] and [3])
nums = [3, 4, 7, -2, 2], k = 7 -> 4
Approach
The prefix-sum trick. Let prefix[i] be the sum of nums[0..i-1]. A subarray nums[l..r] sums to k iff prefix[r+1] - prefix[l] == k. So for each running prefix p, the number of earlier prefixes equal to p - k is the count of subarrays ending here that sum to k.
Use a hash map from prefix value to how many times it has occurred. Seed it with {0: 1} to count subarrays that start at index 0.
Time: O(n). Space: O(n).
Solution
def subarray_sum(nums: list[int], k: int) -> int:
counts = {0: 1}
prefix = 0
answer = 0
for x in nums:
prefix += x
answer += counts.get(prefix - k, 0)
counts[prefix] = counts.get(prefix, 0) + 1
return answer
print(subarray_sum([1, 1, 1], 2)) # 2
print(subarray_sum([1, 2, 3], 3)) # 2
print(subarray_sum([3, 4, 7, -2, 2], 7)) # 4
This is one of the most reusable patterns in interview problems — memorize it.
5. Longest Consecutive Sequence
Problem. Given an unsorted array of integers, return the length of the longest run of consecutive numbers. The algorithm must be O(n).
Examples.
[100, 4, 200, 1, 3, 2] -> 4 (the sequence 1, 2, 3, 4)
[0, 3, 7, 2, 5, 8, 4, 6, 0, 1] -> 9 (0..8)
Approach
Sorting solves the problem in O(n log n) — but the question demands O(n). Use a set. For each value, only start counting a run if it is the smallest of its run (i.e. x - 1 is not in the set). From there, walk forward x, x+1, x+2, ... until the chain breaks.
Each value is visited at most twice — once in the outer loop, and once during the walk that started at the run’s minimum. The total work is O(n).
Time: O(n). Space: O(n).
Solution
def longest_consecutive(nums: list[int]) -> int:
s = set(nums)
best = 0
for x in s:
if x - 1 in s:
continue # not the start of a run
length = 1
while x + length in s:
length += 1
best = max(best, length)
return best
print(longest_consecutive([100, 4, 200, 1, 3, 2])) # 4
print(longest_consecutive([0, 3, 7, 2, 5, 8, 4, 6, 0, 1])) # 9
The if x - 1 in s: continue check is what makes the algorithm linear. Without it, you’d re-walk the same run from every position.
6. First Unique Character in a String
Problem. Return the index of the first non-repeating character in a string, or -1 if none exists.
Examples.
"leetcode" -> 0
"loveleetcode" -> 2
"aabb" -> -1
Approach
Two passes with a Counter. First pass builds the frequency map; second pass walks left-to-right and returns the index of the first character whose count is 1.
Time: O(n). Space: O(k) for the alphabet.
Solution
from collections import Counter
def first_uniq_char(s: str) -> int:
counts = Counter(s)
for i, ch in enumerate(s):
if counts[ch] == 1:
return i
return -1
print(first_uniq_char("leetcode")) # 0
print(first_uniq_char("loveleetcode")) # 2
print(first_uniq_char("aabb")) # -1
7. Intersection of Two Arrays
Problem. Given two integer arrays nums1 and nums2, return their intersection. Each element in the result must be unique, in any order.
Examples.
nums1 = [1, 2, 2, 1], nums2 = [2, 2] -> [2]
nums1 = [4, 9, 5], nums2 = [9, 4, 9, 8, 4] -> [9, 4] (or [4, 9])
Approach
Convert both arrays to sets and intersect. Average O(n + m) time.
Time: O(n + m). Space: O(min(n, m)) for the smaller set.
Solution
def intersection(nums1: list[int], nums2: list[int]) -> list[int]:
return list(set(nums1) & set(nums2))
print(intersection([1, 2, 2, 1], [2, 2])) # [2]
print(intersection([4, 9, 5], [9, 4, 9, 8, 4])) # [9, 4] (order may vary)
If you want explicit control over which array drives the iteration (helpful when one is much larger):
def intersection_explicit(nums1, nums2):
small, large = (nums1, nums2) if len(nums1) < len(nums2) else (nums2, nums1)
seen = set(small)
result = set()
for x in large:
if x in seen:
result.add(x)
return list(result)
This is a useful optimization when one array fits in cache and the other does not.
8. Top K Frequent Elements
Problem. Given an integer array nums and an integer k, return the k most frequent elements. You may return the answer in any order. Assume k is at most the number of distinct elements.
Examples.
nums = [1, 1, 1, 2, 2, 3], k = 2 -> [1, 2]
nums = [1], k = 1 -> [1]
Approach
Three solid options:
- Counter + sort — O(n + u log u), where u is the number of distinct elements.
- Counter + heap — O(n + u log k), better when k is small.
- Bucket sort by frequency — O(n) worst case.
We’ll show options 2 and 3.
Solution: heap
from collections import Counter
import heapq
def top_k_frequent_heap(nums: list[int], k: int) -> list[int]:
counts = Counter(nums)
return [val for val, _ in heapq.nlargest(k, counts.items(), key=lambda kv: kv[1])]
print(top_k_frequent_heap([1, 1, 1, 2, 2, 3], 2)) # [1, 2]
heapq.nlargest maintains a heap of size k while iterating, hence the O(u log k) factor.
Solution: bucket sort
Because the highest possible frequency is len(nums), we can index a list by frequency directly.
from collections import Counter
def top_k_frequent_bucket(nums: list[int], k: int) -> list[int]:
counts = Counter(nums)
buckets = [[] for _ in range(len(nums) + 1)]
for val, freq in counts.items():
buckets[freq].append(val)
result = []
for freq in range(len(buckets) - 1, 0, -1):
for val in buckets[freq]:
result.append(val)
if len(result) == k:
return result
return result
print(top_k_frequent_bucket([1, 1, 1, 2, 2, 3], 2)) # [1, 2]
Time: O(n). Space: O(n).
Bucket sort wins when k is close to the number of distinct elements; the heap wins when k is small relative to u. Either is acceptable in interviews.
Try it yourself. Pick three problems above and re-solve them from scratch. Then for each, write a one-sentence explanation of why the hash map turns brute-force O(n²) into O(n). The act of stating it out loud cements the pattern.
Stretch problem. Modify Two Sum to return all unique pairs that sum to target, with no duplicate pairs even if a value appears multiple times in the input. Hint: a Counter and careful iteration over distinct values.
Pattern recap
| Problem | Pattern |
|---|---|
| Two Sum | Complement lookup with a dict |
| Contains Duplicate | Membership via set |
| Group Anagrams | Canonical key (sorted string or counts) |
| Subarray Sum Equals K | Prefix-sum frequency map |
| Longest Consecutive Sequence | Set membership + start-of-run check |
| First Unique Character | Two-pass Counter |
| Intersection of Two Arrays | Set intersection |
| Top K Frequent Elements | Counter + heap or bucket sort |
The prefix-sum-plus-hashmap pattern and the start-of-run trick are the two ideas you’ll reuse most often beyond these exact problems.
Next steps
You now have a working toolkit for strings and hashing — the two patterns that show up in the largest share of interview questions. The natural next topic is linked lists: how their pointer-based structure changes which problems are easy and which are hard.
Questions or feedback? Email codeloomdevv@gmail.com.