Skip to content
C Codeloom
DSA

Hashing: 8 Practice Problems with Solutions

Eight classic hash map problems with worked Python solutions — Two Sum, Group Anagrams, Subarray Sum Equals K, Longest Consecutive Sequence, Top K Frequent Elements, and more.

·10 min read · By Yash Kesharwani
Intermediate 22 min read

What you'll learn

  • How to convert O(n^2) brute force into O(n) with a hash map
  • The prefix-sum + hash map pattern for subarray problems
  • How to find longest consecutive runs in unsorted input in O(n)
  • How to compute top-k frequencies with a heap or bucket sort
  • When a set is enough and when you need a map

Prerequisites

This post drills the eight most common hash-map interview problems. Each entry has a problem statement, examples, an approach with Big-O, and a clean Python solution. Try to write each solution yourself before scrolling.

1. Two Sum

Problem. Given an integer array nums and a target integer target, return the indices [i, j] (with i != j) such that nums[i] + nums[j] == target. You may assume exactly one solution exists.

Examples.

nums = [2, 7, 11, 15], target = 9    -> [0, 1]
nums = [3, 2, 4],      target = 6    -> [1, 2]
nums = [3, 3],         target = 6    -> [0, 1]

Approach

Brute force is O(n²): try every pair. The hash-map trick: as you walk the array, for each nums[i] compute complement = target - nums[i] and check if it’s already in a dictionary mapping value to index. If yes, return both indices; otherwise record nums[i] -> i.

Time: O(n). Space: O(n).

Solution

def two_sum(nums: list[int], target: int) -> list[int]:
    seen = {}
    for i, x in enumerate(nums):
        complement = target - x
        if complement in seen:
            return [seen[complement], i]
        seen[x] = i
    return []

print(two_sum([2, 7, 11, 15], 9))   # [0, 1]
print(two_sum([3, 2, 4], 6))        # [1, 2]
print(two_sum([3, 3], 6))           # [0, 1]

The order matters — check the complement before recording nums[i], so the same element is not used twice.

2. Contains Duplicate

Problem. Given an integer array nums, return True if any value appears at least twice.

Examples.

[1, 2, 3, 1]   -> True
[1, 2, 3, 4]   -> False

Approach

Walk the array; record each value in a set; return True the moment you see one twice. Compared to the one-liner len(set(nums)) != len(nums), the loop short-circuits and uses less memory on inputs with an early duplicate.

Time: O(n). Space: O(n).

Solution

def contains_duplicate(nums: list[int]) -> bool:
    seen = set()
    for x in nums:
        if x in seen:
            return True
        seen.add(x)
    return False

print(contains_duplicate([1, 2, 3, 1]))    # True
print(contains_duplicate([1, 2, 3, 4]))    # False

One-liner equivalent:

def contains_duplicate(nums):
    return len(set(nums)) != len(nums)

3. Group Anagrams

Problem. Group a list of strings such that anagrams are together. Order of groups and order within groups does not matter.

Examples.

["eat","tea","tan","ate","nat","bat"]
-> [["eat","tea","ate"], ["tan","nat"], ["bat"]]

Approach

Use the canonical form of each word as a hash key. Two natural choices:

  1. Sorted stringsorted(word) as a tuple. O(m log m) per word.
  2. Letter counts — a 26-tuple of counts. O(m) per word.

Time: O(n · m log m) with sorted keys, O(n · m) with count keys. Space: O(n · m).

Solution

from collections import defaultdict

def group_anagrams(strs: list[str]) -> list[list[str]]:
    groups = defaultdict(list)
    for word in strs:
        key = tuple(sorted(word))
        groups[key].append(word)
    return list(groups.values())

print(group_anagrams(["eat","tea","tan","ate","nat","bat"]))
# [['eat','tea','ate'], ['tan','nat'], ['bat']]

The count-key variant for fixed lowercase ASCII:

def group_anagrams_counts(strs):
    groups = defaultdict(list)
    for word in strs:
        counts = [0] * 26
        for ch in word:
            counts[ord(ch) - ord('a')] += 1
        groups[tuple(counts)].append(word)
    return list(groups.values())

4. Subarray Sum Equals K

Problem. Given an integer array nums and an integer k, return the number of contiguous subarrays whose sum equals k. Values may be negative.

Examples.

nums = [1, 1, 1], k = 2     -> 2     ([1,1] starting at 0 and at 1)
nums = [1, 2, 3], k = 3     -> 2     ([1,2] and [3])
nums = [3, 4, 7, -2, 2], k = 7 -> 4

Approach

The prefix-sum trick. Let prefix[i] be the sum of nums[0..i-1]. A subarray nums[l..r] sums to k iff prefix[r+1] - prefix[l] == k. So for each running prefix p, the number of earlier prefixes equal to p - k is the count of subarrays ending here that sum to k.

Use a hash map from prefix value to how many times it has occurred. Seed it with {0: 1} to count subarrays that start at index 0.

Time: O(n). Space: O(n).

Solution

def subarray_sum(nums: list[int], k: int) -> int:
    counts = {0: 1}
    prefix = 0
    answer = 0
    for x in nums:
        prefix += x
        answer += counts.get(prefix - k, 0)
        counts[prefix] = counts.get(prefix, 0) + 1
    return answer

print(subarray_sum([1, 1, 1], 2))            # 2
print(subarray_sum([1, 2, 3], 3))            # 2
print(subarray_sum([3, 4, 7, -2, 2], 7))     # 4

This is one of the most reusable patterns in interview problems — memorize it.

5. Longest Consecutive Sequence

Problem. Given an unsorted array of integers, return the length of the longest run of consecutive numbers. The algorithm must be O(n).

Examples.

[100, 4, 200, 1, 3, 2]      -> 4   (the sequence 1, 2, 3, 4)
[0, 3, 7, 2, 5, 8, 4, 6, 0, 1] -> 9   (0..8)

Approach

Sorting solves the problem in O(n log n) — but the question demands O(n). Use a set. For each value, only start counting a run if it is the smallest of its run (i.e. x - 1 is not in the set). From there, walk forward x, x+1, x+2, ... until the chain breaks.

Each value is visited at most twice — once in the outer loop, and once during the walk that started at the run’s minimum. The total work is O(n).

Time: O(n). Space: O(n).

Solution

def longest_consecutive(nums: list[int]) -> int:
    s = set(nums)
    best = 0
    for x in s:
        if x - 1 in s:
            continue                # not the start of a run
        length = 1
        while x + length in s:
            length += 1
        best = max(best, length)
    return best

print(longest_consecutive([100, 4, 200, 1, 3, 2]))         # 4
print(longest_consecutive([0, 3, 7, 2, 5, 8, 4, 6, 0, 1])) # 9

The if x - 1 in s: continue check is what makes the algorithm linear. Without it, you’d re-walk the same run from every position.

6. First Unique Character in a String

Problem. Return the index of the first non-repeating character in a string, or -1 if none exists.

Examples.

"leetcode"       -> 0
"loveleetcode"   -> 2
"aabb"           -> -1

Approach

Two passes with a Counter. First pass builds the frequency map; second pass walks left-to-right and returns the index of the first character whose count is 1.

Time: O(n). Space: O(k) for the alphabet.

Solution

from collections import Counter

def first_uniq_char(s: str) -> int:
    counts = Counter(s)
    for i, ch in enumerate(s):
        if counts[ch] == 1:
            return i
    return -1

print(first_uniq_char("leetcode"))       # 0
print(first_uniq_char("loveleetcode"))   # 2
print(first_uniq_char("aabb"))           # -1

7. Intersection of Two Arrays

Problem. Given two integer arrays nums1 and nums2, return their intersection. Each element in the result must be unique, in any order.

Examples.

nums1 = [1, 2, 2, 1], nums2 = [2, 2]      -> [2]
nums1 = [4, 9, 5],    nums2 = [9, 4, 9, 8, 4] -> [9, 4]  (or [4, 9])

Approach

Convert both arrays to sets and intersect. Average O(n + m) time.

Time: O(n + m). Space: O(min(n, m)) for the smaller set.

Solution

def intersection(nums1: list[int], nums2: list[int]) -> list[int]:
    return list(set(nums1) & set(nums2))

print(intersection([1, 2, 2, 1], [2, 2]))            # [2]
print(intersection([4, 9, 5], [9, 4, 9, 8, 4]))      # [9, 4] (order may vary)

If you want explicit control over which array drives the iteration (helpful when one is much larger):

def intersection_explicit(nums1, nums2):
    small, large = (nums1, nums2) if len(nums1) < len(nums2) else (nums2, nums1)
    seen = set(small)
    result = set()
    for x in large:
        if x in seen:
            result.add(x)
    return list(result)

This is a useful optimization when one array fits in cache and the other does not.

8. Top K Frequent Elements

Problem. Given an integer array nums and an integer k, return the k most frequent elements. You may return the answer in any order. Assume k is at most the number of distinct elements.

Examples.

nums = [1, 1, 1, 2, 2, 3], k = 2     -> [1, 2]
nums = [1], k = 1                    -> [1]

Approach

Three solid options:

  1. Counter + sort — O(n + u log u), where u is the number of distinct elements.
  2. Counter + heap — O(n + u log k), better when k is small.
  3. Bucket sort by frequency — O(n) worst case.

We’ll show options 2 and 3.

Solution: heap

from collections import Counter
import heapq

def top_k_frequent_heap(nums: list[int], k: int) -> list[int]:
    counts = Counter(nums)
    return [val for val, _ in heapq.nlargest(k, counts.items(), key=lambda kv: kv[1])]

print(top_k_frequent_heap([1, 1, 1, 2, 2, 3], 2))    # [1, 2]

heapq.nlargest maintains a heap of size k while iterating, hence the O(u log k) factor.

Solution: bucket sort

Because the highest possible frequency is len(nums), we can index a list by frequency directly.

from collections import Counter

def top_k_frequent_bucket(nums: list[int], k: int) -> list[int]:
    counts = Counter(nums)
    buckets = [[] for _ in range(len(nums) + 1)]
    for val, freq in counts.items():
        buckets[freq].append(val)

    result = []
    for freq in range(len(buckets) - 1, 0, -1):
        for val in buckets[freq]:
            result.append(val)
            if len(result) == k:
                return result
    return result

print(top_k_frequent_bucket([1, 1, 1, 2, 2, 3], 2))   # [1, 2]

Time: O(n). Space: O(n).

Bucket sort wins when k is close to the number of distinct elements; the heap wins when k is small relative to u. Either is acceptable in interviews.

Try it yourself. Pick three problems above and re-solve them from scratch. Then for each, write a one-sentence explanation of why the hash map turns brute-force O(n²) into O(n). The act of stating it out loud cements the pattern.

Stretch problem. Modify Two Sum to return all unique pairs that sum to target, with no duplicate pairs even if a value appears multiple times in the input. Hint: a Counter and careful iteration over distinct values.

Pattern recap

ProblemPattern
Two SumComplement lookup with a dict
Contains DuplicateMembership via set
Group AnagramsCanonical key (sorted string or counts)
Subarray Sum Equals KPrefix-sum frequency map
Longest Consecutive SequenceSet membership + start-of-run check
First Unique CharacterTwo-pass Counter
Intersection of Two ArraysSet intersection
Top K Frequent ElementsCounter + heap or bucket sort

The prefix-sum-plus-hashmap pattern and the start-of-run trick are the two ideas you’ll reuse most often beyond these exact problems.

Next steps

You now have a working toolkit for strings and hashing — the two patterns that show up in the largest share of interview questions. The natural next topic is linked lists: how their pointer-based structure changes which problems are easy and which are hard.

Questions or feedback? Email codeloomdevv@gmail.com.