Java Stream Collectors Deep Dive
Master java.util.stream.Collectors with practical examples covering grouping, partitioning, downstream collectors, and building your own custom collector.
What you'll learn
- ✓How a Collector is structured
- ✓Common collectors: toList, toMap, joining
- ✓Grouping and partitioning with downstream collectors
- ✓Reducing with custom accumulators
- ✓Writing a Collector from scratch
Prerequisites
- •Comfort with Java lambdas and streams
What and Why
Stream.collect is the bridge between a lazy pipeline and a concrete result. The Collectors utility class packages dozens of ready-made reductions: collecting to lists and maps, grouping, partitioning, joining strings, summing fields, and more.
Understanding collectors well lets you replace pages of imperative loops with a few declarative lines that often run in parallel without changes.
Mental Model
A Collector<T, A, R> has four moving parts: a supplier that creates a mutable accumulator A, an accumulator that folds elements of type T into it, a combiner that merges two accumulators for parallel runs, and a finisher that turns A into the final result R.
supplier() --> A (empty container)
accumulator(A, T) --> mutates A with each element
combiner(A, A) --> A (only used in parallel)
finisher(A) --> R (final shape) You rarely implement all four. Most of the time you compose pre-built collectors.
Hands-on Example
Imagine processing a list of orders.
import java.util.*;
import java.util.stream.*;
import static java.util.stream.Collectors.*;
record Order(String customer, String category, double amount) {}
public class CollectorsDemo {
public static void main(String[] args) {
List<Order> orders = List.of(
new Order("ada", "books", 12.0),
new Order("ada", "books", 8.5),
new Order("bob", "toys", 30.0),
new Order("bob", "books", 15.0),
new Order("cleo", "toys", 50.0)
);
// Group totals per customer
Map<String, Double> totalByCustomer = orders.stream()
.collect(groupingBy(Order::customer, summingDouble(Order::amount)));
// Count orders per category
Map<String, Long> countByCategory = orders.stream()
.collect(groupingBy(Order::category, counting()));
// Partition by big spenders
Map<Boolean, List<Order>> big = orders.stream()
.collect(partitioningBy(o -> o.amount() > 20));
// Nested grouping
Map<String, Map<String, Double>> nested = orders.stream()
.collect(groupingBy(Order::customer,
groupingBy(Order::category,
summingDouble(Order::amount))));
// Join a column
String customers = orders.stream()
.map(Order::customer).distinct()
.collect(joining(", ", "[", "]"));
System.out.println(totalByCustomer);
System.out.println(nested);
System.out.println(customers);
}
}
The most powerful pattern is the downstream collector passed to groupingBy. Anywhere you would otherwise call groupingBy(...) and then post-process the values, you can fuse the work into a single pass.
Common Pitfalls
toMapand duplicate keys: the two-argument form throwsIllegalStateExceptionon collision. Use the three-arg form with a merge function:toMap(k, v, (a, b) -> a).- Null values in
toMap: maps returned bytoMapare not guaranteed to accept null values. Wrap inOptionalor filter nulls first. - Mutability assumptions:
Collectors.toList()historically returned anArrayList, but the contract only says “some List”. UsetoCollection(ArrayList::new)if you really need a specific type, ortoUnmodifiableList()if you want immutability. - Parallel without an associative reduction: collectors run safely in parallel only when the combine step really merges two partial accumulators. Custom collectors must respect this.
Practical Tips
Use mapping to flatten transformations into downstream collectors, for example groupingBy(Order::customer, mapping(Order::category, toSet())) to get each customer’s unique categories.
Reach for teeing (Java 12+) when you need to compute two things in one pass, like average and max together, and combine them at the end.
When the built-ins don’t fit, implement Collector directly with Collector.of(supplier, accumulator, combiner, finisher). Here is a tiny custom collector that returns the first and last elements seen.
Collector<Order, ?, List<Order>> firstAndLast = Collector.of(
() -> new ArrayList<Order>(2),
(acc, o) -> {
if (acc.isEmpty()) acc.add(o);
if (acc.size() == 2) acc.set(1, o); else acc.add(o);
},
(a, b) -> { a.add(b.get(b.size() - 1)); return a; }
);
Prefer toUnmodifiableList and toUnmodifiableMap when handing results across module boundaries. They guard against accidental writes.
Wrap-up
Collectors are tiny composable strategies. Once you internalize the supplier/accumulator/combiner/finisher shape, you can read the Collectors Javadoc as a recipe book and confidently compose new behaviors. The result is concise pipelines that scale from single-threaded experiments to parallel batch jobs.
Related articles
- Java Java Lambda Expressions Tutorial
Learn how Java lambda expressions work, when to use them, and how they interact with functional interfaces and the Streams API.
- Java Java Streams API Deep Dive
A practical tour of the Java Streams API: how it works, when to use it, lazy evaluation, collectors, parallel streams, and the pitfalls that trip up newcomers.
- Java Java Streams and Lambdas: Functional Style on the JVM
Transform collections with lambdas and the Stream API. Map, filter, reduce, collectors, parallel streams, and the pitfalls of lazy pipelines.
- Java Java Collections Framework Cheatsheet
A pragmatic tour of Java's collection interfaces and implementations, with guidance on choosing between List, Set, Map, and Queue variants in real applications.