Java SE 8 Programmer II Exam Series: Advanced Stream Pipeline Concepts

5 min readJun 10, 2020

Introduction

In this post, we will cover advanced stream pipeline concepts in Java 8. We will analyze chaining Optionals, predefined collectors, and learn how to use groupingBy, partitioningBy, and mapping methods. You can find the source code on GitHub.

Chaining Optionals

Chaining optionals help eliminate the nested if-statements for Optional variables. The function below prints the virus name if it contains Covid-19 without chaining optional. Hopefully, it won’t print 🙂

private static void printVirus(Optional<String> optional) { 
  if (optional.isPresent()) { 
     String virus = optional.get(); 
        if (virus.contains("Covid-19")) 
           System.out.println(virus); 
  } 
}

Yet, this is a code smell. If you add more nested if-statements, you will end up with the Arrow Anti Pattern. The following version is more concise and descriptive. If you need to have conditions, you can add a new filter.

private static void printVirusUsingChaining(Optional<String> optional) { 
   optional.filter(s -> s.contains("Covid-19")
           .ifPresent(System.out::println);                                    
}

Grouping results

In this section, we will have a look at the predefined collectors, and learn how to use groupingBy, partitioningBy, and mapping methods.

Basic operations

Many of the collectors behave in the same way. All we need to do is to pass the collect(Collector<? super T,A,R> collector) method accumulating the elements of a stream into a final result. Let’s look at some of the collectors.

Collectors.joining

Concatenate the string elements, separated by the comma.

Stream<String> streamJoin = Stream.of("the", "new", "normal"); String resultJoin = streamJoin.collect(Collectors.joining(", ")); System.out.println(resultJoin); // the, new, normal

Collectors.averagingInt

Produce the arithmetic mean of the string elements

Stream<String> streamAverage = Stream.of("the", "new", "normal"); Double resultAverage = streamAverage.collect(Collectors.averagingInt(String::length)); System.out.println(resultAverage); // 4.0

Collectors.toCollection

Accumulate a stream into other collections. Sometimes you want to have more control on the return type.

Stream<String> streamCollection = Stream.of("the", "new", "normal"); Set<String> result = streamCollection
  .filter(s -> s.startsWith("n"))
  .collect(Collectors.toCollection(HashSet::new)); System.out.println(result); // [new, normal]

Collecting into maps

There are three overloaded functions for Collectors.toMap(). We will look at them in detail using the Coronavirus example. Let’s create a CoronavirusCase class.

public class CoronavirusCase { 
   private String country;
   private long numberOfCases;
   public CoronavirusCase(String country, long numberOfCases) { 
      this.country = country; 
      this.numberOfCases = numberOfCases; 
   } 
   //getters and setters
}

We are going to transform the list of coronavirus cases into a map using a Collectors.toMap() methods.

List<CoronavirusCase> cases = new ArrayList<>(); cases.add(new CoronavirusCase("TURKEY", 170000)); 
cases.add(new CoronavirusCase("SPAIN", 242000)); 
cases.add(new CoronavirusCase("SWEEDEN", 287000)); 
cases.add(new CoronavirusCase("ITALY", 235000)); 
cases.add(new CoronavirusCase("USA", 287000)); 
cases.add(new CoronavirusCase("UK", 287000)); // 1. Create a map using toMap(keyMapper, valueMapper) 
// 2. Create a map using toMap(keyMapper, valueMapper, mergeFunction) 
// 3. Create a tree map using toMap(keyMapper, valueMapper, mergeFunction, mapSupplier)

Collectors.toMap(keyMapper, valueMapper)

toMap(keyMapper, valueMapper) takes mapping functions to produce keys and values, then returns a Collector which collects elements into a Map. In this example, we choose the country as key and the number of cases as value.

Map<String, Long> map = cases.stream().collect(Collectors.toMap(CoronavirusCase::getCountry, CoronavirusCase::getNumberOfCases)); System.out.println(map);//{USA=287000, TURKEY=170000, UK=287000, ITALY=235000, SWEEDEN=287000, SPAIN=242000}

What if we choose the number of cases as a key?

Map<Long, String> mapDuplicated = 
cases.stream().collect(Collectors.toMap(CoronavirusCase::getNumberOfCases, CoronavirusCase::getCountry)); System.out.println(mapDuplicated); ////throws Exception in thread "main" java.lang.IllegalStateException: Duplicate key SWEEDEN

We would get the java.lang.IllegalStateException as the number of cases is equal for SWEEDEN, USA, and ITALY.

Collectors.toMap(keyMapper, valueMapper, mergeFunction)

toMap(keyMapper, valueMapper, mergeFunction) function deals with these collisions problems. It takes an additional parameter called a merge function. If the keys have duplicate values, the merge function is applied. The example below produces a map by mapping cases to a concatenated list of countries:

BinaryOperator<String> mergeFunction = (case1, case2) -> case1 + "-" + case2; 
Map<Long, String> map2 = cases.stream().collect(Collectors.toMap(CoronavirusCase::getNumberOfCases, CoronavirusCase::getCountry, mergeFunction)); System.out.println(map2); // {170000=TURKEY, 242000=SPAIN, 235000=ITALY, 287000=SWEEDEN-USA-UK}

Collectors.toMap(keyMapper, valueMapper, mergeFunction, mapSupplier)

If we wanted to sort the map, we would use the toMap(keyMapper, valueMapper, mergeFunction, mapSupplier) function. It takes mapping functions to produce keys and values, a merge function, and a supplier. The sorted map is created by a provided supplier TreeMap.

BinaryOperator<String> mergeFunction = (case1, case2) -> case1 + "-" + case2; TreeMap<Long, String> treeMap = 
cases
 .stream()
 .collect(Collectors.toMap(CoronavirusCase::getNumberOfCases,                                    CoronavirusCase::getCountry, mergeFunction, TreeMap::new)
);System.out.println(treeMap); 
//{170000=TURKEY, 235000=ITALY, 242000=SPAIN, 287000=SWEEDEN-USA-UK}System.out.println(treeMap.getClass());
//class java.util.TreeMap

Grouping, Partitioning, and Mapping

Sometimes we might need to do more operations that are not covered in previous sections. In that case, grouping, partitioning, and mapping are beneficial.

GroupingBy

groupingBy( Function classifier) groups the elements according to a classification function. We grouped the coronavirus cases with respect to the number of cases.

Map<Long, List<CoronavirusCase>> groupingBy = 
cases
 .stream()
 .collect(Collectors.groupingBy(CoronavirusCase::getNumberOfCases));

groupingBy( Function classifier, Collector downstream) groups the elements using a classifier and performs a reduction operation with a downstream collector. We changed the value of the map from List to Set using the downstream collector, Collectors.toSet().

Map<Long, Set<CoronavirusCase>> groupingBy = 
cases
 .stream()
 .collect(Collectors.groupingBy(CoronavirusCase::getNumberOfCases, Collectors.toSet()));

groupingBy( Function classifier, Supplier mapFactory, Collector downstream) groups the elements using a classifier and performs a reduction operation with a collector, and applies the supplier to change the return type. We returned TreeMap instead of Map using TreeMap::new supplier.

TreeMap<Long, List<CoronavirusCase>> groupingBy = 
cases
 .stream()
 .collect(Collectors.groupingBy(CoronavirusCase::getNumberOfCases, TreeMap::new, Collectors.toList()));

PartitioningBy

With partitioning, we split the elements into two groups — true and false.

partitioningBy( Predicate predicate) splits the input elements using a predicate. We partitioned the coronavirus cases into two groups — cases that are less than or equal to or greater than 200000.

Map<Boolean, List<CoronavirusCase>> partitioningBy = 
cases
 .stream()
 .collect(Collectors.partitioningBy(s -> s.getNumberOfCases() <= 200000));

partitioningBy( Predicate predicate, Collector downstream) splits the input elements using a predicate and performs reduction. We partitioned the input stream with respect to the number of cases. We also changed the value of the map from List to Set using the downstream collector, Collectors.toSet().

Map<Boolean, Set<CoronavirusCase>> partitioningBy = 
cases
 .stream()
 .collect(Collectors.partitioningBy(s -> s.getNumberOfCases() <= 12000, Collectors.toSet()));

Mapping

According to Java Doc, the mapping() collectors are most useful when used in a multi-level reduction, such as a groupingBy or partitioning. We can generalize the mapping formula as follows:

Given a stream of S, accumulate X of Y for/in each Z:

For example, given a stream of CoronavirusCase, accumulate the set of countries names for each case:

Map<Long, Set<String>> countryByCases = 
cases
 .stream()
 .collect(Collectors.groupingBy(CoronavirusCase::getNumberOfCases, Collectors.mapping(CoronavirusCase::getCountry, Collectors.toSet()))); System.out.println(countryByCases);
// {170000=[TURKEY], 242000=[SPAIN], 235000=[ITALY], 287000=[USA, UK, SWEEDEN]}

Likewise, given a stream of CoronavirusCase, accumulate the set of cases for each country:

Map<String, Set<Long>> casesByCountry = 
cases
 .stream()
 .collect(Collectors.groupingBy(CoronavirusCase::getCountry, Collectors.mapping(CoronavirusCase::getNumberOfCases,Collectors.toSet())));System.out.println(casesByCountry); 
//{USA=[287000], TURKEY=[170000], UK=[287000], ITALY=[235000], SWEEDEN=[287000], SPAIN=[242000]}

Summary

In this section, we covered the advanced stream pipeline concepts in Java 8. We learned how to chain Optionals, use basic collectors, and studied groupingBy, partitioningBy, and mapping methods. You can find the source code on GitHub.

Originally published at http://suleymanyildirim.org on June 10, 2020.