The latest expert opinions, articles, and guides for the Java professional.
Part 3: Bulk data operations for Java collections
The goal of bulk data operations is to provide new features for data processing utilizing lambda functions including parallel operations. The parallel implementation is the central element of this feature. It builds upon the java.util.concurrency Fork/Join implementation introduced in Java 7.
Bulk operations – what’s in it?
As the original change spec says, the purpose of bulk operations is to:
Add functionality to the Java Collections Framework for bulk operations upon data. This is commonly referenced as “filter/map/reduce for Java.” The bulk data operations include both serial (on the calling thread) and parallel (using many threads) versions of the operations. Operations upon data are generally expressed as lambda functions.
With the addition of lambdas to Java language and the new API for collections, we will be able to leverage parallel features of the underlying platform in a much more efficient way.
The new java.util.stream package has been added to JDK which allows us to perform filter/map/reduce-like operations with the collections in Java 8.
The Stream API would allow us to declare either sequential or parallel operations over the stream of data:
List persons = .. // sequential version Stream stream = persons.stream(); //parallel version Stream parallelStream = persons.parallelStream();
A stream is something like an iterator. However, a stream can only be traversed once, then it’s used up. Streams may also be infinite, which basically means that streams are “lazy” – we never know in advance, how many elements we will have to process.
The java.util.stream.Stream interface serves as a gateway to the bulk data operations. After the reference to a stream instance is acquired, we can perform the interesting tasks with the collections.
One important thing to notice about Stream API is that the source data is not mutated during the operations. This is due to the fact that the source of the data might not exist as such, or the initial data might be required somewhere else in the application code.
Leave a comment