In August 2015, a team of engineers at Google published a paper with a title so long it barely fits on a conference slide: “The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing.” The opening line was:
We as a field must stop trying to groom unbounded datasets into finite pools of information that eventually become complete.