No.136 Functional Stream Libraries and Fusion: What's Next?

Icon

NII Shonan Meeting Seminar 136

Transportation and Local Information

If you want to announce your arrival/departure flight and time and arrange to share transportation to/from Shonan together, please make a comment to this post. Comments are open.

Access to the Shonan Village

See Shonan Village Home Page

Overview

Stream processing defines a pipeline of operators that transform, combine, or reduce (even to a single scalar) large amounts of data. Characteristically, data is accessed strictly linearly rather than randomly and repeatedly—and processed uniformly. The upside of the limited expressiveness is the opportunity to process large amount of data efficiently, in constant and small space.

Stream processing underlies one of the first programming languages (COBOL), and it has been coming to forefront again, with the popularity of big data and MapReduce. The widespread automated trading is one example of extremely high-performant stream processing; software-defined radio is another such example. Widely-used stream libraries are now available for virtually all modern OO and functional languages, from Java to C# to Scala to OCaml to Haskell, for small and big data. Streams also motivate a vast range of topics in computer science such as dependent types, termination and partial evaluation. Streams stress virtual machines and CPU memory access paths, and pose challenges for compiler writers and hardware designers.

Functional stream libraries let us easily build stream processing pipelines, by composing sequences of simple transformers such as map or filter with producers (backed by an array, a file, or a generating function) and consumers (reducers). The purely applicative approach of building a complex pipeline from simple immutable pieces simplifies programming and reasoning: the assembled pipeline is an executable specification. To be practical, however, a library has to be efficient: at the very least, it should avoid creating intermediate structures — especially structures like files and lists whose size grows with the length of the stream. Even the bounded-size intermediate structures significantly, up to two orders of magnitude, slow down the processing. Eliminating the intermediate structures is the central problem in stream processing: so-called stream fusion.

Stream fusion has been the subject of intensive research since late 1950’s. By now, the low-hanging fruit in stream processing has been all picked up — although some of it though quite recently, POPL 2017. Stream fusion made it to the front page of CACM. Java 8 Streams and Haskell compilers, among others, have implemented some of the earlier research results. We have attained a milestone. What are the further challenges?

We thus propose a discussion-heavy workshop to assess the state of the art and to answer:

  • Can the recent advances in `traditional’ stream fusion be extended to non-traditional domains such as GPGPU, heterogenous architectures (FPGA)? Can the methods of stream fusion be applied to functional reactive programming (FRP) and optimizing database queries?
  • Can we bridge functional and effectful stream libraries?
  • What are the next big challenges? Which real world scenarios require new advances in stream fusion?

Just as the two Shonan meetings (No.2012-4 “Bridging the Theory of Staged Programming Languages and the Practice of High-Performance Computing” and No. 2014-7 “Staging and High-Performance Computing: Theory and Practice”) aimed to solicit and discuss real-world applications of assured code generation in HPC (High-Performance Computing) that would drive programming language research, we likewise aim to compile the case studies requiring further research in stream fusion, and distill them into a benchmark.

To promote mutual understanding, we plan for the workshop to have lots of time for discussion. We will emphasize tutorial, brainstorming and working-group sessions rather than mere conference-like presentations.

Schedule

October 21, 2018 (Sunday)

  • 15:00 — Hotel Check In (early check-in from 12:00 is negotiable if informed in advance)
  • 19:00 — 21:00: Welcome Reception

October 22, 2018 (Monday)

  • 07:30 — 09:00: Breakfast
  • 09:00 — 09:15: Welcome, Overview, Administrative issues
  • 09:15 — 10:30: Self-introductions
  • 10:30 — 11:00: Tea break
  • 11:00 — 11:50: Self-introductions
  • 12:00 — 13:30: Lunch
  • 13:30 — 15:15: A brief history of streams (Aggelos Biboudis)
  • 15:15 — 15:45: Tea break
  • 15:45 — 17:00: Lucid, Lustre and its descendants (Marc Pouzet)
  • 17:00 — 18:00: Discussion
  • 18:00 — 19:30: Dinner

October 23, 2018 (Tuesday)

October 24, 2018 (Wednesday)

October 25, 2018 (Thursday)

  • 07:30 — 09:00: Breakfast
  • 09:00 — 10:50:
  •  – Is this even a good idea? (Trevor McDonell)
  •  – Vélus: A formally verified compiler for Lustre (Timothy Bourke)
  • 10:50 — 11:20: Tea break
  • 11:20 — 12:00: Final panel discussion: what’s next
  • 12:00 — 13:30: Lunch

Participants

  • Hidehiko Masuhara, Tokyo Intitute of Technology (JP)
  • Bračevac Oliver, Technical University Darmstadt (DE)
  • Jeremy Gibbons, University of Oxford (UK)
  • Jeremy Yallop, University of Cambridge (UK)
  • Ben Lippmeier, University of New South Wales (AU)
  • Ugo Dal Lago, Université degli Studi di Bologna
  • Marc Pouzet, Université Pierre et Marie Curie & École normale `s`upérieure
  • John Reppy, University of Chicago (US)
  • Geoffrey Mainland, Drexel University (US)
  • Sven Bodo-Scholz, Heriot-Watt University (Scotland)
  • Shigeru Chiba, University of Tokyo (Japan)
  • Christoph Koch, EPFL (Switzerland)
  • Trevor L. McDonell, University of New South Wales, AU
  • Timothy Bourke, INRIA & ENS, FR
  • Amir Shaikhha, EPFL
  • Peter Braam, Consultant

 

Organizers

  • Aggelos Biboudis (EPFL, Switzerland)
  • Oleg Kiselyov (Tohoku University, Japan)
  • Martin Odersky (EPFL, Switzerland)