By abstracting away the complexity of distributed systems, large-scale data processing platforms—MapReduce, Hadoop, Spark, Dryad, etc.—have provided developers with simple means for harnessing the power of the cloud. In this paper, we ask whether we can automatically synthesize MapReduce-style distributed programs from input–output examples. Our ultimate goal is to enable end users to specify large-scale data analyses through the simple interface of examples. We thus present a new algorithm and tool for synthesizing programs composed of efficient data-parallel operations that can execute on cloud computing infrastructure. We evaluate our tool on a range of real-world big-data analysis tasks and general computations. Our results demonstrate the efficiency of our approach and the small number of examples it requires to synthesize correct, scalable programs.
Thu 16 JunDisplayed time zone: Tijuana, Baja California change
13:30 - 15:00 | |||
13:30 30mTalk | MapReduce Program Synthesis Research Papers Media Attached | ||
14:00 30mTalk | Programmatic and Direct Manipulation, Together at Last Research Papers Ravi Chugh University of Chicago, Brian Hempel University of Chicago, Mitchell Spradlin University of Chicago, Jacob Albers University of Chicago Pre-print Media Attached | ||
14:30 30mTalk | Fast Synthesis of Fast Collections Research Papers Calvin Loncaric University of Washington, Emina Torlak University of Washington, Michael D. Ernst University of Washington Media Attached |