heathermiller · September 28, 2016 16:32
diff --git a/desc.txt b/desc.txt
 My colleague Heather Miller and I have been discussing a new project that would
 focus on increasing the reliability and performance of applications based on
 Apache’s “Spark” engine for big data processing. The programming model we aim to
 improve seeks to achieve parallelism via distribution, by transmitting
 computations (closures) to a collection of sites where distributed data resides.
 The work we have in mind would have two areas of focus: (i) design,
 implementation, and evaluation of programming models that make this paradigm of
 shipping computations to distributed data more robust and usable, and less error
 prone (e.g., to avoid races, memory leaks, etc.) and (ii) design,
 implementation, and evaluation of tools for analyzing and refactoring of Spark
 applications for improved reliability and performance (via analyses and
 refactorings that would target the new programming model). The work would be
 done in the context of the Scala programming language.
	My colleague Heather Miller and I have been discussing a new project that would
	focus on increasing the reliability and performance of applications based on
	Apache’s “Spark” engine for big data processing. The programming model we aim to
	improve seeks to achieve parallelism via distribution, by transmitting
	computations (closures) to a collection of sites where distributed data resides.
	The work we have in mind would have two areas of focus: (i) design,
	implementation, and evaluation of programming models that make this paradigm of
	shipping computations to distributed data more robust and usable, and less error
	prone (e.g., to avoid races, memory leaks, etc.) and (ii) design,
	implementation, and evaluation of tools for analyzing and refactoring of Spark
	applications for improved reliability and performance (via analyses and
	refactorings that would target the new programming model). The work would be
	done in the context of the Scala programming language.