This article compares the approaches of data manipulation using data.table in R and standard functions in Clojure, in order to evaluate the necessity of a Clojure library providing similar functionality of data.table. In the conceptual level, data.table provides a DSL in R specifically for data analysis, with a unified syntax similar to SQL for selecting, grouping, updating and joining tabular data. In contrast, the standard libraries of Clojure provide basic building blocks for general purpose, including persistent data structures (e.g. sequence, vector, set, map) and generic transformation functions (e.g. map, filter, reduce). This article aims to illustrate the impact of the two approaches on data manipulation using common use cases.
The dataset used in this article is the NYC-flights14 data, which is On-Time flights data from the Bureau of Transporation Stati