Million Songs Dataset is probably one of the most popular datasets for those who want to start fiddle with Big Data analysis and Hadoop. In a nutshell, it's a set of million songs, described by a long set of characteristics, like year of publishing, where the artist comes from, but also shape of the wave, segments, etc.
In order for a time analysis (like, how does the tempo change throughout the years), it is good to know what is the distribution of the data among the time. And this is what the chart above is about - I just how many songs for each year are there in the dataset.
The analysis was performed on 10 small instances on Amazon Map Reduce, and it took nearly 10 hours, which means that the cost of the analysis was 10 instances * 10 hours * (0.015 + 0.06)$ = 7.50$. Pretty cheap, isn't it?
More to come!