In this document, I'm using a data file containing 40 million records. The file is a text file with one record per line.
The following Scala code is run in a spark-shell:
val filename = "<path to the file>"
val file = sc.textFile(filename)
file.count()