First a disclaimer: This is an experimental API that exposes internals that are likely to change in between different Spark releases. As a result, most datasources should be written against the stable public API in org.apache.spark.sql.sources. We expose this mostly to get feedback on what optimizations we should add to the stable API in order to get the best performance out of data sources.
We'll start with a simple artificial data source that just returns ranges of consecutive integers.
/** A data source that returns ranges of consecutive integers in a column named `a`. */
case class SimpleRelation(
start: Int,
end: Int)(
@transient val sqlContext: SQLContext)