Skip to content

Instantly share code, notes, and snippets.

@timvw
Last active July 28, 2016 07:23
Show Gist options
  • Select an option

  • Save timvw/51b2182c50cff8063a3972e298bd358e to your computer and use it in GitHub Desktop.

Select an option

Save timvw/51b2182c50cff8063a3972e298bd358e to your computer and use it in GitHub Desktop.
List files that are found by a spark path (potentially using ** and *)
package be.icteam.demo
import org.apache.spark._
object Program extends App {
override def main(args: Array[String]) = {
val sparkConf = new SparkConf().
setAppName("demo").
setMaster("local[4]")
val sc = new SparkContext(sparkConf)
val path = "./src/main/resources/"
val files = listFilesInPath(sc, path)
println("files: ")
println(files.mkString("\n"))
println("done")
}
def listFilesInPath(sc: SparkContext, path: String) : Array[String] = sc.wholeTextFiles(path)
.map(x => x._1)
.collect()
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment