Created
February 9, 2022 02:52
-
-
Save dboyliao/21beb6607f4dfa6ba64237ae7f428bc1 to your computer and use it in GitHub Desktop.
Spark Simple Example
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// The simplest possible sbt build file is just one line: | |
scalaVersion := "2.13.3" | |
// That is, to create a valid sbt build, all you've got to do is define the | |
// version of Scala you'd like your project to use. | |
// ============================================================================ | |
// Lines like the above defining `scalaVersion` are called "settings". Settings | |
// are key/value pairs. In the case of `scalaVersion`, the key is "scalaVersion" | |
// and the value is "2.13.3" | |
// It's possible to define many kinds of settings, such as: | |
name := "hello-world" | |
organization := "ch.epfl.scala" | |
version := "1.0" | |
// Note, it's not required for you to define these three settings. These are | |
// mostly only necessary if you intend to publish your library's binaries on a | |
// place like Sonatype. | |
// Want to use a published library in your project? | |
// You can define other libraries as dependencies in your build like this: | |
libraryDependencies += "org.scala-lang.modules" %% "scala-parser-combinators" % "1.1.2" | |
libraryDependencies += "org.apache.spark" %% "spark-core" % "3.2.0" | |
libraryDependencies += "org.apache.spark" %% "spark-sql" % "3.2.0" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import org.apache.spark.sql.{SparkSession, Row, types => T} | |
import org.apache.log4j.{Logger, Level} | |
object Main extends App { | |
Logger.getLogger("org").setLevel(Level.ERROR) | |
val spark = SparkSession | |
.builder() | |
.appName("hello-world") | |
.master("local[2]") | |
.getOrCreate() | |
val data = List("Hello, world", "I'm running Spark!") | |
val msgDF = spark.createDataFrame( | |
spark.sparkContext.makeRDD(data.map(x => Row(x))), | |
schema = T.StructType(Array(T.StructField("msg", T.StringType))) | |
) | |
msgDF.foreach((row: Row) => println(row.getAs("msg"))) | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment