Skip to content

Instantly share code, notes, and snippets.

(-> ^SQLContext sqtx
(.read)
(.format "parquet")
(.options (java.util.HashMap. {"mergeSchema" "false" "path" path}))
(.load))
val file = sqx.read.option("mergeSchema", "false").parquet(path)
(spark-conf/set "spark.hadoop.mapred.output.committer.class" "com.appsflyer.spark.DirectOutputCommitter")
(let [ctx (spark/spark-context conf)
hadoop-conf (.hadoopConfiguration ^JavaSparkContext ctx)]
(.set hadoop-conf "spark.sql.parquet.output.committer.class" "org.apache.spark.sql.parquet.DirectParquetOutputCommitter"))