Skip to content

Instantly share code, notes, and snippets.

@robenalt
Last active June 10, 2016 19:30
Show Gist options
  • Save robenalt/eea9085dbfe6c061db44b4d06d85b333 to your computer and use it in GitHub Desktop.
Save robenalt/eea9085dbfe6c061db44b4d06d85b333 to your computer and use it in GitHub Desktop.
pyspark save dataframe
#from pyspark.sql import HiveContext
#sqlContext = HiveContext(sc)
query = """
select * from db.sometable where col>50
"""
results = sqlContext.sql(query)
result_writer = pyspark.sql.DataFrameWriter(results)
result_writer.saveAsTable('db.new_table_name',format='parquet', mode='overwrite',path='/path/to/new/data/files')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment