Skip to content

Instantly share code, notes, and snippets.

@rajeevshukla
Last active September 14, 2019 17:52
Show Gist options
  • Save rajeevshukla/f38ff48ad8719e256fab7ec5bde9f60b to your computer and use it in GitHub Desktop.
Save rajeevshukla/f38ff48ad8719e256fab7ec5bde9f60b to your computer and use it in GitHub Desktop.
R function to create ORC table dynamically and partitioning
saveAsORCTable<- function(df,master_table_name,partitionBy, mode="overwrite") {
dropSqlQuery<- paste("drop table if exists",master_table_name)
tempTableName<- paste("temp",master_table_name,sep = "_")
createTableQuery<-paste("create table ",master_table_name," using orc partitioned by (",toString(partitionBy),") as select * from",tempTableName,"limit 0")
createOrReplaceTempView(df,tempTableName)
columnNames<-colnames(df)
columnNames<-setdiff(columnNames,partitionBy)
columnNames<-append(columnNames,partitionBy)
insertQuery<-paste("insert into",master_table_name,"partition (",toString(partitionBy),") select",toString(columnNames),"from",tempTableName)
if(mode == "overwrite") {
sql(dropSqlQuery)
sql(createTableQuery)
sql(insertQuery)
} else if(mode=="append"){
TABLES<-listTables()
TABLES<-filter(TABLES,TABLES$name == master_table_name)
row_count<-count(TABLES)
if(row_count == 0) {
sql(createTableQuery)
}
sql(insertQuery)
} else {
print(paste("Error!!!! Invalid mode:",mode," for table ",master_table_name))
q()
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment