Created
October 26, 2021 11:26
-
-
Save RaMSFT/b843e738f3050d64cb5b70a6e684efd9 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| ## Provide mount with directory where the files exists | |
| mount_path = '/mnt/<mount name>/<directory>' | |
| spark.sql(f"create table flights_data_2 using csv location '{mount_path}/*.csv' options(header 'true', inferSchema 'true', sep ',')") | |
| ## run a group by command on registered table | |
| resultdf = spark.sql("select input_file_name() as filename, count(*) from flights_data_2 group by filename") | |
| resultdf.display() |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment