- based on this to convert the pbf into paraquet file
- The library, https://github.com/adrianulbona/osm-parquetizer to convert the osm.pbf file into three paraquet files
java -jar target/osm-parquetizer-1.0.1-SNAPSHOT.jar ../test1/romania-latest.osm.pbf
- The Python pyspark code to read the paraquete file and make subset on extent and save the shape file
- Using the docker for https://hub.docker.com/r/airpollutionstudyindia/matplotlib and executing the code in the contianer
from pyspark.sql import SparkSession
from pyspark.sql.functions import col,explode