Skip to content

Instantly share code, notes, and snippets.

@Jviejo
Last active August 28, 2025 19:59
Show Gist options
  • Save Jviejo/31f4419fb7f2d4fa0350053a891fcdc2 to your computer and use it in GitHub Desktop.
Save Jviejo/31f4419fb7f2d4fa0350053a891fcdc2 to your computer and use it in GitHub Desktop.
spark.py
%spark.pyspark
spark = SparkSession.builder \
.master("spark://localhost:7077") \
.appName("ConexionACluster") \
.getOrCreate()
df = spark.createDataFrame([("A", 1), ("B", 2)], ["col1", "col2"])
jdbc_url = "jdbc:postgresql://host.docker.internal:5432/northwind"
properties = {
"user": "postgres",
"password": "123456",
"driver": "org.postgresql.Driver"
}
df2 = spark.read.jdbc(url=jdbc_url, table="orders", properties=properties)
print(df2.count())
df.show()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment