Skip to content

Instantly share code, notes, and snippets.

Databricks billing

  1. Ask Databricks Admin to enable system billing After enabling, system.billing.usage and databricks billing billable-usage will work

  2. Get the workspace ID

workspace_id = spark.conf.get("spark.databricks.workspaceUrl").split(".")[0].replace("adb-", "")
print(f"Workspace ID: {workspace_id}")

This script looks for all files starting with protected_WAG_CNVR_FY26_Immz_2yr_1Vax_Ages and print row counts in each file

for f in $(ls -1 | grep 'protected_WAG_CNVR_FY26_Immz_2yr_1Vax_Ages');do
count=$(wc -l < "$f")
echo "$f : $count"
done

The output :

@dvu4
dvu4 / subtract_vs_left_join_in_pyspark.md
Last active April 23, 2025 19:30
Troubleshooting for dividing data into 2 separate groups with subtract vs left join

This approach leads to the duplications in df_holdout and df_target

  • subtract() is row-based and requires exact row match

  • If df has duplicate rows, subtract() doesn't guarantee it removes just one instance.

# Filter 20% of the data for the holdout group
df_holdout = df.sample(fraction=0.2, seed=42)
    
@dvu4
dvu4 / write_count_record_to_file.md
Last active April 23, 2025 18:51
Troubleshooting for writing total rows to .ok file in Pyspark

Write count record to .ok file

This code is functionally correct, but it's inefficient and overly complex for what it does: writing a single integer (record count) to a file.

def write_table_to_file(df, container, storage_account, out_name = None, delimiter = "|", audit_prefix = ".ok"):

  ok_output_path  = f"abfss://{container}@{storage_account}.dfs.core.windows.net/tmp_ok_output/"

Check version of GPU Mac

1. Check the GPU info

system_profiler SPDisplaysDataType

This will display something like:

Check version of Mac Chip

1. Check the Chip Type

sysctl -n machdep.cpu.brand_string

This will display something like:

@dvu4
dvu4 / insert_in_sublime.md
Last active April 8, 2025 23:56
Insert text every beginning or ending of all lines in sublime

Insert text every beginning or ending of all lines in sublime

Context :

I have 1000 line following like structure texts.

f54g
f5g546
2122v