Skip to content

Instantly share code, notes, and snippets.

View Penguin5681's full-sized avatar
🛠️
Bob the builder

Pranav Sinha Penguin5681

🛠️
Bob the builder
View GitHub Profile
@SamarthGarge
SamarthGarge / data_cleaning_functions.py
Created March 26, 2025 10:58
Outliers feature : 2 ways to use it => 1) From the outliers.py function 2) By integrating it in the clean_dataframe function
def clean_dataframe(df, null_method='nan', fix_numeric=True, remove_dups=True, dup_subset=None,
detect_outliers_flag=False, outlier_columns=None, outlier_method='zscore',
outlier_threshold=3, outlier_processing='remove', outlier_cap_values=None):
"""
Apply all cleaning functions in sequence
Parameters:
-----------
df : pandas.DataFrame
The dataframe to clean