Last active
May 16, 2024 06:18
-
-
Save aculich/fb2769414850d20911eb to your computer and use it in GitHub Desktop.
remove-empty-columns
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
foo | bar | baz | |
---|---|---|---|
a | 1 | ||
b | 2 | ||
c | |||
4 | |||
e | 5 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
# coding: utf-8 | |
# To drop all empty columns (but still keeping the headers) using the Python [Pandas library](http://pandas.pydata.org/) we can use the following 4-line script to read in the csv file, drop the columns where **all** the elements are missing, and save the data to a new csv file. | |
# In[1]: | |
from pandas.io.parsers import read_csv | |
data = read_csv('remove-empty-columns.csv') | |
filtered_data = data.dropna(axis='columns', how='all') | |
filtered_data.to_csv('empty-columns-removed.csv', index=False) | |
# As shown below, the sample data included in the csv file has 3 columns which contain missing values. | |
# | |
# The second column, labeled **bar**, is completely empty except the header; columns like this should be dropped. The other columns contain data, but should not be dropped even though they contain some missing values. | |
# In[2]: | |
data | |
# Using the [pandas.DataFrame.dropna()](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.dropna.html) function with the **columns** axis we can drop any column where **all** the entries are **NaN** (missing values). | |
# In[3]: | |
filtered_data = data.dropna(axis='columns', how='all') | |
filtered_data | |
doesn't work for me ;(
doesn't work for me :(
@zhixe @JayProXter i've updated the script based on the suggestion from @ganeshan
Does it work for you now? If not, can you let me know what errors or unexpected behaviors you are seeing?
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The statement
filtered_data.to_csv('empty-columns-removed.csv')
will write first column with index values and the column header will be empty. To prevent writing index column, use this code instead.
filtered_data.to_csv('empty-columns-removed.csv', index=False)
Thanks very much for the script.