aculich · May 16, 2024 06:18 · EricPeter · Jun 18, 2019 · ganeshan · Jan 23, 2020
diff --git a/remove-empty-columns.csv b/remove-empty-columns.csv
diff --git a/remove-empty-columns.ipynb b/remove-empty-columns.ipynb
diff --git a/remove-empty-columns.py b/remove-empty-columns.py
 #!/usr/bin/env python
 # coding: utf-8

 # To drop all empty columns (but still keeping the headers) using the Python [Pandas library](http://pandas.pydata.org/) we can use the following 4-line script to read in the csv file, drop the columns where **all** the elements are missing, and save the data to a new csv file.

 # In[1]:


 from pandas.io.parsers import read_csv
 data = read_csv('remove-empty-columns.csv')
 filtered_data = data.dropna(axis='columns', how='all')
 filtered_data.to_csv('empty-columns-removed.csv', index=False)


 # As shown below, the sample data included in the csv file has 3 columns which contain missing values.
 # 
 # The second column, labeled **bar**, is completely empty except the header; columns like this should be dropped. The other columns contain data, but should not be dropped even though they contain some missing values.

 # In[2]:


 data


 # Using the [pandas.DataFrame.dropna()](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.dropna.html) function with the **columns** axis we can drop any column where **all** the entries are **NaN** (missing values).

 # In[3]:


 filtered_data = data.dropna(axis='columns', how='all')
 filtered_data
No results found