Skip to content

Instantly share code, notes, and snippets.

@rrosasl
Created November 25, 2020 17:10
Show Gist options
  • Save rrosasl/1ed7943cdc3f69ec3fa4ec74ae0d9184 to your computer and use it in GitHub Desktop.
Save rrosasl/1ed7943cdc3f69ec3fa4ec74ae0d9184 to your computer and use it in GitHub Desktop.
Cleaning spreadsheet data
# Convert Rows into DataFrame and clean data
new_header = df.iloc[0] #grab the first row for the header
df = df[1:] #take the data less the header row
df.columns = new_header #set the header row as the df header
df = df.iloc[:,1:] # Remove time stamp
# Convert to a DataFrame and render.
df = pd.DataFrame.from_records(rows)
new_header = df.iloc[0] #grab the first row for the header
df = df[1:] #take the data less the header row
df.columns = new_header #set the header row as the df header
df = df.iloc[:,1:] # Remove time stamp
#Convert votes to int
for col in df.columns:
df[col] = pd.to_numeric(df[col])
#Change column names to candidate
votes = []
for col in df.columns:
name = col[37:-1]
votes.append(name)
df.columns = votes
df = df.reset_index().iloc[:,1:]
@rrosasl
Copy link
Author

rrosasl commented Aug 27, 2021

Hi Capncrockett!

You first need to have the dataframe DF with data.
Did you arrive here by looking at the medium article?
https://rrosasl.medium.com/ranked-choice-voting-with-google-forms-and-python-c471ea568a60

If so, there's the previous step that might be helpful.
If you're still stuck, let me know and I can try helping you :)

auth.authenticate_user()

https://colab.research.google.com/notebooks/snippets/sheets.ipynb

import gspread
from oauth2client.client import GoogleCredentials
gc = gspread.authorize(GoogleCredentials.get_application_default())
#worksheet = gc.open('Your spreadsheet name').sheet1
wb = gc.open_by_url('https://docs.google.com/spreadsheets/d/1cH8SE6ba4LqYA0kIS8n9LFt2zJQvh7EZaldcJ_MzgyA/edit#gid=48937339')

get_all_values gives a list of rows.

wb = wb.worksheet('Form Responses 1')
rows = wb.get_all_values()

Convert to a DataFrame and render.

df = pd.DataFrame.from_records(rows)

https://gist.githubusercontent.com/rrosasl/9a79cf881ff5322bbbaebf6af6e0d9ad/raw/8db757416e4fe9ef85263f3b8053595b9a2cde9f/importing_g_sheet.py

@capncrockett
Copy link

Hey hey, thanks for the quick response. Yes I was following that medium tutorial. I did get it to work though. I think by following the previous step a little more closely? Honestly not sure.

I did wind up switching a few things to work for me. If you can spot any differences I'd be keen to know if what I did differs from what you intended or what was on that medium tutorial. https://colab.research.google.com/drive/1FIcdoo8ELCzPeltEIKixWNPIdYQZPXrl?usp=sharing

Nice job by the way! This worked really well once I got it. I'm keen to try some alternative methods of input for G Forms. Like a drag and drop reorder approach might work better for voters, be more intuitive, and you can leave some off if you'd prefer. Would this code be able to handle varied index lengths? Using the
name = col[34:-1] # Can adjust the length of the column input approach would work I think. Maybe this isn't the place for that particular convo though.

@rrosasl
Copy link
Author

rrosasl commented Aug 30, 2021 via email

@capncrockett
Copy link

Hey thanks!

I was not using it for anything specifically. I've just been a fan of it since I saw Krist Novoselic speak about it 15 some odd years ago. I've always wanted to see it take hold across the country. But, for now, I'm learning about programming and data and wanted to mess around with the back end side of things of a RCV system.

I'll check out g-forms for more. Maybe it would make sense to write an API for something like that. I'm just getting started so any projects that would look good my GitHub seem like time well spent. I appreciate your willingness to help! You will probably hear from me again :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment