Skip to content

Instantly share code, notes, and snippets.

@fabsta
Last active November 21, 2017 18:05
Show Gist options
  • Save fabsta/b9c3e1182fc2b53e75fcb65f07aacc00 to your computer and use it in GitHub Desktop.
Save fabsta/b9c3e1182fc2b53e75fcb65f07aacc00 to your computer and use it in GitHub Desktop.
#DataScience #InputOutput

[TOC]

Read

csv

import pandas as pd

data = pd.read_csv("http://www.google.de")
df = pd.read_csv('../data/example.csv', header=None)
df = pd.read_csv('../data/example.csv', na_values=['.']) # specifying "." as missing values

df = pd.read_csv('../data/example.csv', na_values={'Last Name': ['.', 'NA'], 'Pre-Test Score': ['.']}) # specifying "." and "NA" as missing values in the Last Name column and "." as missing values in Pre-Test Score column

df = pd.read_csv('../data/example.csv', na_values=sentinels, skiprows=3) # skipping the top 3 rows
df = pd.read_csv('../data/example.csv', thousands=',') # interpreting "," in strings around numbers as thousands seperators

Parsing dates

dateparse = lambda dates: pd.datetime.strptime(dates, '%Y')
data = pd.read_csv(in_file, parse_dates='Month', index_col='Month',date_parser=dateparse)

Excel

Import the excel file and call it xls_file

xls_file = pd.ExcelFile('../data/example.xls')

Load the xls file's Sheet1 as a dataframe

df = xls_file.parse('Sheet1')

SAVE

CSV

df.to_csv("../submission.csv", index = False)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment