Skip to content

Instantly share code, notes, and snippets.

@stiles
Last active August 3, 2019 17:37
Show Gist options
  • Save stiles/341c0360cbc60e2b2b5941b677f68fe0 to your computer and use it in GitHub Desktop.
Save stiles/341c0360cbc60e2b2b5941b677f68fe0 to your computer and use it in GitHub Desktop.
Pandas cheat sheet

The pandas equivalent to...

select * from table where column_name = some_value

is:

table[table.column_name == some_value]

Multiple conditions:

table[(table.column_name == some_value) | (table.column_name2 == some_value2)]

or...

table.query('column_name == some_value | column_name2 == some_value2')

###Code example import pandas as pd

###Create data set d = {'foo':[100, 111, 222], 'bar':[333, 444, 555]} df = pd.DataFrame(d)

###Full dataframe: df

###Shows: #|bar|foo ==|==|== 0|333|100 1|444|111 2|555|222

Output only the row(s) in df where foo is 222:

df[df.foo == 222]

Shows:

bar foo

2 555 222

In the above code it is the line df[df.foo == 222] that gives the rows based on the column value, 222 in this case.

Multiple conditions are also possible:

df[(df.foo == 222) | (df.bar == 444)]

bar foo

1 444 111

2 555 222

But at that point I would recommend using the query function, since it's less verbose and yields the same result:

df.query('foo == 222 | bar == 444')

#md
tl;dr
The pandas equivalent to
`select * from table where column_name = some_value`
is
table[table.column_name == some_value]
Multiple conditions:
table[(table.column_name == some_value) | (table.column_name2 == some_value2)]
or
table.query('column_name == some_value | column_name2 == some_value2')
Code example
import pandas as pd
# Create data set
d = {'foo':[100, 111, 222],
'bar':[333, 444, 555]}
df = pd.DataFrame(d)
# Full dataframe:
df
# Shows:
# bar foo
# 0 333 100
# 1 444 111
# 2 555 222
# Output only the row(s) in df where foo is 222:
df[df.foo == 222]
# Shows:
# bar foo
# 2 555 222
In the above code it is the line df[df.foo == 222] that gives the rows based on the column value, 222 in this case.
Multiple conditions are also possible:
df[(df.foo == 222) | (df.bar == 444)]
# bar foo
# 1 444 111
# 2 555 222
But at that point I would recommend using the query function, since it's less verbose and yields the same result:
df.query('foo == 222 | bar == 444')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment