- Reading files
- Reading from the web
- Creating example DataFrames
- Creating columns
- Renaming columns
- Selecting rows and columns
- Filtering rows by condition
- Manipulating strings
- Working with data types
- Encoding data
- Extracting data from lists
- Working with time series data
- Handling missing values
- Using aggregation functions
- Using cumulative functions
- Random sampling
- Merging DataFrames
- Styling DataFrames
- Exploring a dataset
- Handling warnings
- Other
πΌπ€ΉββοΈ pandas trick:
5 useful "read_csv" parameters that are often overlooked:
β‘οΈ names: specify column names
β‘οΈ usecols: which columns to keep
β‘οΈ dtype: specify data types
β‘οΈ nrows: # of rows to read
β‘οΈ na_values: strings to recognize as NaN#Python #DataScience #pandastricksβ Kevin Markham (@justmarkham) August 19, 2019
πΌπ€ΉββοΈ pandas trick:
β οΈ Got bad data (or empty rows) at the top of your CSV file? Use these read_csv parameters:β‘οΈ header = row number of header (start counting at 0)
β‘οΈ skiprows = list of row numbers to skipSee example π#Python #DataScience #pandas #pandastricks pic.twitter.com/t1M6XkkPYG
β Kevin Markham (@justmarkham) September 3, 2019
πΌπ€ΉββοΈ pandas trick:
Two easy ways to reduce DataFrame memory usage:
1. Only read in columns you need
2. Use 'category' data type with categorical data.Example:
df = https://t.co/Ib52aQAdkA_csv('file.csv', usecols=['A', 'C', 'D'], dtype={'D':'category'})#Python #pandastricksβ Kevin Markham (@justmarkham) June 21, 2019
πΌπ€ΉββοΈ pandas trick:
You can read directly from a compressed file:
df = https://t.co/Ib52aQAdkA_csv('https://t.co/3JAwA8h7FJ')Or write to a compressed file:https://t.co/ySXYEf6MjY_csv('https://t.co/3JAwA8h7FJ')
Also supported: .gz, .bz2, .xz#Python #pandas #pandastricks
β Kevin Markham (@justmarkham) July 4, 2019
πΌπ€ΉββοΈ pandas trick:
Are your dataset rows spread across multiple files, but you need a single DataFrame?
Solution:
1. Use glob() to list your files
2. Use a generator expression to read files and concat() to combine them
3. π₯³See example π#Python #DataScience #pandastricks pic.twitter.com/qtKpzEoSC3
β Kevin Markham (@justmarkham) June 20, 2019
πΌπ€ΉββοΈ pandas trick:
Need to quickly get data from Excel or Google Sheets into pandas?
1. Copy data to clipboard
2. df = https://t.co/Ib52aQAdkA_clipboard()
3. π₯³See example π
Learn 25 more tips & tricks: https://t.co/6akbxXG6SI#Python #DataScience #pandas #pandastricks pic.twitter.com/M2Yw0NAXRe
β Kevin Markham (@justmarkham) July 15, 2019
πΌπ€ΉββοΈ pandas trick:
Want to read a JSON file from the web? Use read_json() to read it directly from a URL into a DataFrame! π
See example π#Python #DataScience #pandas #pandastricks pic.twitter.com/gei6eeudiq
β Kevin Markham (@justmarkham) September 9, 2019
πΌπ€ΉββοΈ pandas trick #68:
Want to scrape a web page? Try read_html()!
Definitely worth trying before bringing out a more complex tool (Beautiful Soup, Selenium, etc.)
See example π#Python #DataScience #pandas #pandastricks pic.twitter.com/sPKrea9wk1
β Kevin Markham (@justmarkham) September 18, 2019
πΌπ€ΉββοΈ pandas trick:
Need to create an example DataFrame? Here are 3 easy options:
pd.DataFrame({'col_one':[10, 20], 'col_two':[30, 40]})
pd.DataFrame(np.random.rand(2, 3), columns=list('abc'))
pd.util.testing.makeMixedDataFrame()See output π#Python #pandas #pandastricks pic.twitter.com/SSlZsd6OEj
β Kevin Markham (@justmarkham) June 28, 2019
πΌπ€ΉββοΈ pandas trick:
Need to create a DataFrame for testing?
pd.util.testing.makeDataFrame() β‘οΈ contains random values
.makeMissingDataframe() β‘οΈ some values missing
.makeTimeDataFrame() β‘οΈ has DateTimeIndex
.makeMixedDataFrame() β‘οΈ mixed data types#Python #pandas #pandastricksβ Kevin Markham (@justmarkham) July 10, 2019
πΌπ€ΉββοΈ pandas trick:
Want to create new columns (or overwrite existing columns) within a method chain? Use "assign"!
See example π#Python #DataScience #pandas #pandastricks pic.twitter.com/y0wEfbz0VA
β Kevin Markham (@justmarkham) September 17, 2019
πΌπ€ΉββοΈ pandas trick:
Need to create a bunch of new columns based on existing columns? Use this pattern:
for col in df.columns:
df[f'{col}_new'] = df[col].apply(my_function)See example π
Thanks to @pmbaumgartner for this trick!#Python #DataScience #pandas #pandastricks pic.twitter.com/7qvKn9UypE
β Kevin Markham (@justmarkham) September 16, 2019
πΌπ€ΉββοΈ pandas trick:
3 ways to rename columns:
1. Most flexible option:
df = df.rename({'A':'a', 'B':'b'}, axis='columns')2. Overwrite all column names:
df.columns = ['a', 'b']3. Apply string method:
df.columns = df.columns.str.lower()#Python #DataScience #pandastricksβ Kevin Markham (@justmarkham) July 16, 2019
πΌπ€ΉββοΈ pandas trick:
Add a prefix to all of your column names:
df.add_prefix('X_')Add a suffix to all of your column names:
df.add_suffix('_Y')#Python #DataScienceβ Kevin Markham (@justmarkham) June 11, 2019
πΌπ€ΉββοΈ pandas trick:
Need to rename all of your columns in the same way? Use a string method:
Replace spaces with _:
df.columns = df.columns.str.replace(' ', '_')Make lowercase & remove trailing whitespace:
df.columns = df.columns.str.lower().str.rstrip()#Python #pandastricksβ Kevin Markham (@justmarkham) June 25, 2019
πΌπ€ΉββοΈ pandas trick:
You can use f-strings (Python 3.6+) when selecting a Series from a DataFrame!
See example π#Python #DataScience #pandas #pandastricks @python_tip pic.twitter.com/8qHEXiGBaB
β Kevin Markham (@justmarkham) September 13, 2019
πΌπ€ΉββοΈ pandas trick:
Need to select multiple rows/columns? "loc" is usually the solution:
select a slice (inclusive):
df.loc[0:4, 'col_A':'col_D']select a list:
df.loc[[0, 3], ['col_A', 'col_C']]select by condition:
df.loc[df.col_A=='val', 'col_D']#Python #pandastricksβ Kevin Markham (@justmarkham) July 3, 2019
πΌπ€ΉββοΈ pandas trick:
"loc" selects by label, and "iloc" selects by position.
But what if you need to select by label *and* position? You can still use loc or iloc!
See example π
P.S. Don't use "ix", it has been deprecated since 2017.#Python #DataScience #pandas #pandastricks pic.twitter.com/SpFkjWYEE0
β Kevin Markham (@justmarkham) August 1, 2019
πΌπ€ΉββοΈ pandas trick:
Reverse column order in a DataFrame:
df.loc[:, ::-1]Reverse row order:
df.loc[::-1]Reverse row order and reset the index:
df.loc[::-1].reset_index(drop=True)Want more #pandastricks? Working on a video right now, stay tuned... π₯#Python #DataScience
β Kevin Markham (@justmarkham) June 12, 2019
πΌπ€ΉββοΈ pandas trick:
Filter DataFrame by multiple OR conditions:
df[(df.color == 'red') | (df.color == 'green') | (df.color == 'blue')]Shorter way:
df[df.color.isin(['red', 'green', 'blue'])]Invert the filter:
df[~df.color.isin(['red', 'green', 'blue'])]#Python #pandastricksβ Kevin Markham (@justmarkham) June 13, 2019
πΌπ€ΉββοΈ pandas trick:
Are you trying to filter a DataFrame using lots of criteria? It can be hard to write βοΈ and to read! π
Instead, save the criteria as objects and use them to filter. Or, use reduce() to combine the criteria!
See example π#Python #DataScience #pandastricks pic.twitter.com/U9NV27RIjQ
β Kevin Markham (@justmarkham) August 28, 2019
πΌπ€ΉββοΈ pandas trick:
Want to filter a DataFrame that doesn't have a name?
Use the query() method to avoid creating an intermediate variable!
See example π#Python #DataScience #pandas #pandastricks pic.twitter.com/NyUOOSr7Sc
β Kevin Markham (@justmarkham) July 25, 2019
πΌπ€ΉββοΈ pandas trick:
Need to refer to a local variable within a query() string? Just prefix it with the @ symbol!
See example π#Python #DataScience #pandas #pandastricks pic.twitter.com/PfXcASWDdC
β Kevin Markham (@justmarkham) August 13, 2019
πΌπ€ΉββοΈ pandas trick:
If you want to use query() on a column name containing a space, just surround it with backticks! (New in pandas 0.25)
See example π#Python #DataScience #pandas #pandastricks pic.twitter.com/M5ZSRVr3no
β Kevin Markham (@justmarkham) July 30, 2019
πΌπ€ΉββοΈ pandas trick:
Want to concatenate two string columns?
Option 1: Use a string method π§Ά
Option 2: Use plus signs βSee example π
Which option do you prefer, and why?#Python #DataScience #pandas #pandastricks pic.twitter.com/SsjBAMqkxB
β Kevin Markham (@justmarkham) August 22, 2019
πΌπ€ΉββοΈ pandas trick:
Need to split a string into multiple columns? Use str.split() method, expand=True to return a DataFrame, and assign it to the original DataFrame.
See example π#Python #DataScience #pandas #pandastricks pic.twitter.com/wZ4okQZ9Dy
β Kevin Markham (@justmarkham) July 9, 2019
πΌπ€ΉββοΈ pandas trick:
Numbers stored as strings? Try astype():
df.astype({'col1':'int', 'col2':'float'})But it will fail if you have any invalid input. Better way:
df.apply(https://t.co/H90jtE9QMp_numeric, errors='coerce')Converts invalid input to NaN π#Python #pandastricks
β Kevin Markham (@justmarkham) June 17, 2019
πΌπ€ΉββοΈ pandas trick:
Select columns by data type:https://t.co/8c3VWfaERD_dtypes(include='number')https://t.co/8c3VWfaERD_dtypes(include=['number', 'category', 'object'])https://t.co/8c3VWfaERD_dtypes(exclude=['datetime', 'timedelta'])#Python #DataScience #pandas #pandastricks
β Kevin Markham (@justmarkham) June 14, 2019
πΌπ€ΉββοΈ pandas trick:
Two useful properties of ordered categories:
1οΈβ£ You can sort the values in logical (not alphabetical) order
2οΈβ£ Comparison operators also work logicallySee example π#Python #DataScience #pandas #pandastricks pic.twitter.com/HeYZ3P3gPP
β Kevin Markham (@justmarkham) August 8, 2019
πΌπ€ΉββοΈ pandas trick:
Need to convert a column from continuous to categorical? Use cut():
df['age_groups'] = pd.cut(df.age, bins=[0, 18, 65, 99], labels=['child', 'adult', 'elderly'])
0 to 18 β‘οΈ 'child'
18 to 65 β‘οΈ 'adult'
65 to 99 β‘οΈ 'elderly'#Python #pandas #pandastricksβ Kevin Markham (@justmarkham) July 2, 2019
πΌπ€ΉββοΈ pandas trick:
Want to dummy encode (or "one hot encode") your DataFrame? Use pd.get_dummies(df) to encode all object & category columns.
Want to drop the first level since it provides redundant info? Set drop_first=True.
See example & read thread π#Python #pandastricks pic.twitter.com/g0XjJ44eg2
β Kevin Markham (@justmarkham) August 5, 2019
πΌπ€ΉββοΈ pandas trick:
Need to apply the same mapping to multiple columns at once? Use "applymap" (DataFrame method) with "get" (dictionary method).
See example π#Python #DataScience #pandas #pandastricks pic.twitter.com/WU4AmeHP4O
β Kevin Markham (@justmarkham) August 30, 2019
πΌπ€ΉββοΈ pandas trick:
Has your data ever been TRAPPED in a Series of Python lists? π
Expand the Series into a DataFrame by using apply() and passing it the Series constructor π
See example π#Python #DataScience #pandas #pandastricks pic.twitter.com/ZvysqaRz6S
β Kevin Markham (@justmarkham) June 27, 2019
πΌπ€ΉββοΈ pandas trick:
Do you have a Series containing lists of items? Create one row for each item using the "explode" method π₯
New in pandas 0.25! See example π
π€―#Python #DataScience #pandas #pandastricks pic.twitter.com/ix5d8CLg57
β Kevin Markham (@justmarkham) August 12, 2019
πΌπ€ΉββοΈ pandas trick:
Does your Series contain comma-separated items? Create one row for each item:
βοΈ "str.split" creates a list of strings
β¬ οΈ "assign" overwrites the existing column
π₯ "explode" creates the rows (new in pandas 0.25)See example π#Python #pandas #pandastricks pic.twitter.com/OqZNWdarP0
β Kevin Markham (@justmarkham) August 14, 2019
πΌπ€ΉββοΈ pandas trick:
π₯ "explode" takes a list of items and creates one row for each item (new in pandas 0.25)
You can also do the reverse! See example π
Thanks to @EForEndeavour for this tip π#Python #DataScience #pandas #pandastricks pic.twitter.com/4UBxbzHS51
β Kevin Markham (@justmarkham) August 16, 2019
πΌπ€ΉββοΈ pandas trick:
If you need to create a single datetime column from multiple columns, you can use to_datetime() π
See example π
You must include: month, day, year
You can also include: hour, minute, second#Python #DataScience #pandas #pandastricks pic.twitter.com/0bip6SRDdFβ Kevin Markham (@justmarkham) July 8, 2019
πΌπ€ΉββοΈ pandas trick:
One reason to use the datetime data type is that you can access many useful attributes via "dt", like:
df.column.dt.hourOther attributes include: year, month, day, dayofyear, week, weekday, quarter, days_in_month...
See full list π#Python #pandastricks pic.twitter.com/z405STKqKY
β Kevin Markham (@justmarkham) August 2, 2019
πΌπ€ΉββοΈ pandas trick:
Need to perform an aggregation (sum, mean, etc) with a given frequency (monthly, yearly, etc)?
Use resample! It's like a "groupby" for time series data. See example π
"Y" means yearly. See list of frequencies: https://t.co/oPDx85yqFT#Python #pandastricks pic.twitter.com/nweqbHXEtd
β Kevin Markham (@justmarkham) July 18, 2019
πΌπ€ΉββοΈ pandas trick:
Want to calculate the difference between each row and the previous row? Use df.col_name.diff()
Want to calculate the percentage change instead? Use df.col_name.pct_change()
See example π#Python #DataScience #pandas #pandastricks pic.twitter.com/5EGYqpNPC3
β Kevin Markham (@justmarkham) August 27, 2019
πΌπ€ΉββοΈ pandas trick:
Need to convert a datetime Series from UTC to another time zone?
1. Set current time zone β‘οΈ tz_localize('UTC')
2. Convert β‘οΈ tz_convert('America/Chicago')Automatically handles Daylight Savings Time!
See example π#Python #DataScience #pandastricks pic.twitter.com/ztzMXcgkFY
β Kevin Markham (@justmarkham) July 31, 2019
πΌπ€ΉββοΈ pandas trick:
Calculate % of missing values in each column:
df.isna().mean()Drop columns with any missing values:
df.dropna(axis='columns')Drop columns in which more than 10% of values are missing:
df.dropna(thresh=len(df)*0.9, axis='columns')#Python #pandastricksβ Kevin Markham (@justmarkham) June 19, 2019
πΌπ€ΉββοΈ pandas trick:
Need to fill missing values in your time series data? Use df.interpolate()
Defaults to linear interpolation, but many other methods are supported!
Want more pandas tricks? Watch this:
π https://t.co/6akbxXXHKg π#Python #DataScience #pandas #pandastricks pic.twitter.com/JjH08dvjMKβ Kevin Markham (@justmarkham) July 12, 2019
πΌπ€ΉββοΈ pandas trick:
Do you need to store missing values ("NaN") in an integer Series? Use the "Int64" data type!
See example π
(New in v0.24, API is experimental/subject to change)#Python #DataScience #pandas #pandastricks pic.twitter.com/mN7Ud53Rls
β Kevin Markham (@justmarkham) August 15, 2019
πΌπ€ΉββοΈ pandas trick:
Instead of aggregating by a single function (such as 'mean'), you can aggregate by multiple functions by using 'agg' (and passing it a list of functions) or by using 'describe' (for summary statistics π)
See example π#Python #DataScience #pandastricks pic.twitter.com/Emg3zLAocB
β Kevin Markham (@justmarkham) July 19, 2019
πΌπ€ΉββοΈ pandas trick:
Did you know that "last" is an aggregation function, just like "sum" and "mean"?
Can be used with a groupby to extract the last value in each group. See example π
P.S. You can also use "first" and "nth" functions!#Python #DataScience #pandas #pandastricks pic.twitter.com/WKJtNIUxwz
β Kevin Markham (@justmarkham) August 9, 2019
πΌπ€ΉββοΈ pandas trick:
Are you applying multiple aggregations after a groupby? Try "named aggregation":
β Allows you to name the output columns
β Avoids a column MultiIndexNew in pandas 0.25! See example π#Python #DataScience #pandas #pandastricks pic.twitter.com/VXJz6ShZbc
β Kevin Markham (@justmarkham) August 21, 2019
πΌπ€ΉββοΈ pandas trick:
Want to combine the output of an aggregation with the original DataFrame?
Instead of: df.groupby('col1').col2.func()
Use: df.groupby('col1').col2.transform(func)"transform" changes the output shape
See example π#Python #DataScience #pandas #pandastricks pic.twitter.com/9dkcAGpTYK
β Kevin Markham (@justmarkham) September 4, 2019
πΌπ€ΉββοΈ pandas trick:
Need to calculate a running total (or "cumulative sum")? Use the cumsum() function! Also works with groupby()
See example π
Other cumulative functions: cummax(), cummin(), cumprod()#Python #DataScience #pandas #pandastricks pic.twitter.com/H4whqlV2ky
β Kevin Markham (@justmarkham) September 6, 2019
πΌπ€ΉββοΈ pandas trick:
Need to calculate a running count within groups? Do this:
df.groupby('col').cumcount() + 1See example π
Thanks to @kjbird15 and @EForEndeavour for this trick! π#Python #DataScience #pandas #pandastricks @python_tip pic.twitter.com/jSz231QmmS
β Kevin Markham (@justmarkham) September 11, 2019
πΌπ€ΉββοΈ pandas trick:
Randomly sample rows from a DataFrame:
df.sample(n=10)
df.sample(frac=0.25)Useful parameters:
β‘οΈ random_state: use any integer for reproducibility
β‘οΈ replace: sample with replacement
β‘οΈ weights: weight based on values in a column π#Python #pandastricks pic.twitter.com/j2AyoTLRKbβ Kevin Markham (@justmarkham) August 20, 2019
πΌπ€ΉββοΈ pandas trick:
Want to shuffle your DataFrame rows?
df.sample(frac=1, random_state=0)Want to reset the index after shuffling?
df.sample(frac=1, random_state=0).reset_index(drop=True)#Python #DataScience #pandas #pandastricksβ Kevin Markham (@justmarkham) August 26, 2019
πΌπ€ΉββοΈ pandas trick:
Split a DataFrame into two random subsets:
df_1 = df.sample(frac=0.75, random_state=42)
df_2 = df.drop(df_1.index)(Only works if df's index values are unique)
P.S. Working on a video of my 25 best #pandastricks, stay tuned! πΊ#Python #pandas #DataScience
β Kevin Markham (@justmarkham) June 18, 2019
πΌπ€ΉββοΈ pandas trick:
When you are merging DataFrames, you can identify the source of each row (left/right/both) by setting indicator=True.
See example π
P.S. Learn 25 more #pandastricks in 25 minutes: https://t.co/6akbxXG6SI#Python #DataScience #pandas pic.twitter.com/tkb2LiV4eh
β Kevin Markham (@justmarkham) July 23, 2019
πΌπ€ΉββοΈ pandas trick:
Merging datasets? Check that merge keys are unique in BOTH datasets:
pd.merge(left, right, validate='one_to_one')β Use 'one_to_many' to only check uniqueness in LEFT
β Use 'many_to_one' to only check uniqueness in RIGHT#Python #DataScience #pandastricksβ Kevin Markham (@justmarkham) June 26, 2019
πΌπ€ΉββοΈ pandas trick:
Two simple ways to style a DataFrame:
1οΈβ£ https://t.co/HRqLVf3cWC.hide_index()
2οΈβ£ https://t.co/HRqLVf3cWC.set_caption('My caption')See example π
For more style options, watch trick #25: https://t.co/6akbxXG6SI πΊ#Python #DataScience #pandas #pandastricks pic.twitter.com/8yzyQYz9vr
β Kevin Markham (@justmarkham) August 6, 2019
πΌπ€ΉββοΈ pandas trick:
Want to add formatting to your DataFrame? For example:
- hide the index
- add a caption
- format numbers & dates
- highlight min & max valuesWatch π to learn how!
Code: https://t.co/HKroWYVIEs
25 more tricks: https://t.co/6akbxXG6SI#Python #pandastricks pic.twitter.com/AKQr7zVR7S
β Kevin Markham (@justmarkham) July 17, 2019
πΌπ€ΉββοΈ pandas trick:
Want to explore a new dataset without too much work?
1. Pick one:
β‘οΈ pip install pandas-profiling
β‘οΈ conda install -c conda-forge pandas-profiling2. import pandas_profiling
3. df.profile_report()
4. π₯³See example π#Python #DataScience #pandastricks pic.twitter.com/srq5rptEUj
β Kevin Markham (@justmarkham) July 29, 2019
πΌπ€ΉββοΈ pandas trick:
Need to check if two Series contain the same elements?
β Don't do this:
df.A == df.Bβ Do this:
df.A.equals(df.B)β Also works for DataFrames:
df.equals(df2)equals() properly handles NaNs, whereas == does not#Python #DataScience #pandas #pandastricks
β Kevin Markham (@justmarkham) June 24, 2019
πΌπ€ΉββοΈ pandas trick #69:
Need to check if two Series are "similar"? Use this:
pd.testing.assert_series_equal(df.A, df.B, ...)
Useful arguments include:
β‘οΈ check_names=False
β‘οΈ check_dtype=False
β‘οΈ check_exact=FalseSee example π#Python #DataScience #pandas #pandastricks pic.twitter.com/bdJBkiFxne
β Kevin Markham (@justmarkham) September 19, 2019
πΌπ€ΉββοΈ pandas trick:
Want to examine the "head" of a wide DataFrame, but can't see all of the columns?
Solution #1: Change display options to show all columns
Solution #2: Transpose the head (swaps rows and columns)See example π#Python #DataScience #pandas #pandastricks pic.twitter.com/9sw7O7cPeh
β Kevin Markham (@justmarkham) July 24, 2019
πΌπ€ΉββοΈ pandas trick:
Want to plot a DataFrame? It's as easy as:
df.plot(kind='...')You can use:
line π
bar π
barh
hist
box π¦
kde
area
scatter
hexbin
pie π₯§Other plot types are available via pd.plotting!
Examples: https://t.co/fXYtPeVpZX#Python #dataviz #pandastricks pic.twitter.com/kp82wA15S4
β Kevin Markham (@justmarkham) August 23, 2019
πΌπ€ΉββοΈ pandas trick:
Did you encounter the dreaded SettingWithCopyWarning? π»
The usual solution is to rewrite your assignment using "loc":
β df[df.col == val1].col = val2
β df.loc[df.col == val1, 'col'] = val2See example π#Python #DataScience #pandastricks @python_tip pic.twitter.com/6L6IukTpBO
β Kevin Markham (@justmarkham) September 10, 2019
πΌπ€ΉββοΈ pandas trick:
Did you get a "SettingWithCopyWarning" when creating a new column? You are probably assigning to a DataFrame that was created from another DataFrame.
Solution: Use the "copy" method when copying a DataFrame!
See example π#Python #DataScience #pandastricks pic.twitter.com/LrRNFyN6Qn
β Kevin Markham (@justmarkham) September 12, 2019
πΌπ€ΉββοΈ pandas trick:
If you've created a groupby object, you can access any of the groups (as a DataFrame) using the get_group() method.
See example π#Python #DataScience #pandas #pandastricks pic.twitter.com/6Ya0kxMpgk
β Kevin Markham (@justmarkham) September 2, 2019
πΌπ€ΉββοΈ pandas trick:
Do you have a Series with a MultiIndex?
Reshape it into a DataFrame using the unstack() method. It's easier to read, plus you can interact with it using DataFrame methods!
See example π
P.S. Want a video with my top 25 #pandastricks? πΊ#Python #pandas pic.twitter.com/DKHwN03A7J
β Kevin Markham (@justmarkham) July 1, 2019
πΌπ€Ή pandas trick:
There are many display options you can change:
max_rows
max_columns
max_colwidth
precision
date_dayfirst
date_yearfirstHow to use:
pd.set_option('display.max_rows', 80)
pd.reset_option('display.max_rows')See all:
pd.describe_option()#Python #pandastricksβ Kevin Markham (@justmarkham) July 26, 2019
πΌπ€ΉββοΈ pandas trick:
Show total memory usage of a DataFrame:https://t.co/LkpMP7wWOi(memory_usage='deep')
Show memory used by each column:
df.memory_usage(deep=True)Need to reduce? Drop unused columns, or convert object columns to 'category' type.#Python #pandas #pandastricks
β Kevin Markham (@justmarkham) July 5, 2019
πΌπ€ΉββοΈ pandas trick #70:
Need to know which version of pandas you're using?
β‘οΈ pd.__version__
Need to know the versions of its dependencies (numpy, matplotlib, etc)?
β‘οΈ https://t.co/84gN00FdzJ_versions()
Helpful when reading the documentation! π#Python #pandas #pandastricks
β Kevin Markham (@justmarkham) September 20, 2019
πΌπ€ΉββοΈ pandas trick:
Want to use NumPy without importing it? You can access ALL of its functionality from within pandas! See example π
This is probably *not* a good idea since it breaks with a long-standing convention. But it's a neat trick π#Python #pandas #pandastricks pic.twitter.com/pZbXwuj6Kz
β Kevin Markham (@justmarkham) July 22, 2019