JonathanLoscalzo · April 24, 2019 18:08
diff --git a/03-DataCamp-Merging DataFrames with pandas.md b/03-DataCamp-Merging DataFrames with pandas.md
diff --git a/concat_horizontally_multiindex.py b/concat_horizontally_multiindex.py
 """
 Concatenating horizontally to get MultiIndexed columns
 It is also possible to construct a DataFrame with hierarchically indexed columns. For this exercise, you'll start with pandas imported and a list of three DataFrames called dataframes. All three DataFrames contain 'Company', 'Product', and 'Units' columns with a 'Date' column as the index pertaining to sales transactions during the month of February, 2015. The first DataFrame describes Hardware transactions, the second describes Software transactions, and the third, Service transactions.

 Your task is to concatenate the DataFrames horizontally and to create a MultiIndex on the columns. From there, you can summarize the resulting DataFrame and slice some information from it.
 """
 february = pd.concat(dataframes, keys=['Hardware', 'Software', 'Service'], axis=1)
 # Print february.info()
 print(february.info())

 # Assign pd.IndexSlice: idx
 idx = pd.IndexSlice

 # Create the slice: slice_2_8
 slice_2_8 = february.loc['Feb. 2, 2015':'Feb. 8, 2015', idx[:, 'Company']]

 # Print slice_2_8
 print(slice_2_8)
diff --git a/Converting_currency_stocks.py b/Converting_currency_stocks.py
 """
 In this exercise, stock prices in US Dollars for the S&P 500 in 2015 have been obtained from Yahoo Finance. The files sp500.csv for sp500 and exchange.csv for the exchange rates are both provided to you.

 Using the daily exchange rate to Pounds Sterling, your task is to convert both the Open and Close column prices.
 """"
 # Import pandas
 import pandas as pd

 # Read 'sp500.csv' into a DataFrame: sp500
 sp500 = pd.read_csv('sp500.csv', parse_dates=True, index_col='Date')

 # Read 'exchange.csv' into a DataFrame: exchange
 exchange = pd.read_csv('exchange.csv', parse_dates=True, index_col='Date')

 # Subset 'Open' & 'Close' columns from sp500: dollars
 dollars = sp500.loc[:, ['Open', 'Close']]

 # Print the head of dollars
 print(dollars.head())

 # Convert dollars to pounds: pounds
 pounds = dollars.multiply(exchange['GBP/USD'], axis='rows')

 # Print the head of pounds
 print(pounds.head())
diff --git a/merge_ordered.py b/merge_ordered.py
 # Perform the first ordered merge: tx_weather
 tx_weather = pd.merge_ordered(austin, houston)

 # Print tx_weather
 print(tx_weather)

 # Perform the second ordered merge: tx_weather_suff
 tx_weather_suff = pd.merge_ordered(austin, houston, on='date', suffixes=['_aus', '_hus'])


 # Print tx_weather_suff
 print(tx_weather_suff)

 # Perform the third ordered merge: tx_weather_ffill
 tx_weather_ffill = pd.merge_ordered(austin, houston, on='date', suffixes=['_aus', '_hus'], fill_method='ffill')

 # Print tx_weather_ffill
 print(tx_weather_ffill)
diff --git a/merge_with_outer.py b/merge_with_outer.py
 # Perform the first merge: merge_default
 merge_default = pd.merge(sales_and_managers, revenue_and_sales)

 # Print merge_default
 print(merge_default)

 # Perform the second merge: merge_outer
 merge_outer =  pd.merge(sales_and_managers, revenue_and_sales, how='outer')

 # Print merge_outer
 print(merge_outer)

 # Perform the third merge: merge_outer_on
 merge_outer_on = pd.merge(sales_and_managers, revenue_and_sales, on=['city','state'], how='outer')

 # Merge revenue and sales: revenue_and_sales
 revenue_and_sales = pd.merge(revenue, sales, how='right', on=['city', 'state'])

 # Print revenue_and_sales
 print(revenue_and_sales)

 # Merge sales and managers: sales_and_managers
 sales_and_managers = pd.merge(sales, managers, how='left', right_on=['branch', 'state'], left_on=['city', 'state'])

 # Print sales_and_managers
 print(sales_and_managers)
diff --git a/resample_and_concat.py b/resample_and_concat.py
 # Resample and tidy china: china_annual
 china_annual = china.resample('A').last().pct_change(10).dropna()

 # Resample and tidy us: us_annual
 us_annual = us.resample('A').last().pct_change(10).dropna()

 # Concatenate china_annual and us_annual: gdp
 gdp = pd.concat([china_annual, us_annual], axis=1, join='inner')

 # Resample gdp and print
 print(gdp.resample('10A').last())
diff --git a/slicing_multiindex.py b/slicing_multiindex.py
 """
 This exercise picks up where the last ended (again using The Guardian's Olympic medal dataset).

 You are provided with the MultiIndexed DataFrame as produced at the end of the preceding exercise. Your task is to sort the DataFrame and to use the pd.IndexSlice to extract specific slices. Check out this exercise from Manipulating DataFrames with pandas to refresh your memory on how to deal with MultiIndexed DataFrames.

 pandas has been imported for you as pd and the DataFrame medals is already in your namespace.
 """

 # Sort the entries of medals: medals_sorted
 medals_sorted = medals.sort_index(level=0)

 # Print the number of Bronze medals won by Germany
 print(medals_sorted.loc[('bronze','Germany')])

 # Print data about silver medals
 print(medals_sorted.loc['silver'])

 # Create alias for pd.IndexSlice: idx
 idx = pd.IndexSlice

 # Print all the data on medals won by the United Kingdom
 print(medals_sorted.loc[idx[:,'United Kingdom'], :])
	"""
	Concatenating horizontally to get MultiIndexed columns
	It is also possible to construct a DataFrame with hierarchically indexed columns. For this exercise, you'll start with pandas imported and a list of three DataFrames called dataframes. All three DataFrames contain 'Company', 'Product', and 'Units' columns with a 'Date' column as the index pertaining to sales transactions during the month of February, 2015. The first DataFrame describes Hardware transactions, the second describes Software transactions, and the third, Service transactions.

	Your task is to concatenate the DataFrames horizontally and to create a MultiIndex on the columns. From there, you can summarize the resulting DataFrame and slice some information from it.
	"""
	february = pd.concat(dataframes, keys=['Hardware', 'Software', 'Service'], axis=1)
	# Print february.info()
	print(february.info())

	# Assign pd.IndexSlice: idx
	idx = pd.IndexSlice

	# Create the slice: slice_2_8
	slice_2_8 = february.loc['Feb. 2, 2015':'Feb. 8, 2015', idx[:, 'Company']]

	# Print slice_2_8
	print(slice_2_8)
	"""
	In this exercise, stock prices in US Dollars for the S&P 500 in 2015 have been obtained from Yahoo Finance. The files sp500.csv for sp500 and exchange.csv for the exchange rates are both provided to you.

	Using the daily exchange rate to Pounds Sterling, your task is to convert both the Open and Close column prices.
	""""
	# Import pandas
	import pandas as pd

	# Read 'sp500.csv' into a DataFrame: sp500
	sp500 = pd.read_csv('sp500.csv', parse_dates=True, index_col='Date')

	# Read 'exchange.csv' into a DataFrame: exchange
	exchange = pd.read_csv('exchange.csv', parse_dates=True, index_col='Date')

	# Subset 'Open' & 'Close' columns from sp500: dollars
	dollars = sp500.loc[:, ['Open', 'Close']]

	# Print the head of dollars
	print(dollars.head())

	# Convert dollars to pounds: pounds
	pounds = dollars.multiply(exchange['GBP/USD'], axis='rows')

	# Print the head of pounds
	print(pounds.head())
	# Perform the first ordered merge: tx_weather
	tx_weather = pd.merge_ordered(austin, houston)

	# Print tx_weather
	print(tx_weather)

	# Perform the second ordered merge: tx_weather_suff
	tx_weather_suff = pd.merge_ordered(austin, houston, on='date', suffixes=['_aus', '_hus'])


	# Print tx_weather_suff
	print(tx_weather_suff)

	# Perform the third ordered merge: tx_weather_ffill
	tx_weather_ffill = pd.merge_ordered(austin, houston, on='date', suffixes=['_aus', '_hus'], fill_method='ffill')

	# Print tx_weather_ffill
	print(tx_weather_ffill)
	# Perform the first merge: merge_default
	merge_default = pd.merge(sales_and_managers, revenue_and_sales)

	# Print merge_default
	print(merge_default)

	# Perform the second merge: merge_outer
	merge_outer = pd.merge(sales_and_managers, revenue_and_sales, how='outer')

	# Print merge_outer
	print(merge_outer)

	# Perform the third merge: merge_outer_on
	merge_outer_on = pd.merge(sales_and_managers, revenue_and_sales, on=['city','state'], how='outer')

	# Merge revenue and sales: revenue_and_sales
	revenue_and_sales = pd.merge(revenue, sales, how='right', on=['city', 'state'])

	# Print revenue_and_sales
	print(revenue_and_sales)

	# Merge sales and managers: sales_and_managers
	sales_and_managers = pd.merge(sales, managers, how='left', right_on=['branch', 'state'], left_on=['city', 'state'])

	# Print sales_and_managers
	print(sales_and_managers)
	# Resample and tidy china: china_annual
	china_annual = china.resample('A').last().pct_change(10).dropna()

	# Resample and tidy us: us_annual
	us_annual = us.resample('A').last().pct_change(10).dropna()

	# Concatenate china_annual and us_annual: gdp
	gdp = pd.concat([china_annual, us_annual], axis=1, join='inner')

	# Resample gdp and print
	print(gdp.resample('10A').last())
	"""
	This exercise picks up where the last ended (again using The Guardian's Olympic medal dataset).

	You are provided with the MultiIndexed DataFrame as produced at the end of the preceding exercise. Your task is to sort the DataFrame and to use the pd.IndexSlice to extract specific slices. Check out this exercise from Manipulating DataFrames with pandas to refresh your memory on how to deal with MultiIndexed DataFrames.

	pandas has been imported for you as pd and the DataFrame medals is already in your namespace.
	"""

	# Sort the entries of medals: medals_sorted
	medals_sorted = medals.sort_index(level=0)

	# Print the number of Bronze medals won by Germany
	print(medals_sorted.loc[('bronze','Germany')])

	# Print data about silver medals
	print(medals_sorted.loc['silver'])

	# Create alias for pd.IndexSlice: idx
	idx = pd.IndexSlice

	# Print all the data on medals won by the United Kingdom
	print(medals_sorted.loc[idx[:,'United Kingdom'], :])