David Cochran data-enhanced

Change the Default Browser for Jupyter Notebooks in OS X

Step 1. Create an editable config file for Jupyter notebooks.

To do this, open Terminal and type:

jupyter notebook --generate-config

This generates the file:

~/.jupyter/jupyter_notebook_config.py

Pandas .describe() formatted

Format numbers output from the pandas df.describe() method. For instance, instead of outputting scientific notation, we can have numbers with thousands separators and a desired number of decimals.

When using .describe with an entire dataframe, use .apply and a lambda function to apply the formatting to every number.

To change the number of decimals, change the number before the f
To remove the thousands separator remove the comma

df.describe().apply(lambda s: s.apply('{:,.0f}'.format))

When using .describe with a single column or a series, use the .map method instead:

	AussieAnimals = ['kangaroo', 'cassowary', 'wombat', 'possum', 'echidna', 'ibis', 'wallaby', 'koala',
	'tasmanian devil', 'kookaburra', 'numbat', 'platypus', 'lyre bird', 'quokka', 'quoll',
	'sugar glider', 'bandicoot', 'thorny devil', 'dingo', 'wallaroo', 'yabby', 'bilby']

	GalapagosAnimals = ['rice rat', 'hoary bat', 'bottlenose dolphin', 'beaked whale', 'lava lizard',
	'tortoise', 'flightless cormorant', 'green sea turtle', 'blue-footed booby',
	'marine iguana', 'pink land iguana', 'darwins finches', 'brown noddy']

	# Define a function to get the name field from the first item in a dictionary list
	def get_genre1(x):
	x = json.loads(x)
	if len(x) > 0:
	return x[0]['name']

	# Now use pandas.apply to use the function on one column
	# In thise case create a new column called genres1 to hold the new data
	movies['genre1'] = movies['genres'].apply(get_genre1)

	# Import ast for ast.literal_eval
	import ast

	# Remove JSON from TMDB fields
	# for genres, spoken_languages, production_companies, production_countries
	# Works only with non-null values, so filter out null values before applying
	# Requires import ast -- or use simply eval vs ast.literal_eval
	def remove_json(content):
	# Interpret the content as a Python list of dictionaries
	content = ast.literal_eval(content)

	# Function for generating model scores and confusion matrices with custom colors and descriptive labels
	# https://stackoverflow.com/questions/70097754/confusion-matrix-with-different-colors
	# https://medium.com/@dtuk81/confusion-matrix-visualization-fc31e3f30fea

	def report_scores(model, features, labels):
	'''
	Generating model scores and confusion matrices with custom colors and descriptive labels
	model = model variable
	features = features of desired split
	labels = labels of desired split

	# Format using Markdown ================================================

	# Format Jupyter Code Output using Markdown
	# https://ipython.readthedocs.io/en/stable/api/generated/IPython.display.html?highlight=display_markdown#IPython.display.display_markdown
	from IPython.display import display_markdown

	markdowntext = 'Markdown Heading Level 3'
	display_markdown(f'### Code Output Formatted as {markdowntext}', raw=True)
	display_markdown(f'_Code output italicized using_ `display_markdown`', raw=True)