Gary Biggs GDBSD

Data Science Consultant. Applying my many years of management, marketing, engineering, and AI experience to help companies implement successful AI projects.

2 followers · 1 following

Indieus AI
Mountain View CA
https://www.linkedin.com/in/gbiggs/

View GitHub Profile

Recently created

Least recently created

Recently updated

Least recently updated

GDBSD / compare_bq_table_schemas.py

Created December 12, 2022 18:54

Compare BigQuery table schemas

	def compare_table_schemas(client, project: str, dataset: str,
	table_a: str, table_b: str) -> bool:
	"""Compare the schemas of two BigQuery tables. Useful for instance
	to confirm that there hasn't been a drift in the schemas for the
	production and development tables.

	:param client: BigQuery client object
	:param project: string - GCP project ID
	:param dataset: string - dataset name
	:param table_a: string - table name

GDBSD / calc_fbeta.py

Created June 11, 2022 18:21

Calculate F-Beta

	def calc_beta_f1(beta, precision, recall):
	"""Calculate F-beta given the values for precision and recall and the beta value"""
	beta_sq = pow(beta, 2)
	num = (1 + beta_sq)precisionrecall
	denom = beta_sq*precision+recall
	return num/denom

GDBSD / nonprint-char_remover.py

Created October 25, 2021 14:24

Remove non-printing characters from a Pandas dataframe

	def remove_non_printing_chars(df):
	"""Clean a dataframe column to remove any non-printing characters.
	We've encountered values like tabs in some of the data.

	:param df: Pandas dataframe
	:return: Pandas dataframe
	"""
	clean_df = df.copy(deep=True)
	clean_df = clean_df.apply(lambda x: x.str.strip() if x.dtype == "object" else x)
	for col in list(clean_df.columns):

GDBSD / path_setter.py

Last active July 9, 2021 17:25

jupter-easy-import-modules-from-higher-directories

	import os
	import sys

	"""Utility function to update sys.path so in our notebooks you can import modules from
	any folder in the application. It will also allow you to import any module in your virtual
	environment. Note that in my project the virtual environment is named "venv".

	In the notebook, in the first cell, import this script. It will run
	automatically

GDBSD / pytest_compare_arrays_floats.py

Created January 12, 2021 23:35

PyTest - comparing arrays and floats

	# Use Case: Here we have a dict "stats" with four keys with arrays and floats as values, both of which can trip you up.
	# We solve it by using Numpy .all() and PyTest approx()

	assert type(stats['observed']) is np.ndarray
	assert type(stats['expected']) is np.ndarray
	assert (stats['observed'] == [[1, 2, 3], [4, 5, 6], [7, 8, 9]]).all()
	assert (stats['expected'] == [[1.6, 2.0, 2.4], [4.0, 5.0, 6.0], [6.4, 8.0, 9.6]]).all()
	assert stats['G'] == pytest.approx(0.49173089057312613)
	assert stats['p'] == pytest.approx(0.9743009689044624)

GDBSD / get_global.py

Last active November 16, 2020 17:18

"global" makes a previously declared variable global

	# Consider this code
	x = 5
	def func1():
	print(x)
	func1()
	# Output
	5

	# Since x is declared before the function call, func1 can access it.
	# However, if you try to change it:

GDBSD / compare_dicts.py

Created October 17, 2020 23:35

Python - Compare Dictionaries

	import numpy as np

	def test_dict_equality(dict_1, dict_2):
	false_matches = 0
	for key in dict_1:
	if key in dict_2:
	if not np.array_equal(dict_2[key], dict_2[key]):
	false_matches += 1
	return false_matches == 0

GDBSD / gcp_jupyter_setup.txt

Last active October 15, 2020 00:56

GCP VM - Working With Jupyter Notebook On Your Local Device

	# I've seen a lot of posts with instructions for opening a Jupyter Notebook on your
	# local device with the Juyter server running on a GCP VM. They make it seem really
	# complicated. It ain't that hard folks!

	1. On the GCP Compute Engine UI click on the drop-down menu on the upper left side
	under Remote access. Select "view gcloud command" and copy the command.

	2. To that command append -- -L localhost:8887:127.0.0.1:8889
	Example:
	gcloud beta compute ssh --zone "<zone-name>" "<vm-instanve-name>" --project "<project-name>" -- -L localhost:8887:127.0.0.1:8889

GDBSD / compress_dict.py

Last active September 19, 2020 15:20

Compress and decompress a Python dictionary

	import gzip
	import json


	source_dict = {
	"New Year's Day": "Fri, Jan 1, 2021",
	"Martin Luther King Jr. Day": "Mon, Jan 18, 2021",
	"Washington's Birthday": "Mon, Feb 15, 2021",
	"Arbor Day": "Fri, Apr 30, 2021",
	"Memorial Day": "Mon, May 31, 2021",