🎯

Focusing

Allie .S Ubisse AllieUbisse

🎯

Focusing

Data Scientist https://youtu.be/p72XZJQGn2k

6 followers · 21 following

SAND AI
South Africa (Cape Town)
20:55 (UTC -12:00)
allieubisse.co.za
@troinux

View GitHub Profile

Recently created

Least recently created

Recently updated

Least recently updated

shanealynn / python batch geocoding.py

Last active July 9, 2025 18:35

Geocode as many addresses as you'd like with a powerful Python and Google Geocoding API combination

	"""
	Python script for batch geocoding of addresses using the Google Geocoding API.
	This script allows for massive lists of addresses to be geocoded for free by pausing when the
	geocoder hits the free rate limit set by Google (2500 per day). If you have an API key for paid
	geocoding from Google, set it in the API key section.
	Addresses for geocoding can be specified in a list of strings "addresses". In this script, addresses
	come from a csv file with a column "Address". Adjust the code to your own requirements as needed.
	After every 500 successul geocode operations, a temporary file with results is recorded in case of
	script failure / loss of connection later.
	Addresses and data are held in memory, so this script may need to be adjusted to process files line

hammadzz / pyspark_help.md

Last active August 23, 2020 12:13

PySpark HelpSheet

Common Alias Functions

These functions are exactly equivalent

		Reference
filter	where	pyspark.sql.DataFrame.filter
drop_duplicates	dropDuplicates	pyspark.sql.DataFrame.drop_duplicates
avg	mean	pyspark.sql.GroupedData.avg

bshishov / forecasting_metrics.py

Last active July 22, 2025 10:05

Python Numpy functions for most common forecasting metrics

	import numpy as np

	EPSILON = 1e-10


	def _error(actual: np.ndarray, predicted: np.ndarray):
	""" Simple error """
	return actual - predicted

oskarryn / sstreaming-spark-final.py

Created April 15, 2019 08:19

	'''
	spark/bin/spark-submit \
	--master local --driver-memory 4g \
	--num-executors 2 --executor-memory 4g \
	--packages org.apache.spark:spark-sql-kafka-0-10_2.11:2.4.0 \
	sstreaming-spark-final.py
	'''
	from pyspark.sql import SparkSession
	from pyspark.sql.types import *
	from pyspark.sql.functions import expr

azarnyx / test_MLflow_2.ipynb

Created June 30, 2019 12:07

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

ldrewniak / 01-model-training.ipynb

Created November 5, 2019 14:01

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

jeanmidevacc / mlflow-model-evaluation.py

Created November 13, 2019 13:36

	from sklearn.neighbors import KNeighborsRegressor
	from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score, explained_variance_score

	import mlflow
	import mlflow.sklearn

	import numpy as np

	# Launch the experiment on mlflow
	experiment_name = "electricityconsumption-forecast"

liorshk / mlflow_gridsearch.py

Created April 22, 2020 15:24

Create MLFlow runs with Sklearn Gridsearch object


	def log_run(gridsearch: sklearn.GridSearchCV, experiment_name: str, model_name: str, run_index: int, conda_env, tags={}):
	"""Logging of cross validation results to mlflow tracking server

	Args:
	experiment_name (str): experiment name
	model_name (str): Name of the model
	run_index (int): Index of the run (in Gridsearch)
	conda_env (str): A dictionary that describes the conda environment (MLFlow Format)
	tags (dict): Dictionary of extra data and tags (usually features)

Ben-Epstein / 7.2 Splice MLflow Support.ipynb

Created July 24, 2020 00:37

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

carlleston / 3-ln_model.ipynb

Last active August 23, 2020 12:09

pre-processing and linear model in pyspark

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

OlderNewer