Skip to content

Instantly share code, notes, and snippets.

View AllieUbisse's full-sized avatar
🎯
Focusing

Allie .S Ubisse AllieUbisse

🎯
Focusing
View GitHub Profile
@AllieUbisse
AllieUbisse / Spark Dataframe Cheat Sheet.py
Created August 21, 2020 23:19 — forked from crawles/Spark Dataframe Cheat Sheet.py
Cheat sheet for Spark Dataframes (using Python)
# A simple cheat sheet of Spark Dataframe syntax
# Current for Spark 1.6.1
# import statements
from pyspark.sql import SQLContext
from pyspark.sql.types import *
from pyspark.sql.functions import *
#creating dataframes
df = sqlContext.createDataFrame([(1, 4), (2, 5), (3, 6)], ["A", "B"]) # from manual data
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@AllieUbisse
AllieUbisse / pyspark_help.md
Created August 23, 2020 12:13 — forked from hammadzz/pyspark_help.md
PySpark HelpSheet
#Import All Functions
from pyspark.sql import SQLContext
from pyspark.sql import functions as F
from pyspark.sql import SparkSession
from pyspark.sql.functions import unix_timestamp, to_date, date_format, month, year, dayofyear, dayofweek, col
from pyspark.sql.types import TimestampType
from pyspark.sql import functions as F
from pyspark.sql import SparkSession
from pyspark.sql.functions import unix_timestamp, to_date, date_format, month, year, dayofyear, dayofweek, col
from pyspark.sql.types import TimestampType
@AllieUbisse
AllieUbisse / forecasting_metrics.py
Created January 25, 2021 13:15 — forked from Kalki5/forecasting_metrics.py
Python Numpy functions for most common forecasting metrics
import numpy as np
EPSILON = 1e-10
def _error(actual: np.ndarray, predicted: np.ndarray):
""" Simple error """
return actual - predicted
@AllieUbisse
AllieUbisse / add_policy.py
Created January 25, 2021 13:32 — forked from Kalki5/add_policy.py
Add own policy to a lambda function
import boto3
lamba_client = boto3.client('lambda', region_name='REGION_NAME')
lamba_client.add_permission(
FunctionName='create_lab',
StatementId='AWSEventsRule',
Action='lambda:InvokeFunction',
Principal='events.amazonaws.com',
SourceArn='arn:aws:events:REGION_NAME:ACCOUNT_NUMBER:rule/*',
@AllieUbisse
AllieUbisse / forecasting_metrics.py
Created January 26, 2021 13:44 — forked from bshishov/forecasting_metrics.py
Python Numpy functions for most common forecasting metrics
import numpy as np
EPSILON = 1e-10
def _error(actual: np.ndarray, predicted: np.ndarray):
""" Simple error """
return actual - predicted
@AllieUbisse
AllieUbisse / python batch geocoding.py
Created May 14, 2021 07:22 — forked from shanealynn/python batch geocoding.py
Geocode as many addresses as you'd like with a powerful Python and Google Geocoding API combination
"""
Python script for batch geocoding of addresses using the Google Geocoding API.
This script allows for massive lists of addresses to be geocoded for free by pausing when the
geocoder hits the free rate limit set by Google (2500 per day). If you have an API key for paid
geocoding from Google, set it in the API key section.
Addresses for geocoding can be specified in a list of strings "addresses". In this script, addresses
come from a csv file with a column "Address". Adjust the code to your own requirements as needed.
After every 500 successul geocode operations, a temporary file with results is recorded in case of
script failure / loss of connection later.
Addresses and data are held in memory, so this script may need to be adjusted to process files line