This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
-- Active issues | |
-- Count of total active issues in the specified time frame | |
-- Source: githubarchive public data set via Google BigQuery http://githubarchive.org/ | |
SELECT | |
COUNT(DISTINCT JSON_EXTRACT_SCALAR(events.payload, '$.issue.id')) AS events_issue_count | |
FROM (SELECT * FROM TABLE_DATE_RANGE([githubarchive:day.],TIMESTAMP('2015-09-01'),TIMESTAMP('2016-08-31'))) | |
AS events | |
-- 10,723,492 active issues |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Country,CountryCode,Currency,Code | |
New Zealand,NZ,New Zealand Dollars,NZD | |
Cook Islands,CK,New Zealand Dollars,NZD | |
Niue,NU,New Zealand Dollars,NZD | |
Pitcairn,PN,New Zealand Dollars,NZD | |
Tokelau,TK,New Zealand Dollars,NZD | |
Australia,AU,Australian Dollars,AUD | |
Christmas Island,CX,Australian Dollars,AUD | |
Cocos (Keeling) Islands,CC,Australian Dollars,AUD | |
Heard and Mc Donald Islands,HM,Australian Dollars,AUD |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#' Email a user a report is ready | |
#' | |
#' Requires an account at Mailgun: https://mailgun.com | |
#' Pre-verification can only send to a whitelist of emails you configure | |
#' | |
#' @param email Email to send to | |
#' @param mail_message Any extra info | |
#' | |
#' @return TRUE if successful email sent | |
#' @import httr |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# BB-8 -------------------------------------------------------------------- | |
# Inspired by Brian Hough in http://codepen.io/bhough/pen/wawrPL | |
# Packages ---------------------------------------------------------------- | |
library("dplyr") | |
library("ggplot2") |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
library(tidyverse) | |
library(paletteer) | |
library(gt) | |
pizzaplace %>% | |
mutate(type = case_when( | |
type == "chicken" ~ "chicken (pizzas with chicken as a major ingredient)", | |
type == "classic" ~ "classic (classical pizzas)", | |
type == "supreme" ~ "supreme (pizzas that try a little harder)", | |
type == "veggie" ~ "veggie (pizzas without any meats whatsoever)", |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
root | |
|-- artist: string (nullable = true) | |
|-- auth: string (nullable = true) | |
|-- firstName: string (nullable = true) | |
|-- gender: string (nullable = true) | |
|-- itemInSession: long (nullable = true) | |
|-- lastName: string (nullable = true) | |
|-- length: double (nullable = true) | |
|-- level: string (nullable = true) | |
|-- location: string (nullable = true) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
+--------------------+---------+---------+------+-------------+--------+---------+-----+--------------------+------+---------------+-------------+---------+--------------------+------+-------------+--------------------+------+ | |
| artist| auth|firstName|gender|itemInSession|lastName| length|level| location|method| page| registration|sessionId| song|status| ts| userAgent|userId| | |
+--------------------+---------+---------+------+-------------+--------+---------+-----+--------------------+------+---------------+-------------+---------+--------------------+------+-------------+--------------------+------+ | |
| Martha Tilston|Logged In| Colin| M| 50| Freeman|277.89016| paid| Bakersfield, CA| PUT| NextSong|1538173362000| 29| Rockpools| 200|1538352117000|Mozilla/5.0 (Wind...| 30| | |
| Five Iron Frenzy|Logged In| Micah| M| 79| Long|236.09424| free|Boston-Cambridge-...| PUT| |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def topn_values(column_name, n): | |
""" | |
Take a column name and find the most frequent values | |
""" | |
return df.groupby(column_name).count().sort('count', ascending=False).limit(n).toPandas() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def get_churned_users(df): | |
""" | |
Find out the users that cancelled so we can identify who churned. | |
Return updated dataframe with additional column identifying as such | |
""" | |
cancelled_ids = df.filter('page == "Cancellation Confirmation"').select("userId").distinct() | |
# Convert to list to be used to filter later | |
cancelled_ids = cancelled_ids.toPandas()['userId'].tolist() | |
# 1 when a user churned and 0 when they did not | |
df = df.withColumn("Churn", when((col("userId").isin(cancelled_ids)),lit('1')).otherwise(lit('0'))) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def aggregate_to_user_level(df): | |
""" | |
Aggregate the selected features to the user level | |
""" | |
exprs = [\ | |
sparkMax(col('churn')).alias('churn')\ | |
,sparkMax(col('Gender')).alias('gender')\ | |
,sparkMax(col('level')).alias('subscription_level')\ | |
,sparkMax(col('device_type')).alias('device_type')\ | |
,sparkMax(when(col("page") == 'Upgrade', 1).otherwise(0)).alias('page_upgraded') |