This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| from pyspark.sql.functions import array, col, explode, lit, struct | |
| from pyspark.sql import DataFrame | |
| from typing import Iterable | |
| def melt( | |
| df: DataFrame, | |
| id_vars: Iterable[str], value_vars: Iterable[str], | |
| var_name: str="variable", value_name: str="value") -> DataFrame: | |
| """Convert :class:`DataFrame` from wide to long format.""" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import pandas as pd | |
| def create_date_window(in_date, window_size): | |
| date_lower = pd.to_datetime(in_date) - pd.DateOffset(days=window_size) | |
| date_upper = pd.to_datetime(in_date) + pd.DateOffset(days=window_size) | |
| return date_lower, date_upper | |
| if __name__ == "__main__": | |
| print(create_date_window("2019-09-17", 28)) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| library(sparklyr) | |
| SparkR::sparkR.session() | |
| sc <- spark_connect(method="databricks") | |
| snow.df.sparklyr <- spark_read_source( | |
| sc=sc, | |
| name = "adult", | |
| source = "snowflake", | |
| options = list( |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| library(dplyr) | |
| sdisplay <- function(x) { | |
| x %>% sample_n(1000) %>% collect() %>% display | |
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| from threading import Thread | |
| def producer_method(): | |
| dbutils.notebook.run( | |
| path="./kinesis-producer", | |
| timeout_seconds=600, | |
| arguments={ | |
| "kinesisRegion": KINESIS_REGION, | |
| "inputStream": INPUT_STREAM, | |
| "newsgroupDataLocation": NEWSGROUP_DATA_PATH |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import boto3 | |
| import json | |
| import numpy as np | |
| import pandas as pd | |
| from math import ceil | |
| class KinesisWriter: | |
| def __init__(self, region, stream, classes): | |
| self.kinesis_client = None |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import mlflow.pyfunc | |
| import mlflow.keras | |
| class KerasWrapper(mlflow.pyfunc.PythonModel): | |
| def __init__(self, keras_model_name): | |
| self.keras_model_name = keras_model_name | |
| def load_context(self, context): | |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| data.iloc[:, np.r_[5:data.columns.size,1]] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| @pandas_udf("timestamp", PandasUDFType.SCALAR) | |
| def from_xltime(x): | |
| import pandas as pd | |
| import datetime as dt | |
| return (pd.TimedeltaIndex(x, unit='d') + dt.datetime(1899,12,30)).to_series() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import org.apache.spark.ml.linalg.Vector | |
| val toArray = udf { v: Vector => v.toArray } | |
| spark.sqlContext.udf.register("toArray", toArray) |
OlderNewer