Skip to content

Instantly share code, notes, and snippets.

View MLnick's full-sized avatar

Nick Pentreath MLnick

  • Cape Town, South Africa
  • X @MLnick
View GitHub Profile
@MLnick
MLnick / Ensemble.scala
Created August 30, 2017 07:28
Ensemble pipeline component in Spark
class Ensemble(val uid: String, models: Seq[RegressionModel[_, _]]) extends Model[RegressionModel[_, _]] {
import org.apache.spark.sql.functions._
def this(models: Seq[Model[_]]) = this(Identifiable.randomUID("ensemble"), models)
override def copy(extra: ParamMap) = ???
override def transform(
dataset: Dataset[_]): DataFrame = {
1. Error: gapply() and gapplyCollect() on a DataFrame (@test_sparkSQL.R#2569) --
org.apache.spark.SparkException: Job aborted due to stage failure: Task 114 in stage 957.0 failed 1 times, most recent failure: Lost task 114.0 in stage 957.0 (TID 13209, localhost, executor driver): org.apache.spark.SparkException: R computation failed with
[1] 1
[1] 3
[1] 2
[1][1] 1 2
[1] 3
[1] 2
[1] 2
@MLnick
MLnick / onnx.pb
Last active October 17, 2019 11:39
graph {
node {
input: "X"
input: "W"
output: "Y"
name: "matmult"
op_type: "Mul"
}
input {
name: "X"