BF bfraiche

bfraiche / build_grid.py

Last active April 2, 2019 22:08

This gist contains code snippets for my blogpost: 'Random Forest with Python and Spark ML'

	from pyspark.ml.tuning import ParamGridBuilder
	import numpy as np

	paramGrid = ParamGridBuilder() \
	.addGrid(rf.numTrees, [int(x) for x in np.linspace(start = 10, stop = 50, num = 3)]) \
	.addGrid(rf.maxDepth, [int(x) for x in np.linspace(start = 5, stop = 25, num = 3)]) \
	.build()

bfraiche / build_cv.py

Created April 2, 2019 17:41

This gist contains code snippets for my blogpost: 'Random Forest with Python and Spark ML'

	from pyspark.ml.tuning import CrossValidator
	from pyspark.ml.evaluation import RegressionEvaluator

	crossval = CrossValidator(estimator=pipeline,
	estimatorParamMaps=paramGrid,
	evaluator=RegressionEvaluator(),
	numFolds=3)

bfraiche / best_hp.py

Last active April 2, 2019 22:18

This gist contains code snippets for my blogpost: 'Random Forest with Python and Spark ML'

	print('numTrees - ', bestModel.getNumTrees)
	print('maxDepth - ', bestModel.getOrDefault('maxDepth'))

bfraiche / add_rf.py

Last active April 2, 2019 17:43

This gist contains code snippets for my blogpost: 'Random Forest with Python and Spark ML'

	from pyspark.ml.regression import RandomForestRegressor

	rf = RandomForestRegressor(labelCol="label", featuresCol="features")