The goal of this class is to investigate basic concepts surrounding text mining.
Databricks Talk
Create a DataBricks Community Edition Account
Gentle Introduction To Spark
Word2Vec
Presentation
Databricks Demo
Introduction to MapReduce Links: local github slides
Introduction to Spark Links: local github slides
Lab2 Word Count
Notes: Be sure to install the library test_helper
.
Unfortunately this class is closed. Sorry.
(1) https://www.edx.org/course/big-data-analysis-apache-spark-uc-berkeleyx-cs110x
But the video lectures are still available on this YouTube Playlist.
(2) Lab2 Word Count
Notes:
This work is licensed under a Creative Commons Attribution 4.0 International License.