This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
\RequirePackage{tikz} | |
\documentclass[varwidth]{standalone} | |
\usepackage{import} | |
\usepackage{pgfplots} | |
\usepgfplotslibrary{groupplots} | |
\pgfplotsset{compat=newest} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
-- Dataset available from http://multimedia-commons.s3-website-us-west-2.amazonaws.com/?prefix=tools/etc/ in Sqlite3 database format ('yfcc100m_dataset.sql' file) | |
SET NAMES utf8; | |
SET time_zone = '+00:00'; | |
SET foreign_key_checks = 0; | |
SET sql_mode = 'NO_AUTO_VALUE_ON_ZERO'; | |
DROP TABLE IF EXISTS `yfcc100m_dataset`; | |
CREATE TABLE `yfcc100m_dataset` ( | |
`photoid` int NOT NULL, |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Loading PySpark modules | |
from pyspark.sql import DataFrame | |
from pyspark.sql.types import * | |
#from pyspark.context import SparkContext | |
#from pyspark.sql.session import SparkSession | |
# sc = SparkContext('local') | |
# spark = SparkSession(sc) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
-- 'xxx.stackexchange.com' is the name of the database | |
-- `xxx.stackexchange.com`.badges definition | |
CREATE TABLE `badges` ( | |
`Id` int(11) NOT NULL, | |
`UserId` int(11) NOT NULL, | |
`Name` varchar(30) NOT NULL, | |
`Date` datetime NOT NULL, | |
`Class` int(11) NOT NULL, |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// Loading the posts | |
LOAD CSV WITH HEADERS FROM 'file:///posts_all_csv.csv' AS row | |
WITH toInteger(row[0]) AS postId, row[5] AS postBody, toInteger(row[3]) AS postScore | |
RETURN count(row); | |
LOAD CSV WITH HEADERS FROM 'file:///posts_all_csv.csv' AS row FIELDTERMINATOR '\t' | |
WITH row[0] AS postId, row[3] AS postScore, row[5] AS postBody | |
MERGE (p:Post {postId: postId}) | |
SET p.postBody = postBody, p.postScore = postScore | |
RETURN p; |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
a | |
about | |
above | |
after | |
again | |
against | |
ain | |
all | |
am | |
an |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
## run the following commands in BASH | |
start-master.sh | |
# go to http://localhost:8080 and check if the Spark's master service is started | |
start-slave.sh spark://$(hostname):7077 | |
# if the worker's service is started successfully you should be able to see the worker in http://localhost:8080, at the connected worker's section |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
## run the following commands in BASH | |
cd # let's get back to your user's home directory | |
wget -c https://www-us.apache.org/dist/spark/spark-2.4.5/spark-2.4.5-bin-hadoop2.7.tgz # this will download spark | |
tar xvfz spark-2.4.5-bin-hadoop2.7.tgz # this will extract the downloaded file to current directory | |
mv spark-2.4.5-bin-hadoop2.7 spark # renaming the extarcted folder to "spark" | |
# appending the JAVA_HOME and SPARK_HOME environement variables to end of your BASH startup script | |
# we are assuming that our JRE 8 is installed in "/usr/lib/jvm/java-1.8.0-openjdk-amd64" | |
cat >> .bashrc <<'EOF' |