Skip to content

Instantly share code, notes, and snippets.

@kshirsagarsiddharth
Created June 8, 2021 04:53
Show Gist options
  • Select an option

  • Save kshirsagarsiddharth/686182a0a3b1b7715975326d8ae194ae to your computer and use it in GitHub Desktop.

Select an option

Save kshirsagarsiddharth/686182a0a3b1b7715975326d8ae194ae to your computer and use it in GitHub Desktop.
recommendation loading data
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName('rec').getOrCreate()
books_rating_path = "book_recommendation/book_recommendation/BX-Book-Ratings.csv"
books_path = "book_recommendation/book_recommendation/BX-Books.csv"
books_user_path = "book_recommendation/book_recommendation/BX-Users.csv"
books = spark.read.format("csv") \
.option("inferSchema","true") \
.option("header","true") \
.option("mode","DROPMALFORMED") \
.option("delimiter", ";") \
.load(books_path)
users = spark.read.format("csv") \
.option("inferSchema","true") \
.option("header","true") \
.option("mode","DROPMALFORMED") \
.option("delimiter", ";") \
.load(books_user_path)
ratings = spark.read.format("csv") \
.option("inferSchema","true") \
.option("header","true") \
.option("mode","DROPMALFORMED") \
.option("delimiter", ";") \
.load(books_rating_path)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment