Last active
November 21, 2024 23:55
-
-
Save jugaldb/12fd9622dd5b32ea84e9a6f6d7061781 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# SQL Basics with DuckDB: Restaurant Analysis\n", | |
"\n", | |
"This notebook demonstrates SQL fundamentals using DuckDB and a restaurant dataset. We'll cover:\n", | |
"- Creating and connecting to a DuckDB database\n", | |
"- Designing and creating relational tables\n", | |
"- Inserting and manipulating data\n", | |
"- Writing complex queries\n", | |
"- Using regular expressions for data validation\n", | |
"- Analyzing restaurant performance metrics" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Setup\n", | |
"First, let's import required libraries and create our DuckDB connection." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": {}, | |
"source": [ | |
"import duckdb\n", | |
"import pandas as pd\n", | |
"import re\n", | |
"from datetime import datetime, timedelta\n", | |
"\n", | |
"# Create a connection to DuckDB (in-memory database)\n", | |
"conn = duckdb.connect(':memory:')\n", | |
"print(\"DuckDB connection established successfully!\")" | |
], | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Creating Database Schema\n", | |
"We'll create three related tables:\n", | |
"1. restaurants - Main table with restaurant information\n", | |
"2. inspections - Health inspection records\n", | |
"3. reviews - Customer reviews and ratings" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": {}, | |
"source": [ | |
"def create_tables():\n", | |
" # Restaurants table\n", | |
" conn.execute(\"\"\"\n", | |
" CREATE TABLE restaurants (\n", | |
" restaurant_id INTEGER PRIMARY KEY,\n", | |
" name VARCHAR NOT NULL,\n", | |
" cuisine_type VARCHAR,\n", | |
" price_range INTEGER CHECK (price_range BETWEEN 1 AND 4),\n", | |
" neighborhood VARCHAR,\n", | |
" opening_date DATE\n", | |
" )\n", | |
" \"\"\")\n", | |
" \n", | |
" # Inspections table\n", | |
" conn.execute(\"\"\"\n", | |
" CREATE TABLE inspections (\n", | |
" inspection_id INTEGER PRIMARY KEY,\n", | |
" restaurant_id INTEGER,\n", | |
" inspection_date DATE,\n", | |
" score INTEGER CHECK (score BETWEEN 0 AND 100),\n", | |
" grade CHAR(1),\n", | |
" violation_description TEXT,\n", | |
" FOREIGN KEY (restaurant_id) REFERENCES restaurants(restaurant_id)\n", | |
" )\n", | |
" \"\"\")\n", | |
" \n", | |
" # Reviews table\n", | |
" conn.execute(\"\"\"\n", | |
" CREATE TABLE reviews (\n", | |
" review_id INTEGER PRIMARY KEY,\n", | |
" restaurant_id INTEGER,\n", | |
" rating INTEGER CHECK (rating BETWEEN 1 AND 5),\n", | |
" review_text TEXT,\n", | |
" review_date DATE,\n", | |
" reviewer_name VARCHAR,\n", | |
" FOREIGN KEY (restaurant_id) REFERENCES restaurants(restaurant_id)\n", | |
" )\n", | |
" \"\"\")\n", | |
" \n", | |
" print(\"Tables created successfully!\")\n", | |
"\n", | |
"create_tables()" | |
], | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Inserting Sample Data\n", | |
"Let's populate our tables with sample restaurant data, including inspection records and customer reviews." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": {}, | |
"source": [ | |
"# Insert restaurants\n", | |
"restaurants_data = [\n", | |
" (1, \"The Rustic Spoon\", \"Italian\", 2, \"Downtown\", \"2020-01-15\"),\n", | |
" (2, \"Sushi Wave\", \"Japanese\", 3, \"Marina\", \"2019-06-22\"),\n", | |
" (3, \"Taco Fiesta\", \"Mexican\", 1, \"Mission\", \"2021-03-10\"),\n", | |
" (4, \"Le Petit Bistro\", \"French\", 4, \"Hayes Valley\", \"2018-12-05\"),\n", | |
" (5, \"Dragon Palace\", \"Chinese\", 2, \"Chinatown\", \"2020-09-30\")\n", | |
"]\n", | |
"\n", | |
"conn.execute(\"\"\"\n", | |
"INSERT INTO restaurants VALUES (?, ?, ?, ?, ?, ?)\n", | |
"\"\"\", restaurants_data)\n", | |
"\n", | |
"# Insert inspections\n", | |
"inspections_data = [\n", | |
" (1, 1, \"2023-01-15\", 92, \"A\", \"Minor cleaning required\"),\n", | |
" (2, 2, \"2023-02-20\", 88, \"B\", \"Temperature control issues\"),\n", | |
" (3, 3, \"2023-03-10\", 95, \"A\", \"All standards met\"),\n", | |
" (4, 4, \"2023-04-05\", 90, \"A\", \"Update cleaning logs\"),\n", | |
" (5, 5, \"2023-05-12\", 85, \"B\", \"Storage improvements needed\"),\n", | |
" (6, 1, \"2023-06-15\", 94, \"A\", \"All standards met\"),\n", | |
" (7, 2, \"2023-07-20\", 91, \"A\", \"Minor violations corrected\")\n", | |
"]\n", | |
"\n", | |
"conn.execute(\"\"\"\n", | |
"INSERT INTO inspections VALUES (?, ?, ?, ?, ?, ?)\n", | |
"\"\"\", inspections_data)\n", | |
"\n", | |
"# Insert reviews\n", | |
"reviews_data = [\n", | |
" (1, 1, 4, \"Great pasta!\", \"2023-01-20\", \"John D.\"),\n", | |
" (2, 1, 5, \"Authentic Italian taste\", \"2023-02-15\", \"Maria S.\"),\n", | |
" (3, 2, 3, \"Good sushi but pricey\", \"2023-03-10\", \"Alex K.\"),\n", | |
" (4, 3, 5, \"Best tacos in town\", \"2023-04-05\", \"Carlos M.\"),\n", | |
" (5, 4, 4, \"Very romantic atmosphere\", \"2023-05-12\", \"Sophie L.\"),\n", | |
" (6, 5, 2, \"Service needs improvement\", \"2023-06-18\", \"David W.\"),\n", | |
" (7, 5, 4, \"Delicious dim sum\", \"2023-07-22\", \"Linda H.\")\n", | |
"]\n", | |
"\n", | |
"conn.execute(\"\"\"\n", | |
"INSERT INTO reviews VALUES (?, ?, ?, ?, ?, ?)\n", | |
"\"\"\", reviews_data)\n", | |
"\n", | |
"print(\"Sample data inserted successfully!\")" | |
], | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Basic Queries\n", | |
"Let's start with some basic SELECT queries to explore our data." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": {}, | |
"source": [ | |
"# List all restaurants with their cuisine type and price range\n", | |
"print(\"Restaurants Overview:\")\n", | |
"result = conn.execute(\"\"\"\n", | |
"SELECT \n", | |
" name,\n", | |
" cuisine_type,\n", | |
" price_range,\n", | |
" neighborhood\n", | |
"FROM restaurants\n", | |
"ORDER BY price_range, name\n", | |
"\"\"\").fetchdf()\n", | |
"display(result)" | |
], | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## JOIN Operations\n", | |
"Now let's explore relationships between tables using JOIN operations." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": {}, | |
"source": [ | |
"# Get restaurants with their latest inspection scores\n", | |
"print(\"Latest Restaurant Inspections:\")\n", | |
"result = conn.execute(\"\"\"\n", | |
"SELECT \n", | |
" r.name,\n", | |
" r.cuisine_type,\n", | |
" i.inspection_date,\n", | |
" i.score,\n", | |
" i.grade\n", | |
"FROM restaurants r\n", | |
"JOIN inspections i ON r.restaurant_id = i.restaurant_id\n", | |
"WHERE i.inspection_date = (\n", | |
" SELECT MAX(inspection_date)\n", | |
" FROM inspections i2\n", | |
" WHERE i2.restaurant_id = r.restaurant_id\n", | |
")\n", | |
"ORDER BY i.score DESC\n", | |
"\"\"\").fetchdf()\n", | |
"display(result)" | |
], | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Aggregation and GROUP BY\n", | |
"Let's analyze aggregated data using GROUP BY and aggregate functions." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": {}, | |
"source": [ | |
"# Calculate average ratings and inspection scores by cuisine type\n", | |
"result = conn.execute(\"\"\"\n", | |
"SELECT \n", | |
" r.cuisine_type,\n", | |
" ROUND(AVG(rev.rating), 2) as avg_customer_rating,\n", | |
" ROUND(AVG(i.score), 2) as avg_inspection_score,\n", | |
" COUNT(DISTINCT r.restaurant_id) as num_restaurants\n", | |
"FROM restaurants r\n", | |
"LEFT JOIN reviews rev ON r.restaurant_id = rev.restaurant_id\n", | |
"LEFT JOIN inspections i ON r.restaurant_id = i.restaurant_id\n", | |
"GROUP BY r.cuisine_type\n", | |
"ORDER BY avg_customer_rating DESC\n", | |
"\"\"\").fetchdf()\n", | |
"display(result)" | |
], | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Window Functions\n", | |
"Let's use window functions to perform more complex analysis." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": {}, | |
"source": [ | |
"# Rank restaurants within their price range based on average rating\n", | |
"result = conn.execute(\"\"\"\n", | |
"SELECT \n", | |
" name,\n", | |
" cuisine_type,\n", | |
" price_range,\n", | |
" ROUND(avg_rating, 2) as average_rating,\n", | |
" rank_in_category\n", | |
"FROM (\n", | |
" SELECT \n", | |
" r.name,\n", | |
" r.cuisine_type,\n", | |
" r.price_range,\n", | |
" AVG(rev.rating) as avg_rating,\n", | |
" RANK() OVER (\n", | |
" PARTITION BY r.price_range \n", | |
" ORDER BY AVG(rev.rating) DESC\n", | |
" ) as rank_in_category\n", | |
" FROM restaurants r\n", | |
" LEFT JOIN reviews rev ON r.restaurant_id = rev.restaurant_id\n", | |
" GROUP BY r.restaurant_id, r.name, r.cuisine_type, r.price_range\n", | |
")\n", | |
"ORDER BY price_range, rank_in_category\n", | |
"\"\"\").fetchdf()\n", | |
"display(result)" | |
], | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Data Validation with Regular Expressions\n", | |
"Let's create a function to validate and process new review submissions." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": {}, | |
"source": [ | |
"def validate_review(raw_review):\n", | |
" \"\"\"Validate and process a raw review string using regex\"\"\"\n", | |
" try:\n", | |
" # Example format: \"Restaurant: Dragon Palace | Rating: 4 | Review: Great food! | Reviewer: Jane D.\"\n", | |
" restaurant = re.search(r'Restaurant:\\s*([^|]+)', raw_review).group(1).strip()\n", | |
" rating = int(re.search(r'Rating:\\s*(\\d+)', raw_review).group(1))\n", | |
" review = re.search(r'Review:\\s*([^|]+)', raw_review).group(1).strip()\n", | |
" reviewer = re.search(r'Reviewer:\\s*([^|]+)', raw_review).group(1).strip()\n", | |
" \n", | |
" if not (1 <= rating <= 5):\n", | |
" raise ValueError(f\"Invalid rating: {rating}. Must be between 1 and 5.\")\n", | |
" \n", | |
" return restaurant, rating, review, reviewer\n", | |
" except AttributeError:\n", | |
" raise ValueError(\"Invalid review format\")\n", | |
"\n", | |
"# Test the validation function\n", | |
"test_review = \"Restaurant: Dragon Palace | Rating: 4 | Review: Great food! | Reviewer: Jane D.\"\n", | |
"restaurant, rating, review, reviewer = validate_review(test_review)\n", | |
"print(f\"Validated Review:\nRestaurant: {restaurant}\nRating: {rating}\nReview: {review}\nReviewer: {reviewer}\")" | |
], | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Practice Exercises\n", | |
"\n", | |
"Try these exercises to reinforce your understanding of SQL with DuckDB:\n", | |
"\n", | |
"1. Create a query to find all restaurants that have never received a B grade in their inspections\n", | |
"2. Calculate the average time between inspections for each restaurant\n", | |
"3. Find restaurants whose average rating is higher than the average rating of their price category\n", | |
"4. Create a monthly summary of reviews showing the number of reviews and average rating\n", | |
"5. Identify restaurants that have improved their inspection scores over time" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": {}, | |
"source": [ | |
"# Exercise 1: Restaurants with no B grades\n", | |
"print(\"Restaurants that have never received a B grade:\")\n", | |
"result = conn.execute(\"\"\"\n", | |
"SELECT DISTINCT r.name, r.cuisine_type\n", | |
"FROM restaurants r\n", | |
"WHERE NOT EXISTS (\n", | |
" SELECT 1\n", | |
" FROM inspections i\n", | |
" WHERE i.restaurant_id = r.restaurant_id\n", | |
" AND i.grade = 'B'\n", | |
")\n", | |
"ORDER BY r.name\n", | |
"\"\"\").fetchdf()\n", | |
"display(result)" | |
], | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": {}, | |
"source": [ | |
"# Exercise 2: Average time between inspections\n", | |
"print(\"Average days between inspections by restaurant:\")\n", | |
"result = conn.execute(\"\"\"\n", | |
"WITH inspection_gaps AS (\n", | |
" SELECT \n", | |
" r.name,\n", | |
" i.inspection_date,\n", | |
" LEAD(i.inspection_date) OVER (\n", | |
" PARTITION BY r.restaurant_id \n", | |
" ORDER BY i.inspection_date\n", | |
" ) as next_inspection,\n", | |
" DATEDIFF('day', i.inspection_date, \n", | |
" LEAD(i.inspection_date) OVER (\n", | |
" PARTITION BY r.restaurant_id \n", | |
" ORDER BY i.inspection_date\n", | |
" )\n", | |
" ) as days_between\n", | |
" FROM restaurants r\n", | |
" JOIN inspections i ON r.restaurant_id = i.restaurant_id\n", | |
")\n", | |
"SELECT \n", | |
" name,\n", | |
" ROUND(AVG(days_between), 1) as avg_days_between_inspections,\n", | |
" COUNT(*) as total_inspections\n", | |
"FROM inspection_gaps\n", | |
"WHERE days_between IS NOT NULL\n", | |
"GROUP BY name\n", | |
"ORDER BY avg_days_between_inspections\n", | |
"\"\"\").fetchdf()\n", | |
"display(result)" | |
], | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": {}, | |
"source": [ | |
"# Exercise 3: Above-average restaurants in their price category\n", | |
"print(\"Restaurants performing above their price category average:\")\n", | |
"result = conn.execute(\"\"\"\n", | |
"WITH avg_ratings AS (\n", | |
" SELECT \n", | |
" r.restaurant_id,\n", | |
" r.name,\n", | |
" r.price_range,\n", | |
" AVG(rev.rating) as restaurant_avg,\n", | |
" (SELECT AVG(rating)\n", | |
" FROM restaurants r2\n", | |
" JOIN reviews rev2 ON r2.restaurant_id = rev2.restaurant_id\n", | |
" WHERE r2.price_range = r.price_range\n", | |
" ) as category_avg\n", | |
" FROM restaurants r\n", | |
" JOIN reviews rev ON r.restaurant_id = rev.restaurant_id\n", | |
" GROUP BY r.restaurant_id, r.name, r.price_range\n", | |
")\n", | |
"SELECT \n", | |
" name,\n", | |
" price_range,\n", | |
" ROUND(restaurant_avg, 2) as restaurant_rating,\n", | |
" ROUND(category_avg, 2) as category_average,\n", | |
" ROUND(restaurant_avg - category_avg, 2) as difference\n", | |
"FROM avg_ratings\n", | |
"WHERE restaurant_avg > category_avg\n", | |
"ORDER BY difference DESC\n", | |
"\"\"\").fetchdf()\n", | |
"display(result)" | |
], | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": {}, | |
"source": [ | |
"# Exercise 4: Monthly review summary\n", | |
"print(\"Monthly review summary:\")\n", | |
"result = conn.execute(\"\"\"\n", | |
"SELECT \n", | |
" strftime('%Y-%m', review_date) as month,\n", | |
" COUNT(*) as number_of_reviews,\n", | |
" ROUND(AVG(rating), 2) as average_rating,\n", | |
" MIN(rating) as lowest_rating,\n", | |
" MAX(rating) as highest_rating\n", | |
"FROM reviews\n", | |
"GROUP BY strftime('%Y-%m', review_date)\n", | |
"ORDER BY month\n", | |
"\"\"\").fetchdf()\n", | |
"display(result)" | |
], | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": {}, | |
"source": [ | |
"# Exercise 5: Inspection score improvements\n", | |
"print(\"Restaurants with improving inspection scores:\")\n", | |
"result = conn.execute(\"\"\"\n", | |
"WITH score_changes AS (\n", | |
" SELECT \n", | |
" r.name,\n", | |
" i.inspection_date,\n", | |
" i.score,\n", | |
" LAG(i.score) OVER (\n", | |
" PARTITION BY r.restaurant_id \n", | |
" ORDER BY i.inspection_date\n", | |
" ) as previous_score\n", | |
" FROM restaurants r\n", | |
" JOIN inspections i ON r.restaurant_id = i.restaurant_id\n", | |
")\n", | |
"SELECT \n", | |
" name,\n", | |
" inspection_date,\n", | |
" previous_score,\n", | |
" score,\n", | |
" score - previous_score as improvement\n", | |
"FROM score_changes\n", | |
"WHERE score > previous_score\n", | |
"ORDER BY improvement DESC\n", | |
"\"\"\").fetchdf()\n", | |
"display(result)" | |
], | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Bonus Exercise: Creating a Restaurant Performance Dashboard\n", | |
"\n", | |
"Create a comprehensive dashboard view for restaurant owners that combines inspection scores, customer ratings, and trends over time." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": {}, | |
"source": [ | |
"def generate_restaurant_dashboard(restaurant_name):\n", | |
" # Get restaurant overview\n", | |
" overview = conn.execute(\"\"\"\n", | |
" SELECT \n", | |
" r.name,\n", | |
" r.cuisine_type,\n", | |
" r.price_range,\n", | |
" r.neighborhood,\n", | |
" r.opening_date,\n", | |
" COUNT(DISTINCT i.inspection_id) as total_inspections,\n", | |
" COUNT(DISTINCT rev.review_id) as total_reviews,\n", | |
" ROUND(AVG(rev.rating), 2) as avg_rating,\n", | |
" ROUND(AVG(i.score), 2) as avg_inspection_score\n", | |
" FROM restaurants r\n", | |
" LEFT JOIN inspections i ON r.restaurant_id = i.restaurant_id\n", | |
" LEFT JOIN reviews rev ON r.restaurant_id = rev.restaurant_id\n", | |
" WHERE r.name = ?\n", | |
" GROUP BY r.restaurant_id, r.name, r.cuisine_type, r.price_range, r.neighborhood, r.opening_date\n", | |
" \"\"\", [restaurant_name]).fetchdf()\n", | |
" \n", | |
" print(f\"Dashboard for {restaurant_name}\")\n", | |
" print(\"=\"* 50)\n", | |
" display(overview)\n", | |
" \n", | |
" # Get inspection history\n", | |
" print(\"\nInspection History:\")\n", | |
" inspections = conn.execute(\"\"\"\n", | |
" SELECT \n", | |
" i.inspection_date,\n", | |
" i.score,\n", | |
" i.grade,\n", | |
" i.violation_description\n", | |
" FROM restaurants r\n", | |
" JOIN inspections i ON r.restaurant_id = i.restaurant_id\n", | |
" WHERE r.name = ?\n", | |
" ORDER BY i.inspection_date DESC\n", | |
" \"\"\", [restaurant_name]).fetchdf()\n", | |
" display(inspections)\n", | |
" \n", | |
" # Get review summary\n", | |
" print(\"\nRecent Reviews:\")\n", | |
" reviews = conn.execute(\"\"\"\n", | |
" SELECT \n", | |
" rev.review_date,\n", | |
" rev.rating,\n", | |
" rev.review_text,\n", | |
" rev.reviewer_name\n", | |
" FROM restaurants r\n", | |
" JOIN reviews rev ON r.restaurant_id = rev.restaurant_id\n", | |
" WHERE r.name = ?\n", | |
" ORDER BY rev.review_date DESC\n", | |
" LIMIT 5\n", | |
" \"\"\", [restaurant_name]).fetchdf()\n", | |
" display(reviews)\n", | |
"\n", | |
"# Generate dashboard for a specific restaurant\n", | |
"generate_restaurant_dashboard(\"The Rustic Spoon\")" | |
], | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Cleanup\n", | |
"Finally, let's clean up our database connection." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": {}, | |
"source": [ | |
"# Function to reset the notebook environment\n", | |
"def reset_database():\n", | |
" \"\"\"Reset the database by dropping all tables and recreating them\"\"\"\n", | |
" conn.execute(\"DROP TABLE IF EXISTS reviews\")\n", | |
" conn.execute(\"DROP TABLE IF EXISTS inspections\")\n", | |
" conn.execute(\"DROP TABLE IF EXISTS restaurants\")\n", | |
" print(\"Tables dropped successfully!\")\n", | |
" \n", | |
" # Recreate tables if needed\n", | |
" create_tables()\n", | |
" print(\"Database reset complete!\")\n", | |
"\n", | |
"# Close the connection\n", | |
"conn.close()\n", | |
"print(\"Database connection closed!\")" | |
], | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Next Steps\n", | |
"\n", | |
"Here are some suggestions for further exploration:\n", | |
"\n", | |
"1. Add more complex data validation rules\n", | |
"2. Implement trending analysis for ratings over time\n", | |
"3. Create visualization of inspection scores and ratings\n", | |
"4. Add seasonal analysis of customer ratings\n", | |
"5. Implement a recommendation system based on cuisine type and ratings" | |
] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python 3", | |
"language": "python", | |
"name": "python3" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 3 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython3", | |
"version": "3.8.0" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 4 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment