* All data can be found at s3://databrickssql/
* Create the following databases:
  * {UNIQUE_IDENTIFIER}_retail
  * {UNIQUE_IDENTIFIER}_nyse
* Number of orders placed every month. We have to consider only COMPLETE as well as CLOSED orders. Make sure output is sorted by month.
  * Database: retail
  * Table: orders
* Number of orders placed in the months of 2013 August, 2013 September and 2013 October. We have to consider all PENDING orders (PENDING, PENDING_PAYMENT)
  * Database: retail
  * Table: orders
* Get revenue for each order (using order_items). Output should contain order id and revenue, revenue should be rounded off to 2 digits. Also output should be saved to a new table called order_revenue (file format parquet).
  * Source Database: retail
  * Source Table: order_items
  * Target Database: {UNIQUE_IDENTIFIER}_retail
  * Target Table: order_revenue
  * Reference for [Creating Tables using Spark SQL](https://docs.databricks.com/spark/latest/spark-sql/language-manual/create-table.html)
* Create tables for products, categories and departments, then load the data into all the tables. Use the below data model to come up with column names. 
  * Base Directory for all data sets **/data/retail_db**
  * File Format: TEXTFILE
  * Delimiter: Comma ","
* Create a new table for NYSE Data.
  * Database Name: **{UNIQUE_IDENTIFIER}_nyse**
  * Table Name: **nyse_eod**
  * Type: **EXTERNAL**
  * File Type: **TEXTFILE** (default)
  * Location: **s3://databrickssql/nyse_data/**
  * Fields: **stockticker, tradedate, openprice, highprice, lowprice, closeprice, volume**
  * Data Types: **Prices are of type float, stockticker is of type string, tradedate is of type int and volume is of type bigint.**
  * Field Delimiter: **Comma (,)**
  * Run `SELECT * FROM ` the table name to ensure data is loaded properly. Also run the count to make sure that all the data is loaded into the table.
* Compute Daily Product Revenue
  * Source Database: **retail**
  * Consider only COMPLETE and CLOSED orders
  * Output should contain - order_date, product_name, revenue based on order_item_subtotal
  * Round off revenue to 2 digits
  * Save output to daily_product_revenue (sort the data by order_date in ascending and revenue in descending).
  * Target Database: **{UNIQUE_IDENTIFIER}_retail**
  * Target File Format: TEXTFILE
* Get the details of orders where there are no records in order_items.
  * Source Database: retail
  * Tables: orders and order_items
* Data Validation: Validate if order_item_subtotal is equal to product of order_item_product_price and order_item_quantity.
  * Source Database: retail
  * Source Table: order_items
* Get the order count for each day and status sorted by order_date in ascending order and then by order_status where data is sorted COMPLETE, CLOSED and rest in ascending order.
  * Source Database: retail
  * Source Table: orders
* Create FACT Table(s) for the following reports:
  * Daily Product Revenue
  * Daily Category Revenue
  * Daily Department Revenue
  * Roll up to Weekly, Monthly, Yearly
  * Support To Date Reports