Skip to content

Instantly share code, notes, and snippets.

@rominirani
Last active September 6, 2018 05:24
Show Gist options
  • Save rominirani/6509e764fcc5a3d1d452123ee81d0408 to your computer and use it in GitHub Desktop.
Save rominirani/6509e764fcc5a3d1d452123ee81d0408 to your computer and use it in GitHub Desktop.

Public Data Set to use

bigquery-public-data.new_york_taxi_trips.tlc_yellow_trips_2017

Dataset details

Table size 12.79 GB

Number of rows 113,496,874

Find the most expensive taxi trip taken

SELECT total_amount, pickup_datetime, trip_distance FROM `bigquery-public-data.new_york_taxi_trips.tlc_yellow_trips_2017` ORDER BY total_amount DESC LIMIT 1;

Find the most common hour of the day

SELECT HOUR(pickup_datetime) as hour, COUNT(*) as count FROM [bigquery-public-data.new_york_taxi_trips.tlc_yellow_trips_2017] GROUP BY hour ORDER BY count DESC;

Find the most common day of the week + hour of the day

SELECT DAYOFWEEK(pickup_datetime) as day, HOUR(pickup_datetime) as hour, COUNT(*) as count FROM [bigquery-public-data.new_york_taxi_trips.tlc_yellow_trips_2017] GROUP BY day, hour ORDER BY count DESC;

Create a New Table

Sample Data

1493033027,1493033627,8.42
1493033004,1493033943,18.61
1493033102,1493033609,9.17
1493032027,1493033801,24.97

Save the above data into a CSV File. Either use that file to upload or better still add it to a GCS Bucket.

Sample Schema

pickup_time,TIMESTAMP,REQUIRED
dropoff_time,TIMESTAMP,REQUIRED
fare_amount,TIMESTAMP,REQUIRED

You will need to create this in the UI.

Go ahead and create a Job to create the dataset + load data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment