Skip to content

Instantly share code, notes, and snippets.

@sharoonthomas
Created October 10, 2024 01:19
Show Gist options
  • Save sharoonthomas/11e8abaac70f5236182afa3a3d3529d8 to your computer and use it in GitHub Desktop.
Save sharoonthomas/11e8abaac70f5236182afa3a3d3529d8 to your computer and use it in GitHub Desktop.
delivery times = happier customers = better repurchase rates ?? May be

Shipment Analysis Query Explanation

This SQL query performs an in-depth analysis of shipment data, focusing on delivery times, customer behavior, and carrier performance. The query is structured in three main parts: two Common Table Expressions (CTEs) and a final SELECT statement.

This was inspired by a post by Mehtab Karta.

CleanShot 2024-10-09 at 21 19 15

Prerequisites

This query is designed to work with the fulfil data warehouse. Ensure you have access to this data warehouse before attempting to run the query.

1. Customer Metrics CTE

The first CTE, customer_metrics, calculates various metrics for each customer:

  • Average delivery time (in days)
  • Order count
  • Total amount spent
  • Whether the customer has made repeat purchases

It joins the shipments and sales_orders tables to gather this information. The WHERE clause ensures that only completed shipments with valid data are included in the analysis.

2. Delivery Time Bins CTE

The second CTE, delivery_time_bins, categorizes the average delivery times into bins:

  • 0-3 days
  • 4-5 days
  • 6-7 days
  • 8-10 days
  • 10+ days

This categorization allows for easier analysis of delivery performance.

3. Main Query

The final SELECT statement aggregates the data from the delivery_time_bins CTE to provide insights on:

  • Number of customers in each delivery time bin
  • Average customer lifetime value (LTV) for each bin
  • Repurchase rate for each bin

The results are grouped by delivery time bin, carrier, and carrier service, then ordered for easy interpretation.

Key Insights

This query can help answer questions such as:

  1. Which carriers and services have the fastest delivery times?
  2. How does delivery time affect customer lifetime value?
  3. Is there a correlation between delivery speed and likelihood of repurchase?
  4. Which carriers are handling the most customers?

By analyzing this data, the company can make informed decisions about which carriers to use, how to improve customer satisfaction, and potentially how to increase customer lifetime value and repurchase rates.

Instructions for Creating Custom Reports

To create a custom report using this query in the fulfil data warehouse:

  1. Log in to your fulfil account and navigate to the Reports section.
  2. Click on "Create New Report" or "New Custom Report".
  3. In the query editor, paste the entire SQL query provided above.
  4. You may need to adjust table names or field names to match your specific data warehouse schema.
  5. Click "Run Query" to test the report and ensure it executes without errors.
  6. Once satisfied with the results, give your report a name and description.
  7. Set up any desired scheduling or sharing options for the report.
  8. Save the report.

Remember to review and update the report periodically to ensure it continues to meet your business needs and reflect any changes in your data structure or business processes.

Note: Depending on the size of your data warehouse and the complexity of the query, this report might take some time to run. Consider adding date range filters if you need to limit the data processed for performance reasons.

-- Common Table Expression (CTE) to calculate customer metrics
WITH customer_metrics AS (
SELECT
s.customer_id,
s.carrier,
s.carrier_service,
AVG(DATE_DIFF(s.delivered_at, s.shipped_at, DAY)) AS avg_delivery_time,
COUNT(DISTINCT so.id) AS order_count,
SUM((SELECT SUM(line.amount) FROM UNNEST(so.lines) AS line)) AS total_spent,
CASE WHEN COUNT(DISTINCT so.id) > 1 THEN 1 ELSE 0 END AS has_repurchased
FROM
shipments s
JOIN
sales_orders so ON s.customer_id = so.customer_id AND so.order_date >= CAST(s.delivered_at AS date)
WHERE
s.state = 'done'
AND s.delivered_at IS NOT NULL
AND s.shipped_at IS NOT NULL
AND s.carrier IS NOT NULL
-- Uncomment the following line to filter for shipments from 2024 onwards
-- AND s.shipped_date >= '2024-01-01'
GROUP BY
s.customer_id,
s.carrier,
s.carrier_service
),
-- CTE to categorize delivery times into bins
delivery_time_bins AS (
SELECT
customer_id,
carrier,
carrier_service,
avg_delivery_time,
order_count,
total_spent,
has_repurchased,
CASE
WHEN avg_delivery_time <= 3 THEN '0-3 days'
WHEN avg_delivery_time <= 5 THEN '4-5 days'
WHEN avg_delivery_time <= 7 THEN '6-7 days'
WHEN avg_delivery_time <= 10 THEN '8-10 days'
ELSE '10+ days'
END AS delivery_time_bin
FROM
customer_metrics
)
-- Main query to aggregate and analyze data
SELECT
delivery_time_bin,
carrier,
carrier_service,
COUNT(DISTINCT customer_id) AS customer_count,
ROUND(AVG(total_spent), 2) AS avg_ltv,
ROUND(AVG(has_repurchased), 2) AS repurchase_rate
FROM
delivery_time_bins
GROUP BY
delivery_time_bin,
carrier,
carrier_service
ORDER BY
carrier,
carrier_service,
delivery_time_bin
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment