Skip to content

Instantly share code, notes, and snippets.

@jrosell
Last active March 27, 2025 09:58
Show Gist options
  • Save jrosell/62e5706a306af49a9dea8d3a939758e9 to your computer and use it in GitHub Desktop.
Save jrosell/62e5706a306af49a9dea8d3a939758e9 to your computer and use it in GitHub Desktop.
An example of data model to perform clickstream data analysis in R for an ecommerce website or app.
``` r
suppressPackageStartupMessages({
library(dplyr)
library(tidyr)
})
click_data <- tibble(
click_id = c("u1", "u1", "u1", "u1", "u1", "u1", "u1", "u1", "u1", "u2", "u2", "u2", "u3", "u3", "u2"),
click_name = c("first_click", "page_view", "view_item", "add_to_wishlist", "add_to_cart", "page_view", "begin_checkout", "page_view", "purchase", "first_click", "page_view", "view_item", "first_click", "page_view", "page_view"),
click_value = c(0, 0, 0, 1, 2, 0, 4, 0, 20, 0, 0, 0, 0, 0, 0),
click_revenue = c(0, 0, 0, 0, 0, 0, 0, 0, 201.2, 0, 0, 0, 0, 0, 0),
page_referrer = c("document.referrer"),
page_location = c("window.location.href"),
utm_source = "",
utm_medium = "",
utm_campaign = "",
utm_content = "",
utm_term = "",
country = "",
device_category = "",
device_os = "",
timestamp = Sys.time()
)
click_data
#> # A tibble: 15 × 15
#> click_id click_name click_value click_revenue page_referrer page_location
#> <chr> <chr> <dbl> <dbl> <chr> <chr>
#> 1 u1 first_click 0 0 document.ref… window.locat…
#> 2 u1 page_view 0 0 document.ref… window.locat…
#> 3 u1 view_item 0 0 document.ref… window.locat…
#> 4 u1 add_to_wishli… 1 0 document.ref… window.locat…
#> 5 u1 add_to_cart 2 0 document.ref… window.locat…
#> 6 u1 page_view 0 0 document.ref… window.locat…
#> 7 u1 begin_checkout 4 0 document.ref… window.locat…
#> 8 u1 page_view 0 0 document.ref… window.locat…
#> 9 u1 purchase 20 201. document.ref… window.locat…
#> 10 u2 first_click 0 0 document.ref… window.locat…
#> 11 u2 page_view 0 0 document.ref… window.locat…
#> 12 u2 view_item 0 0 document.ref… window.locat…
#> 13 u3 first_click 0 0 document.ref… window.locat…
#> 14 u3 page_view 0 0 document.ref… window.locat…
#> 15 u2 page_view 0 0 document.ref… window.locat…
#> # ℹ 9 more variables: utm_source <chr>, utm_medium <chr>, utm_campaign <chr>,
#> # utm_content <chr>, utm_term <chr>, country <chr>, device_category <chr>,
#> # device_os <chr>, timestamp <dttm>
click_data |> summarise(entrances = n_distinct(click_id))
#> # A tibble: 1 × 1
#> entrances
#> <int>
#> 1 3
click_data |> summarise(.by = click_name, event_count = n())
#> # A tibble: 7 × 2
#> click_name event_count
#> <chr> <int>
#> 1 first_click 3
#> 2 page_view 6
#> 3 view_item 2
#> 4 add_to_wishlist 1
#> 5 add_to_cart 1
#> 6 begin_checkout 1
#> 7 purchase 1
click_data |>
rename(name = click_name) |>
summarise(.by = c(name), value = n()) |>
pivot_wider(values_fill = 0)
#> # A tibble: 1 × 7
#> first_click page_view view_item add_to_wishlist add_to_cart begin_checkout
#> <int> <int> <int> <int> <int> <int>
#> 1 3 6 2 1 1 1
#> # ℹ 1 more variable: purchase <int>
click_data |>
rename(name = click_name) |>
summarise(.by = c(click_id, name), value = n()) |>
group_by(click_id) |>
pivot_wider(values_fill = 0)
#> # A tibble: 3 × 8
#> # Groups: click_id [3]
#> click_id first_click page_view view_item add_to_wishlist add_to_cart
#> <chr> <int> <int> <int> <int> <int>
#> 1 u1 1 3 1 1 1
#> 2 u2 1 2 1 0 0
#> 3 u3 1 1 0 0 0
#> # ℹ 2 more variables: begin_checkout <int>, purchase <int>
```
<sup>Created on 2025-03-27 with [reprex v2.1.1](https://reprex.tidyverse.org)</sup>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment