Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save yutannihilation/e10a63285178d3d305650dff4fa2eab4 to your computer and use it in GitHub Desktop.
Save yutannihilation/e10a63285178d3d305650dff4fa2eab4 to your computer and use it in GitHub Desktop.
Are mutate() and summarise() different?
library(dplyr, warn.conflicts = FALSE)

g <- starwars %>%
  select(name, mass, species) %>%
  group_by(species)

# usual usage of mutate()
d1 <- g %>% 
  mutate(mass_norm = mass / mean(mass, na.rm = TRUE))

d1
#> # A tibble: 87 x 4
#> # Groups:   species [38]
#>    name                mass species mass_norm
#>    <chr>              <dbl> <chr>       <dbl>
#>  1 Luke Skywalker        77 Human       0.930
#>  2 C-3PO                 75 Droid       1.08 
#>  3 R2-D2                 32 Droid       0.459
#>  4 Darth Vader          136 Human       1.64 
#>  5 Leia Organa           49 Human       0.592
#>  6 Owen Lars            120 Human       1.45 
#>  7 Beru Whitesun lars    75 Human       0.906
#>  8 R5-D4                 32 Droid       0.459
#>  9 Biggs Darklighter     84 Human       1.01 
#> 10 Obi-Wan Kenobi        77 Human       0.930
#> # … with 77 more rows

# can we do the same thing with summarise()?
d2 <- g %>% 
  summarise(
    across(), # need to be added to keep the columns
    mass_norm = mass / mean(mass, na.rm = TRUE)
  )
#> `summarise()` regrouping output by 'species' (override with `.groups` argument)

d2
#> # A tibble: 87 x 4
#> # Groups:   species [38]
#>    species  name             mass mass_norm
#>    <chr>    <chr>           <dbl>     <dbl>
#>  1 Aleena   Ratts Tyerell      15     1    
#>  2 Besalisk Dexter Jettster   102     1    
#>  3 Cerean   Ki-Adi-Mundi       82     1    
#>  4 Chagrian Mas Amedda         NA    NA    
#>  5 Clawdite Zam Wesell         55     1    
#>  6 Droid    C-3PO              75     1.08 
#>  7 Droid    R2-D2              32     0.459
#>  8 Droid    R5-D4              32     0.459
#>  9 Droid    IG-88             140     2.01 
#> 10 Droid    R4-P17             NA    NA    
#> # … with 77 more rows

all.equal(
  d1 %>% select(name, mass, species, mass_norm) %>% arrange(name),
  d2 %>% select(name, mass, species, mass_norm) %>% arrange(name)
)
#> [1] TRUE

Created on 2020-06-28 by the reprex package (v0.3.0)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment