Have: several treatments measured across time, many parameters recorded
dat <- expand.grid(
time = 1:3,
trt = letters[1:2],
param = LETTERS[1:3])
dat$value <- c(
rep(1.0, 6), # A: constant, should drop
rep(c(1.0, 1.5, 2.0), 2), # B: trts identical within days, should drop
1.6, 1.7, 1.8, 1.6, 1.7, 2.0) # C: trt a and b differ at time 3, should keep
dat
# time trt param value
# 1 1 a A 1.0
# 2 2 a A 1.0
# 3 3 a A 1.0
# 4 1 b A 1.0
# 5 2 b A 1.0
# 6 3 b A 1.0
# 7 1 a B 1.0
# 8 2 a B 1.5
# 9 3 a B 2.0
# 10 1 b B 1.0
# 11 2 b B 1.5
# 12 3 b B 2.0
# 13 1 a C 1.6
# 14 2 a C 1.7
# 15 3 a C 1.8
# 16 1 b C 1.6
# 17 2 b C 1.7
# 18 3 b C 2.0
Want to: Drop all observations from parameters where treatment never has any effect. "No effect" really is numerically zero in my case, but bonus points for answers that allow a tolerance.
Do NOT want to: Drop any observations from parameters that differ at some time but not others.
My current approach:
library(tidyverse)
dat_responding <- (dat
%>% group_by(param, time)
%>% mutate(trts_differ = var(value) > 0)
%>% group_by(param)
%>% filter(any(trts_differ))
%>% select(-trts_differ)
)
dat_responding
# # A tibble: 6 x 4
# # Groups: param [1]
# time trt param value
# <int> <fct> <fct> <dbl>
# 1 1 a C 1.6
# 2 2 a C 1.7
# 3 3 a C 1.8
# 4 1 b C 1.6
# 5 2 b C 1.7
# 6 3 b C 2
This does what I want (drops A and B, keeps C), but group-assign-regroup-filter-drop feels like a pretty clunky way of expressing "drop parameters that never differ between treatments".
Question: Can this be streamlined?