Skip to content

Instantly share code, notes, and snippets.

@wbazant
Created October 20, 2016 11:01
Show Gist options
  • Save wbazant/efec281be4ff1695de48a7b81a41daeb to your computer and use it in GitHub Desktop.
Save wbazant/efec281be4ff1695de48a7b81a41daeb to your computer and use it in GitHub Desktop.
Differential analytics download
Abbreviations
----
v value
exp experiment
c contrast
id identifier
as analytics search, which the query either matches or it doesn't
cs conditions search, which the query either matches or it doesn't
Sets considered
----
D= {(v,exp,c,id)} all data we have
A= {(as,v, exp,c,id): (v,exp,c,id) is in D, v>threshold, as=annotateAll(v,exp,c,id)}.
C= {(cs, exp,c): (v,exp,c) is in D for some v, cs= annotateConditions(exp,c)}
So as and cs are calculated fields
Current search
----
R = { id: there is (as,v, exp, c, id) in A such that as matches query}
x {(exp, c): there is (cs,exp, c) in C such that cs matches query } intersect D
Search in analytics index only
----
S = { (v,exp,c,id) : there is (as, v, exp,c,id) in A such that as matches query}
What data will we miss?
----
Consider (exp', c', id') in R\S.
If there are (as,v) such that (as, v, exp', c', id') is in A, then as doesn't match query.
Hence for all (as, exp'', c'', id') in A such that as matches query, (exp'',c'') != (exp', c')
Hence there are some (as*,cs*):
(as*, v, exp'', c'', id') is in A
(cs*,exp', c') is in C
as* matches query and cs* matches query
exp'' is not exp' or exp'' is exp' but c'' is not c'
I claim we don't want these results.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment