Created
July 30, 2021 01:53
-
-
Save joowkim/c50873ed9fb8f3b36f28b163d1ccbeb2 to your computer and use it in GitHub Desktop.
Question about batch effect
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### I am going to make up the tech variable as a batch effect. For example, tech1 is measured by technician 1 and tech2 is measure by technician 2. And I am not interested in the effect of the tech variale, but would like to ignore or remove the effect of the tech variable. " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 5, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<table class=\"dataframe\">\n", | |
"<caption>A data.frame: 6 × 2</caption>\n", | |
"<thead>\n", | |
"\t<tr><th></th><th scope=col>weight</th><th scope=col>group</th></tr>\n", | |
"\t<tr><th></th><th scope=col><dbl></th><th scope=col><fct></th></tr>\n", | |
"</thead>\n", | |
"<tbody>\n", | |
"\t<tr><th scope=row>1</th><td>4.17</td><td>ctrl</td></tr>\n", | |
"\t<tr><th scope=row>2</th><td>5.58</td><td>ctrl</td></tr>\n", | |
"\t<tr><th scope=row>3</th><td>5.18</td><td>ctrl</td></tr>\n", | |
"\t<tr><th scope=row>4</th><td>6.11</td><td>ctrl</td></tr>\n", | |
"\t<tr><th scope=row>5</th><td>4.50</td><td>ctrl</td></tr>\n", | |
"\t<tr><th scope=row>6</th><td>4.61</td><td>ctrl</td></tr>\n", | |
"</tbody>\n", | |
"</table>\n" | |
], | |
"text/latex": [ | |
"A data.frame: 6 × 2\n", | |
"\\begin{tabular}{r|ll}\n", | |
" & weight & group\\\\\n", | |
" & <dbl> & <fct>\\\\\n", | |
"\\hline\n", | |
"\t1 & 4.17 & ctrl\\\\\n", | |
"\t2 & 5.58 & ctrl\\\\\n", | |
"\t3 & 5.18 & ctrl\\\\\n", | |
"\t4 & 6.11 & ctrl\\\\\n", | |
"\t5 & 4.50 & ctrl\\\\\n", | |
"\t6 & 4.61 & ctrl\\\\\n", | |
"\\end{tabular}\n" | |
], | |
"text/markdown": [ | |
"\n", | |
"A data.frame: 6 × 2\n", | |
"\n", | |
"| <!--/--> | weight <dbl> | group <fct> |\n", | |
"|---|---|---|\n", | |
"| 1 | 4.17 | ctrl |\n", | |
"| 2 | 5.58 | ctrl |\n", | |
"| 3 | 5.18 | ctrl |\n", | |
"| 4 | 6.11 | ctrl |\n", | |
"| 5 | 4.50 | ctrl |\n", | |
"| 6 | 4.61 | ctrl |\n", | |
"\n" | |
], | |
"text/plain": [ | |
" weight group\n", | |
"1 4.17 ctrl \n", | |
"2 5.58 ctrl \n", | |
"3 5.18 ctrl \n", | |
"4 6.11 ctrl \n", | |
"5 4.50 ctrl \n", | |
"6 4.61 ctrl " | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"head(PlantGrowth)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 6, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<table class=\"dataframe\">\n", | |
"<caption>A data.frame: 6 × 3</caption>\n", | |
"<thead>\n", | |
"\t<tr><th></th><th scope=col>weight</th><th scope=col>group</th><th scope=col>tech</th></tr>\n", | |
"\t<tr><th></th><th scope=col><dbl></th><th scope=col><fct></th><th scope=col><fct></th></tr>\n", | |
"</thead>\n", | |
"<tbody>\n", | |
"\t<tr><th scope=row>1</th><td>4.17</td><td>ctrl</td><td>tech1</td></tr>\n", | |
"\t<tr><th scope=row>2</th><td>5.58</td><td>ctrl</td><td>tech2</td></tr>\n", | |
"\t<tr><th scope=row>3</th><td>5.18</td><td>ctrl</td><td>tech1</td></tr>\n", | |
"\t<tr><th scope=row>4</th><td>6.11</td><td>ctrl</td><td>tech2</td></tr>\n", | |
"\t<tr><th scope=row>5</th><td>4.50</td><td>ctrl</td><td>tech1</td></tr>\n", | |
"\t<tr><th scope=row>6</th><td>4.61</td><td>ctrl</td><td>tech2</td></tr>\n", | |
"</tbody>\n", | |
"</table>\n" | |
], | |
"text/latex": [ | |
"A data.frame: 6 × 3\n", | |
"\\begin{tabular}{r|lll}\n", | |
" & weight & group & tech\\\\\n", | |
" & <dbl> & <fct> & <fct>\\\\\n", | |
"\\hline\n", | |
"\t1 & 4.17 & ctrl & tech1\\\\\n", | |
"\t2 & 5.58 & ctrl & tech2\\\\\n", | |
"\t3 & 5.18 & ctrl & tech1\\\\\n", | |
"\t4 & 6.11 & ctrl & tech2\\\\\n", | |
"\t5 & 4.50 & ctrl & tech1\\\\\n", | |
"\t6 & 4.61 & ctrl & tech2\\\\\n", | |
"\\end{tabular}\n" | |
], | |
"text/markdown": [ | |
"\n", | |
"A data.frame: 6 × 3\n", | |
"\n", | |
"| <!--/--> | weight <dbl> | group <fct> | tech <fct> |\n", | |
"|---|---|---|---|\n", | |
"| 1 | 4.17 | ctrl | tech1 |\n", | |
"| 2 | 5.58 | ctrl | tech2 |\n", | |
"| 3 | 5.18 | ctrl | tech1 |\n", | |
"| 4 | 6.11 | ctrl | tech2 |\n", | |
"| 5 | 4.50 | ctrl | tech1 |\n", | |
"| 6 | 4.61 | ctrl | tech2 |\n", | |
"\n" | |
], | |
"text/plain": [ | |
" weight group tech \n", | |
"1 4.17 ctrl tech1\n", | |
"2 5.58 ctrl tech2\n", | |
"3 5.18 ctrl tech1\n", | |
"4 6.11 ctrl tech2\n", | |
"5 4.50 ctrl tech1\n", | |
"6 4.61 ctrl tech2" | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"tech <- rep(c('tech1', 'tech2'),nrow(PlantGrowth)/2)\n", | |
"PlantGrowth$tech <- as.factor(tech)\n", | |
"head(PlantGrowth)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 7, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"'data.frame':\t30 obs. of 3 variables:\n", | |
" $ weight: num 4.17 5.58 5.18 6.11 4.5 4.61 5.17 4.53 5.33 5.14 ...\n", | |
" $ group : Factor w/ 3 levels \"ctrl\",\"trt1\",..: 1 1 1 1 1 1 1 1 1 1 ...\n", | |
" $ tech : Factor w/ 2 levels \"tech1\",\"tech2\": 1 2 1 2 1 2 1 2 1 2 ...\n" | |
] | |
} | |
], | |
"source": [ | |
"str(PlantGrowth)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 8, | |
"metadata": { | |
"scrolled": true | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"\n", | |
"Call:\n", | |
"lm(formula = weight ~ group + tech, data = PlantGrowth)\n", | |
"\n", | |
"Residuals:\n", | |
" Min 1Q Median 3Q Max \n", | |
"-0.9710 -0.3765 -0.0330 0.2100 1.2600 \n", | |
"\n", | |
"Coefficients:\n", | |
" Estimate Std. Error t value Pr(>|t|) \n", | |
"(Intercept) 5.1410 0.2280 22.550 <2e-16 ***\n", | |
"grouptrt1 -0.3710 0.2792 -1.329 0.1955 \n", | |
"grouptrt2 0.4940 0.2792 1.769 0.0886 . \n", | |
"techtech2 -0.2180 0.2280 -0.956 0.3478 \n", | |
"---\n", | |
"Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1\n", | |
"\n", | |
"Residual standard error: 0.6244 on 26 degrees of freedom\n", | |
"Multiple R-squared: 0.2891,\tAdjusted R-squared: 0.2071 \n", | |
"F-statistic: 3.525 on 3 and 26 DF, p-value: 0.02881\n" | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"fit <- lm(weight ~ group + tech, data=PlantGrowth)\n", | |
"summary(fit)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 20, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<style>\n", | |
".dl-inline {width: auto; margin:0; padding: 0}\n", | |
".dl-inline>dt, .dl-inline>dd {float: none; width: auto; display: inline-block}\n", | |
".dl-inline>dt::after {content: \":\\0020\"; padding-right: .5ex}\n", | |
".dl-inline>dt:not(:first-of-type) {padding-left: .5ex}\n", | |
"</style><dl class=dl-inline><dt>(Intercept)</dt><dd>5.141</dd><dt>grouptrt1</dt><dd>-0.371</dd><dt>grouptrt2</dt><dd>0.494</dd><dt>techtech2</dt><dd>-0.218</dd></dl>\n" | |
], | |
"text/latex": [ | |
"\\begin{description*}\n", | |
"\\item[(Intercept)] 5.141\n", | |
"\\item[grouptrt1] -0.371\n", | |
"\\item[grouptrt2] 0.494\n", | |
"\\item[techtech2] -0.218\n", | |
"\\end{description*}\n" | |
], | |
"text/markdown": [ | |
"(Intercept)\n", | |
": 5.141grouptrt1\n", | |
": -0.371grouptrt2\n", | |
": 0.494techtech2\n", | |
": -0.218\n", | |
"\n" | |
], | |
"text/plain": [ | |
"(Intercept) grouptrt1 grouptrt2 techtech2 \n", | |
" 5.141 -0.371 0.494 -0.218 " | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"coefficients(fit)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 9, | |
"metadata": { | |
"scrolled": true | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
" group emmean SE df lower.CL upper.CL\n", | |
" ctrl 5.03 0.197 26 4.63 5.44\n", | |
" trt1 4.66 0.197 26 4.26 5.07\n", | |
" trt2 5.53 0.197 26 5.12 5.93\n", | |
"\n", | |
"Results are averaged over the levels of: tech \n", | |
"Confidence level used: 0.95 " | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"emmeans(fit, \"group\")" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"#### My question is, if I knew the coefficient of tech1, then ignoring both of tech(1,2) coefficients would be a way of estimating the effect of group? \n", | |
"\n", | |
"#### Is there a nice way of removing the unwanted effect? " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": {}, | |
"outputs": [], | |
"source": [] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "R", | |
"language": "R", | |
"name": "ir" | |
}, | |
"language_info": { | |
"codemirror_mode": "r", | |
"file_extension": ".r", | |
"mimetype": "text/x-r-source", | |
"name": "R", | |
"pygments_lexer": "r", | |
"version": "4.0.4" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 4 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment