Created
March 23, 2023 19:02
-
-
Save bbdaniels/39c5497fe62ed87f5d780395ba4fdc07 to your computer and use it in GitHub Desktop.
Debiasing regression with controls
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// Part 1: De-biasing a parameter estimate using controls | |
// Develop some data generating process for data X’s and for outcome Y, with some (potentially multi-armed) treatment variable and treatment effect. Like last week, you should strongly consider "simulating" data along the lines of your group project. | |
// This DGP should include strata groups and continuous covariates, as well as random noise. Make sure that the strata groups affect the outcome Y and are of different sizes, and make the probability that an individual unit receives treatment vary across strata groups. You will want to create the strata groups first, then use a command like expand or merge to add them to an individual-level data set. | |
cap prog drop runregs | |
prog def runregs, rclass | |
syntax anything // sample size goes here (divided by ten) | |
clear | |
set obs 4 | |
gen block = _n | |
expand block*`anything' | |
gen cov_conf = rnormal() | |
gen cov_nocy = rnormal() | |
gen cov_nocx = rnormal() | |
gen treat = (block/(4) + cov_conf + cov_nocx + rnormal()) > 0 | |
gen y = block + cov_conf /// block and cov_conf are confounders | |
+ 4*cov_nocy + 0.5*treat /// <-- true effect = 0.5 | |
+ 2*rnormal() // random noise | |
return scalar n = `c(N)' | |
reg y treat | |
return scalar b_nocov = _b[treat] | |
reg y treat i.block | |
return scalar b_block = _b[treat] | |
reg y treat i.block cov_conf | |
return scalar b_unbias = _b[treat] | |
reg y treat cov_nocy cov_nocx | |
return scalar b_bias1 = _b[treat] | |
reg y treat cov_nocy cov_nocx cov_conf | |
return scalar b_bias2 = _b[treat] | |
end | |
tempfile sims all | |
clear | |
save `all' , emptyok | |
qui foreach i in 10 20 40 80 160 320 640 { | |
simulate nocov=r(b_nocov) block=r(b_block) /// | |
bias1=r(b_bias1) bias2=r(b_bias2) /// | |
unbias=r(b_unbias) n=r(n) /// | |
, r(500) saving(`sims' , replace) /// | |
: runregs `i' | |
use `all' , clear | |
append using `sims' | |
save `all' , replace | |
} | |
use `all' , clear | |
tw /// | |
(lpolyci nocov n) /// | |
(lpolyci block n) /// | |
(lpolyci bias1 n) /// | |
(lpolyci bias2 n) /// | |
(lpolyci unbias n) /// | |
, legend(on c(1) pos(3) /// | |
order(2 "Naive" 3 "Strata" 4 "X Cov Only" /// | |
5 "X + Conf" 6 "Correct")) /// | |
yscale(r(0)) ylab(#6 0.5 "True") ytit("Coefficient Estimates") /// | |
xscale(log) xlab(100 200 400 800 1600 3200 6400) xtit("Observations") | |
// Make sure that at least one of the continuous covariates also affects both the outcome and the likelihood of receiving treatment (a "confounder"). Make sure that another one of the covariates affects the outcome but not the treatment. Make sure that another one affects the treatment but not the outcome. (What do these do?) | |
// Construct at least five different regression models with combinations of these covariates and strata fixed effects. (Type h fvvarlist for information on using fixed effects in regression.) Run these regressions at different sample sizes, using a program like last week. Collect as many regression runs as you think you need for each, and produce figures and tables comparing the biasedness and convergence of the models as N grows. Can you produce a figure showing the mean and variance of beta for different regression models, as a function of N? Can you visually compare these to the "true" parameter value? | |
// Fully describe your results in your README.md file, including figures and tables as appropriate. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment