In certain scenarios, we want to estimate a model's parameters on the sample for each observation with itself excluded. This can be achieved by estimating the model repeatedly on the leave-one-out samples but is very inefficient. If we estimate the model on the full sample, however, the coefficient estimates will certainly be biased. Thankfully, we have the Jackknife method to correct for the bias, which produces the (Jackknifed) coefficient estimates for each observation.
Let's start with some variable definitions to help with the explanation.
Variable | Definition |
---|---|
the parameter estimates after deleting the $i$th observation | |
the variance estimate after deleting the $i$th observation | |
the |
|
the $i$th value predicted without using the $i$th observation | |
the $i$th residual | |
the $i$th diagonal of the projection matrix for the predictor space, also called the hat matrix | |
studentized residual | |
the $(j,j)$th element of |
|
the scaled measures of the change in the $j$th parameter estimate calculated by deleting the $i$th observation |
Compute the coefficient estiamtes with the $i$th observation excluded from the sample, i.e.
From the table above, we can get that the $j$th Jackknifed coefficient estimate
Hence,
The good thing is that PROC REG
produces the coefficient estimate INFLUENCE
and I
options produce the remaining statistics just enough to compute
Variable | Option in PROC REG or MODEL statement |
Name in the output dataset |
---|---|---|
Outest= option in PROC REG
|
<jthVariable> |
|
OutputStatistics= from INFLUENCE option in MODEL statement |
Residual |
|
OutputStatistics= from INFLUENCE option in MODEL statement |
RStudent |
|
OutputStatistics= from INFLUENCE option in MODEL statement |
HatDiagnol |
|
OutputStatistics= from INFLUENCE option in MODEL statement |
DFB_<jthVariable> |
|
InvXPX= from I option in MODEL statement |
<jthVariable> |
Suppose we want to calculate the firm-level discretionary accruals for each year using the Jones (1991) model and Kothari et al (2005) model. For a firm
Below is an example PROC REG
that produces three datasets named work.params
, work.outstats
and work.xpxinv
, which contain sufficient statistics to compute the Jackknifed estimates and thus the predicted accruals.
ods listing close;
proc reg data=work.funda edf outest=work.params;
/* industry-year regression */
by fyear sic2;
/* id is necessary for later matching Jackknifed coefficients to firm-year */
id key;
/* Jones Model */
Jones: model tac = inv_at_l drev ppe / noint influence i;
/* Kothari Model with ROA */
Kothari: model tac = inv_at_l drevadj ppe roa / noint influence i;
ods output OutputStatistics=work.outstats InvXPX=work.xpxinv;
run;
ods listing;
Full code of computing five measures of firm-year discretionary accruals can be found on my blog.