* Lorena G. Barberia * Adapted from Andy Philips (2013) * 31/10/2019 * Lab Class 10 - Clarify * Bootstrapping and Clarify * In this lab, we will briefly explore how bootstrapping and Clarify work. * * ----------------------------------------------------------------------- * This .do file is based on generated data * Data Generating Process: clear set seed 345 set obs 120 gen e1 = rnormal() gen x1 = rnormal() gen x2 = rnormal() sum e1 x1 x2 gen y = 2*x1 + 3*x2 + e1 kdensity y * Now let's examine the regression results of the DGP regress y x1 x2 estimates store m1_original * let's examine the residuals and fitted values predict uhat_m1_original, resid predict yhat_m1_original, xb sum uhat_m1_original yhat_m1_original * BOOTSTRAPPING -------------------------------------------------------------- /* bootstrapping standard errors from a statistic can be used by the following: 1. write program (if you have a custom statistic program) 2. load in data 3. drop missing values (STATA will not discern if you have missing values) 4. drop unneeded variables (this speeds up a bootstrap) 5. set seed 6. run bootstrap */ * first drop any missing obs foreach var in y x1 x2 { drop if `var' == . } reg y x1 x2 bootstrap, reps(1000): regress y x1 x2 estimates store m2_bootstrap predict uhat_m2_bootstrap, resid predict yhat_m2_bootstrap, xb sum uhat_m1_original uhat_m2_bootstrap yhat_m1_original yhat_m2_bootstrap coefplot m1_original m2_bootstrap, drop(_cons) xline(0) /* Can you identify the difference produced by the bootstraping? */ * Jacknife -------------------------------------------------------------- reg y x1 x2, vce(jackknife) estimates store m3_jacknife predict uhat_m3_jacknife, resid predict yhat_m3_jacknife, xb sum uhat_m1_original uhat_m2_bootstrap uhat_m3_jacknife yhat_m1_original yhat_m2_bootstrap yhat_m3_jacknife coefplot m1_original m2_bootstrap m3_jacknife, drop(_cons) xline(0) /* */ * Clarify -------------------------------------------------------------- * net from http://gking.harvard.edu/clarify * net install clarify /* Clarify can do faster some of the things we have done this far, look at the help file for more information on the commands we will use for this lab*/ estsimp regress y x1 x2 * Note the labels for b1, b2, b3, b4 * b4 is sigma^2 * b1 and b2 are the betas for the regression parameters * b3 is the simulated parameter for the constant sum * How do the coefficient standard errors compare with the original standard errors for the coefficients? * What about the coefficients? What differences do you recognize? * Now let's compare these outputs with the uhat above. gen uhat_clarify=sqrt( b4 ) sum uhat_m1_original uhat_m2_bootstrap uhat_m3_jacknife uhat_clarify * Let's compare predicted values of y with pv and ev drop b1 b2 b3 b4 estsimp regress y x1 x2 setx mean simqi, pv simqi, ev * let's compare with the original yhat and yhat under bootstrap * note that these are standard deviations and we would need to calculate the standard errors to compare them directly sum yhat_m1_original yhat_m2_bootstrap yhat_m3_jacknife * Questions * 1. How do the estimates from the bootstrap model compare to the original results? Why? * 2. How do the estimates from jacknife compare with the original regression estimates? Why? * 3. How do the estimates from Clarify compare with the original regression estimates? Why? * 4. How do the standard errors from Clarify compare to the original, bootstrap and jacknife results? Why? * 4. Why does the Clarify simqui command with expected values (ev option) and predicted values (pv option) differ? * 5. How does running Clarify relate to the robustness tests we ran for the last lab? What are the differences? * 6. What are the advantages and disadvantages related to each of the methods we discuss in this lab?