* Lab 12
* Cross-Sectional, Time Series and Times Series Cross-Sectional Approaches to Study the Incumbency Advantage
* Lorena G. Barberia

clear
use "C:\Users\Lorena Barberia\Dropbox\2019 USP Methods II\Labs\Lab Class 12 - Economic Voting with Time Series\JOPrepfile1.dta" 

* The purpose of this lab is to illustrate three different research designs using the same data to test the hypothesis of the incumbency advantage.
* The basic regression model is the regression that was run by Jacobson (2015) to construct Figure 1.
* We are going to run three distinct analysis:

* 1) a cross-sectional model of the incumbency advantage in a single election across all districts
* 2) a time-series model of the incumbency advantage in a single district across elections
* 3) a pooled time-series cross-sectional model of the incumbency advantage across elections and across elections

* Part I.  A Cross-Sectional Test in a Single Election

* We are going to examine the average effect in a single election year across districts first.
tab year
keep if year==2000

eststo cs: reg dv dvp ptynow inc3 

* a) What is the interpretation here of the lag of the dependent variable?
* b) Why are there only 305 observations if there are 435 districts?

* Let's add controls for state fixed effects and re-estimate the model.
* c) What is the interpretation here of the lag of the dependent variable?

eststo csfe: reg dv dvp ptynow inc3 i.state

esttab cs csfe

* Part II. A Univariate Time Series Test 
* We are going to focus our analysis on a single district across time.

clear
use "C:\Users\Lorena Barberia\Dropbox\2019 USP Methods II\Labs\Lab Class 12 - Economic Voting with Time Series\JOPrepfile1.dta" 
tab stcd 

* We need to inform Stata we are working with time-series data.
* Let's pick district 701

keep if stcd==701
tsset year

* lets look at our dv across elections in this district
 
tsline dv, yline(50)

*We are going to examine the average effect in a single district across years.

eststo ts701: reg dv dvp ptynow inc3 

* a) What is the interpretation here of the lag of the dependent variable?

* b) Based on the analysis below, what can we conclude about serial correlation? 
predict uhatts, resid
tsline uhat 

*let's install a package
findit gr0009

* Here is the same analysis with a more elegant quode and figure
qui reg dv dvp ptynow inc3 
ofrtplot 

* Does it look like we have a pattern that suggests we may have serial correlation?

* We could also do some formal tests, but these do not work well as we have gaps in the data.

* Part III TSCS

clear
use "C:\Users\Lorena Barberia\Dropbox\2019 USP Methods II\Labs\Lab Class 12 - Economic Voting with Time Series\JOPrepfile1.dta" 
xtset stcd year


* Let's look at our dv across elections for a select number of districts
xtline dv if stcd<=200

* Let's  run two tscs models. In the second, we will employ standard errors clustered at the district level.

eststo tscs: reg dv dvp ptynow inc3 
predict uhattscs1, resid

eststo tscsclusteredse: reg dv dvp ptynow inc3, cluster(stcd) 
predict uhattscs2, resid

esttab tscs tscsclusteredse

* a) What is the interpretation here of the lag of the dependent variable?

* b) What (if any patterns) do we see in the residuals?

xtline uhattscs1 uhattscs2 if stcd<=200