-
Notifications
You must be signed in to change notification settings - Fork 178
Description
First of all, thank you for this wonderful resource!
I am confused by the Stata event study code, and think it might not be totally correct. For reference, here it is
use "https://raw.githubusercontent.com/LOST-STATS/LOST-STATS.github.io/master/Model_Estimation/Data/Event_Study_DiD/bacon_example.dta", clear
* create the lag/lead for treated states
* fill in control obs with 0
* This allows for the interaction between `treat` and `time_to_treat` to occur for each state.
* Otherwise, there may be some NAs and the estimations will be off.
g time_to_treat = year - _nfd
replace time_to_treat = 0 if missing(_nfd)
* this will determine the difference
* btw controls and treated states
g treat = !missing(_nfd)
* Stata won't allow factors with negative values, so let's shift
* time-to-treat to start at 0, keeping track of where the true -1 is
summ time_to_treat
g shifted_ttt = time_to_treat - r(min)
summ shifted_ttt if time_to_treat == -1
local true_neg1 = r(mean)
* Regress on our interaction terms with FEs for group and year,
* clustering at the group (state) level
* use ib# to specify our reference group
reghdfe asmrs ib`true_neg1'.shifted_ttt pcinc asmrh cases, a(stfips year) vce(cluster stfips)
My problem stems from the line
replace time_to_treat = 0 if missing(_nfd)
This means that states which are not treated are given 0, meaning they are treated in that year. This gives the following
time_to_tre
at Freq. Percent Cum.
-21 1 0.06 0.06
-20 2 0.12 0.19
-19 2 0.12 0.31
-18 2 0.12 0.43
-17 2 0.12 0.56
-16 3 0.19 0.74
-15 3 0.19 0.93
-14 3 0.19 1.11
-13 6 0.37 1.48
-12 7 0.43 1.92
-11 9 0.56 2.47
-10 12 0.74 3.22
-9 22 1.36 4.58
-8 25 1.55 6.12
-7 32 1.98 8.10
-6 34 2.10 10.20
-5 36 2.23 12.43
-4 36 2.23 14.66
-3 36 2.23 16.88
-2 36 2.23 19.11
-1 36 2.23 21.34
0 465 28.76 50.09
1 36 2.23 52.32
2 36 2.23 54.55
3 36 2.23 56.77
4 36 2.23 59.00
5 36 2.23 61.22
6 36 2.23 63.45
7 36 2.23 65.68
8 36 2.23 67.90
9 36 2.23 70.13
10 36 2.23 72.36
11 36 2.23 74.58
12 35 2.16 76.75
13 34 2.10 78.85
14 34 2.10 80.95
15 34 2.10 83.06
16 34 2.10 85.16
17 33 2.04 87.20
18 33 2.04 89.24
19 33 2.04 91.28
20 30 1.86 93.14
21 29 1.79 94.93
22 27 1.67 96.60
23 24 1.48 98.08
24 14 0.87 98.95
25 11 0.68 99.63
26 4 0.25 99.88
27 2 0.12 100.00
Total 1,617 100.00
It's possible that because in control units, time_to_treat does not vary across years, the state (stfips) fixed effects "take care" of this. But I can't intuitively reason about what's really happening given 0 stands for both untreated and treated, but year 0.
I would recommend making the time_to_treat variable 100 or the maximum plus 100, to avoid this confusion. The values don't matter since they are used as fixed effects anyways.