Skip to content

Event study in Stata  #145

@pdeffebach

Description

@pdeffebach

First of all, thank you for this wonderful resource!

I am confused by the Stata event study code, and think it might not be totally correct. For reference, here it is

use "https://raw.githubusercontent.com/LOST-STATS/LOST-STATS.github.io/master/Model_Estimation/Data/Event_Study_DiD/bacon_example.dta", clear

* create the lag/lead for treated states
* fill in control obs with 0
* This allows for the interaction between `treat` and `time_to_treat` to occur for each state.
* Otherwise, there may be some NAs and the estimations will be off.
g time_to_treat = year - _nfd
replace time_to_treat = 0 if missing(_nfd)
* this will determine the difference
* btw controls and treated states
g treat = !missing(_nfd)

* Stata won't allow factors with negative values, so let's shift
* time-to-treat to start at 0, keeping track of where the true -1 is
summ time_to_treat
g shifted_ttt = time_to_treat - r(min)
summ shifted_ttt if time_to_treat == -1
local true_neg1 = r(mean)

* Regress on our interaction terms with FEs for group and year,
* clustering at the group (state) level
* use ib# to specify our reference group
reghdfe asmrs ib`true_neg1'.shifted_ttt pcinc asmrh cases, a(stfips year) vce(cluster stfips)

My problem stems from the line

replace time_to_treat = 0 if missing(_nfd)

This means that states which are not treated are given 0, meaning they are treated in that year. This gives the following


time_to_tre	
at	Freq.	Percent	Cum.
			
-21	1	0.06	0.06
-20	2	0.12	0.19
-19	2	0.12	0.31
-18	2	0.12	0.43
-17	2	0.12	0.56
-16	3	0.19	0.74
-15	3	0.19	0.93
-14	3	0.19	1.11
-13	6	0.37	1.48
-12	7	0.43	1.92
-11	9	0.56	2.47
-10	12	0.74	3.22
-9	22	1.36	4.58
-8	25	1.55	6.12
-7	32	1.98	8.10
-6	34	2.10	10.20
-5	36	2.23	12.43
-4	36	2.23	14.66
-3	36	2.23	16.88
-2	36	2.23	19.11
-1	36	2.23	21.34
0	465	28.76	50.09
1	36	2.23	52.32
2	36	2.23	54.55
3	36	2.23	56.77
4	36	2.23	59.00
5	36	2.23	61.22
6	36	2.23	63.45
7	36	2.23	65.68
8	36	2.23	67.90
9	36	2.23	70.13
10	36	2.23	72.36
11	36	2.23	74.58
12	35	2.16	76.75
13	34	2.10	78.85
14	34	2.10	80.95
15	34	2.10	83.06
16	34	2.10	85.16
17	33	2.04	87.20
18	33	2.04	89.24
19	33	2.04	91.28
20	30	1.86	93.14
21	29	1.79	94.93
22	27	1.67	96.60
23	24	1.48	98.08
24	14	0.87	98.95
25	11	0.68	99.63
26	4	0.25	99.88
27	2	0.12	100.00
			
Total	1,617	100.00

It's possible that because in control units, time_to_treat does not vary across years, the state (stfips) fixed effects "take care" of this. But I can't intuitively reason about what's really happening given 0 stands for both untreated and treated, but year 0.

I would recommend making the time_to_treat variable 100 or the maximum plus 100, to avoid this confusion. The values don't matter since they are used as fixed effects anyways.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions