Getting Started With Stacked DID

A side-by-side comparison of R and Stata code

Author

Coady Wing, Alex Hollingsworth, and Seth Freedman

Quick Start

The stacked DID estimator we describe in our paper is easy to implement using standard regression packages and commands in both Stata and R. The main task is to build the stacked data set and use sample shares to compute the weights.

We wrote two functions – available in both Stata and R – to automate this process. Each function is a bit longer and more repetitive than it strictly needs to be. But we’ve opted for a more verbose set for this tutorial to keep things as clear as possible. We will provide a faster single function for both R and Stata that handles more complex settings and is a bit tidier.

This tutorial shows you how to implement a stacked DID regression using these functions. The mission is simply to show you how to make a stacked data set, how to make the weights, and how to estimate the regression. We focus on the example from the paper, which is about the effects of the ACA Medicaid expansion on uninsurance rates using data from the American Community Survey downloaded from IPUMS.

Load the data

To get started, load up a data set called acs1860_2008_2021.csv. It’s a long form panel with one row for every state-year from 2008 to 2021. That’s 51 States \(\times\) 14 years = 714 total observations.

The key outcome variable is unins, which measures the fraction of people ages 18 to 60 who do not health insurance. The adopt_year variable records the year that the state first adopted the ACA Medicaid expansion. States that have not yet adopted the expansion are coded as NA.

pacman::p_load(Statamarkdown, tidyverse, ggthemes, rio, 
                geomtextpath, gghighlight, data.table,
                collapse, fixest, modelsummary)

dtc = fread("data/acs1860_unins_2008_2021.csv")

import delimited "data/acs1860_unins_2008_2021.csv", clear

(encoding automatically selected: ISO-8859-1)
(5 vars, 714 obs)

The `create_sub_exp()` function

We use a function called create_sub_exp() to construct a sub-experimental data set for a specified adoption year. The code that defines the function is below in both Stata and R.

The inputs and outputs of the create_sub_exp() are:

inputs:
- dataset: underlying panel dataset
- timeID: name of the variable measuring calendar time
- groupID: name of the variable indicating group
- adoptionTime: name of the variable containing the treatment adoption time for each unit; the variable is set to NA for those that never adopt
- focalAdoptionTime: user specified number indicating the focal adoption time for this sub-experiment
- kappa_pre: user specified number of pre-treatment periods
- kappa_post: user specified number of post-treatment periods
outputs: A data.table that consists of the treated units that adopt in timeID adoptTime, the clean controls that adopt no earlier than adoptTime + kappa_post, and only the calendar time units that fall within \(adoptTime - kappa_{pre}\) and \(atime + kappa_{post}\). The output table consists of one row for each state-year observation in the sub-experimental data set. There is also a feasible variable that indicates whether or not a given state-year observation is “allowed” given our trimming restrictions.

The code to create the create_sub_exp() function in Stata and R is below.

create_sub_exp = function(dataset, timeID, groupID, adoptionTime, focalAdoptionTime, kappa_pre, kappa_post){
  
  # Copy dataset 
  dt_temp = copy(dataset)

  # Determine earliest and latest time in the data. 
        # Used for feasibility check later
  minTime = dt_temp[, fmin(get(timeID))]
  maxTime = dt_temp[, fmax(get(timeID))]
  
  # Include only the treated groups and the clean controls that adopt at least kappa_post periods after the focal atime.
  dt_temp = dt_temp[get(adoptionTime) == focalAdoptionTime | get(adoptionTime) > focalAdoptionTime + kappa_post | get(adoptionTime) == TRUE | is.na(get(adoptionTime))]
  
  # Limit to time periods inside the event window defined by the kappas
  dt_temp = dt_temp[get(timeID) %in% (focalAdoptionTime - kappa_pre):(focalAdoptionTime + kappa_post)]
  
  # Make treatment group dummy
  dt_temp[, treat := 0]
  dt_temp[get(adoptionTime) == focalAdoptionTime, treat := 1] 
  
  # Make a post variable
  dt_temp[, post := fifelse(get(timeID) >= focalAdoptionTime, 1, 0)]
  
  # Make event time variable
  dt_temp[, event_time := get(timeID) - focalAdoptionTime]
  
  # Create a feasible variable
  dt_temp[, feasible := fifelse(focalAdoptionTime - kappa_pre >= minTime & focalAdoptionTime + kappa_post <= maxTime, 1, 0)]
  
  # Make a sub experiment ID.
  dt_temp[, sub_exp := focalAdoptionTime]
  
  return(dt_temp)
}

/* Create sub-experiment data for stack */

* clear programs
capture program drop _all

* start new program
program create_sub_exp
syntax, ///
    timeID(string) ///
    groupID(string) ///
    adoptionTime(string) ///
    focalAdoptionTime(int) ///
    kappa_pre(numlist) ///
    kappa_post(numlist)
    * Suppress output
    qui{
        * Save dataset in memory, so we can call this function multiple times. 
        preserve

        * Determine earliest and latest time in the data. 
            * Used for feasibility check later
        sum `timeID'
        local minTime = r(min)
        local maxTime = r(max)

        
        *variable to label sub-experiment if treated in focalAdoptionTime, 
        gen sub_exp = `focalAdoptionTime' if `adoptionTime' == `focalAdoptionTime'
        
        *Now fill in this variable for states with adoptionTime > focalAdoptionTime + kappa_post
        *note, this will include never treated, because adopt_year is ., which stata counts as infinity
        replace sub_exp = `focalAdoptionTime' if `adoptionTime' > `focalAdoptionTime' + `kappa_post'
        
        *Keep only treated and clean controls
        keep if sub_exp != .
        
        *gen treat variable in subexperiment
        gen treat = `adoptionTime' == `focalAdoptionTime'
        
        *gen event_time and 
        gen event_time = year - sub_exp
        
        *gen post variable
        gen post = event_time >= 0
        
        *trim based on kappa's: -kappa_pre < event_time < kappa_post
        keep if inrange(event_time, -`kappa_pre', `kappa_post')
        
        *keep if event_time >= -`kappa_pre' & event_time <= `kappa_post'
        gen feasible = 0 
        replace feasible = 1 if !missing(`adoptionTime')
        replace feasible = 0 if `adoptionTime' < `minTime' + `kappa_pre' 
        replace feasible = 0 if `adoptionTime' > `maxTime' - `kappa_post' 
        drop if `adoptionTime' < `minTime' + `kappa_pre' 

        * Save dataset
        compress
        save temp/subexp`focalAdoptionTime', replace
        restore
    }
end

  2.         * Suppress output
  9.

Making sub-experimental data sets

Let’s use create_sub_exp() to build a single sub-experimental data set for the 2014 adoption event.

# Run this function with focal year 2014
subexp2014 = create_sub_exp(
              dataset = dtc,
              timeID = "year",
              groupID = "statefips", 
              adoptionTime = "adopt_year", 
              focalAdoptionTime = 2014,
              kappa_pre = 3,
              kappa_post = 2)

# Summarize
datasummary(All(subexp2014) ~ N + Mean + SD + Min + Max,
            data = subexp2014,
            output = 'markdown')

	N	Mean	SD	Min	Max
statefip	276	29.63	15.80	1.00	56.00
year	276	2013.50	1.71	2011.00	2016.00
adopt_year	210	2015.20	2.43	2014.00	2021.00
unins	276	0.17	0.06	0.04	0.32
treat	276	0.61	0.49	0.00	1.00
post	276	0.50	0.50	0.00	1.00
event_time	276	-0.50	1.71	-3.00	2.00
feasible	276	1.00	0.00	1.00	1.00
sub_exp	276	2014.00	0.00	2014.00	2014.00

* Save dataset
preserve

* Run this function with focal year 2014
create_sub_exp, ///
    timeID(year) ///
    groupID( statefip) ///
    adoptionTime(adopt_year) ///
    focalAdoptionTime(2014) ///
    kappa_pre(3) ///
    kappa_post(2)

* Open temp dataset created with function
use temp/subexp2014.dta, clear

* Summarize
sum statefip year adopt_year unins  treat  post event_time feasible sub_exp 

* Restore dataset
restore

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
    statefip |        276    29.63043    15.80337          1         56
        year |        276      2013.5    1.710927       2011       2016
  adopt_year |        210      2015.2    2.429484       2014       2021
       unins |        276    .1658948    .0646393    .037867   .3239669
       treat |        276    .6086957    .4889288          0          1
-------------+---------------------------------------------------------
        post |        276          .5    .5009083          0          1
  event_time |        276         -.5    1.710927         -3          2
    feasible |        276    .6521739    .4771457          0          1
     sub_exp |        276        2014           0       2014       2014

Build the stack of sub-experiments

In practice, we aren’t usually interested in creating a single sub-experimental dataset. We want to create a family of sub-experimental datasets – one for each adoption event.

The next step is to use a loop to run the create_sub_exp() function on each adoption year. This will produce a collection of sub-experimental datasets.

Then we can append the individual sub-experiments into one large dataset that we will call stacked_dtc. With the stacked data set in hand, we remove the sub-experiments that are not feasible. This imposes the inclusion criteria required for a compositionally balanced event study design.

# create the sub-experimental data sets
events = dtc[is.na(adopt_year) == FALSE, funique(adopt_year)]

# make a list to store the sub experiments in.
sub_experiments = list()

# Loop over the events and make a data set for each one
for (j in events) {
  sub_name = paste0("sub_",j) 
  sub_experiments[[sub_name]] = create_sub_exp(
              dataset = dtc,
              timeID = "year",
              groupID = "statefips", 
              adoptionTime = "adopt_year", 
              focalAdoptionTime = j,
              kappa_pre = 3,
              kappa_post = 2)
}

# Vertically concatenate the sub-experiments
stackfull = rbindlist(sub_experiments)

# Remove the sub-experiments that are not feasible
stacked_dtc = stackfull[feasible == 1]

# Summarize
datasummary(All(stacked_dtc) ~ N + Mean + SD + Min + Max,
            data = stacked_dtc,
            output = 'markdown')

	N	Mean	SD	Min	Max
statefip	600	30.88	15.96	1.00	56.00
year	600	2014.76	2.37	2011.00	2021.00
adopt_year	336	2016.55	2.88	2014.00	2021.00
unins	600	0.17	0.06	0.04	0.32
treat	600	0.35	0.48	0.00	1.00
post	600	0.50	0.50	0.00	1.00
event_time	600	-0.50	1.71	-3.00	2.00
feasible	600	1.00	0.00	1.00	1.00
sub_exp	600	2015.26	1.64	2014.00	2019.00

# Treated, control, and total count by stack
stacked_dtc[event_time==0, 
            .(N_treated = fsum(treat), 
              N_control = fsum(1-treat), 
              N_total = .N
              ), 
            by = sub_exp][order(sub_exp)]

   sub_exp N_treated N_control N_total
1:    2014        28        18      46
2:    2015         3        18      21
3:    2016         2        18      20
4:    2019         2        11      13


//create the sub-experimental data sets

levelsof adopt_year, local(alist)
di "`alist'"
qui{
// Loop over the events and make a data set for each one
foreach j of numlist `alist' { 
  // Preserve dataset
  preserve

  // run function
  create_sub_exp, ///
    timeID(year) ///
    groupID( statefip) ///
    adoptionTime(adopt_year) ///
    focalAdoptionTime(`j') ///
    kappa_pre(3) ///
    kappa_post(2)

  // restore dataset
  restore
}

// Append the stacks together, but only from feasible stacks
        * Determine earliest and latest time in the data. 
            * Used for feasibility check later
          sum year
          local minTime = r(min)
          local maxTime = r(max)
          local kappa_pre = 3
          local kappa_post= 2

gen feasible_year = adopt_year
replace feasible_year = . if adopt_year < `minTime' + `kappa_pre' 
replace feasible_year = . if adopt_year > `maxTime' - `kappa_post' 
sum feasible_year

local minadopt = r(min)
levelsof feasible_year, local(alist)
clear
foreach j of numlist `alist'  {
    display `j'
    if `j' == `minadopt' use temp/subexp`j', clear
    else append using temp/subexp`j'
}

// Clean up 
* erase temp/subexp`j'.dta
}
* Summarize
sum statefip year adopt_year unins  treat  post event_time feasible sub_exp

* Treated, control, and total count by stack
preserve
keep if event_time == 0
gen N_treated = treat 
gen N_control = 1 - treat 
gen N_total = 1
collapse (sum) N_treated N_control N_total, by(sub_exp)
list sub_exp N_treated N_control N_total in 1/4
/*
sumup treat if event_time == 0, s(N)
stacked_dtc[event_time==0, 
            .(N_treated = fsum(treat), 
              N_control = fsum(1-treat), 
              N_total = .N
              ), 
            by = sub_exp][order(sub_exp)]
*/
restore

2014 2015 2016 2019 2020 2021

2014 2015 2016 2019 2020 2021

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
    statefip |        600       30.88    15.96027          1         56
        year |        600     2014.76    2.366093       2011       2021
  adopt_year |        336    2016.554    2.882285       2014       2021
       unins |        600    .1735541    .0556038    .037867   .3239669
       treat |        600         .35    .4773676          0          1
-------------+---------------------------------------------------------
        post |        600          .5    .5004172          0          1
  event_time |        600         -.5     1.70925         -3          2
    feasible |        600         .41    .4922437          0          1
     sub_exp |        600     2015.26    1.636112       2014       2019


(500 observations deleted)

     +-----------------------------------------+
     | sub_exp   N_trea~d   N_cont~l   N_total |
     |-----------------------------------------|
  1. |    2014         28         18        46 |
  2. |    2015          3         18        21 |
  3. |    2016          2         18        20 |
  4. |    2019          2         11        13 |
     +-----------------------------------------+

The summary table at the end of this code snippet shows that the stacked data set consists of 4 sub-experiments: 2014, 2015, 2016, and 2019. There are 28 treated units in the 2014 sub-experiment, 3 treated units in the 2015 sub-experiment, and 2 treated units in the 2016 and 2019 sub-experiments. In contrast, there are 18 clean controls in the 2014, 2015, and 2016 sub-experiments, and 11 clean controls in the 2019 sub-experiment.

The `compute_weights()` function

We developed a function called compute_weights() to construct the corrective weights used in the weighted stacked DID regressions. compute_weights() takes a stacked data set as input and computes the corrective sample weights for the treated and control groups

inputs:
stack_data: a stacked dataset created using create_sub_exp() and the appending procedure above
treatedVar: the name of the variable that indicates whether a unit serves as a treated unit in a given sub-experiment
eventTimeVar: the name of the variable that indicates the event time for each sub-experiment
subexpVar: the name of the variable that indicates the sub-experiment for each unit
outputs:
the original stack_data, but with a new column of corrective sample weights stack_weights

The code that defines the compute_weights() function is shown below. The function is written in R and in Stata.

compute_weights = function(dataset, treatedVar, eventTimeVar, subexpVar) {

  # Create a copy of the underlying dataset
  stack_dt_temp = copy(dataset)

  # Step 1: Compute stack - time counts for treated and control
  stack_dt_temp[, `:=` (stack_n = .N,
                     stack_treat_n = sum(get(treatedVar)),
                     stack_control_n = sum(1 - get(treatedVar))), 
             by = get(eventTimeVar)
             ]  
  # Step 2: Compute sub_exp-level counts
  stack_dt_temp[, `:=` (sub_n = .N,
                     sub_treat_n = sum(get(treatedVar)),
                     sub_control_n = sum(1 - get(treatedVar))
                     ), 
             by = list(get(subexpVar), get(eventTimeVar))
             ]
  
  # Step 3: Compute sub-experiment share of totals
  stack_dt_temp[, sub_share := sub_n / stack_n]
  
  stack_dt_temp[, `:=` (sub_treat_share = sub_treat_n / stack_treat_n,
                     sub_control_share = sub_control_n / stack_control_n
                     )
             ]
  
  # Step 4: Compute weights for treated and control groups
  stack_dt_temp[get(treatedVar) == 1, stack_weight := 1]
  stack_dt_temp[get(treatedVar) == 0, stack_weight := sub_treat_share/sub_control_share]
  
  return(stack_dt_temp)
}


/* Create Weights */
capture program drop _all
program compute_weights
syntax, ///
    treatedVar(string) ///
    eventTimeVar(string) ///
  groupID(string) ///
    subexpVar(string) 

  // Create weights
  bysort `subexpVar' `groupID': gen counter_treat = _n if `treatedVar' == 1
  egen n_treat_tot = total(counter_treat)
  by `subexpVar': egen n_treat_sub = total(counter_treat) 

  bysort `subexpVar'  `groupID': gen counter_control = _n if `treatedVar' == 0
  egen n_control_tot = total(counter_control)
  by `subexpVar': egen n_control_sub = total(counter_control) 


  gen stack_weight = 1 if `treatedVar' == 1
  replace stack_weight = (n_treat_sub/n_treat_tot)/(n_control_sub/n_control_tot) if `treatedVar' == 0
end

  2. 
 10. end

Use `compute_weights()` to compute the stacked weights

With the stacked data set in hand, we use the compute_weights() function to compute the corrective weight variable.

stacked_dtc2 = compute_weights(
      dataset = stacked_dtc,
      treatedVar = "treat",
      eventTimeVar = "event_time",
      subexpVar = "sub_exp")

# Summarize
stacked_dtc2[event_time==0 & treat==0, 
             .(avg_control_weight = mean(stack_weight)), 
             by = sub_exp][order(sub_exp)]

   sub_exp avg_control_weight
1:    2014          2.8888889
2:    2015          0.3095238
3:    2016          0.2063492
4:    2019          0.3376623

compute_weights, ///
    treatedVar(treat) ///
    eventTimeVar(event_time) ///
  groupID(statefip) ///
    subexpVar(sub_exp) 

* Summarize 
sumup stack_weight if treat == 0 & event_time == 0, by(sub_exp) s(mean)

>         treatedVar(treat) ///
>         eventTimeVar(event_time) ///
>   groupID(statefip) ///
>         subexpVar(sub_exp) 
(390 missing values generated)
(210 missing values generated)
(390 missing values generated)
(390 real changes made)


sub_exp |      Mean 
--------+-----------
 2014   |   2.888889
 2015   |   .3095238
 2016   |   .2063492
 2019   |   .3376623
--------+-----------
 Total  |          1
--------------------

Estimate the stacked regression

Now that we have a stacked data set and a set of corrective weights, we can estimate the stacked regressions.

# Fit the event study model, using the weights, clustering at the state level.
weight_stack = feols(unins ~ i(event_time, treat, ref = -1) | treat + event_time, 
                              data = stacked_dtc2, 
                              cluster = stacked_dtc2$statefip,
                              weights = stacked_dtc2$stack_weight)


// Create dummy variables for event-time
char event_time[omit] -1
xi i.event_time

// Run regression
qui reghdfe unins i.treat##i._I* [aw = stack_weight], cluster(statefip) absorb(treat event_time)
est sto weight_stack

i.event_time      _Ievent_tim_1-6     (_Ievent_tim_3 for event_t~e==-1 omitted)

Show the results

# display results
etable(weight_stack)

                               weight_stack
Dependent Var.:                       unins
                                           
treat x event_time = -3    -0.0010 (0.0037)
treat x event_time = -2    -0.0030 (0.0030)
treat x event_time = 0  -0.0163*** (0.0039)
treat x event_time = 1  -0.0239*** (0.0065)
treat x event_time = 2  -0.0255*** (0.0071)
Fixed-Effects:          -------------------
treat                                   Yes
event_time                              Yes
_______________________ ___________________
S.E.: Clustered                 by: cluster
Observations                            600
R2                                  0.42775
Within R2                           0.01288
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

// Show results
esttab weight_stack, keep(1.treat#1*) se

                      (1)   
                    unins   
----------------------------
1.treat#1.~1     -0.00102   
                (0.00368)   

1.treat#1.~2     -0.00303   
                (0.00299)   

1.treat#1.~4      -0.0163***
                (0.00393)   

1.treat#1.~5      -0.0239***
                (0.00645)   

1.treat#1.~6      -0.0255***
                (0.00707)   
----------------------------
N                     600   
----------------------------
Standard errors in parentheses
* p<0.05, ** p<0.01, *** p<0.001

The results in the table show the coefficients on the interaction terms from the weighted stacked event study regressions. The coefficients on the pre-event periods are small and are not statistically different from zero by conventional measures. The coefficients on event time periods 0, 1, and 2 represent estimates of the trimmed aggregate ATT parameter for the first three years after the expansion is implemented. The estimates imply that expanding the Medicaid program reduced the uninsurance rate by 1.6 percentage points in the year of adoption, 2.4 percentage points in at 1 year after adoption, and 2.5 percentage points at 2 years after adoption.

Compute the average post-treatment effect

The stacked event study regression estimates the average treatment effect for each year after the treatment is implemented. We can use these estimates to compute the average post-treatment effect over the first three years after the treatment is implemented. We do this by forming the simple the coefficients on the interaction terms for the first three years after the treatment is implemented. This is a linear combination of coefficients and so we can use the lincom command in Stata or the marginaleffects package in R to compute standard errors for the average post-treatment effect.

# Compute the average post-treatment effect using the hypotheis() function.
hypotheses(weight_stack, 
           "(
           `event_time::0:treat` + 
           `event_time::1:treat` + 
           `event_time::2:treat`
           )/3 = 0", 
           df = 50)


                                                                                                                              Term
 (\n           `event_time::0:treat` + \n           `event_time::1:treat` + \n           `event_time::2:treat`\n           )/3 = 0
 Estimate Std. Error     t Pr(>|t|)   2.5 %  97.5 % Df
  -0.0219    0.00563 -3.89   <0.001 -0.0332 -0.0106 50

Columns: term, estimate, std.error, statistic, p.value, conf.low, conf.high, df

lincom (1.treat#1._Ievent_tim_4 + 1.treat#1._Ievent_tim_5 + 1.treat#1._Ievent_tim_6)/3

 ( 1)  .3333333*1.treat#1._Ievent_tim_4 + .3333333*1.treat#1._Ievent_tim_5 +
       .3333333*1.treat#1._Ievent_tim_6 = 0

------------------------------------------------------------------------------
       unins | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         (1) |  -.0218778   .0056281    -3.89   0.000    -.0331822   -.0105733
------------------------------------------------------------------------------

The results show that the average effect is about -2.2 percentage points across the three post-treatment time periods, with a standard error of .006. This point estimate is easy to verify by taking the simple average of the the three post-period event study coefficients: \((-.01627 + -0.023864 + -.025500)/3 = -.0219\).

Note: Thank you to Shyam Raman for the help getting side-by-side code working and for quarto formatting magic!