As a quick refresher for what we’re doing here, I’ll write in words what the random inference procedure we need to do is, then show you how to implement it with the assumption you’ve created a panel for your synthetic control already. I’m using the notation from the synthdid package in this case.
Weight Selection Problem
The synthetic control weights \(w = (w_2, \dots, w_{N_0+1})'\) solve
\[
\min_{w}
\quad
\sqrt{
\left( X_1 - X_0 w \right)'
V
\left( X_1 - X_0 w \right)
}
\]
subject to
\[
w_j \ge 0 \quad \text{for all } j
\]
and
\[
\sum_{j=2}^{N_0+1} w_j = 1
\]
where \(V\) is a positive semi-definite weighting matrix.
Treatment Effect
The treatment effect at time \(t\) is
\[
\hat{\tau}_{1t}
=
Y_{1t}
-
\hat{Y}_{1t}^{\text{synth}}
\]
The average post-treatment effect is
\[
\hat{\tau}
=
\frac{1}{T_1}
\sum_{t=T_0+1}^{T}
\left(
Y_{1t}
-
\hat{Y}_{1t}^{\text{synth}}
\right)
\] ### Inference
The idea behind the randomized inference procedure is to then assign treatment iteratively over each of the donor units and run the same synthetic control procedure. But, importantly, we do this to obtain not the average treatment effect but the mean-square prediction-error.
We can think of this as a numerical representation of how far the treated unit (and, in turn, each iterated donor unit when we pretend it is treated) varies from its synthetic control in the pre- and post-treatment periods. We want a nice pre-treatment fit. If the treatment we are estimating is non-random, we would also expect a large deviance from the post-treatment synthetic control. Taking the ratio of these gets us our RMSPE ratio which we then rank. The higher the rank, the more likely it is that our treated unit did not see deviation at random.
Formally, for each unit \(i\), define the synthetic control gap
\[
\hat{\tau}_{it}
=
Y_{it}
-
\hat{Y}_{it}^{\text{synth}}.
\]
We then compute the pre-treatment root mean squared prediction error:
\[
\text{Pre-RMSPE}_i
=
\sqrt{
\frac{1}{T_0}
\sum_{t=1}^{T_0}
\hat{\tau}_{it}^2
}.
\]
Similarly, we compute the post-treatment RMSPE:
\[
\text{Post-RMSPE}_i
=
\sqrt{
\frac{1}{T_1}
\sum_{t=T_0+1}^{T}
\hat{\tau}_{it}^2
}.
\]
The object of interest is the ratio
\[
R_i
=
\frac{\text{Post-RMSPE}_i}{\text{Pre-RMSPE}_i}.
\]
We compute this ratio for the true treated unit and then for each donor unit after iteratively assigning it treatment and re-running the synthetic control procedure.
Let \(R_1\) denote the ratio for the actual treated unit and let \(\{R_j\}_{j=2}^{N_0+1}\) denote the ratios for each placebo unit.
The randomization-based p-value is simply the rank of the treated unit’s ratio among all possible assignments:
\[
p
=
\frac{
1 + \sum_{j=2}^{N_0+1}
\mathbf{1}
\{ R_j \ge R_1 \}
}{
1 + N_0
}.
\]
Intuitively, if the treated unit’s post-treatment deviation is unusually large relative to its pre-treatment fit — compared to what we see when we pretend each donor was treated — then the p-value will be small.