Synthetic Control Placebo Inference in Synthdid

Published

February 24, 2026

1 Setup

The synthdid package put out by David Hirshberg is my go-to when it comes to running all things synthetic control. I find it very helpful for both synthetic controls (SC) and synthetic difference-in-differences (SDiD), and even comparing to a base TWFE specification. In a recent project I’ve been working on, typical DiD specifications seem not to capture pre-trends well and I’m getting noisy results. Part of the reason for this is that I am expecting some heterogeneity in results by-state. Instead of trying to run state-specific results, I found myself (with some advising by Ed Rubin) running synthetic controls. This has the advantage of a better pre-period match as well as being better able to visualize the trends in treated states.

As I began to run this, I realized that the synthdid package does not have a built-in way (and I double checked with Kyu Matuzawa) to run inference on syntetic control results. It does have built-in the Arkhangelsky et al. procedure which returns a standard error using the placebo method they derive, but it can only be used for the SDiD specification. See for yourself here:

library(synthdid)

data('california_prop99')
setup = panel.matrices(california_prop99)
tau.sc = sc_estimate(setup$Y, setup$N0, setup$T0)

se = sqrt(vcov(tau.sc, method='placebo'))
se

        [,1]
[1,] 11.4213

t_stat <- as.numeric(tau.sc) / se
p_val  <- 2 * (1 - pnorm(abs(t_stat)))
p_val

           [,1]
[1,] 0.08583067

The p-value I obtain here is off from the original estimate using random inference in the original Abadie et al. 2010 paper which obtains a p-value of \(\frac{1}{39} \sim 0.026\). The standard error calculated requires a distributional assumption I think many of us are unwilling to make, or maybe that is just me worrying about what the referees would say were I to implement the above exercise. This would probably be a conservative way to obtain p-values anyhow, but still, I wanted to ensure that I can obtain the p-values using the random inference procedure described by Abadie Diamond and Hainmueller, 2010.

2 Review

As a quick refresher for what we’re doing here, I’ll write in words what the random inference procedure we need to do is, then show you how to implement it with the assumption you’ve created a panel for your synthetic control already. I’m using the notation from the synthdid package in this case.

We observe one treated unit and \(N_0\) donor units over \(T\) periods. Let \(T_0\) denote the number of pre-treatment periods and \(T_1 = T - T_0\) the number of post-treatment periods.

Let:

\(Y_{1t}\) denote the outcome for the treated unit
\(Y_{jt}\) denote the outcome for donor unit \(j = 2, \dots, N_0+1\)

2.1 Pre-treatment Outcomes

The treated unit’s pre-treatment outcome vector is

\[ X_1 = \begin{pmatrix} Y_{11} \\ \vdots \\ Y_{1T_0} \end{pmatrix} \]

The donor matrix of pre-treatment outcomes is

\[ X_0 = \begin{pmatrix} Y_{21} & \dots & Y_{N_0+1,1} \\ \vdots & \ddots & \vdots \\ Y_{2T_0} & \dots & Y_{N_0+1,T_0} \end{pmatrix} \]

2.2 Weight Selection Problem

The synthetic control weights \(w = (w_2, \dots, w_{N_0+1})'\) solve

\[ \min_{w} \quad \sqrt{ \left( X_1 - X_0 w \right)' V \left( X_1 - X_0 w \right) } \]

subject to

\[ w_j \ge 0 \quad \text{for all } j \]

and

\[ \sum_{j=2}^{N_0+1} w_j = 1 \]

where \(V\) is a positive semi-definite weighting matrix.

2.3 Synthetic Control Path

Given optimal weights \(\hat{w}\), the synthetic outcome for any period \(t\) is

\[ \hat{Y}_{1t}^{\text{synth}} = \sum_{j=2}^{N_0+1} \hat{w}_j Y_{jt} \]

2.4 Treatment Effect

The treatment effect at time \(t\) is

\[ \hat{\tau}_{1t} = Y_{1t} - \hat{Y}_{1t}^{\text{synth}} \]

The average post-treatment effect is

\[ \hat{\tau} = \frac{1}{T_1} \sum_{t=T_0+1}^{T} \left( Y_{1t} - \hat{Y}_{1t}^{\text{synth}} \right) \] ### Inference

The idea behind the randomized inference procedure is to then assign treatment iteratively over each of the donor units and run the same synthetic control procedure. But, importantly, we do this to obtain not the average treatment effect but the mean-square prediction-error.

We can think of this as a numerical representation of how far the treated unit (and, in turn, each iterated donor unit when we pretend it is treated) varies from its synthetic control in the pre- and post-treatment periods. We want a nice pre-treatment fit. If the treatment we are estimating is non-random, we would also expect a large deviance from the post-treatment synthetic control. Taking the ratio of these gets us our RMSPE ratio which we then rank. The higher the rank, the more likely it is that our treated unit did not see deviation at random.

Formally, for each unit \(i\), define the synthetic control gap

\[ \hat{\tau}_{it} = Y_{it} - \hat{Y}_{it}^{\text{synth}}. \]

We then compute the pre-treatment root mean squared prediction error:

\[ \text{Pre-RMSPE}_i = \sqrt{ \frac{1}{T_0} \sum_{t=1}^{T_0} \hat{\tau}_{it}^2 }. \]

Similarly, we compute the post-treatment RMSPE:

\[ \text{Post-RMSPE}_i = \sqrt{ \frac{1}{T_1} \sum_{t=T_0+1}^{T} \hat{\tau}_{it}^2 }. \]

The object of interest is the ratio

\[ R_i = \frac{\text{Post-RMSPE}_i}{\text{Pre-RMSPE}_i}. \]

We compute this ratio for the true treated unit and then for each donor unit after iteratively assigning it treatment and re-running the synthetic control procedure.

Let \(R_1\) denote the ratio for the actual treated unit and let \(\{R_j\}_{j=2}^{N_0+1}\) denote the ratios for each placebo unit.

The randomization-based p-value is simply the rank of the treated unit’s ratio among all possible assignments:

\[ p = \frac{ 1 + \sum_{j=2}^{N_0+1} \mathbf{1} \{ R_j \ge R_1 \} }{ 1 + N_0 }. \]

Intuitively, if the treated unit’s post-treatment deviation is unusually large relative to its pre-treatment fit — compared to what we see when we pretend each donor was treated — then the p-value will be small.

3 Implementing in synthdid

3.1 Randomization inference (Abadie-style in-space placebos)

Once you have your synthetic control estimate, the key idea for inference is to re-run the exact same synthetic control procedure while pretending each donor unit was treated. This gives us a placebo distribution. If the true treated unit looks unusually “extreme” relative to those placebos, we interpret that as evidence the effect is unlikely to be due to chance.

The subtle but important point is that we don’t rank the raw ATT. Instead, following Abadie et al. (2010), we rank a fit-adjusted statistic based on prediction error:

Compute the synthetic control gap for unit (i): [ {it} = Y{it} - _{it}^{}. ]
Turn that into a pre-treatment RMSPE and post-treatment RMSPE: [ _i = , _i = . ]
Take the ratio: [ R_i = . ]

Intuition: we want a good pre-period fit (small denominator), and if treatment mattered we should see a bigger post-period gap (large numerator). The ratio is a simple way to combine both.

To get a p-value, we compute (R_1) for the treated unit, then compute (R_j) for each donor unit (j) when we pretend it was treated. The p-value is just the rank:

[ p = . ]

3.2 Done in synthdid

Below is a minimal implementation that assumes you already have a setup object from panel.matrices().

# Assumes you already have: setup <- panel.matrices(...)
# setup$Y is (N x T), rownames(setup$Y) are unit names, setup$T0 is pre-period length.
# This loops over EVERY unit, treats it as "treated" by moving it to the last row,
# runs sc_estimate, computes pre/post RMSE and MSPE ratios, and ranks California.

library(synthdid)

sc_placebo_inference <- function(setup, stat = c("mspe", "rmse")) {
  stat <- match.arg(stat)

  Y0 <- setup$Y
  T0 <- setup$T0
  TT <- ncol(Y0)
  units <- rownames(Y0)
  N <- nrow(Y0)

  out <- data.frame(
    unit = units,
    att = NA_real_,
    pre_mspe = NA_real_,
    post_mspe = NA_real_,
    pre_rmse = NA_real_,
    post_rmse = NA_real_,
    ratio_rmse = NA_real_,
    ratio_mspe = NA_real_
  )

  for (i in seq_len(N)) {
    # reorder Y so unit i is last (treated)
    ord <- c(setdiff(seq_len(N), i), i)
    Y <- Y0[ord, , drop = FALSE]

    run <- synthdid::sc_estimate(Y, N - 1, T0)
    out$att[i] <- as.numeric(run)

    # Use the full donor weight vector (not summary(run)$controls display output).
    w_full <- attr(run, "weights")$omega

    # synthetic path and gaps
    y_synth <- as.numeric(t(w_full) %*% Y[1:(N - 1), , drop = FALSE])
    y_tr <- as.numeric(Y[N, ])
    gap     <- y_tr - y_synth

    pre_mspe <- mean(gap[1:T0]^2, na.rm = TRUE)
    post_mspe <- mean(gap[(T0 + 1):TT]^2, na.rm = TRUE)
    pre_rmse <- sqrt(pre_mspe)
    post_rmse <- sqrt(post_mspe)

    out$pre_mspe[i] <- pre_mspe
    out$post_mspe[i] <- post_mspe
    out$pre_rmse[i]  <- pre_rmse
    out$post_rmse[i] <- post_rmse
    out$ratio_rmse[i] <- post_rmse / pre_rmse
    out$ratio_mspe[i] <- post_mspe / pre_mspe
  }

  ratio_vec <- if (stat == "mspe") out$ratio_mspe else out$ratio_rmse
  treated_unit <- rownames(setup$Y)[setup$N0 + 1]
  R_treated <- ratio_vec[out$unit == treated_unit]
  R_placebo <- ratio_vec[out$unit != treated_unit]
  p_val <- (1 + sum(R_placebo >= R_treated, na.rm = TRUE)) /
    (1 + sum(is.finite(R_placebo)))

  table_ranked <- out[order(ratio_vec, decreasing = TRUE), ]
  table_ranked$rank <- seq_len(nrow(table_ranked))
  rank_treated <- table_ranked$rank[table_ranked$unit == treated_unit]

  list(
    treated_unit = treated_unit,
    statistic = stat,
    R_treated = R_treated,
    rank_treated = rank_treated,
    p_val = p_val,
    table = table_ranked
  )
}

# Example:
setup <- panel.matrices(california_prop99)
res <- sc_placebo_inference(setup, stat = "mspe")
res[c("treated_unit", "statistic", "R_treated", "rank_treated", "p_val")]

$treated_unit
[1] "California"

$statistic
[1] "mspe"

$R_treated
[1] 154.94

$rank_treated
[1] 3

$p_val
[1] 0.07692308

The way the code is written is that the very last row in the panel object corresponds to the treated unit. So all we have to do is iterate the observations to the bottom. In other words, take the first observation, move to the bottom, and so on. Then we can use that to calculate the RMSE as we go along.

There are a few differences in the synthdid code that resulted in some differences in the ATT and p-value of the estimate I wrote here. Namely, there is something about ridge-type regularization that makes it so the SC doesn’t produce the same exact weights, and also I don’t include controls here which are in the original paper.

We can confirm this when using the tidysynth package, which has always felt fairly unintuitive to me, but is now making me feel as I look at it that I should’ve used that. Regardless, I feel good about writing a nice little piece on synthetic controls and knowing I can run this manually if computers were ever not a thing.

require(tidysynth)

Loading required package: tidysynth

Warning: package 'tidysynth' was built under R version 4.4.3

data("smoking")

smoking_out <-
  smoking %>%

  synthetic_control(
    outcome = cigsale,
    unit    = state,
    time    = year,
    i_unit  = "California",
    i_time  = 1988,
    generate_placebos = TRUE
  ) %>%

  # Outcome-only matching:
  # use the pre-treatment outcome path (or a subset of pre-years) as predictors.
  # Here I include every pre-treatment year up through 1988.
  generate_predictor(
    time_window = 1970:1988,
    cigsale_pre = mean(cigsale, na.rm = T)
  ) %>%

  generate_weights(
    optimization_window = 1970:1988,
    margin_ipop = .02, sigf_ipop = 7, bound_ipop = 6
  ) %>%

  generate_control()

  head(smoking_out %>% grab_significance())

# A tibble: 6 × 8
  unit_name     type    pre_mspe post_mspe mspe_ratio  rank fishers_exact_pvalue
  <chr>         <chr>      <dbl>     <dbl>      <dbl> <int>                <dbl>
1 West Virginia Donor      13.6      351.        25.8     1               0.0256
2 Illinois      Donor       9.36     202.        21.6     2               0.0513
3 California    Treated    61.4     1004.        16.3     3               0.0769
4 Missouri      Donor      23.4      324.        13.8     4               0.103 
5 Indiana       Donor      34.8      453.        13.0     5               0.128 
6 South Dakota  Donor       6.92      70.2       10.1     6               0.154 
# ℹ 1 more variable: z_score <dbl>

If you read this, thanks!