# Simulating and Estimating Spillovers in Stata¶

This notebook simulates and then estimates spillover effects where potential outcomes are described by the function $$y(\tau_{i,t}, v(\tau_{-i,t}))$$

for firm-level treatment $\tau_{i,t}$. Outcomes depend on peer firm treatment status through the scalar function $v(\tau_{-i,t})$ where $\tau_{-i,t}$ denotes treatments outside of the focal firm. The scalar function was first devised by Hong and Raudenbush (2006 JASA) to address the dimensionality of $\tau_{-i,1}$ (the function $y(\tau_{i,t},\tau_{-i,t})$ has $2^N-1$ potential outcomes). The scalar function $v(\cdot)$ is specified as $$v(\tau_{-i,t}) := \frac{1}{|C|}\sum_{i' \in C} \tau_{i',t}$$ for $C := \{i'|\text{Ind. }i'=\text{Ind. }i\}$. This follows Ferraci, Jolivet, and van den Berg (2014 REStat). As shorthand, we will write $v(\tau_{-i,t})$ as $\rho_{j,t}$ where $j$ indexes the industry.

We begin by constructing a panel with $40$ industries, $50$ firms per industry, and $40$ quarters of data. A handful of true $\beta$ parameters are also specified.

### Exogenous Treatment Example¶

Suppose first that treatment assignment is random and that spillover effects are linear. This implies a model $$y_{i,t} = \beta_0 + \beta_1\tau_{i,t} + \beta_2\tau_{i,t}\rho_{j,t} + \beta_3(1-\tau_{i,t})\rho_{j,t} + \epsilon_{i,t}$$

When treatment is exogenous and spillover effects are linear in $\rho$, treatment effects are easy to estimate.

The Stable Unit Treatment Value Assumption (SUTVA) imposes the condition that $y(\tau_{i,t},\tau_{-i,t})=y(\tau_{i,t})$ so that there are no spillover effects. This implies the model $$y_{i,t} = \theta_0 + \theta_1\tau_{i,t} + u_{i,t}$$

Theoretically, $\hat{\theta_1} = \hat{\beta_1} + (\hat{\beta_2}-\hat{\beta_3})\bar{\rho}_{j,t}$. Within the simulated sample, the observed bias is close to the asymptotic bias.

### Endogenous Treatment Sample¶

Now suppose that treatment assignment is endogenous. In particular, there is an industry-level covariate $w$ with which a firm-level covariate $x$ is correlated. We allow the firm-level covariate $x$ to predict treatment status.

For simplicity, we won't allow $x$ to affect outcomes directly, though adding this feature wouldn't really change anything.

As one would expect, without addressing endogenous treatment assignment, the model that we ran before is flawed.

The simplest approach is to use regression adjustment and include $x$ as a covariate.

Now suppose that spillover effects are nonlinear in $\rho$.

In this scenario, regression adjustment alone isn't sufficient.

Begin by computing likelihoods of treatment.

Ideally, this would be done on an industry-by-industry or industry-quarter by industry-quarter basis. To save some computation time, I'll run it all in one go. This is fine because the treatment propensity simulation above does not use industry-specific parameters to calculate treatment likelihoods.

Impose overlap with a maxima-minima rule for estimated treatment probabilities.

Now, within a particular industry-quarter, the parameter $\rho$ is fixed. This allows us to freely estimate $E[y(\tau=1,\rho)|\rho]$ and $E[\tau=0,\rho)|\rho]$. We will include $x$ as a covariate in the industry-quarter by industry-quarter regressions so that the expected outcomes we estimate are IPWRA estimators and thus doubly-robust.

To estimate the dose response functions $E[y(1,\rho)]$ and $E[y(0,\rho)]$ over industry-quarter level data, one could either use $\rho$ as the continous treatment variable or a standardized form of $\rho$. In the case of the former, because it is bounded over $[0,1]$ and likely to be skewed, special care needs to be taken with specifying the link function used in the dose response estimation. It turns out that it is more effective to standardize $\rho$ (done below) and then un-standardize it post-estimation because of the good performance that we get out of the usual Gaussian distribution with identity link function.

We can now estimate a dose response function on $E[y(0,\rho)]$ with covariate $w$. Recall that $x$ predicts firm-level treatment and was correlated with $w$, so $w$ should predict industry-quarter level treatment $\rho$. To speed things along, I've specified flag(0) to turn off covariate balancing tests (since we know that, by construction, w is sufficient to balance over $\rho$). We do the same for $E[y(1,\rho)]$. The estimated effects for $\tau=0$ and for $\tau=1$ firms line up very well with their true values. The only failure is at the tail end of the distribution for $\rho$. Given that the dose response function is estimated nonparameterically, failures near the edge of the distribution are expected.