Econometrics

Is this an an endogeneity/simultaneity problem?

  • March 6, 2021

I would like to know if the logic in these two situations is correct.

Situation 1: Let’s say we have a continuous dependent variable, $ y_1 $ , that then has a causal impact on an unobserved variable, $ \rho $ . This unobserved variable then has a causal impact on a variable, $ y_2 $ , which has a causal impact on $ y_1 $ . We want to estimate $ \frac{\partial y_1}{\partial y_2} $ . So, we have a set of structural equations as follows:

$$ y_1 = f(y_2, \mathbf{x_1})+ e_1 $$ $$ \rho = f(y_1, \mathbf{x_2})+ e_2 $$ $$ y_2 = f(\rho, \mathbf{x_3})+ e_3 $$

where the $ \mathbf{x_i} $ terms are exogenous and the $ e_i $ terms are errors. By substituting the second equation into the third, we can see that we would have simultaneity and our estimates of the impact of $ y_2 $ on $ y_1 $ would be biased if we could not control for $ \rho $ or some proxy for it in our estimate of the structural equation for $ y_1 $ .

I am particularly unsure about this last part in italics. Could we use the proxy for $ \rho $ in the estimate of the structural equation for $ y_1 $ , or would we have to do 2SLS, with the proxy for $ \rho $ being included in the first stage but excluded in the second?

Situation 2: Let’s say we have a percentage dependent variable, $ s_1 $ . Let’s say the complement of $ s_1 $ is made up of two other percentages, $ s_2 $ and $ s_3 $ . Furthermore, let’s say that $ s_3 $ has a causal impact on an unobserved factor, $ \rho $ , and that $ \rho $ has a causal impact on $ y_2 $ , which has a causal impact on $ s_1 $ . We want to estimate $ \frac{\partial s_1}{\partial y_2} $ . Thus, we have the following structural equations:

$$ s_1 = f(y_2, \mathbf{x_1})+ e_1 $$ $$ \rho = f(s_3, \mathbf{x_2})+ e_2 $$ $$ y_2 = f(\rho, \mathbf{x_3})+ e_3 $$

Let’s now say that $ s_2 $ is more or less constant across observations. Thus, there is generally an inverse relation between $ s_1 $ and $ s_3 $ . This implies that we can rewrite $ s_3 $ in the second structural equations in terms of $ s_1 $ :

$$ \rho = f(1 - (s_1 + \bar{s}_2), \mathbf{x_2})+ e_2 $$

Then, just as in situation 1, by substituting this equation into the structural equation for $ y_2 $ , we can see there would be simultaneity and our estimates of the impact of $ y_2 $ on $ y_1 $ would be biased (once again, I am unsure about whether controlling for $ \rho $ or a proxy for it in our estimate of the structural equation for $ y_1 $ would solve this).

Let us consider Situation 1.

Let us assume that $ \rho $ is observed. If it does not work when $ \rho $ is observed, there is no reason why it (using a proxy of $ \rho $ as instrument) should work when $ \rho $ is not observed.

$ \rho $ is endogenous so we can’t just include it as a regressor in an equation. Thus, let us consider IV estimation.

Obvious instruments for the first equation are $ \mathbf{x}_1 $ , $ \mathbf{x}_2 $ , and $ \mathbf{x}_3 $ . Can we use $ \rho $ as an extra instrument? It depends on whether $ \rho $ is relevant and whether $ \rho $ is exogenous in the first equation (i.e., the $ y_1 $ equation). First, $ \rho $ is relevant as it is correlated with $ y_2 $ (unless the last equation is degenerate). Next, is it exogenous?

To check it, let’s go maths and consider the following simple model (without intercepts and exogenous regressors, for simplicity): $$ y_1=\alpha y_2 + e_1,;; \rho = \beta y_1+e_2,;; y_2=\gamma \rho + e_3. $$ Write them in matrix form: $$ \begin{pmatrix} 1 & 0 & -\alpha\ -\beta & 1 & 0\ 0 & -\gamma & 1 \end{pmatrix} \begin{pmatrix} y_1\ \rho\ y_2 \end{pmatrix}

\begin{pmatrix} e_1\ e_2\ e_3 \end{pmatrix} . $$ Using Cramer’s rule, we get $$ \rho = (\beta e_1 + e_2 + \alpha \beta e_3 ) / (1-\alpha\beta\gamma). $$ $ \rho $ is correlated with $ e_1 $ unless $ \beta=0 $ , so $ \rho $ can’t be used as an instrument.

Edit

What happens if $ y_1 $ is regressed on $ y_2 $ and $ \rho $ ? Then will $ \alpha $ be consistently estimated? For simplicity, suppose that $ e_1, e_2, e_3 $ are $ iid $ standard normal. Then the OLS estimator vector converges in probability to $$ \begin{bmatrix} \alpha\ 0 \end{bmatrix} + \begin{bmatrix} E(y_2^2) & E(y_2 \rho)\ E(y_2\rho) & E(\rho^2) \end{bmatrix} ^{-1} \begin{bmatrix} E(y_2 e_1)\ E(\rho e_1) \end{bmatrix} . $$ Letting $ e=(e_1,e_2,e_3)’ $ , write $ \rho = e’g $ and $ y_2 = e’h $ for some $ 3\times 1 $ nonrandom vectors $ g $ and $ h $ . Then the second term is $$ \begin{pmatrix} h’h & h’g\ g’h & g’g \end{pmatrix} ^{-1} \begin{pmatrix} h_1\ g_1 \end{pmatrix} . $$ The first element of the above is the determinant inverse times $ g’gh_1 - g’h g_1 = g’(gh_1 - hg_1) $ . Let us calculate it. We have already obtained $ g $ . We now need $ h $ : $$ y_2 = (-\beta\gamma e_1 - \gamma e_2 + e_3) / (1-\alpha\beta\gamma). $$

Let us ignore the determinant part (common). We can work out with $ g=(\beta, 1, \alpha\beta)’ $ and $ h=(-\beta\gamma, -\gamma, 1)’ $ , ignoring the determinant. Then $$ gh_1 - hg_1 = [0, 0, -\beta (1+\alpha \beta\gamma) ]’ \text{ times a constant}. $$ Thus, $$ g’(gh_1 - hg_1) = -\alpha \beta^2 (1+\alpha\beta\gamma) \text{ times another constant}. $$ The OLS regression of $ y_1 $ on $ y_2 $ and $ \rho $ gives an inconsistent estimator in general, but gives a consistent estimator of $ \alpha $ if $ \alpha=0 $ or $ \beta=0 $ . Especially, if $ \alpha=0 $ , then the OLS estimator is consistent. That is, we can regress $ y_1 $ on $ y_2 $ and $ \rho $ if we want to test $ H_0: \alpha=0 $ . This is unexpected. Not sure if my algebra is correct.

引用自:https://economics.stackexchange.com/questions/42889