Quant-Trading-Strategies

Regression coefficient and basic trading strategy

  • September 27, 2016

This question might be very basic but still I couldn’t really find a satisfying answer anywhere. I want to analyse the effect of a repeated event (data release) on the price of a specific asset (I have daily data) thrgouh regression (with a GARCH model for volatility). I have already ran an event study, but I would like to see a different approach.

So I run a linear regression explaining $ R_{t+1} $ with several regressors : lagged return $ R_{t} $ , exogenous variable $ E_{t} $ and the z-score of the released data $ Z_{t} $ (equal to 0 on non-release days). For the sake of the argument, suppose that the regression coefficients are all significant.

My question is : how can I use these results to build a basic trading strategy ? So far I could think of three approaches, but I would like to know whether some/all are wrong/useless/good :

  • Using the coefficient of the lagged return to build basic trend following (if the coeff is > 0 ) or mean-reversion (is it is < 0)
  • Using the coefficient of the z-score of the data to determine whether a positive value has a positive or negative effect on the return. Then, for future data, if the coeff is positive, then if $ Z_{t} $ is positive, go long, else go short.
  • Using all the coefficients to later on forecast the value of $ R_{t+1} $ and invest accordingly.

Are these ways to interpret regression results investment-wise flawed or correct ? Are there others ?

I will give an answer focusing on the econometric aspect of the question. (I could be missing some basic ideas in finance, though.)

If each different regressor is roughly uncorrelated with any linear combination of the remaining regressors, then each of the strategies could work and seem reasonable within the given model. However, focusing on the effect of a single variable is less efficient than focusing on all the effects together, so the last approach should dominate the former ones. (There is one caveat, though, which is overfitting if you are using OLS estimation without regularization/shrinkage.)

If the different regressors are (somewhat highly) correlated with some linear combination of other regressors, the first two approaches could be dominated by estimating simple regresions omitting all variables but the regressors of interest. While this may seem strange in the context of explanatory modelling (why would you omit relevant regressors and thus get inconsistent estimates?), it actually works in predictive modelling; see e.g. F. X. Diebold’s blog post on the so-called “predictive consistency”: “Causality and T-Consistency vs. Correlation and P-Consistency”. In any case, these approaches would still be dominated by the third approach (using prediction from the full model) (but again, the same caveat applies).

You could use GARCH model as filter - fit (mayby with some AR/MA/ARIMA part) and in second step build regression model between obtained residuals and Z(t) variable.

Do you think that impact of Z(t) is additive or multiplicative against residuals of GARCH model ?

err(t) = e(t)*sigma(t) + Z(t)

or

err(t) = e(t)*sigma(t)*Z(t)

or mayby

err(t)=e(t)*sigma(t)+Z(t)*sigma(t) ?

Third case looks the most suitable for me and you can deal with it using above approach.

引用自:https://quant.stackexchange.com/questions/28032