SlideShare a Scribd company logo
Conditional CorrelationNick WadeNorthfield Information ServicesAsian Research SeminarNovember 2009
OverviewCorrelation holds a pivotal place in our analysis of data, and the construction of forecasting models for return and riskReview the literature on correlation stability with a particular focus on turbulent marketsBacktrack:Review assumptions underlying correlationExplore role in regression, factor analysis, and cluster analysisDiscuss alternate distance measures and adjustmentsShow recent advances in regression, factor analysis, and cluster analysis that avoid non-stationarity issuesReturn to the thread:Show how a simple regime-switching model allowing conditional correlation can be incorporated into the linear multifactor risk model
Correlation breakdownThe concept of a “correlation breakdown” describing the tendency of correlations to move towards 1 as markets melt down reducing all the benefit of diversification just when it’s needed the most.It would also have stark implications for the mutual fund industry, whose business model is at least in part based on providing access to diversification
To open…Increasing attention is being paid to the issue of correlations varying over time:(stocks) De Santis, G. and B. Gerard (1997), International asset pricing and portfolio diversification with time-varying risk, Journal of Finance, 52, 1881-1912.(stocks) Longin, F. and B. Solnik (2001), Extreme correlation of international equity markets, Journal of Finance, LVI(2), 646-676.(bonds) Hunter, D.M. and D.P. Simon (2005), A conditional assessment of the relationships between the major world bond markets, European Financial Management, 11(4), 463-482.(bonds) Solnik, B., C. Boucrelle and Y.L. Fur (1996), International market correlation and volatility, Financial Analysts Journal, 52(5), 17-34.Markov switching model: Chesnay, F and Jondeau, E “Does Correlation Between Stock Returns really increase during turbulent periods?” Bank of France research paper.To date little explored – however, Implied Correlation also seems useful, and more powerful than historical correlation in forecasting (we saw the same result with volatility):Campa, J.M. and P.H.K. Chang (1998), The forecasting ability of correlations implied in foreign exchange options, Journal of International Money and Finance, 17, 855-880.
Correlation stabilityOne of the first… Kaplanis (1988): STABLETang (1995), Ratner (1992), Sheedy (1997): STABLE – although crash of 1987 regarded as an “anomaly”Bertero and Mayer (1989), King and Wadwhani (1990) and Lee and Kim (1993): correlation has increased, but STABLENot quite so stable? Erb et al (1994) – increases in bear marketsLongin and Solnick (1995) – increases in periods of high volatilityLongin and Solnick (2001) – increases in bear markets
Turbulence in the marketKritzman (2009): Correlation of US and foreign stocks when both markets’ returns are one standard deviation above their mean: -17%Correlation of US and foreign stocks when both markets’ returns are one standard deviation below their mean: +76%“Conditional correlations are essential for constructing properly diversified portfolios”
Correlation regimesIf it’s not stable, how about Markov switching models?Ramchand and Susmel (1998), Chesnay and Jondreau (2001) – correlation, conditioned on market regime, increases in periods with high volatilityAng and Bekaert (1999) – evidence for two regimes; a high vol/high corr, and a low vol/low corr.
backtrackBefore we get too far down the track, lets look at the assumptions underlying correlation and see what the implications are
Correlation – a definitionCorrelation measures the strength and direction of a linear relationship between two variablesDifference from the mean in the numeratorStandard deviation in the denominatorNot new… dates from the 1880’s…
Correlation – limitations & assumptionsLinear – fits only the linear part of the relationshipStationary – susceptible to trend in mean in either seriesStationary – assumes the volatility of each series is unchanging over timeNo concept of higher momentsSensitive to the underlying distributionSensitive to the presence of estimation error in the observationsAn incomplete measure of the relationship between two series
Correlation in regressionStating the obvious, but…Correlation is at the heart of regression analysis:Remember to use something like Durbin-Watson to test for serial correlationTypical situation: huge R^2, low DW (positive s.c.)If DW<1 problematic positive serial correlationIf you think the residuals are non-Normal, DW blows up. Use Breusch-Godfrey test instead.
Correlation in factor analysis(Principal Components Analysis is a sub-class of the family of Factor Analysis techniques)Correlation is a key part of factor analysisPCA uses the eigenvectors of the covariance matrix, and hence is affected by anything that impacts the volatility or correlation of the series
Correlation in cluster analysisCluster analysis is closely related to factor analysisCluster analysis assigns members to a group depending on rules and a distance measureFor example “complete linkage” cluster analysis adds a new member to the group whose least-related member is most highly related to the new member. Thus, depending on the choice of distance measure [correlation is the most common choice], cluster analysis may also be affected by any problems with correlation…
One classification systemJorge Luis Borges, “Other Inquisitions” 1937-1952Animals are divided into: a)those that belong to the Emperor, b) embalmed ones, c) those that are trained, d) suckling pigs, e) mermaids, f) fabulous ones, g) stray dogs, h) those that are included in this classification, i) those that tremble as if they were mad, j) innumerable ones, k) those drawn with a very fine camel's hair brush, l) others, m) those that have just broken a flower vase, n) those that resemble flies from a distance.
ThoughtsNon-stationary volatility (ARCH, GARCH, etc)We spend an heroic amount of time trying to forecast non-stationary volatilityBut we often just ignore it when we calculate correlation, or perform regression analysis, or run factor analysis (or PCA)Non-stationary mean (Trend)We often build models to capture the alpha in momentum, reversals, and other manifestations of a non-stationary meanBut we often ignore those when we calculate correlation, or perform regression analysis, or run factor analysisRead the fine print…
paralysisYes, I know, if you read all the fine print and believed that non-stationarity in the data rendered all the techniques useless you’d never do any analysis and would eventually get fired for surfing the internet all day for months and months looking for The Right Approach.Fries with that?
More thoughts…What happens ex-post when we analyze data?IR, ICIs my risk model broken?Measures such as IR, IC, are also affected by non-stationarityIC varies over time
Nonstationarity again…Huber 2001Manager has 2.33% forecast tracking error and -6.3% realized return. 3-sigma event or a broken risk model?Risk model on target [ex-ante and ex-post SD both 2.xx]Then why? Unfortunately -50bps per month alpha trendTypical measures of risk are centered on the trend and thus ignore the risk of being consistently bad (Or good! He could have had +6% return…)Extending this idea, Qian and Hua (2004) define “strategy risk” as the standard deviation of the manager’s IC over time, and thus “forecast true active risk”:Forecast Active Risk = std(IC) * Breadth1/2 * Forecast Tracking ErrorHuber, Gerard. “Tracking Error and Active. Management”, Northfield Conference Proceedings, 2001, http://www.northinfo.com/documents/164.pdf
An aside… Sharpe RatioTrend Effect: in the case of declining markets, the fund with the higher total risk exhibits a higher (less negative) Sharpe ratio…Another great way to stuff up your Sharpe ratio:Imagine you have a model that only works sometimes [no, no, I know your models ALWAYS work…]Be really really good sometimes…and in cash the rest of the time [being responsible]=> your mean return is low, and your best months are the ones contributing all your volatility.Another manager doing the same thing with less skill can have a better Sharpe ratio
Linearity affects correlation
Outlier dependence on correlationThe presence of outliers is problematic for correlation (think about regression) Use Mahalanobis distance as a test for outliers
Correlation examples
alternativesWe are looking for a measure of similarity, or shared behavior, or difference, or distance between two seriesIdeally we want one with as few restrictions as possible i.e. non-linear, robust to errors, not dependent on a particular distribution, and so on.
Related measures, adjustments and alternativesRank correlationDisattenuationTotal correlation / mutual informationCohesion/CoherenceMahalanobis distanceEuclidean Distance (special case of MD)Masochists can read 74 pages on similarity measures in: Sneath PHA & Sokal RR (1973) Numerical Taxonomy. Freeman, San Francisco.
Rank correlationNo linearity requirementwhere:di = xi − yi = the difference between the ranks of corresponding values Xi and Yi, andn = the number of values in each data set (same for both sets).
Disattenuation (what??)Somewhat complicated, but essentially just an adjustment to correlation for the presence of estimation error in underlying seriesTends to upward bias correlations
Mutual informationMutual information/ Total correlationTotal Correlation [Watanabe (1960)] expresses the amount of redundancy or dependency existing among a set of variables.
Cohesion/coherenceThe spectral coherence is a statistic that can be used to examine the relation between two signals or data sets. It is commonly used to estimate the power transfer between input and output of a linear system. The squared coherence between two signals x(t) and y(t) is a real-valued function that is defined as [1][2]:where Gxy is the cross-spectral density between x and y, and Gxx and Gyy the autospectral density of x and y respectively. The magnitude or power of the spectral density is denoted as |G|.
Mahalanobis distanceMD is one example of a “Bregman divergence” , a group of distance measures. Clustering: classifies the test point as belonging to that class for which the Mahalanobis distance is minimal. This is equivalent to selecting the class with the maximum likelihood.Regression: Mahalanobis distance and leverage are often used to detect outliers, especially in the development of linear regression models. A point that has a greater Mahalanobis distance from the rest of the sample population of points is said to have higher leverage since it has a greater influence on the slope or coefficients of the regression equation.  Specifically, Mahalanobis distance is also used to determine multivariate outliers. A point can be an multivariate outlier even if it is not a univariate outlier on any variable.Factor Analysis: recent research couples Mahalanobis distance with Factor Analysis and use MD to determine whether a new observation is an outlier or a member of the existing factor set. [Zhang 2003]MD depends on covariance (S^-1 is the inverse of the covariance matrix), so is exposed to the same stationarity issues that affect correlation, however as described above it can help us reduce correlation’s outlier dependence.
Mahalanobis in actionBorrowed from Kritzman: Skulls, financial turbulence, and theimplications for risk management. July 2009
Regression with nonstationary dataTechniques have been developed specifically to allow time-varying sensitivitiesFLS (flexible least-squares)FLS is primarily a descriptive tool that allows us to gauge the potential for time-evolution of exposuresMinimze both sum of squared errors and sum of squared dynamic errors  (coefficient estimates)
FLS exampleAn example from Clayton and MacKinnon (2001)The coefficient apparently exhibits structural shift in 1992
Factor analysis with nonstationary dataDahlhaus, R. (1997). Fitting Time Series Models to Nonstationary Processes. Annals of Statistics, Vol. 25, 1-37.Del Negro and Otrok (2008): Dynamic Factor Models with Time-Varying Parameters: Measuring Changes in International Business Cycles (Federal Reserve Bank New York) Eichler, M., Motta, G., and von Sachs, R. (2008). Fitting dynamic factor models to non-stationary time series. METEOR research memoranda RM/09/002, Maastricht University.Stock and Watson (2007): Forecasting in dynamic factor models subject to structural instability (Harvard).There are techniques available, and they are being applied to financial series.
Cluster analysis with nonstationary dataGuedalia, London, Werman; “An on-line agglomerative clustering method for nonstationary data” Neural Computation, February 15, 1999, Vol. 11, No. 2, Pages 521-54C. Aggarwal, J. Han, J. Wang, and P. S. Yu, On Demand Classification of Data Streams, Proc. 2004 Int. Conf. on Knowledge Discovery and Data Mining (KDD'04), Seattle, WA, Aug. 2004.G. Widmer and M. Kubat, “Learning in the Presence of Concept Drift and Hidden Contexts”, Machine Learning, Vol. 23, No. 1, pp. 69-101, 1996.Again, there are techniques available to conquer the problem
Clustering – artificial immune systemsNon-stationary clustering is also related to the development of artificial immune systems: Modeling evolving data sets, you can think of data as “born” with a set of factors (immunity) and subsequently develops immunity to new effects as they appear. You can decide whether/how much memory their should be of new and non-current factors.Each new observation could be one of three things: a member of an existing cluster, the first member of a new cluster, or an outlier.
The story so farA long time ago (in a galaxy far far away?) correlation was stable.Recent evidence tends to suggest that correlation is very much regime-dependent, and the consensus seems to be that two regimes are sufficient.Correlation can be upset by outliers, trend, noise, and non-linearityCorrelation is the “default” choice in regression analysis, factor analysis, and cluster analysis – the core of our toolkitMore recent techniques, alternative measures, and adjustments exist to combat these effects.
What’s leftThe linear multi-factor risk modelAdjusting the model for non-stationarityMaking the model “conditional” by allowing separate regimesLunch
The linear multi-factor risk modelRelationship between R and F is linear ∀FThe Exposures E are the correlations between R and F.There are N common factor sources of returnThe distribution of F is stationary, Normal, i.i.d. ∀F(Implicitly also the volatility of R and F is stationary)
Effect on risk model estimationIf not properly addressed during estimation:Time-Series (macro model): effect will be observed in exposures and correlationsCross-sectional (fundamental): effect observed in correlation (covariance) matrixStatistical: effect observed in factor loadings (from FA), factor returns (from regression analysis) and in covariance matrix. Ouch!
Limitations of the modelSymmetric – the response is the same whether a factor increases or decreasesLinear – the response is the same for any size move in a factorStationary – we are assuming that volatility and mean is stationary for all the factors
Adjustments for non-stationarity(WARNING: MARKETING PLUG: all currently used in the Northfield risk model family)Exponential Weighting Conditional variancesParkinson range measureCross-sectional dispersion-inferred time-series variance adjustmentsImplied volatility Serial-correlation adjustments in short-horizon models
Is A Single correlation Adequate?Recent evidence suggests that correlations change over time, and particularly during market turmoil:If true, this impacts our ability to diversify just when we need it mostThis means our risk analysis may over-estimate the benefits of diversification when reporting our riskOptimized portfolios may be exposed to higher risk than we thought One simple correction would be to have two sets of risk model factors, variances, and correlationsEstimate “normal” and “turbulent” volatilities and correlations for factorsScale exposures to account for the probability that each regime will be visible within the risk forecast horizon
implementationUse Viterbi’s algorithm (Viterbi 1967) to detect states.Use Jennrich tests (Jennrich 1970) to decide whether correlation differences between states are significant
Implementation 2Take a factor modelFirst Pass: “clone” the factor set and assume all factors apply in both regimesEstimate factor variances and correlations separately for the two regimesWeight the security exposures to the factors based on the probability of each regime:E.g. Regime 1 90%, Regime 2 10%Single correlation model Market exposure for security X = 1.25In the two-regime model the Market exposure to “Market Normal” factor is 0.90*1.25, and the exposure to “Market Turbulent” factor is 0.10*1.25Set correlations between the two models equal to zero across models (i.e. Correlation (“Market Normal”, “Market Turbulent”) = 0)Run risk analysis / optimization as usual
limitationsStill linear – no accommodation of asymmetric correlation (i.e. UP ≠ DOWN)Same set of factors usedOpportunity to estimate completely different betas for each regime – small-sample problems?Pessimistic? If the probability assigned to the turbulent regime is high, risk estimates and optimal portfolios will be very conservative – perhaps too conservative?
conclusionsCorrelation lies at the heart of our favorite tools for data analysis (and risk model estimation)A clear understanding of its behavior is a requirement for good analysisRecent developments aid the incorporation of time-varying correlation and/or non-stationary processes.Recent research strongly indicates that correlation is regime dependentWe can incorporate multiple regimes into standard risk models simply, but linearity/symmetry assumptions remainOverall conclusion: read more papers?
referencesAng, A. and Bekaert, G. (1999) ‘International Asset Allocation with time-varying Correlations’, working paper, Graduate School of Business, Stanford University and NBER.Banerjee, Arindam; Merugu, Srujana; Dhillon, Inderjit S.; Ghosh, Joydeep (2005). "Clustering with Bregman divergences". Journal of Machine Learning Research6: 1705–1749. http://jmlr.csail.mit.edu/papers/v6/banerjee05b.html.Bertero, E. and Mayer, C. (1989) ‘Structure and Performance:Global Interdependence of Stock Markets around the Crash of October 1987’, London, Centre for Economic Policy Research.Chesnay, F. and Jondeau, E. (2001) ‘Does Correlation between Stock Returns really increase during turbulent Periods?’, Economic Notes by Banca Monte dei Paschi di Siena SpA, Vol. 30,No. 1, pp.53–80.Jim Clayton and Greg MacKinnon (2001), "The Time-Varying Nature of the Link Between REIT, Real Estate and Financial Asset Returns"(pdf,6.3M), Journal of Real Estate Portfolio Management, January-March IssueErb, C.B., Harvey, C.R. and Viskanta, T.E. (1994) ‘Forecasting international Equity Correlations’, Financial Analysts Journal,pp.32–45.Jakulin A & Bratko I (2003a). Analyzing Attribute Dependencies, in N Lavra\quad{c}, D Gamberger, L Todorovski & H Blockeel, eds, Proceedings of the 7th European Conference on Principles and Practice of Knowledge Discovery in Databases, Springer, Cavtat-Dubrovnik, Croatia, pp. 229-240Jennrich R. (1970) ‘An Asymptotic χ2 Test for the Equality of Two Correlation Matrices’, Journal of the American Statistical Association,Vol. 65, No. 330.
referencesR. Kalaba, L. Tesfatsion. Time-varying linear regression via flexible least squares. International Journal on Computers and Mathematics with Applications, 1989, Vol. 17, pp. 1215-1245.Kaplanis, E. (1988) ‘Stability and Forecasting of the Comovement Measures of International Stock Market Returns’, Journal of International Money and Finance, Vol. 7, pp.63–75.Lee, S.B. and Kim, K.J. (1993) ‘Does the October 1987 Crash strengthen the co-Movements among national Stock Markets?’,Review of Financial Economics, Vol. 3, No. 1, pp.89–102.Longin, F. and Solnik, B. (1995) ‘Is the Correlation in International Equity Returns constant: 1960–1990?’, Journal of InternationalMoney and Finance, Vol. 14, No. 1, pp.3–26.Longin, F. and Solnik, B. (2001) ‘Extreme Correlation of International Equity Markets’, The Journal of Finance, Vol. 56, No.2.Mahalanobis, P C (1936). "On the generalised distance in statistics". Proceedings of the National Institute of Sciences of India2 (1): 49–55. http://ir.isical.ac.in/dspace/handle/1/1268. Retrieved 2008-11-05Nemenman I (2004). Information theory, multivariate dependence, and genetic network inference
referencesOsborne, Jason W. (2003). Effect sizes and the disattenuation of correlation and regression coefficients: lessons from educational psychology. Practical Assessment, Research & Evaluation, 8(11). Qian, Edward and Ronald Hua. “Active Risk and the Information Ratio”, Journal of Investment Management, Third Quarter 2004.Ramchand, L. and Susmel, R. (1998) ‘Volatility and Cross Correlation across major Stock Markets’, Journal of Empirical Finance, Vol. 5, No. 4, pp.397–416.Ratner, M. (1992) ‘Portfolio Diversification and the inter-temporal Stability of International Indices’, Global Finance Journal, Vol. 3, pp.67–78.Sheedy, E. (1997) ‘Is Correlation constant after all? (A Study of multivariate Risk Estimation for International Equities)’, working paper.Sneath PHA & Sokal RR (1973) Numerical Taxonomy. Freeman, San Francisco.Tang, G.: ‘Intertemporal Stability in International Stock Market Relationships: A Revisit’, The Quarterly Review of Economics and Finance, Vol. 35 (Special), pp.579–593.Sharpe W. F., "Morningstar’s Risk-adjusted Ratings", Financial Analysts Journal, July/August 1998, p. 21-33.Viterbi, A. (1967) ‘Error Bounds for convolutional Codes and an asymptotically Optimum Decoding Algorithm’, IEEE Transactions on Information Theory, Vol. 13, No. 2, pp.260–269.Watanabe S (1960). Information theoretical analysis of multivariate correlation, IBM Journal of Research and Development4, 66-82.Yule, 1926. G.U. Yule, Why do we sometimes get nonsense-correlations between time series?. Journal of the Royal Statistical Society 89 (1926), pp. 1–69.

More Related Content

PPT
Econometric model ing
PPT
Data analysis test for association BY Prof Sachin Udepurkar
PPTX
Regression analysis
PDF
Exploratory Factor Analysis
PPTX
Correlation and regression
PDF
Introduction to Regression Analysis
PPT
Displaying and describing categorical data
PDF
Research Methodology Module-06
Econometric model ing
Data analysis test for association BY Prof Sachin Udepurkar
Regression analysis
Exploratory Factor Analysis
Correlation and regression
Introduction to Regression Analysis
Displaying and describing categorical data
Research Methodology Module-06

What's hot (17)

PPTX
Multiple Linear Regression
PPT
Simple (and Simplistic) Introduction to Econometrics and Linear Regression
PPTX
Econometrics chapter 8
PPTX
Topic 15 correlation spss
PDF
Introduction to regression analysis 2
PDF
Multiple linear regression
PDF
Multicollinearity1
PDF
Machine Learning Algorithm - Linear Regression
PPTX
Correlation
PPTX
Correlation and regression
PPT
Statistics and Public Health. Curso de Inglés Técnico para profesionales de S...
PDF
Correlation analysis
PDF
Correlation analysis
PPTX
PDF
Multiple Regression and Logistic Regression
ODP
Multiple linear regression II
PPT
Factor anaysis scale dimensionality
Multiple Linear Regression
Simple (and Simplistic) Introduction to Econometrics and Linear Regression
Econometrics chapter 8
Topic 15 correlation spss
Introduction to regression analysis 2
Multiple linear regression
Multicollinearity1
Machine Learning Algorithm - Linear Regression
Correlation
Correlation and regression
Statistics and Public Health. Curso de Inglés Técnico para profesionales de S...
Correlation analysis
Correlation analysis
Multiple Regression and Logistic Regression
Multiple linear regression II
Factor anaysis scale dimensionality
Ad

Similar to Conditional Correlation 2009 (20)

PDF
bigDay1data
PPT
Preprocessing - Data Integration Tuple Duplication
PPTX
Final Time series analysis part 2. pptx
PPTX
Introduction to Eviews.pptx
PDF
Econophysics III: Financial Correlations and Portfolio Optimization - Thomas ...
PDF
DS-Lecture-3c-Covariance and Correlation.pdf
PDF
Time Series, Vitalii Radchenko
PPTX
Bus 173_4.pptx
PPT
Research Methodology-Chapter 14
PPT
Chapter 12
PPTX
Multivariate Analysis Degree of association between two variable - Test of Ho...
PPTX
Advanced Econometrics L3-4.pptx
PPTX
Analysis 101 correlation v causation
PDF
The Search for a Better Risk Model - MPT Forum Tokyo March 1st 2012
DOCX
A researcher in attempting to run a regression model noticed a neg.docx
PPTX
Security analysis and portfolio management
PPTX
Time Series Analysis.pptx
PPTX
The 10 Algorithms Machine Learning Engineers Need to Know.pptx
PPTX
PPTX
The X Factor
bigDay1data
Preprocessing - Data Integration Tuple Duplication
Final Time series analysis part 2. pptx
Introduction to Eviews.pptx
Econophysics III: Financial Correlations and Portfolio Optimization - Thomas ...
DS-Lecture-3c-Covariance and Correlation.pdf
Time Series, Vitalii Radchenko
Bus 173_4.pptx
Research Methodology-Chapter 14
Chapter 12
Multivariate Analysis Degree of association between two variable - Test of Ho...
Advanced Econometrics L3-4.pptx
Analysis 101 correlation v causation
The Search for a Better Risk Model - MPT Forum Tokyo March 1st 2012
A researcher in attempting to run a regression model noticed a neg.docx
Security analysis and portfolio management
Time Series Analysis.pptx
The 10 Algorithms Machine Learning Engineers Need to Know.pptx
The X Factor
Ad

More from yamanote (6)

PDF
Using Cross Asset Information To Improve Portfolio Risk Estimation
PPTX
Nick Wade Using A Structural Model For Enterprise Risk, Dst Conference 2011...
PPTX
Agent Based Models 2010
PPT
Balancing quantitative models with common sense 2008
PPTX
Intra Horizon Risk 2010
PPT
Risk Model Methodologies
Using Cross Asset Information To Improve Portfolio Risk Estimation
Nick Wade Using A Structural Model For Enterprise Risk, Dst Conference 2011...
Agent Based Models 2010
Balancing quantitative models with common sense 2008
Intra Horizon Risk 2010
Risk Model Methodologies

Conditional Correlation 2009

  • 1. Conditional CorrelationNick WadeNorthfield Information ServicesAsian Research SeminarNovember 2009
  • 2. OverviewCorrelation holds a pivotal place in our analysis of data, and the construction of forecasting models for return and riskReview the literature on correlation stability with a particular focus on turbulent marketsBacktrack:Review assumptions underlying correlationExplore role in regression, factor analysis, and cluster analysisDiscuss alternate distance measures and adjustmentsShow recent advances in regression, factor analysis, and cluster analysis that avoid non-stationarity issuesReturn to the thread:Show how a simple regime-switching model allowing conditional correlation can be incorporated into the linear multifactor risk model
  • 3. Correlation breakdownThe concept of a “correlation breakdown” describing the tendency of correlations to move towards 1 as markets melt down reducing all the benefit of diversification just when it’s needed the most.It would also have stark implications for the mutual fund industry, whose business model is at least in part based on providing access to diversification
  • 4. To open…Increasing attention is being paid to the issue of correlations varying over time:(stocks) De Santis, G. and B. Gerard (1997), International asset pricing and portfolio diversification with time-varying risk, Journal of Finance, 52, 1881-1912.(stocks) Longin, F. and B. Solnik (2001), Extreme correlation of international equity markets, Journal of Finance, LVI(2), 646-676.(bonds) Hunter, D.M. and D.P. Simon (2005), A conditional assessment of the relationships between the major world bond markets, European Financial Management, 11(4), 463-482.(bonds) Solnik, B., C. Boucrelle and Y.L. Fur (1996), International market correlation and volatility, Financial Analysts Journal, 52(5), 17-34.Markov switching model: Chesnay, F and Jondeau, E “Does Correlation Between Stock Returns really increase during turbulent periods?” Bank of France research paper.To date little explored – however, Implied Correlation also seems useful, and more powerful than historical correlation in forecasting (we saw the same result with volatility):Campa, J.M. and P.H.K. Chang (1998), The forecasting ability of correlations implied in foreign exchange options, Journal of International Money and Finance, 17, 855-880.
  • 5. Correlation stabilityOne of the first… Kaplanis (1988): STABLETang (1995), Ratner (1992), Sheedy (1997): STABLE – although crash of 1987 regarded as an “anomaly”Bertero and Mayer (1989), King and Wadwhani (1990) and Lee and Kim (1993): correlation has increased, but STABLENot quite so stable? Erb et al (1994) – increases in bear marketsLongin and Solnick (1995) – increases in periods of high volatilityLongin and Solnick (2001) – increases in bear markets
  • 6. Turbulence in the marketKritzman (2009): Correlation of US and foreign stocks when both markets’ returns are one standard deviation above their mean: -17%Correlation of US and foreign stocks when both markets’ returns are one standard deviation below their mean: +76%“Conditional correlations are essential for constructing properly diversified portfolios”
  • 7. Correlation regimesIf it’s not stable, how about Markov switching models?Ramchand and Susmel (1998), Chesnay and Jondreau (2001) – correlation, conditioned on market regime, increases in periods with high volatilityAng and Bekaert (1999) – evidence for two regimes; a high vol/high corr, and a low vol/low corr.
  • 8. backtrackBefore we get too far down the track, lets look at the assumptions underlying correlation and see what the implications are
  • 9. Correlation – a definitionCorrelation measures the strength and direction of a linear relationship between two variablesDifference from the mean in the numeratorStandard deviation in the denominatorNot new… dates from the 1880’s…
  • 10. Correlation – limitations & assumptionsLinear – fits only the linear part of the relationshipStationary – susceptible to trend in mean in either seriesStationary – assumes the volatility of each series is unchanging over timeNo concept of higher momentsSensitive to the underlying distributionSensitive to the presence of estimation error in the observationsAn incomplete measure of the relationship between two series
  • 11. Correlation in regressionStating the obvious, but…Correlation is at the heart of regression analysis:Remember to use something like Durbin-Watson to test for serial correlationTypical situation: huge R^2, low DW (positive s.c.)If DW<1 problematic positive serial correlationIf you think the residuals are non-Normal, DW blows up. Use Breusch-Godfrey test instead.
  • 12. Correlation in factor analysis(Principal Components Analysis is a sub-class of the family of Factor Analysis techniques)Correlation is a key part of factor analysisPCA uses the eigenvectors of the covariance matrix, and hence is affected by anything that impacts the volatility or correlation of the series
  • 13. Correlation in cluster analysisCluster analysis is closely related to factor analysisCluster analysis assigns members to a group depending on rules and a distance measureFor example “complete linkage” cluster analysis adds a new member to the group whose least-related member is most highly related to the new member. Thus, depending on the choice of distance measure [correlation is the most common choice], cluster analysis may also be affected by any problems with correlation…
  • 14. One classification systemJorge Luis Borges, “Other Inquisitions” 1937-1952Animals are divided into: a)those that belong to the Emperor, b) embalmed ones, c) those that are trained, d) suckling pigs, e) mermaids, f) fabulous ones, g) stray dogs, h) those that are included in this classification, i) those that tremble as if they were mad, j) innumerable ones, k) those drawn with a very fine camel's hair brush, l) others, m) those that have just broken a flower vase, n) those that resemble flies from a distance.
  • 15. ThoughtsNon-stationary volatility (ARCH, GARCH, etc)We spend an heroic amount of time trying to forecast non-stationary volatilityBut we often just ignore it when we calculate correlation, or perform regression analysis, or run factor analysis (or PCA)Non-stationary mean (Trend)We often build models to capture the alpha in momentum, reversals, and other manifestations of a non-stationary meanBut we often ignore those when we calculate correlation, or perform regression analysis, or run factor analysisRead the fine print…
  • 16. paralysisYes, I know, if you read all the fine print and believed that non-stationarity in the data rendered all the techniques useless you’d never do any analysis and would eventually get fired for surfing the internet all day for months and months looking for The Right Approach.Fries with that?
  • 17. More thoughts…What happens ex-post when we analyze data?IR, ICIs my risk model broken?Measures such as IR, IC, are also affected by non-stationarityIC varies over time
  • 18. Nonstationarity again…Huber 2001Manager has 2.33% forecast tracking error and -6.3% realized return. 3-sigma event or a broken risk model?Risk model on target [ex-ante and ex-post SD both 2.xx]Then why? Unfortunately -50bps per month alpha trendTypical measures of risk are centered on the trend and thus ignore the risk of being consistently bad (Or good! He could have had +6% return…)Extending this idea, Qian and Hua (2004) define “strategy risk” as the standard deviation of the manager’s IC over time, and thus “forecast true active risk”:Forecast Active Risk = std(IC) * Breadth1/2 * Forecast Tracking ErrorHuber, Gerard. “Tracking Error and Active. Management”, Northfield Conference Proceedings, 2001, http://www.northinfo.com/documents/164.pdf
  • 19. An aside… Sharpe RatioTrend Effect: in the case of declining markets, the fund with the higher total risk exhibits a higher (less negative) Sharpe ratio…Another great way to stuff up your Sharpe ratio:Imagine you have a model that only works sometimes [no, no, I know your models ALWAYS work…]Be really really good sometimes…and in cash the rest of the time [being responsible]=> your mean return is low, and your best months are the ones contributing all your volatility.Another manager doing the same thing with less skill can have a better Sharpe ratio
  • 21. Outlier dependence on correlationThe presence of outliers is problematic for correlation (think about regression) Use Mahalanobis distance as a test for outliers
  • 23. alternativesWe are looking for a measure of similarity, or shared behavior, or difference, or distance between two seriesIdeally we want one with as few restrictions as possible i.e. non-linear, robust to errors, not dependent on a particular distribution, and so on.
  • 24. Related measures, adjustments and alternativesRank correlationDisattenuationTotal correlation / mutual informationCohesion/CoherenceMahalanobis distanceEuclidean Distance (special case of MD)Masochists can read 74 pages on similarity measures in: Sneath PHA & Sokal RR (1973) Numerical Taxonomy. Freeman, San Francisco.
  • 25. Rank correlationNo linearity requirementwhere:di = xi − yi = the difference between the ranks of corresponding values Xi and Yi, andn = the number of values in each data set (same for both sets).
  • 26. Disattenuation (what??)Somewhat complicated, but essentially just an adjustment to correlation for the presence of estimation error in underlying seriesTends to upward bias correlations
  • 27. Mutual informationMutual information/ Total correlationTotal Correlation [Watanabe (1960)] expresses the amount of redundancy or dependency existing among a set of variables.
  • 28. Cohesion/coherenceThe spectral coherence is a statistic that can be used to examine the relation between two signals or data sets. It is commonly used to estimate the power transfer between input and output of a linear system. The squared coherence between two signals x(t) and y(t) is a real-valued function that is defined as [1][2]:where Gxy is the cross-spectral density between x and y, and Gxx and Gyy the autospectral density of x and y respectively. The magnitude or power of the spectral density is denoted as |G|.
  • 29. Mahalanobis distanceMD is one example of a “Bregman divergence” , a group of distance measures. Clustering: classifies the test point as belonging to that class for which the Mahalanobis distance is minimal. This is equivalent to selecting the class with the maximum likelihood.Regression: Mahalanobis distance and leverage are often used to detect outliers, especially in the development of linear regression models. A point that has a greater Mahalanobis distance from the rest of the sample population of points is said to have higher leverage since it has a greater influence on the slope or coefficients of the regression equation. Specifically, Mahalanobis distance is also used to determine multivariate outliers. A point can be an multivariate outlier even if it is not a univariate outlier on any variable.Factor Analysis: recent research couples Mahalanobis distance with Factor Analysis and use MD to determine whether a new observation is an outlier or a member of the existing factor set. [Zhang 2003]MD depends on covariance (S^-1 is the inverse of the covariance matrix), so is exposed to the same stationarity issues that affect correlation, however as described above it can help us reduce correlation’s outlier dependence.
  • 30. Mahalanobis in actionBorrowed from Kritzman: Skulls, financial turbulence, and theimplications for risk management. July 2009
  • 31. Regression with nonstationary dataTechniques have been developed specifically to allow time-varying sensitivitiesFLS (flexible least-squares)FLS is primarily a descriptive tool that allows us to gauge the potential for time-evolution of exposuresMinimze both sum of squared errors and sum of squared dynamic errors (coefficient estimates)
  • 32. FLS exampleAn example from Clayton and MacKinnon (2001)The coefficient apparently exhibits structural shift in 1992
  • 33. Factor analysis with nonstationary dataDahlhaus, R. (1997). Fitting Time Series Models to Nonstationary Processes. Annals of Statistics, Vol. 25, 1-37.Del Negro and Otrok (2008): Dynamic Factor Models with Time-Varying Parameters: Measuring Changes in International Business Cycles (Federal Reserve Bank New York) Eichler, M., Motta, G., and von Sachs, R. (2008). Fitting dynamic factor models to non-stationary time series. METEOR research memoranda RM/09/002, Maastricht University.Stock and Watson (2007): Forecasting in dynamic factor models subject to structural instability (Harvard).There are techniques available, and they are being applied to financial series.
  • 34. Cluster analysis with nonstationary dataGuedalia, London, Werman; “An on-line agglomerative clustering method for nonstationary data” Neural Computation, February 15, 1999, Vol. 11, No. 2, Pages 521-54C. Aggarwal, J. Han, J. Wang, and P. S. Yu, On Demand Classification of Data Streams, Proc. 2004 Int. Conf. on Knowledge Discovery and Data Mining (KDD'04), Seattle, WA, Aug. 2004.G. Widmer and M. Kubat, “Learning in the Presence of Concept Drift and Hidden Contexts”, Machine Learning, Vol. 23, No. 1, pp. 69-101, 1996.Again, there are techniques available to conquer the problem
  • 35. Clustering – artificial immune systemsNon-stationary clustering is also related to the development of artificial immune systems: Modeling evolving data sets, you can think of data as “born” with a set of factors (immunity) and subsequently develops immunity to new effects as they appear. You can decide whether/how much memory their should be of new and non-current factors.Each new observation could be one of three things: a member of an existing cluster, the first member of a new cluster, or an outlier.
  • 36. The story so farA long time ago (in a galaxy far far away?) correlation was stable.Recent evidence tends to suggest that correlation is very much regime-dependent, and the consensus seems to be that two regimes are sufficient.Correlation can be upset by outliers, trend, noise, and non-linearityCorrelation is the “default” choice in regression analysis, factor analysis, and cluster analysis – the core of our toolkitMore recent techniques, alternative measures, and adjustments exist to combat these effects.
  • 37. What’s leftThe linear multi-factor risk modelAdjusting the model for non-stationarityMaking the model “conditional” by allowing separate regimesLunch
  • 38. The linear multi-factor risk modelRelationship between R and F is linear ∀FThe Exposures E are the correlations between R and F.There are N common factor sources of returnThe distribution of F is stationary, Normal, i.i.d. ∀F(Implicitly also the volatility of R and F is stationary)
  • 39. Effect on risk model estimationIf not properly addressed during estimation:Time-Series (macro model): effect will be observed in exposures and correlationsCross-sectional (fundamental): effect observed in correlation (covariance) matrixStatistical: effect observed in factor loadings (from FA), factor returns (from regression analysis) and in covariance matrix. Ouch!
  • 40. Limitations of the modelSymmetric – the response is the same whether a factor increases or decreasesLinear – the response is the same for any size move in a factorStationary – we are assuming that volatility and mean is stationary for all the factors
  • 41. Adjustments for non-stationarity(WARNING: MARKETING PLUG: all currently used in the Northfield risk model family)Exponential Weighting Conditional variancesParkinson range measureCross-sectional dispersion-inferred time-series variance adjustmentsImplied volatility Serial-correlation adjustments in short-horizon models
  • 42. Is A Single correlation Adequate?Recent evidence suggests that correlations change over time, and particularly during market turmoil:If true, this impacts our ability to diversify just when we need it mostThis means our risk analysis may over-estimate the benefits of diversification when reporting our riskOptimized portfolios may be exposed to higher risk than we thought One simple correction would be to have two sets of risk model factors, variances, and correlationsEstimate “normal” and “turbulent” volatilities and correlations for factorsScale exposures to account for the probability that each regime will be visible within the risk forecast horizon
  • 43. implementationUse Viterbi’s algorithm (Viterbi 1967) to detect states.Use Jennrich tests (Jennrich 1970) to decide whether correlation differences between states are significant
  • 44. Implementation 2Take a factor modelFirst Pass: “clone” the factor set and assume all factors apply in both regimesEstimate factor variances and correlations separately for the two regimesWeight the security exposures to the factors based on the probability of each regime:E.g. Regime 1 90%, Regime 2 10%Single correlation model Market exposure for security X = 1.25In the two-regime model the Market exposure to “Market Normal” factor is 0.90*1.25, and the exposure to “Market Turbulent” factor is 0.10*1.25Set correlations between the two models equal to zero across models (i.e. Correlation (“Market Normal”, “Market Turbulent”) = 0)Run risk analysis / optimization as usual
  • 45. limitationsStill linear – no accommodation of asymmetric correlation (i.e. UP ≠ DOWN)Same set of factors usedOpportunity to estimate completely different betas for each regime – small-sample problems?Pessimistic? If the probability assigned to the turbulent regime is high, risk estimates and optimal portfolios will be very conservative – perhaps too conservative?
  • 46. conclusionsCorrelation lies at the heart of our favorite tools for data analysis (and risk model estimation)A clear understanding of its behavior is a requirement for good analysisRecent developments aid the incorporation of time-varying correlation and/or non-stationary processes.Recent research strongly indicates that correlation is regime dependentWe can incorporate multiple regimes into standard risk models simply, but linearity/symmetry assumptions remainOverall conclusion: read more papers?
  • 47. referencesAng, A. and Bekaert, G. (1999) ‘International Asset Allocation with time-varying Correlations’, working paper, Graduate School of Business, Stanford University and NBER.Banerjee, Arindam; Merugu, Srujana; Dhillon, Inderjit S.; Ghosh, Joydeep (2005). "Clustering with Bregman divergences". Journal of Machine Learning Research6: 1705–1749. http://jmlr.csail.mit.edu/papers/v6/banerjee05b.html.Bertero, E. and Mayer, C. (1989) ‘Structure and Performance:Global Interdependence of Stock Markets around the Crash of October 1987’, London, Centre for Economic Policy Research.Chesnay, F. and Jondeau, E. (2001) ‘Does Correlation between Stock Returns really increase during turbulent Periods?’, Economic Notes by Banca Monte dei Paschi di Siena SpA, Vol. 30,No. 1, pp.53–80.Jim Clayton and Greg MacKinnon (2001), "The Time-Varying Nature of the Link Between REIT, Real Estate and Financial Asset Returns"(pdf,6.3M), Journal of Real Estate Portfolio Management, January-March IssueErb, C.B., Harvey, C.R. and Viskanta, T.E. (1994) ‘Forecasting international Equity Correlations’, Financial Analysts Journal,pp.32–45.Jakulin A & Bratko I (2003a). Analyzing Attribute Dependencies, in N Lavra\quad{c}, D Gamberger, L Todorovski & H Blockeel, eds, Proceedings of the 7th European Conference on Principles and Practice of Knowledge Discovery in Databases, Springer, Cavtat-Dubrovnik, Croatia, pp. 229-240Jennrich R. (1970) ‘An Asymptotic χ2 Test for the Equality of Two Correlation Matrices’, Journal of the American Statistical Association,Vol. 65, No. 330.
  • 48. referencesR. Kalaba, L. Tesfatsion. Time-varying linear regression via flexible least squares. International Journal on Computers and Mathematics with Applications, 1989, Vol. 17, pp. 1215-1245.Kaplanis, E. (1988) ‘Stability and Forecasting of the Comovement Measures of International Stock Market Returns’, Journal of International Money and Finance, Vol. 7, pp.63–75.Lee, S.B. and Kim, K.J. (1993) ‘Does the October 1987 Crash strengthen the co-Movements among national Stock Markets?’,Review of Financial Economics, Vol. 3, No. 1, pp.89–102.Longin, F. and Solnik, B. (1995) ‘Is the Correlation in International Equity Returns constant: 1960–1990?’, Journal of InternationalMoney and Finance, Vol. 14, No. 1, pp.3–26.Longin, F. and Solnik, B. (2001) ‘Extreme Correlation of International Equity Markets’, The Journal of Finance, Vol. 56, No.2.Mahalanobis, P C (1936). "On the generalised distance in statistics". Proceedings of the National Institute of Sciences of India2 (1): 49–55. http://ir.isical.ac.in/dspace/handle/1/1268. Retrieved 2008-11-05Nemenman I (2004). Information theory, multivariate dependence, and genetic network inference
  • 49. referencesOsborne, Jason W. (2003). Effect sizes and the disattenuation of correlation and regression coefficients: lessons from educational psychology. Practical Assessment, Research & Evaluation, 8(11). Qian, Edward and Ronald Hua. “Active Risk and the Information Ratio”, Journal of Investment Management, Third Quarter 2004.Ramchand, L. and Susmel, R. (1998) ‘Volatility and Cross Correlation across major Stock Markets’, Journal of Empirical Finance, Vol. 5, No. 4, pp.397–416.Ratner, M. (1992) ‘Portfolio Diversification and the inter-temporal Stability of International Indices’, Global Finance Journal, Vol. 3, pp.67–78.Sheedy, E. (1997) ‘Is Correlation constant after all? (A Study of multivariate Risk Estimation for International Equities)’, working paper.Sneath PHA & Sokal RR (1973) Numerical Taxonomy. Freeman, San Francisco.Tang, G.: ‘Intertemporal Stability in International Stock Market Relationships: A Revisit’, The Quarterly Review of Economics and Finance, Vol. 35 (Special), pp.579–593.Sharpe W. F., "Morningstar’s Risk-adjusted Ratings", Financial Analysts Journal, July/August 1998, p. 21-33.Viterbi, A. (1967) ‘Error Bounds for convolutional Codes and an asymptotically Optimum Decoding Algorithm’, IEEE Transactions on Information Theory, Vol. 13, No. 2, pp.260–269.Watanabe S (1960). Information theoretical analysis of multivariate correlation, IBM Journal of Research and Development4, 66-82.Yule, 1926. G.U. Yule, Why do we sometimes get nonsense-correlations between time series?. Journal of the Royal Statistical Society 89 (1926), pp. 1–69.