SlideShare a Scribd company logo
The Bayesian Approach to Default Risk: A Guide Michael Jacobs, Jr. Credit Risk Analysis Division Office of the Comptroller of the Currency Nicholas M. Kiefer Cornell University Departments of Economics and Statistical Science March 2010 Forthcoming: “Rethinking Risk Measurement & Reporting”, Risk Books, Ed. Klaus Blocker The views expressed herein are those of the authors and do not necessarily represent the views of the Office of the Comptroller of the Currency or the Department of the Treasury.
Outline Introduction Elicitation of Expert Information Statistical Models for Defaults Elicitation: Example Inference Conclusions & Directions for Future Research
Introduction All competent statistical analyses involve subjective inputs, importance of which is often minimized in a quest for objectivity Justification of these is an important part of supervisory expectations for model validation (OCC 2000, BCBS 2009b) But to appear objective estimation with such judgments typically proceeds ignoring qualitative information on parameters However, subject-matter experts typically have information about parameter values and model specification  The Bayesian approach allows formal incorporation of this by combining “hard” & “soft” data using the rules of probability  Another advantage is availability of powerful computational techniques such as Markov Chain Monte Carlo (MCMC) A difficulty in Bayesian analysis is elicitation & representation of expert information in the form of a probability distribution
Introduction (continued) While may not be important in "large“ samples, expert information is of value if data is scarce, costly, or unreliable Herein we illustrate the practical steps in the Bayesian analysis of a PD estimation for a group of homogeneous assets This is required for determining minimum regulatory capital under the Basel II (B2) framework (BCBS, 2006) This also implications for BCBS (2009a), which stresses the continuing importance of quantitative risk management Focus 1 st  on elicitation & representation of expert information, then on Bayesian inference in nested simple models of default As we do not know in advance whether default occurs or not, we model this uncertain event with a probability distribution  We assert that uncertainty about the default probability should be modeled the same way as uncertainty about defaults - i.e, represented in a  probability distribution
Introduction (concluded) There is information available to model the PD distribution -  the fact that loans are made shows that risk assessment occurs! The information from this should be organized & used in the analysis in a sensible, transparent and structured way First discuss the process for elicitation of expert information and later show a particular example using the  maximum entropy  Present a sequence of simple models of default generating likelihood functions (with  generalized linear mixed models ): The binomial model, 2-factor ASRM of B2, and an extension with an autocorrelated systematic risk factor We sketch the  Markov Chain Monte Carlo  approach to calculating the posterior distribution from combining data and expert information coherently using the rules of probability Illustrate all of these steps using annual Moody's corporate Ba default rates for the period 1999-2009
Elicitation of Expert Information General definition: a structured algorithm for transforming an expert's beliefs on an uncertain phenomenon into a probability Here a method for specifying a prior distribution of unknown parameters governing a model of credit risk (e.g., PD, LGD)  While focus is inference regarding unknown parameters, arises in almost all scientific applications involving complex systems  The expert may be an experienced statistician or a somewhat quantitatively oriented risk specialist (e.g., a loan officer, PM)  Situation also arises where decision-making under uncertainty needs to be expressed as such to maximize an objective  Useful framework: identify a model developer or econometrician as a facilitator to transform the soft data into probabilistic form The facilitator should be multi-faceted, having also business knowledge and strong communication skills
Elicitation of Expert Information (continued) In setting criteria for the quality of an elicitation, we distinguish between the quality of an expert's knowledge & the elicitation  By no means a straightforward task, even if it is beliefs regarding only a single of event or hypothesis (e.g., PD) We seek assessment of probabilities, but it is possible that the expert is not familiar the meaning or can think in these terms Even if the expert is comfortable with these, it is challenging to accurately assess numerical probabilities of a rare event In eliciting a distribution for a continuous parameter it is not practical to try eliciting an infinite collection of probabilities Practically an expert can make only a finite number (usually limited) number of statements of belief (quantiles or modes) Given these formidable difficulties an observer may question if it is worth the effort to even attempt this!
Elicitation of Expert Information (continued) Often a sensible objective is to measure salient features of the expert's opinion - exact details may not be of highest relevance Similarity to specification of a likelihood function: infinite number of probabilities expressed as a function of small parameter sets Even if the business decision is sensitive to the exact shape, it may be that another metric that is of paramount importance Elicitation promotes a careful consideration by both the expert and facilitator regarding the meaning of the parameters Two benefits: results in an analysis that is closer to the subject of the model & gives rise to a meaningful posterior distribution  A natural interpretation is as part of the statistical modeling  process  - this is a step in the hierarchy & the usual rules apply Stylized representation has 4 stages: preparation, specific summaries, vetting of the distribution & overall assessment
Elicitation of Expert Information (concluded) Non-technical considerations include quality of the expert and quality of the arguments made While the choice of an expert must be justified, it usually not that hard identify: risk management decision-making individuals It is useful to have a summary of what the expert knowledge is based on and be wary of any conflicts of interests Important that if needed training should be offered on whatever concepts will be required in the elicitation (a “dry-run“)  The elicitation should be well documented: set out all questions & responses, and the process of fitting the distribution Documentation requirements here fit well into supervisory expectations with respect to developmental evidence of models Further discussion of this can be found in Kadane and Wolfson (1998), Garthwaite et al (2005) and O'Hagan et al (2006)
Statistical Models for Defaults The simplest probability model for defaults for a homogeneous portfolio segment is Binomial, which assumes independence across assets & time, with common probability As in Basel 2 IRB and the rating agencies, this is marginal with respect to conditions (e.g., through taking long-term averages) Suppose the value of the i th  asset in time t is: Where  , and default  occurs if asset value falls below a common predetermined threshold  , so that:  It follows that default on asset i is distributed Bernoulli: Denote the defaults in the data  and the total count of defaults
Statistical Models for Defaults (continued) The distribution of the data and the default count is: This is Model 1, which underlies rating agency estimation of default rates, where the MLE estimator is simply Basel II guidance suggests there may be heterogeneity due to systematic temporal changes in asset characteristics or to changing macroeconomic conditions, giving rise to our Model 2: Where  is a common time-specific shock or a systematic factor and  is asset value correlation
Statistical Models for Defaults (continued) We have that  and the conditional (or period-t) default probability is (4): We can invert this for the distribution function of the year t default rate for  (5): Differentiating with respect to A yields the well-known  Vasicek distribution  (5.1):
Statistical Models for Defaults (continued) The conditional distribution of the number of defaults in each period is (6): From which we obtain the distribution of defaults conditional on the underlying parameters by integration over the default rate distribution (6.1): By intertemporal independence we have the data likelihood across all years (7): Model II allows clumping of defaults within time periods, but not correlation across time periods, so the next natural extension lets the systematic risk factor x t  follow an AR(1) process:
Statistical Models for Defaults (concluded) The formula for the conditional PD (4) still holds, but we don’t get the Vasicek distribution of the default rate (5) and (6)-(6.1) becomes this without the Vasicek distributed default rate: Now the unconditional distribution is given by the T-dimensional integration as the  likelihood now can’t be broken up the period-by-period (8): Where  is the joint-density of a zero-mean random variable following and AR(1) process While Model 1 is a very simple example of a  Generalized Linear Model  - GLMs (McCullagh and Nelder, 1989), Models II &III are  Generalized Linear Mixed Models  - GLMMs), a parametric mixture (McNeil and Wendin, 2007; Kiefer, 2009)
Elicitation: Example We asked an expert to consider a portfolio of middle market loans in a bank's portfolio, typically commercial loans to un-rated companies (if rated, these be about Moody's Ba-Baa) This is an experienced banking professional in credit portfolio risk management and business analytics, having seen many portfolios of this type in different institutions The expert thought the median value was 0.01, minimum of 0.0001, that a value above 0.035 would occur with probability less than 10%, and an absolute upper bound was 0.3 Quantiles were assessed by asking the expert to consider the value at which larger or smaller values would be equiprobable given the value was less or greater than the median The 0.25 (0.75) quantile was assessed at 0.0075 (0.0125), and he added a 0.99 quantile at 0.2, splitting up the long upper tail from 0.035 to 0.3
Elicitation: Example (continued) How should we mathematically express the expert information? Commonly we specify a parametric distribution, assuming standard identification properties (i.e., K conditions can determine a K-parameter distribution-see Kiefer 2010 a) Disadvantage: rarely good guidance other than convenience of functional form & this can insert extraneous information We prefer the nonparametric the  maximum entropy  (ME) approach (Cover & Thomas, 1991), where we choose a probability distribution p that maximizes the entropy H subject to K constraints: .
Elicitation: Example (continued) Our constraints are the values  of the quantiles  , and we can express the solution in terms of the Lagrangian multipliers  chosen such they are satisfied, so from the 1 st  order conditions: This is a piecewise uniform distribution, which we decide to smooth with an Epanechnikov kernel, under the assumption that discontinuities are unlikely to reflect the expert’s view: Where h is the bandwidth, chosen such that the expert was satisfied with the final product .
Elicitation: Example (continued) We address the boundary problem, that K has a larger support by p ME , using the  reflection  technique (Schuster, 1985): For asset correlation  in Models 2 & 3, B2 recommends a value of about 0.20 for this segment, so due to little expert information on this, we choose a Beta(12.6, 50.4) prior centered at to 0.20 With even less guidance on the autocorrelation  in Model 3, other than from asset pricing literature that is likely to be positive, we chose a uniform prior in [-1,1], with the B2 value of 0 as its mean  .
Elicitation: Example (continued)
Elicitation: Example (concluded)
Inference: Bayesian Framework Let us write the likelihood function of the data generically: The joint distribution of the data R and the prior p is: The marginal (predictive) distribution of R is: Finally, we obtain the posterior (conditional) distribution of the parameter as: Perhaps take a summary statistic like  , the posterior expectation, for B2 or other purposes, which is (asymptotically) optimal under (bowl-shaped) quadratic loss Computationally high dimensional numerical integration may be hard and inference a problem, therefore simulation techniques
Inference: Computation by Markov Chain Monte Carlo MCMC methods are a wide class of procedures for sampling from a distribution when the normalizing constant is unknown In the simple case, the Metropolis method, we sample from our posterior distribution that is only know up to a constant: We construct a Markov Chain which has this as its stationary distribution by starting with a proposal distribution The new parameter depends upon the old one stochastically, and the diagonal covariance matrix of the normal error is chosen specially to make the algorithm work We draw from this distribution and accept the new draw according to the ratio of joint likelihoods of the data and the parameter, known as the  acceptance rate
Inference: Computation by MCMC (concluded) Note  and therefore  is easy to calculate in that: The resulting sample is an MC with this equilibrium distribution & moments calculated from it approximate the target We use the mcmc package (Geyer, 2009) used in the R programming language (R Development Core Team, 2009) The package takes into account that standard errors from this are not independent in computation of confidence bounds
Inference: Data We construct a segment of upper tier high-yield corporate bonds of firms rated Ba by Moody's Investors Service Use the Moody's Default Risk Service TM  (DRS TM ) database (release date 1-8-2010)  These are restricted to U.S. domiciled, non-financial and non-sovereign entities Default rates were computed for annual cohorts of firms starting in January 1999 and running through January 2009 Use the Moody’s adjustment for withdrawals (i.e., remove ½ from the beginning count)  In total there are 2642 firm-years of data and 24 defaults, for an overall empirical rate of 0.00908
Inference: Data (continued)
Inference: Empirical Results PD estimates in 2- & 3-parameter models are only very slightly higher than in the 1-parameter model Higher degree of variability of AVC estimate rho relative to the mean as compared to the PD
Inference: Empirical Results (continued) Relatively low estimate of rho consistent with various previous calibrations of structural credit models to default vs. equity data  But the prior mean (0.2) is well outside the posterior 95% confidence interval for the AVC rho – why? Theta = 0.01 & rho = 0.2 in the Vasicek distribution implies an intertemporal standard deviation in default rates of 0.015 But with rho = 0.077, the posterior mean, the implied standard deviation is 0.008, which better matches that in our sample of 0.0063  This aspect of the data is moving the posterior to the left of the prior There is evidence that autocorrelation parameter tau of systematic risk factor may be mildly positive Estimates of stressed regulatory capital are 6.53% (6.7%) for Model(s) 1 (2 & 3), and the mark-up over the base ranges in 21-25%
Inference: Empirical Results (continued)
Empirical Results (continued)
Empirical Results (concluded)
Conclusion & Directions for Future Research Modeling the data distribution & expert information statistically increases the range of applicability of econometric methods We have gone through the steps of a formal Bayesian analysis for PD, required under B2 for many institutions worldwide We concluded with posterior distributions for the parameters of a nested sequence of models with summary statistics The mean PD a natural estimator for minimum regulatory capital requirements, but such distributions have many uses E.g., stressing IRB models, economic capital or credit pricing More general models provide insight into the extent to which default rates over time are predictable & the extent to which risk calculations should look ahead over a number of years  Analysis of LGD or economic capital using Bayesian methods (jointly with PD?) would be useful (substantial experience here) Many other possible analyses could build on these methods

More Related Content

PPTX
Step by Step guide to executing an analytics project
PPTX
Risk Based Loan Approval Framework
PDF
BOOTSTRAPPING TO EVALUATE RESPONSE MODELS: A SAS® MACRO
PDF
Multi-dimensional time series based approach for Banking Regulatory Stress Te...
PDF
Data mining in support of fraud management
PDF
Aon FI Risk Advisory - CCAR Variable Selection
PDF
Evaluation measures for models assessment over imbalanced data sets
PDF
Variable Selection for CCAR
Step by Step guide to executing an analytics project
Risk Based Loan Approval Framework
BOOTSTRAPPING TO EVALUATE RESPONSE MODELS: A SAS® MACRO
Multi-dimensional time series based approach for Banking Regulatory Stress Te...
Data mining in support of fraud management
Aon FI Risk Advisory - CCAR Variable Selection
Evaluation measures for models assessment over imbalanced data sets
Variable Selection for CCAR

What's hot (19)

PPT
83690136 sess-3-modelling-and-simulation
PDF
3rd alex marketing club (pharmaceutical forecasting) dr. ahmed sham'a
PDF
Risk and Resilience: Towards a more effective narrative
PDF
Classification and decision tree classifier machine learning
PDF
Business Bankruptcy Prediction Based on Survival Analysis Approach
PDF
Re-Learning Strategy with Big Data
PPT
cas_washington_nov2010_web
PPTX
Doc 20190909-wa0025
PDF
ymca821-.pdf-published paper-5.pdf++
DOCX
FSRM 582 Project
PDF
Proficiency comparison ofladtree
PDF
Df24693697
PDF
Instance Selection and Optimization of Neural Networks
PDF
T OWARDS A S YSTEM D YNAMICS M ODELING M E- THOD B ASED ON DEMATEL
PPTX
Research methodology part 2
DOCX
Presentation on quality management system
PDF
IRJET- Stock Market Prediction using Deep Learning and Sentiment Analysis
PDF
IRJET- Overview of Forecasting Techniques
PDF
Default Probability Prediction using Artificial Neural Networks in R Programming
83690136 sess-3-modelling-and-simulation
3rd alex marketing club (pharmaceutical forecasting) dr. ahmed sham'a
Risk and Resilience: Towards a more effective narrative
Classification and decision tree classifier machine learning
Business Bankruptcy Prediction Based on Survival Analysis Approach
Re-Learning Strategy with Big Data
cas_washington_nov2010_web
Doc 20190909-wa0025
ymca821-.pdf-published paper-5.pdf++
FSRM 582 Project
Proficiency comparison ofladtree
Df24693697
Instance Selection and Optimization of Neural Networks
T OWARDS A S YSTEM D YNAMICS M ODELING M E- THOD B ASED ON DEMATEL
Research methodology part 2
Presentation on quality management system
IRJET- Stock Market Prediction using Deep Learning and Sentiment Analysis
IRJET- Overview of Forecasting Techniques
Default Probability Prediction using Artificial Neural Networks in R Programming
Ad

Viewers also liked (7)

PPT
Empirical Analysis of Bank Capital and New Regulatory Requirements for Risks ...
PPT
Jacobs Reg Frmwrks Mkt Risk Presentation Julu12 7 15 12 V5
PPTX
Jacobs Str Tst Crdt Prtfl Risk Mar2012 3 22 12 V20 Nomacr
PPT
Jacobs Dodd Frank&Basel3 July12 7 15 12 V16
PPT
Jacobs Mdl Rsk Par Crdt Der Risk Nov2011 V17 11 7 11
PDF
An Empirical Study of Exposure at Default
PPTX
Jacobs stress testing_aug13_8-15-13_v4
Empirical Analysis of Bank Capital and New Regulatory Requirements for Risks ...
Jacobs Reg Frmwrks Mkt Risk Presentation Julu12 7 15 12 V5
Jacobs Str Tst Crdt Prtfl Risk Mar2012 3 22 12 V20 Nomacr
Jacobs Dodd Frank&Basel3 July12 7 15 12 V16
Jacobs Mdl Rsk Par Crdt Der Risk Nov2011 V17 11 7 11
An Empirical Study of Exposure at Default
Jacobs stress testing_aug13_8-15-13_v4
Ad

Similar to Jacobs Kiefer Bayes Guide 3 10 V1 (20)

PDF
Pillar III presentation 2 27-15 - redacted version
PPT
Marakas-Ch04-Saif Week 04.ppt
PDF
Keys to extract value from the data analytics life cycle
PDF
Manuscript dss
PDF
Modelling the expected loss of bodily injury claims using gradient boosting
PDF
EAD Parameter : A stochastic way to model the Credit Conversion Factor
PDF
Exploring Bayesian Hierarchical Models for Multi-Level Credit Risk Assessment...
DOCX
Datascience
DOCX
datascience.docx
PPT
Pentaho Meeting 2008 - Statistics & BI
PDF
International journal of engineering and mathematical modelling vol1 no1_2015_2
PPT
Codecamp Iasi 7 mai 2011 Monte Carlo Simulation
PPTX
How to establish and evaluate clinical prediction models - Statswork
PDF
Federico Thibaud - Capital Structure Arbitrage
PDF
Machine Learning in Banking
PDF
MACHINE LEARNING CLASSIFIERS TO ANALYZE CREDIT RISK
PDF
ML in banking
PPTX
8 rajib chakravorty risk
PDF
Corporate bankruptcy prediction using Deep learning techniques
PPTX
Financial Risk Mgt - Lec 11 by Dr. Syed Muhammad Ali Tirmizi
Pillar III presentation 2 27-15 - redacted version
Marakas-Ch04-Saif Week 04.ppt
Keys to extract value from the data analytics life cycle
Manuscript dss
Modelling the expected loss of bodily injury claims using gradient boosting
EAD Parameter : A stochastic way to model the Credit Conversion Factor
Exploring Bayesian Hierarchical Models for Multi-Level Credit Risk Assessment...
Datascience
datascience.docx
Pentaho Meeting 2008 - Statistics & BI
International journal of engineering and mathematical modelling vol1 no1_2015_2
Codecamp Iasi 7 mai 2011 Monte Carlo Simulation
How to establish and evaluate clinical prediction models - Statswork
Federico Thibaud - Capital Structure Arbitrage
Machine Learning in Banking
MACHINE LEARNING CLASSIFIERS TO ANALYZE CREDIT RISK
ML in banking
8 rajib chakravorty risk
Corporate bankruptcy prediction using Deep learning techniques
Financial Risk Mgt - Lec 11 by Dr. Syed Muhammad Ali Tirmizi

More from Michael Jacobs, Jr. (9)

PPT
Jacobs Liquidty Risk Garp 2 16 12
PPT
Jacobs Dofdd Frank&Basel3 Risk Nov11 11 8 11 V1
PDF
Lgd Risk Resolved Bog And Occ
PPT
Lgd Model Jacobs 10 10 V2[1]
PPT
Bag Jacobs Ead Model Ccl Irmc 6 10
PPT
Val Econ Cap Mdls Risk Conf Jacobs 1 10 V1
PPT
Risk Aggregation Inanoglu Jacobs 6 09 V1
PPT
Understanding and Predicting Ultimate Loss-Given-Default on Bonds and Loans
PDF
An Empirical Study of the Returns on Defaulted Debt and the Discount Rate for...
Jacobs Liquidty Risk Garp 2 16 12
Jacobs Dofdd Frank&Basel3 Risk Nov11 11 8 11 V1
Lgd Risk Resolved Bog And Occ
Lgd Model Jacobs 10 10 V2[1]
Bag Jacobs Ead Model Ccl Irmc 6 10
Val Econ Cap Mdls Risk Conf Jacobs 1 10 V1
Risk Aggregation Inanoglu Jacobs 6 09 V1
Understanding and Predicting Ultimate Loss-Given-Default on Bonds and Loans
An Empirical Study of the Returns on Defaulted Debt and the Discount Rate for...

Jacobs Kiefer Bayes Guide 3 10 V1

  • 1. The Bayesian Approach to Default Risk: A Guide Michael Jacobs, Jr. Credit Risk Analysis Division Office of the Comptroller of the Currency Nicholas M. Kiefer Cornell University Departments of Economics and Statistical Science March 2010 Forthcoming: “Rethinking Risk Measurement & Reporting”, Risk Books, Ed. Klaus Blocker The views expressed herein are those of the authors and do not necessarily represent the views of the Office of the Comptroller of the Currency or the Department of the Treasury.
  • 2. Outline Introduction Elicitation of Expert Information Statistical Models for Defaults Elicitation: Example Inference Conclusions & Directions for Future Research
  • 3. Introduction All competent statistical analyses involve subjective inputs, importance of which is often minimized in a quest for objectivity Justification of these is an important part of supervisory expectations for model validation (OCC 2000, BCBS 2009b) But to appear objective estimation with such judgments typically proceeds ignoring qualitative information on parameters However, subject-matter experts typically have information about parameter values and model specification The Bayesian approach allows formal incorporation of this by combining “hard” & “soft” data using the rules of probability Another advantage is availability of powerful computational techniques such as Markov Chain Monte Carlo (MCMC) A difficulty in Bayesian analysis is elicitation & representation of expert information in the form of a probability distribution
  • 4. Introduction (continued) While may not be important in "large“ samples, expert information is of value if data is scarce, costly, or unreliable Herein we illustrate the practical steps in the Bayesian analysis of a PD estimation for a group of homogeneous assets This is required for determining minimum regulatory capital under the Basel II (B2) framework (BCBS, 2006) This also implications for BCBS (2009a), which stresses the continuing importance of quantitative risk management Focus 1 st on elicitation & representation of expert information, then on Bayesian inference in nested simple models of default As we do not know in advance whether default occurs or not, we model this uncertain event with a probability distribution We assert that uncertainty about the default probability should be modeled the same way as uncertainty about defaults - i.e, represented in a probability distribution
  • 5. Introduction (concluded) There is information available to model the PD distribution - the fact that loans are made shows that risk assessment occurs! The information from this should be organized & used in the analysis in a sensible, transparent and structured way First discuss the process for elicitation of expert information and later show a particular example using the maximum entropy Present a sequence of simple models of default generating likelihood functions (with generalized linear mixed models ): The binomial model, 2-factor ASRM of B2, and an extension with an autocorrelated systematic risk factor We sketch the Markov Chain Monte Carlo approach to calculating the posterior distribution from combining data and expert information coherently using the rules of probability Illustrate all of these steps using annual Moody's corporate Ba default rates for the period 1999-2009
  • 6. Elicitation of Expert Information General definition: a structured algorithm for transforming an expert's beliefs on an uncertain phenomenon into a probability Here a method for specifying a prior distribution of unknown parameters governing a model of credit risk (e.g., PD, LGD) While focus is inference regarding unknown parameters, arises in almost all scientific applications involving complex systems The expert may be an experienced statistician or a somewhat quantitatively oriented risk specialist (e.g., a loan officer, PM) Situation also arises where decision-making under uncertainty needs to be expressed as such to maximize an objective Useful framework: identify a model developer or econometrician as a facilitator to transform the soft data into probabilistic form The facilitator should be multi-faceted, having also business knowledge and strong communication skills
  • 7. Elicitation of Expert Information (continued) In setting criteria for the quality of an elicitation, we distinguish between the quality of an expert's knowledge & the elicitation By no means a straightforward task, even if it is beliefs regarding only a single of event or hypothesis (e.g., PD) We seek assessment of probabilities, but it is possible that the expert is not familiar the meaning or can think in these terms Even if the expert is comfortable with these, it is challenging to accurately assess numerical probabilities of a rare event In eliciting a distribution for a continuous parameter it is not practical to try eliciting an infinite collection of probabilities Practically an expert can make only a finite number (usually limited) number of statements of belief (quantiles or modes) Given these formidable difficulties an observer may question if it is worth the effort to even attempt this!
  • 8. Elicitation of Expert Information (continued) Often a sensible objective is to measure salient features of the expert's opinion - exact details may not be of highest relevance Similarity to specification of a likelihood function: infinite number of probabilities expressed as a function of small parameter sets Even if the business decision is sensitive to the exact shape, it may be that another metric that is of paramount importance Elicitation promotes a careful consideration by both the expert and facilitator regarding the meaning of the parameters Two benefits: results in an analysis that is closer to the subject of the model & gives rise to a meaningful posterior distribution A natural interpretation is as part of the statistical modeling process - this is a step in the hierarchy & the usual rules apply Stylized representation has 4 stages: preparation, specific summaries, vetting of the distribution & overall assessment
  • 9. Elicitation of Expert Information (concluded) Non-technical considerations include quality of the expert and quality of the arguments made While the choice of an expert must be justified, it usually not that hard identify: risk management decision-making individuals It is useful to have a summary of what the expert knowledge is based on and be wary of any conflicts of interests Important that if needed training should be offered on whatever concepts will be required in the elicitation (a “dry-run“) The elicitation should be well documented: set out all questions & responses, and the process of fitting the distribution Documentation requirements here fit well into supervisory expectations with respect to developmental evidence of models Further discussion of this can be found in Kadane and Wolfson (1998), Garthwaite et al (2005) and O'Hagan et al (2006)
  • 10. Statistical Models for Defaults The simplest probability model for defaults for a homogeneous portfolio segment is Binomial, which assumes independence across assets & time, with common probability As in Basel 2 IRB and the rating agencies, this is marginal with respect to conditions (e.g., through taking long-term averages) Suppose the value of the i th asset in time t is: Where , and default occurs if asset value falls below a common predetermined threshold , so that: It follows that default on asset i is distributed Bernoulli: Denote the defaults in the data and the total count of defaults
  • 11. Statistical Models for Defaults (continued) The distribution of the data and the default count is: This is Model 1, which underlies rating agency estimation of default rates, where the MLE estimator is simply Basel II guidance suggests there may be heterogeneity due to systematic temporal changes in asset characteristics or to changing macroeconomic conditions, giving rise to our Model 2: Where is a common time-specific shock or a systematic factor and is asset value correlation
  • 12. Statistical Models for Defaults (continued) We have that and the conditional (or period-t) default probability is (4): We can invert this for the distribution function of the year t default rate for (5): Differentiating with respect to A yields the well-known Vasicek distribution (5.1):
  • 13. Statistical Models for Defaults (continued) The conditional distribution of the number of defaults in each period is (6): From which we obtain the distribution of defaults conditional on the underlying parameters by integration over the default rate distribution (6.1): By intertemporal independence we have the data likelihood across all years (7): Model II allows clumping of defaults within time periods, but not correlation across time periods, so the next natural extension lets the systematic risk factor x t follow an AR(1) process:
  • 14. Statistical Models for Defaults (concluded) The formula for the conditional PD (4) still holds, but we don’t get the Vasicek distribution of the default rate (5) and (6)-(6.1) becomes this without the Vasicek distributed default rate: Now the unconditional distribution is given by the T-dimensional integration as the likelihood now can’t be broken up the period-by-period (8): Where is the joint-density of a zero-mean random variable following and AR(1) process While Model 1 is a very simple example of a Generalized Linear Model - GLMs (McCullagh and Nelder, 1989), Models II &III are Generalized Linear Mixed Models - GLMMs), a parametric mixture (McNeil and Wendin, 2007; Kiefer, 2009)
  • 15. Elicitation: Example We asked an expert to consider a portfolio of middle market loans in a bank's portfolio, typically commercial loans to un-rated companies (if rated, these be about Moody's Ba-Baa) This is an experienced banking professional in credit portfolio risk management and business analytics, having seen many portfolios of this type in different institutions The expert thought the median value was 0.01, minimum of 0.0001, that a value above 0.035 would occur with probability less than 10%, and an absolute upper bound was 0.3 Quantiles were assessed by asking the expert to consider the value at which larger or smaller values would be equiprobable given the value was less or greater than the median The 0.25 (0.75) quantile was assessed at 0.0075 (0.0125), and he added a 0.99 quantile at 0.2, splitting up the long upper tail from 0.035 to 0.3
  • 16. Elicitation: Example (continued) How should we mathematically express the expert information? Commonly we specify a parametric distribution, assuming standard identification properties (i.e., K conditions can determine a K-parameter distribution-see Kiefer 2010 a) Disadvantage: rarely good guidance other than convenience of functional form & this can insert extraneous information We prefer the nonparametric the maximum entropy (ME) approach (Cover & Thomas, 1991), where we choose a probability distribution p that maximizes the entropy H subject to K constraints: .
  • 17. Elicitation: Example (continued) Our constraints are the values of the quantiles , and we can express the solution in terms of the Lagrangian multipliers chosen such they are satisfied, so from the 1 st order conditions: This is a piecewise uniform distribution, which we decide to smooth with an Epanechnikov kernel, under the assumption that discontinuities are unlikely to reflect the expert’s view: Where h is the bandwidth, chosen such that the expert was satisfied with the final product .
  • 18. Elicitation: Example (continued) We address the boundary problem, that K has a larger support by p ME , using the reflection technique (Schuster, 1985): For asset correlation in Models 2 & 3, B2 recommends a value of about 0.20 for this segment, so due to little expert information on this, we choose a Beta(12.6, 50.4) prior centered at to 0.20 With even less guidance on the autocorrelation in Model 3, other than from asset pricing literature that is likely to be positive, we chose a uniform prior in [-1,1], with the B2 value of 0 as its mean .
  • 21. Inference: Bayesian Framework Let us write the likelihood function of the data generically: The joint distribution of the data R and the prior p is: The marginal (predictive) distribution of R is: Finally, we obtain the posterior (conditional) distribution of the parameter as: Perhaps take a summary statistic like , the posterior expectation, for B2 or other purposes, which is (asymptotically) optimal under (bowl-shaped) quadratic loss Computationally high dimensional numerical integration may be hard and inference a problem, therefore simulation techniques
  • 22. Inference: Computation by Markov Chain Monte Carlo MCMC methods are a wide class of procedures for sampling from a distribution when the normalizing constant is unknown In the simple case, the Metropolis method, we sample from our posterior distribution that is only know up to a constant: We construct a Markov Chain which has this as its stationary distribution by starting with a proposal distribution The new parameter depends upon the old one stochastically, and the diagonal covariance matrix of the normal error is chosen specially to make the algorithm work We draw from this distribution and accept the new draw according to the ratio of joint likelihoods of the data and the parameter, known as the acceptance rate
  • 23. Inference: Computation by MCMC (concluded) Note and therefore is easy to calculate in that: The resulting sample is an MC with this equilibrium distribution & moments calculated from it approximate the target We use the mcmc package (Geyer, 2009) used in the R programming language (R Development Core Team, 2009) The package takes into account that standard errors from this are not independent in computation of confidence bounds
  • 24. Inference: Data We construct a segment of upper tier high-yield corporate bonds of firms rated Ba by Moody's Investors Service Use the Moody's Default Risk Service TM (DRS TM ) database (release date 1-8-2010) These are restricted to U.S. domiciled, non-financial and non-sovereign entities Default rates were computed for annual cohorts of firms starting in January 1999 and running through January 2009 Use the Moody’s adjustment for withdrawals (i.e., remove ½ from the beginning count) In total there are 2642 firm-years of data and 24 defaults, for an overall empirical rate of 0.00908
  • 26. Inference: Empirical Results PD estimates in 2- & 3-parameter models are only very slightly higher than in the 1-parameter model Higher degree of variability of AVC estimate rho relative to the mean as compared to the PD
  • 27. Inference: Empirical Results (continued) Relatively low estimate of rho consistent with various previous calibrations of structural credit models to default vs. equity data But the prior mean (0.2) is well outside the posterior 95% confidence interval for the AVC rho – why? Theta = 0.01 & rho = 0.2 in the Vasicek distribution implies an intertemporal standard deviation in default rates of 0.015 But with rho = 0.077, the posterior mean, the implied standard deviation is 0.008, which better matches that in our sample of 0.0063 This aspect of the data is moving the posterior to the left of the prior There is evidence that autocorrelation parameter tau of systematic risk factor may be mildly positive Estimates of stressed regulatory capital are 6.53% (6.7%) for Model(s) 1 (2 & 3), and the mark-up over the base ranges in 21-25%
  • 31. Conclusion & Directions for Future Research Modeling the data distribution & expert information statistically increases the range of applicability of econometric methods We have gone through the steps of a formal Bayesian analysis for PD, required under B2 for many institutions worldwide We concluded with posterior distributions for the parameters of a nested sequence of models with summary statistics The mean PD a natural estimator for minimum regulatory capital requirements, but such distributions have many uses E.g., stressing IRB models, economic capital or credit pricing More general models provide insight into the extent to which default rates over time are predictable & the extent to which risk calculations should look ahead over a number of years Analysis of LGD or economic capital using Bayesian methods (jointly with PD?) would be useful (substantial experience here) Many other possible analyses could build on these methods

Editor's Notes

  • #4: Clear that mdl spec (LogL), def par. / quant. of int., spec. par. sp., ident. of rel. data req. judg. & are subj. to crit. & requ. just. E.g., PD in (0,1) – def par sp,& if for a part rtg/segm an idea of the loc of the rate A simple example in the case of estimating default rates is sketched in Kiefer (2007) MCMC & related are widely disc. in the econ lit & have been applied in the default estimation setting But applications typ spec a conv prior adds min inf but misses the true power of the Bayesian approach Requ thought & effort, rather than mere computational power, and is therefore not commonly done.
  • #5: Data inf typ overwhelms nondogmatic pr (irrel asympt)-> econ oft just ign. on this basis E.g., not be avail in quant for low-dflt assets, new prod, str econ chngs -> doubt relev. hist. data BCBS 2009 in response to the credit crisis,
  • #6: 1st mdls consistent with the ASRM underlying B2 & the rd adds temp. corr. in asset values by acc. for AC Later 2 mdls repr smpl ex. GLLMs (now prev cr risk lit) Gen. B2 and perhaps in line w/val. exp. of BCBC (2009b) & ind best pract
  • #7: Pr dsn will be comb with data likel thru Bayes' Thrm to derive post dist of risk meas E.g. other appl.: any sit w/ no or very lim data (mdls to descr, und, pred compl beh) Mdl dev in conj w/expert typ prop sens mdl par to obt output where there is unc re. inp true val. As in our appl, this highlights the imp of having a coh appr to repr that unc.
  • #8: Qual of elicit. = acc with which exp. know. trans to prob form Elic. is done well if dsn deriv. is an acc. repr. of expert's know. no matter what the qual. of that know. But good fac asking prob. ques. may also be able to det. if expert really has val. inf. E.g. of diff. dflt for a set of obl., esp. if highly rated & we lack of hist. perf. data on a portf. of sim. cr. Note the symm. - we char. the unc. reg. the unk. prob. gov. distr PD itself in terms of prob’s. Ans to why worth trouble - use of elic. as part of bus. dec. making, get auto buyin, Use test
  • #9: E.g., in forming a prior for the dsn of the DR, a gen sense of where cntrd (5 bps, 1% or 10%?) & degr to which the tail is elong. may be enuf On spec LL pnt str made in normal case: whole set of prob’s spec. functions of a mean / var. (can hardly be cred. as exact descr. of a real data dsn, but still usefulness proven in countless appl. When stat’s write down a LF, just inf. opin. re. a data gen. proc. cond. on a par. set E.g. alt metric: reg cap imp or the exp utility of the sup, which may be quite robust to det. of exp. opin. E.g. close to appl.: ins. focus on a set of plausible obs. DRs over a set hor. w.r.t obl. of a part. cr. qual. E.g. meaning. post: prod. PD est. not only for a compl. ex, but a compl. pred. dsn of DR for other risk man. purp (cr. dec., acc. mgt, portf. str. tst.) Prep: identifying the expert, training the expert, and identifying what aspects of the problem to elicit. In pract. may be overlap betw. prep & summ: choice of data to elicit oft. foll. choice of dsn. form (e.g., a smpl. par. dsn. like beta for prior PD-> few quant. maybe OK vs. non-par. kern. dens. may requ. inf.) Elic. alm. alw. iter.->ass. of adeq. poss. ret. more summ.
  • #10: E.g., qual exp: exper. in related risk-mgt sit. & educ. E.g., choice exp.: for the relev. portf. in a succ. fin. instit. E.g., qual. arg: convinc. other exp? Reas. from part. sim. portf. or econ. cond.? E.g. basis exp know: st of the cr cyc., ind. cond. of the obl. or avg feat. of the portf. that are drivers of dflt risk E.g., confl int: cr exec bonus funct of the reg cap chrg on portf E.g. doc: code for mom. match. or smth. perf.
  • #11: B2 requir. an ann. dflt prob est over smpl long enuf cover full econ. cycle Many disc. of the infer. issue foc. on the bin mdl & assoc. frequ. est.
  • #12: This is true bec. for fixed n data dep on dflt only through r & the suff princ This is the mdl underlying rtg ag est of dflt rates & banks with expert rating sys for whsl B2
  • #13: Get dsn of vit from the normality of x Cond dflt prob foll from stand vit We are inter. est. the marg. dflt prob. – need the dsn of the yr t drt Not that this theta_t is diff. from the func. For the cond. dflt prob. above B2 formula for stressed PD that goes into the reg cap expr
  • #15: Now th_t is cond PD funct NOT the RV yr t dflt Likl contr in any year dep on the prev yr – no intertemp. ind
  • #17: We choose the dsn. that adds the min inf. (or max entr = disorder) subj. to K cnstr (incl. hat p is a proper prob. dsn) H is a widely used meas. inf. in an obs. or exp.
  • #22: Jnt dsn of data & par follows from prod rule, marg of R from def of marg dsn & post dsn of par from def of cond dsn Taken together, this is known as “Bayes Rule”, but this is simply a set of basic rules of prob. Post. exp. has nice theor.prop. (under very gen. cond. – do not need norm.!) with resp. to pred. a likely value of the par. For some mdls. num inetrgr. of likl. not – is in our cases, but then probl. with acc. of the int. lead to probl. in making inf. on par. Sim. techn. Like MCMC are
  • #24: Exp. w/these meth. useful to gain und. Sftwr. prov. val. guid./warnings avail. onlen. Gen. AR~ 25% is good (see Roberts, Gelman, and Gilks (1997)), tuned by adj. the var’s of epsilon Scl. prop. dsn. allowed us AR 22-25%. No way to prove that conv. but nonconv. often obv. from tm-ser plots. Long runs are better than short: M samples (Mdl 1,2-3 = 10-40K) > 5K burnin.
  • #32: E.g. uses post dsn: use entire dsn PD pr. Cr. & set in-house cap. lvls (EC) Also useful stress of IRB mdls (plug high quantile PD into the reg cap form. mdl. adv. Scenario)