SlideShare a Scribd company logo
5th International Summer School
Achievements and Applications of Contemporary Informatics,
Mathematics and Physics
National University of Technology of the Ukraine
Kiev, Ukraine, August 3-15, 2010




    A Classification Problem of Credit Risk Rating
              Investigated and Solved by
           Optimization of the ROC Curve

                                         Gerhard-Wilhelm Weber          *

                                   Kasırga Yıldırak and Efsun Kürüm

              Institute of Applied Mathematics, Middle East Technical University, Ankara, Turkey

                   •    Faculty of Economics, Management and Law, University of Siegen, Germany
                       Center for Research on Optimization and Control, University of Aveiro, Portugal
                                      Universiti Teknologi Malaysia, Skudai, Malaysia
Outline

•   Main Problem from Credit Default

•   Logistic Regression and Performance Evaluation

•   Cut-Off Values and Thresholds

•   Classification and Optimization

•   Nonlinear Regression

•   Numerical Results

•   Outlook and Conclusion
Main Problem from Credit Default



     Whether a credit application should be consented or rejected.




Solution

     Learning about the default probability of the applicant.
Main Problem from Credit Default



     Whether a credit application should be consented or rejected.




Solution

     Learning about the default probability of the applicant.
Logistic Regression



      P(Y 1 X    xl )
log                     β0   β1 xl1 β2 xl 2      β p xlp
      P(Y   0X   xl )

                                              (l 1, 2,..., N )
Goal

Our study is based on one of the Basel II criteria which
recommend that the bank should divide corporate firms by
8 rating degrees with one of them being the default class.


We have two problems to solve here:
 To distinguish the defaults from non-defaults.

 To put non-default firms in an order based on their credit quality

  and classify them into (sub) classes.
Data

 Data have been collected by a bank from the firms operating in the
  manufacturing sector in Turkey.
 They cover the period between 2001 and 2006.
 There are 54 qualitative variables and 36 quantitative variables originally.
 Data on quantitative variables are formed based on a balance sheet
  submitted by the firms’ accountants.
  Essentially, they are the well-known financial ratios.
 The data set covers 3150 firms from which 92 are in the state of default.
  As the number of default is small, in order to overcome the possible
  statistical problems, we downsize the number to 551,
  keeping all the default cases in the set.
We evaluate performance of the model

non-default                      default
cases                            cases
              cut-off value



                                                                       ROC curve

                              test result value




                                                  TPF, sensitivity



                                                                     FPF, 1-specificity
Model outcome versus truth



                                           truth

                                    d                   n

                               True Positive       False Positive
                                  Fraction            Fraction
                          dı
                                    TPF                 FPF
          model outcome

                               False Negative      True Negative
                          nı      Fraction            Fraction

                                    FNF                TNF


                                     1                   1

                                               total
Definitions



 • sensitivity (TPF) := P( Dı | D)
 • specificity        := P( NDı | ND )
 • 1-specificity (FPF) := P( Dı | ND )

 • points (TPF, FPF) constitute the ROC curve
 • c := cut-off value
 • c takes values between -      and


 • TPF(c) := P( z>c | D )
 • FPF(c) := P( z>c | ND )
normal-deviate axes
TPF




                                     Normal Deviate (TPF)
                               FPF




  FPF(ci ) : Φ( ci )
  TPF (ci ) : Φ(a b ci )

           μn - μs        σn
      a:             b:
             σs           σs
                                                            Normal Deviate (FPF)
normal-deviate axes
TPF




                               t

                                         Normal Deviate (TPF)
                                   FPF




  FPF(ci ) : Φ( ci )
  TPF (ci ) : Φ(a b ci )
                                                                                  c

           μn - μs        σn
      a:             b:
             σs           σs
                                                                Normal Deviate (FPF)
Classification

    Ex.:         cut-off values




           actually non-default                                          actually default
                 cases                                                             cases

                                                                                            c

                            class I   class II   class III class IV   class V


   To assess discriminative power of such a model,
   we calculate the Area Under (ROC) Curve:



                        AUC :              Φ(a b c) d Φ (c).
relationship between thresholds and cut-off values


    Ex.:
               TPF




                                                   FPF

                     t0   t1   t2   t3   t4   t5         R=5



                  Φ(c)     t         c   Φ 1(t )
Optimization in Credit Default



    Problem:


    Simultaneously to obtain the thresholds and the parameters a and b
    that maximize AUC,

    while balancing the size of the classes (regularization)


    and guaranteeing a good accuracy.
Optimization Problem


                                                                                      2
              1                                      R 1
                                     -1                          i
 max α1 Φ( a                 b Φ (t )) dt α2                         (ti   1   ti )
                                                      i 0
                                                                 n
 a,b,         0


                        ti    1
        subject to                Φ(a b Φ 1(t ))d t         δi       (i    0,1,..., R 1)
                         ti




         τ : (t1 , t2 ,..., tR -1 )T       t0   0,   tR     1
Optimization Problem


                                                                                      2
              1                                      R 1
                                     -1                          i
 max α1 Φ( a                 b Φ (t )) dt α2                         (ti   1   ti )
                                                      i 0
                                                                 n
 a,b,         0


                        ti    1
        subject to                Φ(a b Φ 1 (t ))d t        δi   0   (i    0,1,..., R 1)
                         ti
                                                          ti 1 ti


         τ : (t1 , t2 ,..., tR -1 )T       t0   0,   tR     1
Over the ROC Curve

         TPF

                    1-AUC



                                     AUC




                                                      FPF

               t0     t1        t2     t3   t4   t5


                           1
        AOC :                  (1 Φ(a b Φ 1 (t ))) dt
                           0
New Version of the Optimization Problem


                                                    2
                R 1                                             1
                                  i                                                  1
 min       α2                          (ti 1 ti )       α 1 (1 Φ(a b                     (t ))) dt
 a, b, τ                          n
                i 0                                             0




      subject to

                      t
                          j 1
                                                        1
                                      (1 Φ(a b              (t ))) dt   tj   1   t j δj      ( j 0,1, ..., R 1)
                          t
                              j
Regression in Credit Default


    Optimization problem:

    Simultaneously to obtain the thresholds and the parameters a and b
    that maximize AUC,
    while balancing the size of the classes (regularization)
    and guaranteeing a good accuracy




                                            discretization of integral
                                            nonlinear regression problem
Discretization of the Integral


    Riemann-Stieltjes integral


                 AUC               Φ(a    b c) dΦ(c)

     Riemann integral
                          1
                 AUC          Φ(a b Φ 1 (t )) dt
                          0
     Discretization
                              R
                 AUC              Φ(a b Φ 1(tk )) Δtk
                           k 1
Optimization Problem with Penalty Parameters

 In the case of violation of anyone of these constraints, we introduce penalty
 parameters. As some penalty becomes increased, the iterates are forced
 towards the feasible set of the optimization problem.


                                                    2
                          R 1                                     1
    ΠΘ ( a,b, τ ) :               i    (ti 1 ti )                     (1- Φ( a b
                                                                                        -1
                                                                                             (t ))) dt
                      2                                       1
                          i 0
                                  n                               0




                              R-1                       tj 1
                                                                                         1
                          3           θj   δj                  Φ(a b                          (t ))) dt
                                j 0                      tj

                                                                  :      j ( a , b,     )


 Θ : (θ1, θ2 ,..., θ R 1 )T                                       θj     0         (j        0,1, ..., R 1)
Optimization Problem                                       further discretized



                                             2
                  R 1                                      R
ΠΘ (a,b, )   α2         i   (ti 1 ti )                α1         ( (1- Φ(a b      1
                                                                                      (t j ))) Δt j )2
                  i 0
                        n                                  j 1



                                                                                                            2
                                                 nj                                         δj
                                                                                                      Δην
                                 R-1
                                                                         1(
                              3.         j                  Φ(a b              j ))                     j
                                j 0              ν 0                                   tj   1    tj
Optimization Problem                                       further discretized



                                             2
                  R 1                                      R
ΠΘ (a,b, )   α2         i   (ti 1 ti )                α1         ( (1- Φ(a b        1
                                                                                        (t j ))) Δt j )2
                  i 0
                        n                                  j 1



                                                                                                             2
                                                 nj                                          δj
                                                                                                       Δην
                                   R-1
                                                                        1(
                              3.         j                 Φ( a b            j ))                        j
                                 j 0             ν 0                                    tj   1    tj
Nonlinear Regression


                          N                      2
         min f                    dj   g xj ,
                          j 1
                          N
                    :           f j2
                          j 1




                                                                               T
                                                F( ) :   f1 ( ),..., f N ( )



        min f ( )       F T ( )F ( )
Nonlinear Regression

                                                             k 1   :   k   qk
 • Gauss-Newton method :


                           T
                    F( )       F ( )q      F ( )F ( )




 • Levenberg-Marquardt method :
                                                                           0

                           T
                    F( )       F( )     Ip q    F ( )F ( )
Nonlinear Regression


alternative solution



 min    t,
  t,q

                               T
 subject to         F( )           F( )   Ip q   F ( )F ( )       t, t   0,
                                                              2

                  || Lq || 2       M




conic quadratic programming
Nonlinear Regression


alternative solution



 min    t,
  t,q

                               T
 subject to         F( )           F( )   Ip q   F ( )F ( )       t, t   0,
                                                              2

                  || Lq || 2       M




conic quadratic programming

interior point methods
Numerical Results

                                         Initial Parameters
                a          b                      Threshold values (t)

                1         0.95      0.0006 0.0015 0.0035 0.01 0.035 0.11 0.35

               1.5        0.85      0.0006 0.0015 0.0035 0.01 0.035 0.11 0.35
               0.80       0.95      0.0006 0.0015 0.0035 0.01 0.035 0.11 0.35
                2         0.70      0.0006 0.0015 0.0035 0.01 0.035 0.11 0.35




                                         Optimization Results
           a          b                        Threshold values (t)               AUC

          0.9999 0.9501          0.0004 0.0020 0.0032 0.012 0.03537 0.09 0.3400   0.8447

          1.4999 0.8501          0.0003 0.0017 0.0036 0.011 0.03537 0.10 0.3500   0.9167

          0.7999 0.9501          0.0004 0.0018 0.0032 0.011 0.03400 0.10 0.3300   0.8138

          2.0001 0.7001          0.0004 0.0020 0.0031 0.012 0.03343 0.11 0.3400   0.9671
Numerical Results

                                Accuracy Error in Each Class
               I          II        III         IV           V         VI        VII       VIII
            0.0000     0.0000     0.0000     0.0001 0.0001 0.0010              0.0010 0.0075
            0.0000     0.0000     0.0000     0.0001 0.0001 0.0010              0.0018 0.0094

            0.0000     0.0000     0.0000     0.0000 0.0001 0.0002              0.0018 0.0059

            0.0000     0.0000     0.0000     0.0001 0.0001 0.0006              0.0018 0.0075




                                Number of Firms in Each Class
               I          II         III        IV           V         VI         VII       VIII
               4          56         27       133        115          102         129        61
               2          42         52       120        119          111         120        61

               4          43         40       129        114          116         120        61

               4          56         24       136        106          129         111        61
           Number of firms in each class at the beginning:       10, 26, 58, 106, 134, 121, 111, 61
Generalized Additive Models
References

Aster, A., Borchers, B., and Thurber, C., Parameter Estimation and Inverse Problems, Academic Press, 2004.

Boyd, S., and Vandenberghe, L., Convex Optimization, Cambridge University Press, 2004.

Buja, A., Hastie, T., and Tibshirani, R., Linear smoothers and additive models, The Ann. Stat. 17, 2 (1989)
453-510.
Fox, J., Nonparametric regression, Appendix to an R and S-Plus Companion to Applied Regression,
Sage Publications, 2002.

Friedman, J.H., Multivariate adaptive regression splines, Annals of Statistics 19, 1 (1991) 1-141.

Friedman, J.H., and Stuetzle, W., Projection pursuit regression, J. Amer. Statist Assoc. 76 (1981) 817-823.

Hastie, T., and Tibshirani, R., Generalized additive models, Statist. Science 1, 3 (1986) 297-310.

Hastie, T., and Tibshirani, R., Generalized additive models: some applications, J. Amer. Statist. Assoc.
82, 398 (1987) 371-386.

Hastie, T., Tibshirani, R., and Friedman, J.H., The Element of Statistical Learning, Springer, 2001.

Hastie, T.J., and Tibshirani, R.J., Generalized Additive Models, New York, Chapman and Hall, 1990.
Nash, G., and Sofer, A., Linear and Nonlinear Programming, McGraw-Hill, New York, 1996.
Nemirovski, A., Lectures on modern convex optimization, Israel Institute of Technology (2002).
References

Nemirovski, A., Modern Convex Optimization, lecture notes, Israel Institute of Technology (2005).

Nesterov, Y.E , and Nemirovskii, A.S., Interior Point Methods in Convex Programming, SIAM, 1993.

Önalan, Ö., Martingale measures for NIG Lévy processes with applications to mathematical finance,
presentation in: Advanced Mathematical Methods for Finance, Side, Antalya, Turkey, April 26-29, 2006.

Taylan, P., Weber, G.-W., and Yerlikaya, F., A new approach to multivariate adaptive regression spline
by using Tikhonov regularization and continuous optimization, to appear in TOP, Selected Papers at the
Occasion of 20th EURO Mini Conference (Neringa, Lithuania, May 20-23, 2008).

Stone, C.J., Additive regression and other nonparametric models, Annals of Statistics 13, 2 (1985) 689-705.

Weber, G.-W., Taylan, P., Akteke-Öztürk, B., and Uğur, Ö., Mathematical and data mining contributions
dynamics and optimization of gene-environment networks, in the special issue Organization in Matter
from Quarks to Proteins of Electronic Journal of Theoretical Physics.

Weber, G.-W., Taylan, P., Yıldırak, K., and Görgülü, Z.K., Financial regression and organization, to appear
in the Special Issue on Optimization in Finance, of DCDIS-B (Dynamics of Continuous, Discrete and
Impulsive Systems (Series B)).

More Related Content

PDF
Valencia 9 (poster)
PDF
Quantization
PDF
quantization
PDF
Computation of the marginal likelihood
PDF
Future Value and Present Value --- Paper (2006)
PDF
Parameter Estimation for Semiparametric Models with CMARS and Its Applications
PDF
F0543645
PPT
Cognitive radio
Valencia 9 (poster)
Quantization
quantization
Computation of the marginal likelihood
Future Value and Present Value --- Paper (2006)
Parameter Estimation for Semiparametric Models with CMARS and Its Applications
F0543645
Cognitive radio

What's hot (11)

PDF
On Foundations of Parameter Estimation for Generalized Partial Linear Models ...
PDF
PAWL - GPU meeting @ Warwick
PDF
Anomaly Detection Using Projective Markov Models
PDF
Incomplete-Market Equilibrium with Unhedgeable Fundamentals and Heterogeneous...
PDF
Performance Maximization of Managed Funds
PDF
UT Austin - Portugal Lectures on Portfolio Choice
PDF
Pages from ludvigson methodslecture 2
PDF
Hedging, Arbitrage, and Optimality with Superlinear Frictions
PDF
Lecture on nk [compatibility mode]
PDF
Transaction Costs Made Tractable
PDF
Dynamic Trading Volume
On Foundations of Parameter Estimation for Generalized Partial Linear Models ...
PAWL - GPU meeting @ Warwick
Anomaly Detection Using Projective Markov Models
Incomplete-Market Equilibrium with Unhedgeable Fundamentals and Heterogeneous...
Performance Maximization of Managed Funds
UT Austin - Portugal Lectures on Portfolio Choice
Pages from ludvigson methodslecture 2
Hedging, Arbitrage, and Optimality with Superlinear Frictions
Lecture on nk [compatibility mode]
Transaction Costs Made Tractable
Dynamic Trading Volume
Ad

Viewers also liked (20)

PPTX
PPTX
How to read a receiver operating characteritic (ROC) curve
PPT
Roc Search
PDF
General Introduction to ROC Curves
PDF
Receiver Operating Characteristic (ROC) curve analysis. 19.12
PDF
TransactionBasedAnalytics2010
PPT
Risk Asessment Presentation 19th Sept - Chris Delves
PPTX
AUC: at what cost(s)?
PDF
ID3 Algorithm & ROC Analysis
PPTX
Population Stability Index(PSI) for Big Data World
PPT
05 powerpoint-alessandra young
PPTX
Logistic Modeling with Applications to Marketing and Credit Risk in the Autom...
PDF
Measurement errors, Statistical Analysis, Uncertainty
PDF
Credit Scoring
PPT
Credit scoring
PDF
Logistic regression
PDF
Model building in credit card and loan approval
PPTX
Credit Risk Model Building Steps
PDF
Predictive Model for Loan Approval Process using SAS 9.3_M1
How to read a receiver operating characteritic (ROC) curve
Roc Search
General Introduction to ROC Curves
Receiver Operating Characteristic (ROC) curve analysis. 19.12
TransactionBasedAnalytics2010
Risk Asessment Presentation 19th Sept - Chris Delves
AUC: at what cost(s)?
ID3 Algorithm & ROC Analysis
Population Stability Index(PSI) for Big Data World
05 powerpoint-alessandra young
Logistic Modeling with Applications to Marketing and Credit Risk in the Autom...
Measurement errors, Statistical Analysis, Uncertainty
Credit Scoring
Credit scoring
Logistic regression
Model building in credit card and loan approval
Credit Risk Model Building Steps
Predictive Model for Loan Approval Process using SAS 9.3_M1
Ad

Similar to A Classification Problem of Credit Risk Rating Investigated and Solved by Optimization of the ROC Curve (20)

PDF
Prediction of Credit Default by Continuous Optimization
PDF
IGARSS2011 FR3.T08.3 BenDavid.pdf
PPT
Quadrature amplitude modulation qam transmitter
PDF
2010 APS_ Broadband Characteristics of A Dome Dipole Antenna
PDF
Approximate Bayesian Computation on GPUs
PPT
Numerical Technique, Initial Conditions, Eos,
PDF
A Mathematica tool for rapid estimation of flow conditions within a compressi...
PDF
Index Determination in DAEs using the Library indexdet and the ADOL-C Package...
PPT
SPM 12 practical course by Volodymyr B. Bogdanov (Kyiv 2015, Day 2)
PDF
Particle filtering
PDF
Ee443 phase locked loop - presentation - schwappach and brandy
PDF
Insiders modeling london-2006
PDF
NCE, GANs & VAEs (and maybe BAC)
PPT
Alba ffs flim 2012
PPTX
Signal Processing Homework Help
PDF
Research Inventy : International Journal of Engineering and Science
PDF
Case Study (All)
PPT
fnCh4.ppt ENGINEERING MATHEMATICS
PDF
Two Curves Upfront
Prediction of Credit Default by Continuous Optimization
IGARSS2011 FR3.T08.3 BenDavid.pdf
Quadrature amplitude modulation qam transmitter
2010 APS_ Broadband Characteristics of A Dome Dipole Antenna
Approximate Bayesian Computation on GPUs
Numerical Technique, Initial Conditions, Eos,
A Mathematica tool for rapid estimation of flow conditions within a compressi...
Index Determination in DAEs using the Library indexdet and the ADOL-C Package...
SPM 12 practical course by Volodymyr B. Bogdanov (Kyiv 2015, Day 2)
Particle filtering
Ee443 phase locked loop - presentation - schwappach and brandy
Insiders modeling london-2006
NCE, GANs & VAEs (and maybe BAC)
Alba ffs flim 2012
Signal Processing Homework Help
Research Inventy : International Journal of Engineering and Science
Case Study (All)
fnCh4.ppt ENGINEERING MATHEMATICS
Two Curves Upfront

More from SSA KPI (20)

PDF
Germany presentation
PDF
Grand challenges in energy
PDF
Engineering role in sustainability
PDF
Consensus and interaction on a long term strategy for sustainable development
PDF
Competences in sustainability in engineering education
PDF
Introducatio SD for enginers
PPT
DAAD-10.11.2011
PDF
Talking with money
PDF
'Green' startup investment
PDF
From Huygens odd sympathy to the energy Huygens' extraction from the sea waves
PDF
Dynamics of dice games
PPT
Energy Security Costs
PPT
Naturally Occurring Radioactivity (NOR) in natural and anthropic environments
PDF
Advanced energy technology for sustainable development. Part 5
PDF
Advanced energy technology for sustainable development. Part 4
PDF
Advanced energy technology for sustainable development. Part 3
PDF
Advanced energy technology for sustainable development. Part 2
PDF
Advanced energy technology for sustainable development. Part 1
PPT
Fluorescent proteins in current biology
PPTX
Neurotransmitter systems of the brain and their functions
Germany presentation
Grand challenges in energy
Engineering role in sustainability
Consensus and interaction on a long term strategy for sustainable development
Competences in sustainability in engineering education
Introducatio SD for enginers
DAAD-10.11.2011
Talking with money
'Green' startup investment
From Huygens odd sympathy to the energy Huygens' extraction from the sea waves
Dynamics of dice games
Energy Security Costs
Naturally Occurring Radioactivity (NOR) in natural and anthropic environments
Advanced energy technology for sustainable development. Part 5
Advanced energy technology for sustainable development. Part 4
Advanced energy technology for sustainable development. Part 3
Advanced energy technology for sustainable development. Part 2
Advanced energy technology for sustainable development. Part 1
Fluorescent proteins in current biology
Neurotransmitter systems of the brain and their functions

Recently uploaded (20)

PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
DOC
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
PDF
Computing-Curriculum for Schools in Ghana
PDF
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
VCE English Exam - Section C Student Revision Booklet
PDF
Classroom Observation Tools for Teachers
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PDF
Weekly quiz Compilation Jan -July 25.pdf
PDF
Microbial disease of the cardiovascular and lymphatic systems
PDF
RMMM.pdf make it easy to upload and study
PPTX
Cell Types and Its function , kingdom of life
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
Supply Chain Operations Speaking Notes -ICLT Program
2.FourierTransform-ShortQuestionswithAnswers.pdf
Final Presentation General Medicine 03-08-2024.pptx
Pharmacology of Heart Failure /Pharmacotherapy of CHF
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
Computing-Curriculum for Schools in Ghana
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
VCE English Exam - Section C Student Revision Booklet
Classroom Observation Tools for Teachers
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
Weekly quiz Compilation Jan -July 25.pdf
Microbial disease of the cardiovascular and lymphatic systems
RMMM.pdf make it easy to upload and study
Cell Types and Its function , kingdom of life
Abdominal Access Techniques with Prof. Dr. R K Mishra
STATICS OF THE RIGID BODIES Hibbelers.pdf
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf

A Classification Problem of Credit Risk Rating Investigated and Solved by Optimization of the ROC Curve

  • 1. 5th International Summer School Achievements and Applications of Contemporary Informatics, Mathematics and Physics National University of Technology of the Ukraine Kiev, Ukraine, August 3-15, 2010 A Classification Problem of Credit Risk Rating Investigated and Solved by Optimization of the ROC Curve Gerhard-Wilhelm Weber * Kasırga Yıldırak and Efsun Kürüm Institute of Applied Mathematics, Middle East Technical University, Ankara, Turkey • Faculty of Economics, Management and Law, University of Siegen, Germany Center for Research on Optimization and Control, University of Aveiro, Portugal Universiti Teknologi Malaysia, Skudai, Malaysia
  • 2. Outline • Main Problem from Credit Default • Logistic Regression and Performance Evaluation • Cut-Off Values and Thresholds • Classification and Optimization • Nonlinear Regression • Numerical Results • Outlook and Conclusion
  • 3. Main Problem from Credit Default  Whether a credit application should be consented or rejected. Solution  Learning about the default probability of the applicant.
  • 4. Main Problem from Credit Default  Whether a credit application should be consented or rejected. Solution  Learning about the default probability of the applicant.
  • 5. Logistic Regression P(Y 1 X xl ) log β0 β1 xl1 β2 xl 2 β p xlp P(Y 0X xl ) (l 1, 2,..., N )
  • 6. Goal Our study is based on one of the Basel II criteria which recommend that the bank should divide corporate firms by 8 rating degrees with one of them being the default class. We have two problems to solve here:  To distinguish the defaults from non-defaults.  To put non-default firms in an order based on their credit quality and classify them into (sub) classes.
  • 7. Data  Data have been collected by a bank from the firms operating in the manufacturing sector in Turkey.  They cover the period between 2001 and 2006.  There are 54 qualitative variables and 36 quantitative variables originally.  Data on quantitative variables are formed based on a balance sheet submitted by the firms’ accountants. Essentially, they are the well-known financial ratios.  The data set covers 3150 firms from which 92 are in the state of default. As the number of default is small, in order to overcome the possible statistical problems, we downsize the number to 551, keeping all the default cases in the set.
  • 8. We evaluate performance of the model non-default default cases cases cut-off value ROC curve test result value TPF, sensitivity FPF, 1-specificity
  • 9. Model outcome versus truth truth d n True Positive False Positive Fraction Fraction dı TPF FPF model outcome False Negative True Negative nı Fraction Fraction FNF TNF 1 1 total
  • 10. Definitions • sensitivity (TPF) := P( Dı | D) • specificity := P( NDı | ND ) • 1-specificity (FPF) := P( Dı | ND ) • points (TPF, FPF) constitute the ROC curve • c := cut-off value • c takes values between - and • TPF(c) := P( z>c | D ) • FPF(c) := P( z>c | ND )
  • 11. normal-deviate axes TPF Normal Deviate (TPF) FPF FPF(ci ) : Φ( ci ) TPF (ci ) : Φ(a b ci ) μn - μs σn a: b: σs σs Normal Deviate (FPF)
  • 12. normal-deviate axes TPF t Normal Deviate (TPF) FPF FPF(ci ) : Φ( ci ) TPF (ci ) : Φ(a b ci ) c μn - μs σn a: b: σs σs Normal Deviate (FPF)
  • 13. Classification Ex.: cut-off values actually non-default actually default cases cases c class I class II class III class IV class V To assess discriminative power of such a model, we calculate the Area Under (ROC) Curve: AUC : Φ(a b c) d Φ (c).
  • 14. relationship between thresholds and cut-off values Ex.: TPF FPF t0 t1 t2 t3 t4 t5 R=5 Φ(c) t c Φ 1(t )
  • 15. Optimization in Credit Default Problem: Simultaneously to obtain the thresholds and the parameters a and b that maximize AUC, while balancing the size of the classes (regularization) and guaranteeing a good accuracy.
  • 16. Optimization Problem 2 1 R 1 -1 i max α1 Φ( a b Φ (t )) dt α2 (ti 1 ti ) i 0 n a,b, 0 ti 1 subject to Φ(a b Φ 1(t ))d t δi (i 0,1,..., R 1) ti τ : (t1 , t2 ,..., tR -1 )T t0 0, tR 1
  • 17. Optimization Problem 2 1 R 1 -1 i max α1 Φ( a b Φ (t )) dt α2 (ti 1 ti ) i 0 n a,b, 0 ti 1 subject to Φ(a b Φ 1 (t ))d t δi 0 (i 0,1,..., R 1) ti ti 1 ti τ : (t1 , t2 ,..., tR -1 )T t0 0, tR 1
  • 18. Over the ROC Curve TPF 1-AUC AUC FPF t0 t1 t2 t3 t4 t5 1 AOC : (1 Φ(a b Φ 1 (t ))) dt 0
  • 19. New Version of the Optimization Problem 2 R 1 1 i 1 min α2 (ti 1 ti ) α 1 (1 Φ(a b (t ))) dt a, b, τ n i 0 0 subject to t j 1 1 (1 Φ(a b (t ))) dt tj 1 t j δj ( j 0,1, ..., R 1) t j
  • 20. Regression in Credit Default Optimization problem: Simultaneously to obtain the thresholds and the parameters a and b that maximize AUC, while balancing the size of the classes (regularization) and guaranteeing a good accuracy discretization of integral nonlinear regression problem
  • 21. Discretization of the Integral Riemann-Stieltjes integral AUC Φ(a b c) dΦ(c) Riemann integral 1 AUC Φ(a b Φ 1 (t )) dt 0 Discretization R AUC Φ(a b Φ 1(tk )) Δtk k 1
  • 22. Optimization Problem with Penalty Parameters In the case of violation of anyone of these constraints, we introduce penalty parameters. As some penalty becomes increased, the iterates are forced towards the feasible set of the optimization problem. 2 R 1 1 ΠΘ ( a,b, τ ) : i (ti 1 ti ) (1- Φ( a b -1 (t ))) dt 2 1 i 0 n 0 R-1 tj 1 1 3 θj δj Φ(a b (t ))) dt j 0 tj : j ( a , b, ) Θ : (θ1, θ2 ,..., θ R 1 )T θj 0 (j 0,1, ..., R 1)
  • 23. Optimization Problem further discretized 2 R 1 R ΠΘ (a,b, ) α2 i (ti 1 ti ) α1 ( (1- Φ(a b 1 (t j ))) Δt j )2 i 0 n j 1 2 nj δj Δην R-1 1( 3. j Φ(a b j )) j j 0 ν 0 tj 1 tj
  • 24. Optimization Problem further discretized 2 R 1 R ΠΘ (a,b, ) α2 i (ti 1 ti ) α1 ( (1- Φ(a b 1 (t j ))) Δt j )2 i 0 n j 1 2 nj δj Δην R-1 1( 3. j Φ( a b j )) j j 0 ν 0 tj 1 tj
  • 25. Nonlinear Regression N 2 min f dj g xj , j 1 N : f j2 j 1 T F( ) : f1 ( ),..., f N ( ) min f ( ) F T ( )F ( )
  • 26. Nonlinear Regression k 1 : k qk • Gauss-Newton method : T F( ) F ( )q F ( )F ( ) • Levenberg-Marquardt method : 0 T F( ) F( ) Ip q F ( )F ( )
  • 27. Nonlinear Regression alternative solution min t, t,q T subject to F( ) F( ) Ip q F ( )F ( ) t, t 0, 2 || Lq || 2 M conic quadratic programming
  • 28. Nonlinear Regression alternative solution min t, t,q T subject to F( ) F( ) Ip q F ( )F ( ) t, t 0, 2 || Lq || 2 M conic quadratic programming interior point methods
  • 29. Numerical Results Initial Parameters a b Threshold values (t) 1 0.95 0.0006 0.0015 0.0035 0.01 0.035 0.11 0.35 1.5 0.85 0.0006 0.0015 0.0035 0.01 0.035 0.11 0.35 0.80 0.95 0.0006 0.0015 0.0035 0.01 0.035 0.11 0.35 2 0.70 0.0006 0.0015 0.0035 0.01 0.035 0.11 0.35 Optimization Results a b Threshold values (t) AUC 0.9999 0.9501 0.0004 0.0020 0.0032 0.012 0.03537 0.09 0.3400 0.8447 1.4999 0.8501 0.0003 0.0017 0.0036 0.011 0.03537 0.10 0.3500 0.9167 0.7999 0.9501 0.0004 0.0018 0.0032 0.011 0.03400 0.10 0.3300 0.8138 2.0001 0.7001 0.0004 0.0020 0.0031 0.012 0.03343 0.11 0.3400 0.9671
  • 30. Numerical Results Accuracy Error in Each Class I II III IV V VI VII VIII 0.0000 0.0000 0.0000 0.0001 0.0001 0.0010 0.0010 0.0075 0.0000 0.0000 0.0000 0.0001 0.0001 0.0010 0.0018 0.0094 0.0000 0.0000 0.0000 0.0000 0.0001 0.0002 0.0018 0.0059 0.0000 0.0000 0.0000 0.0001 0.0001 0.0006 0.0018 0.0075 Number of Firms in Each Class I II III IV V VI VII VIII 4 56 27 133 115 102 129 61 2 42 52 120 119 111 120 61 4 43 40 129 114 116 120 61 4 56 24 136 106 129 111 61 Number of firms in each class at the beginning: 10, 26, 58, 106, 134, 121, 111, 61
  • 32. References Aster, A., Borchers, B., and Thurber, C., Parameter Estimation and Inverse Problems, Academic Press, 2004. Boyd, S., and Vandenberghe, L., Convex Optimization, Cambridge University Press, 2004. Buja, A., Hastie, T., and Tibshirani, R., Linear smoothers and additive models, The Ann. Stat. 17, 2 (1989) 453-510. Fox, J., Nonparametric regression, Appendix to an R and S-Plus Companion to Applied Regression, Sage Publications, 2002. Friedman, J.H., Multivariate adaptive regression splines, Annals of Statistics 19, 1 (1991) 1-141. Friedman, J.H., and Stuetzle, W., Projection pursuit regression, J. Amer. Statist Assoc. 76 (1981) 817-823. Hastie, T., and Tibshirani, R., Generalized additive models, Statist. Science 1, 3 (1986) 297-310. Hastie, T., and Tibshirani, R., Generalized additive models: some applications, J. Amer. Statist. Assoc. 82, 398 (1987) 371-386. Hastie, T., Tibshirani, R., and Friedman, J.H., The Element of Statistical Learning, Springer, 2001. Hastie, T.J., and Tibshirani, R.J., Generalized Additive Models, New York, Chapman and Hall, 1990. Nash, G., and Sofer, A., Linear and Nonlinear Programming, McGraw-Hill, New York, 1996. Nemirovski, A., Lectures on modern convex optimization, Israel Institute of Technology (2002).
  • 33. References Nemirovski, A., Modern Convex Optimization, lecture notes, Israel Institute of Technology (2005). Nesterov, Y.E , and Nemirovskii, A.S., Interior Point Methods in Convex Programming, SIAM, 1993. Önalan, Ö., Martingale measures for NIG Lévy processes with applications to mathematical finance, presentation in: Advanced Mathematical Methods for Finance, Side, Antalya, Turkey, April 26-29, 2006. Taylan, P., Weber, G.-W., and Yerlikaya, F., A new approach to multivariate adaptive regression spline by using Tikhonov regularization and continuous optimization, to appear in TOP, Selected Papers at the Occasion of 20th EURO Mini Conference (Neringa, Lithuania, May 20-23, 2008). Stone, C.J., Additive regression and other nonparametric models, Annals of Statistics 13, 2 (1985) 689-705. Weber, G.-W., Taylan, P., Akteke-Öztürk, B., and Uğur, Ö., Mathematical and data mining contributions dynamics and optimization of gene-environment networks, in the special issue Organization in Matter from Quarks to Proteins of Electronic Journal of Theoretical Physics. Weber, G.-W., Taylan, P., Yıldırak, K., and Görgülü, Z.K., Financial regression and organization, to appear in the Special Issue on Optimization in Finance, of DCDIS-B (Dynamics of Continuous, Discrete and Impulsive Systems (Series B)).