SlideShare a Scribd company logo
Quantifying Regional Error in Surrogates by Modeling its
Relationship with Sample Density
Ali Mehmani, Souma Chowdhury , Jie Zhang, Weiyang Tong,
and Achille Messac
Syracuse University, Department of Mechanical and Aerospace Engineering
54th AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics and
Materials Conference
April 8-11, 2013, Boston, Massachusetts
Surrogate model
• Surrogate models are commonly used for providing a tractable and
inexpensive approximation of the actual system behavior in many
routine engineering analysis and design activities:
2
3
Broad Research Question
Structural Blade Design
ANSYS, Inc.
Expensive Inexpensive
Surrogate Model
Lower Fidelity How Low?
4
Broad Research Question
How to quantify the level of the surrogate accuracy ?
 further improvement of the surrogate,
 domain exploration,
 assessing the reliability of the optimal design,
 quantifying the uncertainty associated with the surrogate,
 construction of a weighted surrogate model, and
 …
Research Objective
 Develop a reliable method to quantify the surrogate error,
5
 This method should have the following characteristics:
 model independent
 no additional system evaluations
 local/global error measurement
 quantify the error of the actual surrogate
6
Regional Error Estimation of Surrogate
7
Regional Error Estimation of Surrogate
(REES)
Presentation Outline
8
• Review of surrogate model error measurement methods
• Relation of surrogate accuracy with sample density
• Regional Error Estimation of Surrogate
• Numerical examples: benchmark and an engineering
design problems
Presentation Outline
9
• Review of surrogate model error measurement methods
• Relation of surrogate accuracy with sample density
• Regional Error Estimation of Surrogate
• Numerical examples: benchmark and an engineering
design problems
Surrogate Model Error Measurement Methods
 Error quantification methods can be classified,
 based on their computational expense, into methods that require
additional data, and methods that use existing data.
 based on the region of interest, into:
- Global error measure
(e.g., split sample, cross-validation, Akaike’s information criterion, and
bootstrapping).
- Local or point-wise error measure
(e.g., the mean squared errors for Kriging and the linear reference model
(LRM).)
10
Surrogate Model Error Measurement Methods
 Error metrics,
11
• The mean squared error (MSE) (or root mean square error (RMSE) )
• The maximum absolute error (MAE)
• The relative absolute error (RAE)
actual values on ith test point
predicted values on ith test point
Surrogate Model Error Measurement Methods
 Error metrics,
12
• The prediction sum of square (PRESS) is based on the leave-one-out
cross-validation error
• The root mean square of PRESS (PRESSRMS) based on the k-fold cross-
validation.
• The relative absolute error of cross-validation (RAECV) based on leave-
one-out approach.
Presentation Outline
13
• Review of surrogate model error measurement methods
• Relation of surrogate accuracy with sample density
• Regional Error Estimation of Surrogate
• Numerical examples: benchmark and an engineering
design problems
Methodology: Concept
14
Model accuracy ∝ Available resources
In general, this concept can be applied for different methodologies
- Surrogate modeling,
- Finite Element Analysis, and
- ...
Methodology: Concept
15
 Finite Element Analysis (numerical methods)
coarse mesh
(4 solid brick element)
medium mesh
(32 solid brick element)
fine mesh
(256 solid brick element)
Estimate total shear force and flexural moment at vertical
sections using Finite Element Analysis.
The finer mesh, the stresses are more precise due to the larger
number of elements
9 training points
3 training points
Methodology: Concept
 Surrogate (mathematical model)
7 training points
16
Surrogate accuracy generally improves with increasing training points.
The location of additional points has
strong impact on surrogate accuracy.
This impact is highly problem and model
dependent.
Presentation Outline
17
• Review of surrogate model error measurement methods
• Relation of surrogate accuracy with sample density
• Regional Error Estimation of Surrogate
• Numerical examples: benchmark and an engineering
design problems
Methodology: REES
18
 The REES method formulates the variation of error as a
function of training points using intermediate surrogates.
 This formulation is used to predict the level of error in a
final surrogate.
Methodology: REES
{𝑿𝒊𝒏}
{𝑿 𝒐𝒖𝒕}
𝑿 = 𝑿𝒊𝒏 + 𝑿 𝒐𝒖𝒕
Step 2 : Identification of sample points inside/outside region of interest
Step 1 : Generation of sample data
The entire set of sample points is represented by 𝑿 .
𝑿𝒊𝒏 : Inside-region data set
user-defined region of interest
𝑿 𝒐𝒖𝒕 : Outside-region data set
19
Methodology: REES
Step 3 : Estimation of the variation of the error with sample density
Inside-region point
Outside-region point
user-defined region
of interest
First Iteration :
Test Point
Training Point
Second iteration :
Test Point
Training Point
Third iteration :
Test Point
Training Point
Final Surrogate :
Training Point
Methodology: REES
Step 3 : Estimation of the variation of the error with sample density
 A position of sample points which are selected as training
points, at each iteration, is critical to the surrogate accuracy.
 The proposed error measure should be minimally sensitive to
the location of the test points at each iteration.
21
 Intermediate surrogates are
iteratively constructed (at each
iteration) over a sample set
comprising all samples outside the
region of interest and heuristic
subsets of samples inside the region
of interest.
Methodology: REES
Step 3 : Estimation of the variation of the error with sample density
 The number of iterations (𝑁 𝑖𝑡
) is defined
- dimension of a problem,
- number of inside sample points, and
- preference of the user
 The number of sample combinations
(𝑲 𝒕
) is defined,
 The intermediate subset for each
combination at specific iteration is defined
by
{𝜷 𝒌
} ⊂ 𝑿𝒊𝒏
#{𝜷 𝒌} = 𝒏 𝒕, 𝒏 𝒕−𝟏 < 𝒏 𝒕
𝒌 = 1,2, … , 𝐾 𝑡
 The intermediate training points and test
points for each combination at each
iteration is defined by
𝑿 𝑻𝑹 = 𝑿 𝒐𝒖𝒕 + 𝜷 𝒌
𝑿 𝑻𝑬 = 𝑿 − 𝑿 𝑻𝑹
 The intermediate surrogates
𝑓 𝑘, 𝒌 = 𝟏, 𝟐, . . , 𝑲 𝒕
are constructed for all combinations using the
intermediate training points ( 𝑿 𝑻𝑹 ), and are
tested over the intermediate test points ( 𝑿 𝑻𝑬 ).
Methodology: REES
Step 3 : Estimation of the variation of the error with sample density
 The median and the maximum errors are
estimated for each combination
𝒎 𝒕
: the number of test points in tth iteration
𝒆: the RAE value estimated on intermediate test points
23
Methodology: REES
Step 3 : Estimation of the variation of the error with sample density
 The median and the maximum errors are
estimated for each combination
24
Median error
Maximum error
Overall Fidelity Information
Minimum Fidelity Information
The median is a useful measures of central
tendency which is less vulnerable to outliers.
Methodology: REES
Step 3 : Estimation of the variation of the error with sample density
 Probabilistic models are developed using
a lognormal distribution to represent
median and maximum errors estimated
over all 𝑲 𝒕
combinations at each
iterations.
 The mode of distribution is selected to
represent the errors at each iteration.
Mode of median error distribution
Mode of maximum error distribution
 These values are used to relate the
variation of the surrogate error with
number of training points (sample
density).
The relation of the error with sample density
 12-D Test Problem (Dixon & Price, n=12)
Number of sample points # 𝑿 = 𝟓𝟓𝟎, Number of inside sample points # 𝑿𝒊𝒏 = # 𝑿
Number of training points at each iteration,𝒏 𝒕
= 5𝑡 + 50, 𝑡 = 1,2, … , 70
Number of sample combination, 𝑲 𝒕 = 500
Estimated mode of median errors Estimated mode of maximum errors
Number of Training Points
MOmax
Number of Training Points
MOmed
First iteration
Last iteration
# 𝑿 𝑻𝑹 = 𝟒𝟎𝟎
# 𝑿 𝑻𝑬 = 𝟏𝟎𝟎
# 𝑿 𝑻𝑹 = 𝟓𝟓
# 𝑿 𝑻𝑬 = 𝟒𝟒𝟓
The relation of the error with sample density
 12-D Test Problem (Dixon & Price, n=12)
Estimated mean of mean errorsEstimated mode of median errors
Number of Training Points
Meanmean
Number of Training Points
MOmed
REES Method Normalized k-fold CV
Methodology: REES
Step 4 : Prediction of regional error in the final surrogate
 The final surrogate model is constructed using the full set of training data.
 Regression models are applied to relate
- the statistical mode of the median error distribution(𝑴𝒐 𝒎𝒆𝒅)
- the statistical mode of the maximum error distributions(𝑴𝒐 𝒎𝒂𝒙), and
- the absolute maximum error (𝑨𝑩𝑺 𝒎𝒂𝒙)
at each iteration to the size of the inside-region training points (nt),
 These regression models are called the variation of error with sample density
(VESD).
The regression models are used to predict the level of the
error in the final surrogate within the region of interest.
28
Methodology: REES
Modeling the Variation of Regional Error with Training Point Density
 In this study, three types of the regression functions are used to represent
the variation of regional error with respect to the inside-region training points
Exponential regression model
Multiplicative regression model
Linear regression model
 The choice of these functions assume a smooth monotonic decrease of the
regional error with the training point density within that region.
 The root mean squared error metric is used to select the best-fit regression
model 29
Presentation Outline
30
• Review of surrogate model error measurement methods
• Relation of surrogate accuracy with sample density
• Regional Error Estimation of Surrogate
• Numerical examples: benchmark and an engineering
design problems
Numerical Examples
 The effectiveness of the REES method is explored for applications with
- Kriging,
- Radial Basis Functions (RBF),
- Extended Radial Basis Functions (E-RBF), and
- Quadratic Response Surface (QRS).
 To evaluate practical and numerical efficiencies of the REES method,
three benchmark problems and an engineering design problem are tested.
 The error evaluated using REES, and the relative absolute error given by
leave-one-out cross-validation (𝑹𝑨𝑬 𝒄𝒗) are compared with the actual
error evaluated using relative absolute error on additional test
points (𝑹𝑨𝑬 𝒂𝒄𝒕𝒖𝒂𝒍).
31
MedianofRAEs
Numerical Examples
Results and Discussion
VESD regression models within the region of interest of surrogate models
constructed for the Branin-Hoo Function to predict,
Distribution of
median errors
Mode of the median error
distribution,
Predicted mode of median error
in the final surrogate,
VESDmed
Number of Inside-region Training Points 32
Numerical Examples
Results and Discussion
VESD regression models
within the region of interest of
surrogate models constructed
for the Branin-Hoo Function
to predict,
Type and coefficients of
VESDmed
RBFKriging
E-RBF QRS
MaximumofRAEs
Numerical Examples
Results and Discussion
VESD regression models within the region of interest of surrogate models constructed for the
Branin-Hoo Function to predict the mode of maximum ( ) and the absolute
maximum ( ) error.
Distribution of
maximum errors
Mode of the maximum
error distribution,
Absolute maximum error
Predicted mode of
maximum error in
the final surrogate
Predicted absolute
maximum error in
the final surrogate
34
Number of Inside-region Training Points
Numerical Examples
Results and Discussion
VESD regression models within the region of interest of surrogate models constructed for the
Branin-Hoo Function to predict the mode of maximum ( ) and the absolute
maximum ( ) error.
Type and coefficients of VESDABS
Type and coefficients of VESDmax
RBFKriging
E-RBF QRS
Numerical Examples
Wind Farm Power Generation
36
Surrogates are developed using Kriging, RBF, E-RBF, and QRS to
represent the power generation of an array-like wind farm.
Numerical Examples
Results and Discussion
37
It. 1 It. 2 It. 3 It. 4 Predicted Error
VESD regression models in different surrogates for the wind farm power
generation problem
Numerical Examples
38
The closer to one, the better the corresponding error measure.
Results and Discussion
predicted mode of median errors
median of RAEs evaluated on test
points
median of relative absolute
errors of cross-validation
Concluding Remarks
 We developed a new method to quantify surrogate error based on the
hypothesis that:
“The accuracy of the approximation model is related to the amount
of available resources”
 This relationship can be reliably quantified when the error measures is
less sensitive to sample locations or a type of application.
 The REES method addresses this issue.
 The preliminary results on benchmark and wind farm power generation
problems indicate that in majority of cases the REES method is more
accurate than other measures.
39
It is not possible using any existing methods
Future Works
 The scope for improvement the method
 The implementation of the proposed error measurement in
surrogate developments.
40
Acknowledgement
41
 I would like to acknowledge my research adviser
Prof. Achille Messac, and my co-adviser Prof.
Souma Chowdhury for their immense help and
support in this research.
 Support from the NSF Awards is also acknowledged.
42
Thank you
Questions
and
Comments
MedianofRAEs
Numerical Examples
Results and Discussion
VESD regression models within the region of interest of surrogate models
constructed for the Branin-Hoo Function to predict,
Distribution of
median errors
Mode of the median error
distribution,
Predicted mode of median error
in the final surrogate,
VESDmed
Number of Inside-region Training Points 43
Meanmean
Number of Inside-region Training Points
k-fold CV

More Related Content

PPTX
AIAA-Aviation-2015-Mehmani
PPTX
COSMOS-ASME-IDETC-2014
PPTX
AIAA-SciTech-ModelSelection-2014-Mehmani
PPTX
ASS_SDM2012_Ali
PPTX
VIDMAP_Aviation_2014_Souma
PPTX
PEMF-1-MAO2012-Ali
PPTX
WCSMO-WFLO-2015-mehmani
PPTX
MOWF_WCSMO_2013_Weiyang
AIAA-Aviation-2015-Mehmani
COSMOS-ASME-IDETC-2014
AIAA-SciTech-ModelSelection-2014-Mehmani
ASS_SDM2012_Ali
VIDMAP_Aviation_2014_Souma
PEMF-1-MAO2012-Ali
WCSMO-WFLO-2015-mehmani
MOWF_WCSMO_2013_Weiyang

What's hot (18)

PDF
A parsimonious SVM model selection criterion for classification of real-world ...
PPTX
AMS_Aviation_2014_Ali
PPTX
ModelSelection1_WCSMO_2013_Ali
PPTX
PEMF2_SDM_2012_Ali
PPT
DSUS_MAO_2012_Jie
PPTX
COSMOS1_Scitech_2014_Ali
PDF
Adaptive response surface by kriging using pilot points for structural reliab...
PDF
A REVIEW ON OPTIMIZATION OF LEAST SQUARES SUPPORT VECTOR MACHINE FOR TIME SER...
PDF
Parametric estimation of construction cost using combined bootstrap and regre...
PDF
Parameter Optimisation for Automated Feature Point Detection
PDF
PPT
DSUS_SDM2012_Jie
PDF
Special Double Sampling Plan for truncated life tests based on the Marshall-O...
PDF
Hc3413121317
PDF
Performance improvement of a Rainfall Prediction Model using Particle Swarm O...
PDF
Lecture7 cross validation
PPT
Taguchi design of experiments nov 24 2013
PDF
A hybrid fuzzy ann approach for software effort estimation
A parsimonious SVM model selection criterion for classification of real-world ...
AMS_Aviation_2014_Ali
ModelSelection1_WCSMO_2013_Ali
PEMF2_SDM_2012_Ali
DSUS_MAO_2012_Jie
COSMOS1_Scitech_2014_Ali
Adaptive response surface by kriging using pilot points for structural reliab...
A REVIEW ON OPTIMIZATION OF LEAST SQUARES SUPPORT VECTOR MACHINE FOR TIME SER...
Parametric estimation of construction cost using combined bootstrap and regre...
Parameter Optimisation for Automated Feature Point Detection
DSUS_SDM2012_Jie
Special Double Sampling Plan for truncated life tests based on the Marshall-O...
Hc3413121317
Performance improvement of a Rainfall Prediction Model using Particle Swarm O...
Lecture7 cross validation
Taguchi design of experiments nov 24 2013
A hybrid fuzzy ann approach for software effort estimation
Ad

Viewers also liked (11)

PPT
AIAA-MAO-DSUS-2012
PPTX
AIAA-MAO-WFLO-2012
PPTX
WCSMO-Wind-2013-Tong
PPTX
AIAA-Aviation-VariableFidelity-2014-Mehmani
PPTX
WCSMO-ModelSelection-2013
PPTX
AIAA-SDM-WFLO-2012
PPTX
AIAA-MAO-RegionalError-2012
PPTX
ASME-IDETC-Sensitivity-2013
PPTX
AIAA-Aviation-Vidmap-2014
PPTX
WCSMO-Vidmap-2015
PPTX
AIAA-SDM-SequentialSampling-2012
AIAA-MAO-DSUS-2012
AIAA-MAO-WFLO-2012
WCSMO-Wind-2013-Tong
AIAA-Aviation-VariableFidelity-2014-Mehmani
WCSMO-ModelSelection-2013
AIAA-SDM-WFLO-2012
AIAA-MAO-RegionalError-2012
ASME-IDETC-Sensitivity-2013
AIAA-Aviation-Vidmap-2014
WCSMO-Vidmap-2015
AIAA-SDM-SequentialSampling-2012
Ad

Similar to AIAA-SDM-PEMF-2013 (20)

PPTX
Response surface designs.Statistics/pptx
PDF
Data Science Interview Questions PDF By ScholarHat
PDF
Introduction to Artificial Intelligence_ Lec 10
PPTX
Matrix OLS dshksdfbksjdbfkdjsfbdskfbdkj.pptx
PPTX
computer application in pharmaceutical research
PPT
dimension reduction.ppt
PPT
Slides sem on pls-complete
PDF
IEOR 265 Final Paper_Minchao Lin
PPTX
On the Creation of Representative Samples of Software Repositories
PDF
Probability density estimation using Product of Conditional Experts
PDF
A ROBUST MISSING VALUE IMPUTATION METHOD MIFOIMPUTE FOR INCOMPLETE MOLECULAR ...
PDF
MyPoster1
PPTX
AIML UNIT 4.pptx. IT contains syllabus and full subject
PPT
2. visualization in data mining
DOCX
Maximum likelihood estimation from uncertain
PPTX
Computer in pharmaceutical research and development-Mpharm(Pharmaceutics)
DOCX
Sampling theory teaches about machine .docx
PDF
Analysis of Common Supervised Learning Algorithms Through Application
PDF
ANALYSIS OF COMMON SUPERVISED LEARNING ALGORITHMS THROUGH APPLICATION
Response surface designs.Statistics/pptx
Data Science Interview Questions PDF By ScholarHat
Introduction to Artificial Intelligence_ Lec 10
Matrix OLS dshksdfbksjdbfkdjsfbdskfbdkj.pptx
computer application in pharmaceutical research
dimension reduction.ppt
Slides sem on pls-complete
IEOR 265 Final Paper_Minchao Lin
On the Creation of Representative Samples of Software Repositories
Probability density estimation using Product of Conditional Experts
A ROBUST MISSING VALUE IMPUTATION METHOD MIFOIMPUTE FOR INCOMPLETE MOLECULAR ...
MyPoster1
AIML UNIT 4.pptx. IT contains syllabus and full subject
2. visualization in data mining
Maximum likelihood estimation from uncertain
Computer in pharmaceutical research and development-Mpharm(Pharmaceutics)
Sampling theory teaches about machine .docx
Analysis of Common Supervised Learning Algorithms Through Application
ANALYSIS OF COMMON SUPERVISED LEARNING ALGORITHMS THROUGH APPLICATION

AIAA-SDM-PEMF-2013

  • 1. Quantifying Regional Error in Surrogates by Modeling its Relationship with Sample Density Ali Mehmani, Souma Chowdhury , Jie Zhang, Weiyang Tong, and Achille Messac Syracuse University, Department of Mechanical and Aerospace Engineering 54th AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics and Materials Conference April 8-11, 2013, Boston, Massachusetts
  • 2. Surrogate model • Surrogate models are commonly used for providing a tractable and inexpensive approximation of the actual system behavior in many routine engineering analysis and design activities: 2
  • 3. 3 Broad Research Question Structural Blade Design ANSYS, Inc. Expensive Inexpensive Surrogate Model Lower Fidelity How Low?
  • 4. 4 Broad Research Question How to quantify the level of the surrogate accuracy ?  further improvement of the surrogate,  domain exploration,  assessing the reliability of the optimal design,  quantifying the uncertainty associated with the surrogate,  construction of a weighted surrogate model, and  …
  • 5. Research Objective  Develop a reliable method to quantify the surrogate error, 5  This method should have the following characteristics:  model independent  no additional system evaluations  local/global error measurement  quantify the error of the actual surrogate
  • 7. 7 Regional Error Estimation of Surrogate (REES)
  • 8. Presentation Outline 8 • Review of surrogate model error measurement methods • Relation of surrogate accuracy with sample density • Regional Error Estimation of Surrogate • Numerical examples: benchmark and an engineering design problems
  • 9. Presentation Outline 9 • Review of surrogate model error measurement methods • Relation of surrogate accuracy with sample density • Regional Error Estimation of Surrogate • Numerical examples: benchmark and an engineering design problems
  • 10. Surrogate Model Error Measurement Methods  Error quantification methods can be classified,  based on their computational expense, into methods that require additional data, and methods that use existing data.  based on the region of interest, into: - Global error measure (e.g., split sample, cross-validation, Akaike’s information criterion, and bootstrapping). - Local or point-wise error measure (e.g., the mean squared errors for Kriging and the linear reference model (LRM).) 10
  • 11. Surrogate Model Error Measurement Methods  Error metrics, 11 • The mean squared error (MSE) (or root mean square error (RMSE) ) • The maximum absolute error (MAE) • The relative absolute error (RAE) actual values on ith test point predicted values on ith test point
  • 12. Surrogate Model Error Measurement Methods  Error metrics, 12 • The prediction sum of square (PRESS) is based on the leave-one-out cross-validation error • The root mean square of PRESS (PRESSRMS) based on the k-fold cross- validation. • The relative absolute error of cross-validation (RAECV) based on leave- one-out approach.
  • 13. Presentation Outline 13 • Review of surrogate model error measurement methods • Relation of surrogate accuracy with sample density • Regional Error Estimation of Surrogate • Numerical examples: benchmark and an engineering design problems
  • 14. Methodology: Concept 14 Model accuracy ∝ Available resources In general, this concept can be applied for different methodologies - Surrogate modeling, - Finite Element Analysis, and - ...
  • 15. Methodology: Concept 15  Finite Element Analysis (numerical methods) coarse mesh (4 solid brick element) medium mesh (32 solid brick element) fine mesh (256 solid brick element) Estimate total shear force and flexural moment at vertical sections using Finite Element Analysis. The finer mesh, the stresses are more precise due to the larger number of elements
  • 16. 9 training points 3 training points Methodology: Concept  Surrogate (mathematical model) 7 training points 16 Surrogate accuracy generally improves with increasing training points. The location of additional points has strong impact on surrogate accuracy. This impact is highly problem and model dependent.
  • 17. Presentation Outline 17 • Review of surrogate model error measurement methods • Relation of surrogate accuracy with sample density • Regional Error Estimation of Surrogate • Numerical examples: benchmark and an engineering design problems
  • 18. Methodology: REES 18  The REES method formulates the variation of error as a function of training points using intermediate surrogates.  This formulation is used to predict the level of error in a final surrogate.
  • 19. Methodology: REES {𝑿𝒊𝒏} {𝑿 𝒐𝒖𝒕} 𝑿 = 𝑿𝒊𝒏 + 𝑿 𝒐𝒖𝒕 Step 2 : Identification of sample points inside/outside region of interest Step 1 : Generation of sample data The entire set of sample points is represented by 𝑿 . 𝑿𝒊𝒏 : Inside-region data set user-defined region of interest 𝑿 𝒐𝒖𝒕 : Outside-region data set 19
  • 20. Methodology: REES Step 3 : Estimation of the variation of the error with sample density Inside-region point Outside-region point user-defined region of interest First Iteration : Test Point Training Point Second iteration : Test Point Training Point Third iteration : Test Point Training Point Final Surrogate : Training Point
  • 21. Methodology: REES Step 3 : Estimation of the variation of the error with sample density  A position of sample points which are selected as training points, at each iteration, is critical to the surrogate accuracy.  The proposed error measure should be minimally sensitive to the location of the test points at each iteration. 21  Intermediate surrogates are iteratively constructed (at each iteration) over a sample set comprising all samples outside the region of interest and heuristic subsets of samples inside the region of interest.
  • 22. Methodology: REES Step 3 : Estimation of the variation of the error with sample density  The number of iterations (𝑁 𝑖𝑡 ) is defined - dimension of a problem, - number of inside sample points, and - preference of the user  The number of sample combinations (𝑲 𝒕 ) is defined,  The intermediate subset for each combination at specific iteration is defined by {𝜷 𝒌 } ⊂ 𝑿𝒊𝒏 #{𝜷 𝒌} = 𝒏 𝒕, 𝒏 𝒕−𝟏 < 𝒏 𝒕 𝒌 = 1,2, … , 𝐾 𝑡  The intermediate training points and test points for each combination at each iteration is defined by 𝑿 𝑻𝑹 = 𝑿 𝒐𝒖𝒕 + 𝜷 𝒌 𝑿 𝑻𝑬 = 𝑿 − 𝑿 𝑻𝑹  The intermediate surrogates 𝑓 𝑘, 𝒌 = 𝟏, 𝟐, . . , 𝑲 𝒕 are constructed for all combinations using the intermediate training points ( 𝑿 𝑻𝑹 ), and are tested over the intermediate test points ( 𝑿 𝑻𝑬 ).
  • 23. Methodology: REES Step 3 : Estimation of the variation of the error with sample density  The median and the maximum errors are estimated for each combination 𝒎 𝒕 : the number of test points in tth iteration 𝒆: the RAE value estimated on intermediate test points 23
  • 24. Methodology: REES Step 3 : Estimation of the variation of the error with sample density  The median and the maximum errors are estimated for each combination 24 Median error Maximum error Overall Fidelity Information Minimum Fidelity Information The median is a useful measures of central tendency which is less vulnerable to outliers.
  • 25. Methodology: REES Step 3 : Estimation of the variation of the error with sample density  Probabilistic models are developed using a lognormal distribution to represent median and maximum errors estimated over all 𝑲 𝒕 combinations at each iterations.  The mode of distribution is selected to represent the errors at each iteration. Mode of median error distribution Mode of maximum error distribution  These values are used to relate the variation of the surrogate error with number of training points (sample density).
  • 26. The relation of the error with sample density  12-D Test Problem (Dixon & Price, n=12) Number of sample points # 𝑿 = 𝟓𝟓𝟎, Number of inside sample points # 𝑿𝒊𝒏 = # 𝑿 Number of training points at each iteration,𝒏 𝒕 = 5𝑡 + 50, 𝑡 = 1,2, … , 70 Number of sample combination, 𝑲 𝒕 = 500 Estimated mode of median errors Estimated mode of maximum errors Number of Training Points MOmax Number of Training Points MOmed First iteration Last iteration # 𝑿 𝑻𝑹 = 𝟒𝟎𝟎 # 𝑿 𝑻𝑬 = 𝟏𝟎𝟎 # 𝑿 𝑻𝑹 = 𝟓𝟓 # 𝑿 𝑻𝑬 = 𝟒𝟒𝟓
  • 27. The relation of the error with sample density  12-D Test Problem (Dixon & Price, n=12) Estimated mean of mean errorsEstimated mode of median errors Number of Training Points Meanmean Number of Training Points MOmed REES Method Normalized k-fold CV
  • 28. Methodology: REES Step 4 : Prediction of regional error in the final surrogate  The final surrogate model is constructed using the full set of training data.  Regression models are applied to relate - the statistical mode of the median error distribution(𝑴𝒐 𝒎𝒆𝒅) - the statistical mode of the maximum error distributions(𝑴𝒐 𝒎𝒂𝒙), and - the absolute maximum error (𝑨𝑩𝑺 𝒎𝒂𝒙) at each iteration to the size of the inside-region training points (nt),  These regression models are called the variation of error with sample density (VESD). The regression models are used to predict the level of the error in the final surrogate within the region of interest. 28
  • 29. Methodology: REES Modeling the Variation of Regional Error with Training Point Density  In this study, three types of the regression functions are used to represent the variation of regional error with respect to the inside-region training points Exponential regression model Multiplicative regression model Linear regression model  The choice of these functions assume a smooth monotonic decrease of the regional error with the training point density within that region.  The root mean squared error metric is used to select the best-fit regression model 29
  • 30. Presentation Outline 30 • Review of surrogate model error measurement methods • Relation of surrogate accuracy with sample density • Regional Error Estimation of Surrogate • Numerical examples: benchmark and an engineering design problems
  • 31. Numerical Examples  The effectiveness of the REES method is explored for applications with - Kriging, - Radial Basis Functions (RBF), - Extended Radial Basis Functions (E-RBF), and - Quadratic Response Surface (QRS).  To evaluate practical and numerical efficiencies of the REES method, three benchmark problems and an engineering design problem are tested.  The error evaluated using REES, and the relative absolute error given by leave-one-out cross-validation (𝑹𝑨𝑬 𝒄𝒗) are compared with the actual error evaluated using relative absolute error on additional test points (𝑹𝑨𝑬 𝒂𝒄𝒕𝒖𝒂𝒍). 31
  • 32. MedianofRAEs Numerical Examples Results and Discussion VESD regression models within the region of interest of surrogate models constructed for the Branin-Hoo Function to predict, Distribution of median errors Mode of the median error distribution, Predicted mode of median error in the final surrogate, VESDmed Number of Inside-region Training Points 32
  • 33. Numerical Examples Results and Discussion VESD regression models within the region of interest of surrogate models constructed for the Branin-Hoo Function to predict, Type and coefficients of VESDmed RBFKriging E-RBF QRS
  • 34. MaximumofRAEs Numerical Examples Results and Discussion VESD regression models within the region of interest of surrogate models constructed for the Branin-Hoo Function to predict the mode of maximum ( ) and the absolute maximum ( ) error. Distribution of maximum errors Mode of the maximum error distribution, Absolute maximum error Predicted mode of maximum error in the final surrogate Predicted absolute maximum error in the final surrogate 34 Number of Inside-region Training Points
  • 35. Numerical Examples Results and Discussion VESD regression models within the region of interest of surrogate models constructed for the Branin-Hoo Function to predict the mode of maximum ( ) and the absolute maximum ( ) error. Type and coefficients of VESDABS Type and coefficients of VESDmax RBFKriging E-RBF QRS
  • 36. Numerical Examples Wind Farm Power Generation 36 Surrogates are developed using Kriging, RBF, E-RBF, and QRS to represent the power generation of an array-like wind farm.
  • 37. Numerical Examples Results and Discussion 37 It. 1 It. 2 It. 3 It. 4 Predicted Error VESD regression models in different surrogates for the wind farm power generation problem
  • 38. Numerical Examples 38 The closer to one, the better the corresponding error measure. Results and Discussion predicted mode of median errors median of RAEs evaluated on test points median of relative absolute errors of cross-validation
  • 39. Concluding Remarks  We developed a new method to quantify surrogate error based on the hypothesis that: “The accuracy of the approximation model is related to the amount of available resources”  This relationship can be reliably quantified when the error measures is less sensitive to sample locations or a type of application.  The REES method addresses this issue.  The preliminary results on benchmark and wind farm power generation problems indicate that in majority of cases the REES method is more accurate than other measures. 39 It is not possible using any existing methods
  • 40. Future Works  The scope for improvement the method  The implementation of the proposed error measurement in surrogate developments. 40
  • 41. Acknowledgement 41  I would like to acknowledge my research adviser Prof. Achille Messac, and my co-adviser Prof. Souma Chowdhury for their immense help and support in this research.  Support from the NSF Awards is also acknowledged.
  • 43. MedianofRAEs Numerical Examples Results and Discussion VESD regression models within the region of interest of surrogate models constructed for the Branin-Hoo Function to predict, Distribution of median errors Mode of the median error distribution, Predicted mode of median error in the final surrogate, VESDmed Number of Inside-region Training Points 43 Meanmean Number of Inside-region Training Points k-fold CV