SlideShare a Scribd company logo
Small Sample Analysis and Algorithms
for Multivariate Functions
Mac Hyman
Tulane University
Joint work with Lin Li, Jeremy Dewar, and Mu Tian (SUNY),
SAMSI WG5 , May 7, 2018
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 1 / 27
The Problem: Accurate integration of multivariate functions
Goal:To estimate the integral I = Ω f (x)dx,
where f (x) : Ω → R, Ω ⊂ Rd
We are focused on situations where:
Situations where there are few samples (n < few 1000), and the
effective dimension is relatively small, x ∈ Rd , (d < 50);
Function evaluations (samples) f (x) are (very) expensive, such as a
large-scale simulation, and additional samples may not be obtainable;
Little a prior information about f (x) is available; and
We might not have control over the sample locations, which can be
far from a desired distribution (e.g. missing data).
Identify new sample locations to minimize MSE based on existing
information.
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 2 / 27
The Problem: Accurate integration of multivariate functions
Goal:To estimate the integral I = Ω f (x)dx,
where f (x) : Ω → R, Ω ⊂ Rd
Four Approaches that work pretty well in practice.
How does do they work in theory?
1. Detrending using covariates
2. Voronoi Weighted Quadrature
3. Surrogate Model Quadrature
4. Adaptive Sampling Based on Kriging SE Estimates
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 3 / 27
Detrending before Integrating
1. Detrending using covariates
Detrending first approximates the underlying function with an easily
integrated surrogate model (covariate).
The integral is then estimated by the exact integral of surrogate +
an approximation of the residual.
For example, the , f (x) can be approximated by a linear combination of
simple basis functions, such as Legendre polynomials, p(x) = t
i=1 βi ψi (x),
which can be integrated exactly, and define
I(f ) =
Ω
f (x)dx (1)
=
Ω
p(x)dx +
Ω
[f (x) − p(x)]dx . (2)
Goal is the pick f (x) to minimize the residual Ω[f (x) − p(x)]dx.
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 4 / 27
Detrending before Integrating
The error for the detrended integral is proportional to the
standard deviation of the residual p(x) − f (x), not f (x)
The residual errors are the only errors in the integration approximation
I(f ) =
Ω
f (x)dx (3)
≈ ˆI(f ) =
Ω
p(x)dx +
1
n
[f (xi ) − p(xi )] (4)
PMC error bound: ||en|| = O( 1√
n
σ(f − p)) and
QMC error bound: ||en||≤O(1
n V [f − p](log n)(d−1))
1. The error bounds are based on σ(f − p) and V [f − p] instead of σ(f )
and V [f ]. The least squares fit reduces these quantities.
2. The convergence rates are the same; the constants are reduced.
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 5 / 27
Quintic detrending reduces MC and QMC errors by factor of 100
Error ˆI(f ) − I(f ) Distributions I(f ) = [0,1]6 i cos(ixi)dx
Error Distributions (6D, 600 points) for PMC (top) and LDS QMC
(bottom) for detrending with a cubic and quintic, K = 3, 5, polynomial.
The x-axis bounds are 10 times smaller for the LDS/QMC samples.
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 6 / 27
Detrending reduces the error constant, not the convergence rates
Detrending doesn’t change the convergence rates PMC
(O(n−1
2 )) and QMC (O(n−1
)) for [0,1]6 i cos(ixi)dx
Errors for PMC (upper lines −−) and QMC (lower lines − · −) for
constant K = 0 (left), cubic K = 3 (center), and quintic K = 5 (right).
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 7 / 27
Detrending reduces the error constant, not the convergence rates
Mean errors for [0,1]5 i cos(ixi)dx with detrending
Detrending errors: degrees K = 0, 1, 2, 3, 4, 5 for 500 − 4000 samples.
Convergence rates don’t chance, but the constant is reduced by 0.001
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 8 / 27
Curse of Dimensionality for polynomial detrending
High degree polynomials in high dimensions are quickly
constrained by the Curse of Dimensionality
For the least squares coefficients to be identifiable, the number of samples
must be ≥ the number of coefficients in the detrending function.
Degree  Dimension 1 2 3 4 5 10 20
0 1 1 1 1 1 1 1
1 2 3 4 5 6 11 21
2 3 6 10 15 21 66 231
3 4 10 20 35 56 286 1,771
4 5 15 35 70 126 1,001 10,626
5 6 21 56 126 252 3,003 53,130
10 11 66 286 1,001 3,003 184,756 30,045,015
The mixed variable terms in multivariate polynomials create an explosion
in the number of terms as a function of the degree and dimension.
For example, a 5th degree polynomial in 20 dimensions has 53,130 terms.
The complexity of this approach grows linearly in the number of
basis functions, and as O(n3) as the number of samples increases.
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 9 / 27
Sparse model selection
The Lp sparse penalty regularization method
p(x) =
t
i=1
βi ψi (x) (5)
β = argmin{
1
2
||Aβ − f ||2
2 +
1
p
λ||β||p} (6)
(7)
This system can be solved using a cyclic coordinate descent algorithm, or
factored iterated reweighted least-squares (IRLS) solving a linear system
of (n = number of samples) equations on each iteration.
If the function f (x) varies along some directions more than others, then
sparse subset selection extracts the appropriate basis functions based on
the effective dimension of the active subspaces.
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 10 / 27
Sparse model selection
Sparse subset detrending allows high degree polynomial
dictionaries for sparse sample distributions.
[0,1]6 i cos(ixi )dx; Errors PMC (top) and QMC (bottom) degrees
K = 0, 3, 5. K = 5 fits keep 35% (PMC) or 29% (QMC) of terms.
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 11 / 27
Least squares detrending = a weighted quadrature rule
Least squares detrending is equivalent to using a weighted
quadrature rule
The integral of a least squares detrending fit through the data points can
be represented as a weighted quadrature rule:
I(f ) =
Ω
f (x)dx =
i Ωi
f (x)dx = wi
¯fi ≈ ˆwi f (xi )
where ¯fi is the mean of f in the Voronoi volume (wi ) of Ωi near xi , and
ˆwi ≈ wi . The error depends on (¯fi − f (xi )) and (wi − ˆwi ).
When the sample points have low discrepancy, then wi ≈ ˆwi = 1/n is a
good approximation.
Can this be improved if we replace the weights with a better approxima-
tion of the Voronoi volume?
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 12 / 27
Voronoi Weighted Quadrature
2. Voronoi Weighted Quadrature
The Voronoi weighted quadrature rule is defined as
In(f ) =
n
i=1
wi f (xi )
where wi is the Voronoi volume associated with the sample xi that is
closer to xi than any other sample point.
• The Voronoi weighted quadrature rule, In(f ), is exact if f (x) is piece-
wise constant over each Voronoi volume.
• Solving for the exact Voronoi volumes is expensive in high dimensions
and suffers from the curse of dimensionality.
• Solution: Use LDS to approximate these volumes.
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 13 / 27
Voronoi Weighted Quadrature
Voronoi Weighted Quadrature
The weights for the Voronoi quadrature rule
In(f ) = wi f (xi )
can be approximated using nearest neighbors of a dense reference LDS.
Step 1: Generate a dense LDS, {ˆxj }, with NLDS points.
Step 2: Compute the distance from each LDS to the original sample set.
Step 3: Define Wi as the number of LDS points closest to xi .
Step 4: Rescale these counts to define the weights wi = Wi /NLDS
(and normalize by the domain volume, if needed).
The weights wi converge to the Voronoi volumes as O(1/NLDS )
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 14 / 27
Voronoi Weighted Quadrature
Voronoi Volumes Estimated by Low Discrepancy Sample
The fraction of LDS samples nearest to each sample is used to estimate
the Voronoi volume for the sample as a fraction of the domain volume.
Works for samples living in a blob.
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 15 / 27
Voronoi Weighted Quadrature
Simple 3D example
1 1.2 1.4 1.6 1.8 2 2.2 2.4
log10
(number of samples)
-2.6
-2.4
-2.2
-2
-1.8
-1.6
-1.4
-1.2
-1
log10
(meanerror)
trig Integration error 3D
MC error
MC Voronoi error
1 1.2 1.4 1.6 1.8 2 2.2 2.4
log10
(number of samples)
-3
-2.5
-2
-1.5
-1
log10
(meanerror)
trig Integration error 3D
LDS error
LSD Voronoi error
In 3D, the Voronoi weights are much more effective in reducing the errors
when the original sample is iid MC than its for a LDS QMC sample.
In(f ) = ˆwi f (xi ) ˆwi = estimate of xi Voronoi volume
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 16 / 27
Voronoi Weighted Quadrature
Simple 6D example
1.4 1.6 1.8 2 2.2 2.4 2.6 2.8
log10
(number of samples)
-2.2
-2
-1.8
-1.6
-1.4
-1.2
-1
log10
(meanerror)
trig Integration error 6D
MC error
MC Voronoi error
1.4 1.6 1.8 2 2.2 2.4 2.6 2.8
log10
(number of samples)
-2.8
-2.6
-2.4
-2.2
-2
-1.8
-1.6
-1.4
log10
(meanerror)
trig Integration error 6D
LDS error
LSD Voronoi error
In 6D, the Voronoi weighted quadrature reduces the errors for iid MC
samples. The approach is not effective for LDS in higher dimensions.
We are looking for ideas for explaining why the Voronoi weighted quadra-
ture approach is less effective for LDS in higher dimensions.
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 17 / 27
Surrogate Model Quadrature
3. Surrogate Model Quadrature
Interpolate samples to a dense LDS, and use standard QMC quadra-
ture on the surrogate points.
Step 1: Generate a dense LDS sample, {ˆxj }, with NLDS points.
Step 2: Use kriging to approximate ˆf (ˆxj ) at the LDS points.
Step 3: Estimate the integral by
I(f ) ≈
1
NLDS
ˆf (ˆxj )
We use the DACE kriging package with a quadratic polynomial basis
based on distance-weighted least-squares with radial Gaussian weights.
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 18 / 27
Surrogate Model Quadrature
Simple 3D example
1 1.2 1.4 1.6 1.8 2 2.2 2.4
log10
(number of samples)
-3
-2.5
-2
-1.5
-1
log10
(meanerror)
trig Integration error 3D
MC error
MC SLDS Error
1 1.2 1.4 1.6 1.8 2 2.2 2.4
log10
(number of samples)
-4.5
-4
-3.5
-3
-2.5
-2
-1.5
-1
log10
(meanerror)
trig Integration error 3D
LDS error
LDS SLDS Error
In 3D, the surrogate data points are effective in reducing the errors for
both iid MC and LDS samples.
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 19 / 27
Surrogate Model Quadrature
Simple 6 D example
1.4 1.6 1.8 2 2.2 2.4
log10
(number of samples)
-2.2
-2
-1.8
-1.6
-1.4
-1.2
-1
log10
(meanerror)
trig Integration error 6D
MC error
MC SLDS Error
1.4 1.6 1.8 2 2.2 2.4 2.6 2.8
log10
(number of samples)
-4.5
-4
-3.5
-3
-2.5
-2
-1.5
log10
(meanerror)
trig Integration error 6D
LDS error
LDS SLDS Error
In 6D, the surrogate data points are effective in reducing the errors for
both iid MC and LDS samples.
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 20 / 27
Comparing Voronoi and Surrogate Model Quadrature
The surrogate quadrature is consistently better than the
Voronoi quadrature
1 1.2 1.4 1.6 1.8 2 2.2 2.4
log10
(number of samples)
-3
-2.5
-2
-1.5
-1
log10
(meanerror)
trig Integration error 3D
MC error
MC Voronoi error
MC SLDS Error
1 1.2 1.4 1.6 1.8 2 2.2 2.4
log10
(number of samples)
-4.5
-4
-3.5
-3
-2.5
-2
-1.5
-1
log10
(meanerror)
trig Integration error 3D
LDS error
LSD Voronoi error
LDS SLDS Error
Both methods reduce the errors in this 3D dimensional problem.
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 21 / 27
Comparing Voronoi and Surrogate Model Quadrature
The surrogate quadrature is consistently better than the
Voronoi quadrature
1.4 1.6 1.8 2 2.2 2.4 2.6 2.8
log10
(number of samples)
-2.4
-2.2
-2
-1.8
-1.6
-1.4
-1.2
-1
log10
(meanerror)
trig Integration error 6D
MC error
MC Voronoi error
MC SLDS Error
1.4 1.6 1.8 2 2.2 2.4 2.6 2.8
log10
(number of samples)
-4.5
-4
-3.5
-3
-2.5
-2
-1.5
log10
(meanerror)
trig Integration error 6D
LDS error
LSD Voronoi error
LDS SLDS Error
• In this 6D dimensional problem, both methods reduce the error when
the original sample is not a LDS.
• When the original sample is LDS, then the Voronoi quadrature doesn’t
improve the accuracy, while the surrogate model continues to be effective.
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 22 / 27
Adaptive Sampling Quadrature
4. Adaptive Sampling Based on Kriging SE Estimates
Instead of adding new samples to ’fill in the holes’ of the existing
distribution, use kriging error estimates to guide future samples.
Iterate until converged, or max number of function values is reached:
Step 1: Generate a dense LDS sample, {ˆxj }, with NLDS points.
Step 2: Using kriging to approximate the function, ˆf (ˆxj ), and estimate
standard errors, SEj , at the LDS points.
Step 3: If the max{SEj } > tolerance, then evaluate the function with the
largest SEj , and return to Step 2 .
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 23 / 27
Adaptive Sampling Quadrature
Initial Random Sample
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
• = current samples
The large standard errors are small red circles and smaller errors are blue.
The next sample will be evaluated at the largest SE.
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 24 / 27
Adaptive Sampling Quadrature
First and second adaptive samples are in the corners
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
The large standard errors are small red circles and smaller errors are blue.
• = current samples • = largest SE and next sample.
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 25 / 27
Adaptive Sampling Quadrature
The new samples fill in the holes to reduce the
uncertainity
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
The large standard errors are small red circles and smaller errors are blue.
• = current samples • = largest SE and next sample.
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 26 / 27
Future Research Questions
Future Research Questions
Continue exploring surrogate models for guiding adaptive sampling.
Develop theory for how many LDS surrogate samples are needed for the
Voronoi weights and surrogate quadrature methods.
Use the surrogate approach to interpolate to a sparse grid, instead of the
LDS, and use higher order quadrature rules.
Combine the surrogate LDS methods with the detrending approaches.
Develop kriging methods that preserve local positivity, monotonicity, and
convexity of the data for both design of experiment surrogate models and
surrogate quadrature methods.
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 27 / 27

More Related Content

PDF
ABC-Gibbs
PDF
Coordinate sampler : A non-reversible Gibbs-like sampler
PDF
ABC based on Wasserstein distances
PDF
Statistics symposium talk, Harvard University
PDF
the ABC of ABC
PDF
Can we estimate a constant?
PDF
random forests for ABC model choice and parameter estimation
PDF
ABC convergence under well- and mis-specified models
ABC-Gibbs
Coordinate sampler : A non-reversible Gibbs-like sampler
ABC based on Wasserstein distances
Statistics symposium talk, Harvard University
the ABC of ABC
Can we estimate a constant?
random forests for ABC model choice and parameter estimation
ABC convergence under well- and mis-specified models

What's hot (20)

PDF
NCE, GANs & VAEs (and maybe BAC)
PDF
Laplace's Demon: seminar #1
PDF
ABC-Gibbs
PDF
CISEA 2019: ABC consistency and convergence
PDF
An overview of Bayesian testing
PDF
Summery of Robust and Effective Metric Learning Using Capped Trace Norm
PDF
ABC-Gibbs
PDF
Multiple estimators for Monte Carlo approximations
PDF
Likelihood-free Design: a discussion
PDF
A Comparative Study of Two-Sample t-Test Under Fuzzy Environments Using Trape...
PDF
accurate ABC Oliver Ratmann
PDF
asymptotics of ABC
PDF
Inference in generative models using the Wasserstein distance [[INI]
PDF
Delayed acceptance for Metropolis-Hastings algorithms
PDF
comments on exponential ergodicity of the bouncy particle sampler
PDF
ABC workshop: 17w5025
PDF
ABC short course: introduction chapters
PDF
8803-09-lec16.pdf
PDF
Testing as estimation: the demise of the Bayes factor
PDF
On the vexing dilemma of hypothesis testing and the predicted demise of the B...
NCE, GANs & VAEs (and maybe BAC)
Laplace's Demon: seminar #1
ABC-Gibbs
CISEA 2019: ABC consistency and convergence
An overview of Bayesian testing
Summery of Robust and Effective Metric Learning Using Capped Trace Norm
ABC-Gibbs
Multiple estimators for Monte Carlo approximations
Likelihood-free Design: a discussion
A Comparative Study of Two-Sample t-Test Under Fuzzy Environments Using Trape...
accurate ABC Oliver Ratmann
asymptotics of ABC
Inference in generative models using the Wasserstein distance [[INI]
Delayed acceptance for Metropolis-Hastings algorithms
comments on exponential ergodicity of the bouncy particle sampler
ABC workshop: 17w5025
ABC short course: introduction chapters
8803-09-lec16.pdf
Testing as estimation: the demise of the Bayes factor
On the vexing dilemma of hypothesis testing and the predicted demise of the B...
Ad

Similar to QMC: Transition Workshop - Small Sample Statistical Analysis and Algorithms for Multivariate Functions - Mac Hyman, May 7, 2018 (20)

PDF
QMC Opening Workshop, High Accuracy Algorithms for Interpolating and Integrat...
PDF
Big Data Analysis
PDF
A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction:
DOCX
Non-Normally Distributed Errors In Regression Diagnostics.docx
PDF
PhysRevE.89.042911
PPTX
Diagnostic methods for Building the regression model
PDF
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
PDF
Machine Learning meets DevOps
DOCX
Outlying and Influential Data In Regression Diagnostics .docx
PDF
MUMS: Bayesian, Fiducial, and Frequentist Conference - Inference on Treatment...
PPT
Input analysis
PDF
New approaches for boosting to uniformity
PDF
Maximum likelihood estimation of regularisation parameters in inverse problem...
PDF
slides_low_rank_matrix_optim_farhad
PDF
MEAN ABSOLUTE DEVIATION FOR HYPEREXPONENTIAL AND HYPOEXPONENTIAL DISTRIBUTION
PDF
Mean Absolute Deviation for Hyperexponential and Hypoexponential Distributions
PPT
R for Statistical Computing
PDF
A comparative analysis of predictve data mining techniques3
PDF
Factor analysis
PDF
Presentation of Understanding Sharpness Dynamics in NN Training with a Minima...
QMC Opening Workshop, High Accuracy Algorithms for Interpolating and Integrat...
Big Data Analysis
A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction:
Non-Normally Distributed Errors In Regression Diagnostics.docx
PhysRevE.89.042911
Diagnostic methods for Building the regression model
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
Machine Learning meets DevOps
Outlying and Influential Data In Regression Diagnostics .docx
MUMS: Bayesian, Fiducial, and Frequentist Conference - Inference on Treatment...
Input analysis
New approaches for boosting to uniformity
Maximum likelihood estimation of regularisation parameters in inverse problem...
slides_low_rank_matrix_optim_farhad
MEAN ABSOLUTE DEVIATION FOR HYPEREXPONENTIAL AND HYPOEXPONENTIAL DISTRIBUTION
Mean Absolute Deviation for Hyperexponential and Hypoexponential Distributions
R for Statistical Computing
A comparative analysis of predictve data mining techniques3
Factor analysis
Presentation of Understanding Sharpness Dynamics in NN Training with a Minima...
Ad

More from The Statistical and Applied Mathematical Sciences Institute (20)

PDF
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
PDF
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
PDF
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
PDF
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
PDF
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
PDF
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
PPTX
Causal Inference Opening Workshop - Difference-in-differences: more than meet...
PDF
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
PDF
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
PPTX
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
PDF
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
PDF
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
PDF
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
PDF
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
PDF
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
PDF
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
PPTX
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
PPTX
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
PDF
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
PDF
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
Causal Inference Opening Workshop - Difference-in-differences: more than meet...
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...

Recently uploaded (20)

PDF
RMMM.pdf make it easy to upload and study
PDF
Practical Manual AGRO-233 Principles and Practices of Natural Farming
PPTX
UV-Visible spectroscopy..pptx UV-Visible Spectroscopy – Electronic Transition...
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
Computing-Curriculum for Schools in Ghana
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PDF
1_English_Language_Set_2.pdf probationary
PDF
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PPTX
UNIT III MENTAL HEALTH NURSING ASSESSMENT
PDF
Chinmaya Tiranga quiz Grand Finale.pdf
PDF
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf
PDF
SOIL: Factor, Horizon, Process, Classification, Degradation, Conservation
PDF
Trump Administration's workforce development strategy
PPTX
History, Philosophy and sociology of education (1).pptx
PDF
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
PDF
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
PDF
Hazard Identification & Risk Assessment .pdf
PPTX
202450812 BayCHI UCSC-SV 20250812 v17.pptx
RMMM.pdf make it easy to upload and study
Practical Manual AGRO-233 Principles and Practices of Natural Farming
UV-Visible spectroscopy..pptx UV-Visible Spectroscopy – Electronic Transition...
Supply Chain Operations Speaking Notes -ICLT Program
Computing-Curriculum for Schools in Ghana
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
1_English_Language_Set_2.pdf probationary
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
Final Presentation General Medicine 03-08-2024.pptx
UNIT III MENTAL HEALTH NURSING ASSESSMENT
Chinmaya Tiranga quiz Grand Finale.pdf
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf
SOIL: Factor, Horizon, Process, Classification, Degradation, Conservation
Trump Administration's workforce development strategy
History, Philosophy and sociology of education (1).pptx
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
Hazard Identification & Risk Assessment .pdf
202450812 BayCHI UCSC-SV 20250812 v17.pptx

QMC: Transition Workshop - Small Sample Statistical Analysis and Algorithms for Multivariate Functions - Mac Hyman, May 7, 2018

  • 1. Small Sample Analysis and Algorithms for Multivariate Functions Mac Hyman Tulane University Joint work with Lin Li, Jeremy Dewar, and Mu Tian (SUNY), SAMSI WG5 , May 7, 2018 Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 1 / 27
  • 2. The Problem: Accurate integration of multivariate functions Goal:To estimate the integral I = Ω f (x)dx, where f (x) : Ω → R, Ω ⊂ Rd We are focused on situations where: Situations where there are few samples (n < few 1000), and the effective dimension is relatively small, x ∈ Rd , (d < 50); Function evaluations (samples) f (x) are (very) expensive, such as a large-scale simulation, and additional samples may not be obtainable; Little a prior information about f (x) is available; and We might not have control over the sample locations, which can be far from a desired distribution (e.g. missing data). Identify new sample locations to minimize MSE based on existing information. Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 2 / 27
  • 3. The Problem: Accurate integration of multivariate functions Goal:To estimate the integral I = Ω f (x)dx, where f (x) : Ω → R, Ω ⊂ Rd Four Approaches that work pretty well in practice. How does do they work in theory? 1. Detrending using covariates 2. Voronoi Weighted Quadrature 3. Surrogate Model Quadrature 4. Adaptive Sampling Based on Kriging SE Estimates Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 3 / 27
  • 4. Detrending before Integrating 1. Detrending using covariates Detrending first approximates the underlying function with an easily integrated surrogate model (covariate). The integral is then estimated by the exact integral of surrogate + an approximation of the residual. For example, the , f (x) can be approximated by a linear combination of simple basis functions, such as Legendre polynomials, p(x) = t i=1 βi ψi (x), which can be integrated exactly, and define I(f ) = Ω f (x)dx (1) = Ω p(x)dx + Ω [f (x) − p(x)]dx . (2) Goal is the pick f (x) to minimize the residual Ω[f (x) − p(x)]dx. Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 4 / 27
  • 5. Detrending before Integrating The error for the detrended integral is proportional to the standard deviation of the residual p(x) − f (x), not f (x) The residual errors are the only errors in the integration approximation I(f ) = Ω f (x)dx (3) ≈ ˆI(f ) = Ω p(x)dx + 1 n [f (xi ) − p(xi )] (4) PMC error bound: ||en|| = O( 1√ n σ(f − p)) and QMC error bound: ||en||≤O(1 n V [f − p](log n)(d−1)) 1. The error bounds are based on σ(f − p) and V [f − p] instead of σ(f ) and V [f ]. The least squares fit reduces these quantities. 2. The convergence rates are the same; the constants are reduced. Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 5 / 27
  • 6. Quintic detrending reduces MC and QMC errors by factor of 100 Error ˆI(f ) − I(f ) Distributions I(f ) = [0,1]6 i cos(ixi)dx Error Distributions (6D, 600 points) for PMC (top) and LDS QMC (bottom) for detrending with a cubic and quintic, K = 3, 5, polynomial. The x-axis bounds are 10 times smaller for the LDS/QMC samples. Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 6 / 27
  • 7. Detrending reduces the error constant, not the convergence rates Detrending doesn’t change the convergence rates PMC (O(n−1 2 )) and QMC (O(n−1 )) for [0,1]6 i cos(ixi)dx Errors for PMC (upper lines −−) and QMC (lower lines − · −) for constant K = 0 (left), cubic K = 3 (center), and quintic K = 5 (right). Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 7 / 27
  • 8. Detrending reduces the error constant, not the convergence rates Mean errors for [0,1]5 i cos(ixi)dx with detrending Detrending errors: degrees K = 0, 1, 2, 3, 4, 5 for 500 − 4000 samples. Convergence rates don’t chance, but the constant is reduced by 0.001 Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 8 / 27
  • 9. Curse of Dimensionality for polynomial detrending High degree polynomials in high dimensions are quickly constrained by the Curse of Dimensionality For the least squares coefficients to be identifiable, the number of samples must be ≥ the number of coefficients in the detrending function. Degree Dimension 1 2 3 4 5 10 20 0 1 1 1 1 1 1 1 1 2 3 4 5 6 11 21 2 3 6 10 15 21 66 231 3 4 10 20 35 56 286 1,771 4 5 15 35 70 126 1,001 10,626 5 6 21 56 126 252 3,003 53,130 10 11 66 286 1,001 3,003 184,756 30,045,015 The mixed variable terms in multivariate polynomials create an explosion in the number of terms as a function of the degree and dimension. For example, a 5th degree polynomial in 20 dimensions has 53,130 terms. The complexity of this approach grows linearly in the number of basis functions, and as O(n3) as the number of samples increases. Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 9 / 27
  • 10. Sparse model selection The Lp sparse penalty regularization method p(x) = t i=1 βi ψi (x) (5) β = argmin{ 1 2 ||Aβ − f ||2 2 + 1 p λ||β||p} (6) (7) This system can be solved using a cyclic coordinate descent algorithm, or factored iterated reweighted least-squares (IRLS) solving a linear system of (n = number of samples) equations on each iteration. If the function f (x) varies along some directions more than others, then sparse subset selection extracts the appropriate basis functions based on the effective dimension of the active subspaces. Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 10 / 27
  • 11. Sparse model selection Sparse subset detrending allows high degree polynomial dictionaries for sparse sample distributions. [0,1]6 i cos(ixi )dx; Errors PMC (top) and QMC (bottom) degrees K = 0, 3, 5. K = 5 fits keep 35% (PMC) or 29% (QMC) of terms. Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 11 / 27
  • 12. Least squares detrending = a weighted quadrature rule Least squares detrending is equivalent to using a weighted quadrature rule The integral of a least squares detrending fit through the data points can be represented as a weighted quadrature rule: I(f ) = Ω f (x)dx = i Ωi f (x)dx = wi ¯fi ≈ ˆwi f (xi ) where ¯fi is the mean of f in the Voronoi volume (wi ) of Ωi near xi , and ˆwi ≈ wi . The error depends on (¯fi − f (xi )) and (wi − ˆwi ). When the sample points have low discrepancy, then wi ≈ ˆwi = 1/n is a good approximation. Can this be improved if we replace the weights with a better approxima- tion of the Voronoi volume? Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 12 / 27
  • 13. Voronoi Weighted Quadrature 2. Voronoi Weighted Quadrature The Voronoi weighted quadrature rule is defined as In(f ) = n i=1 wi f (xi ) where wi is the Voronoi volume associated with the sample xi that is closer to xi than any other sample point. • The Voronoi weighted quadrature rule, In(f ), is exact if f (x) is piece- wise constant over each Voronoi volume. • Solving for the exact Voronoi volumes is expensive in high dimensions and suffers from the curse of dimensionality. • Solution: Use LDS to approximate these volumes. Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 13 / 27
  • 14. Voronoi Weighted Quadrature Voronoi Weighted Quadrature The weights for the Voronoi quadrature rule In(f ) = wi f (xi ) can be approximated using nearest neighbors of a dense reference LDS. Step 1: Generate a dense LDS, {ˆxj }, with NLDS points. Step 2: Compute the distance from each LDS to the original sample set. Step 3: Define Wi as the number of LDS points closest to xi . Step 4: Rescale these counts to define the weights wi = Wi /NLDS (and normalize by the domain volume, if needed). The weights wi converge to the Voronoi volumes as O(1/NLDS ) Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 14 / 27
  • 15. Voronoi Weighted Quadrature Voronoi Volumes Estimated by Low Discrepancy Sample The fraction of LDS samples nearest to each sample is used to estimate the Voronoi volume for the sample as a fraction of the domain volume. Works for samples living in a blob. Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 15 / 27
  • 16. Voronoi Weighted Quadrature Simple 3D example 1 1.2 1.4 1.6 1.8 2 2.2 2.4 log10 (number of samples) -2.6 -2.4 -2.2 -2 -1.8 -1.6 -1.4 -1.2 -1 log10 (meanerror) trig Integration error 3D MC error MC Voronoi error 1 1.2 1.4 1.6 1.8 2 2.2 2.4 log10 (number of samples) -3 -2.5 -2 -1.5 -1 log10 (meanerror) trig Integration error 3D LDS error LSD Voronoi error In 3D, the Voronoi weights are much more effective in reducing the errors when the original sample is iid MC than its for a LDS QMC sample. In(f ) = ˆwi f (xi ) ˆwi = estimate of xi Voronoi volume Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 16 / 27
  • 17. Voronoi Weighted Quadrature Simple 6D example 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 log10 (number of samples) -2.2 -2 -1.8 -1.6 -1.4 -1.2 -1 log10 (meanerror) trig Integration error 6D MC error MC Voronoi error 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 log10 (number of samples) -2.8 -2.6 -2.4 -2.2 -2 -1.8 -1.6 -1.4 log10 (meanerror) trig Integration error 6D LDS error LSD Voronoi error In 6D, the Voronoi weighted quadrature reduces the errors for iid MC samples. The approach is not effective for LDS in higher dimensions. We are looking for ideas for explaining why the Voronoi weighted quadra- ture approach is less effective for LDS in higher dimensions. Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 17 / 27
  • 18. Surrogate Model Quadrature 3. Surrogate Model Quadrature Interpolate samples to a dense LDS, and use standard QMC quadra- ture on the surrogate points. Step 1: Generate a dense LDS sample, {ˆxj }, with NLDS points. Step 2: Use kriging to approximate ˆf (ˆxj ) at the LDS points. Step 3: Estimate the integral by I(f ) ≈ 1 NLDS ˆf (ˆxj ) We use the DACE kriging package with a quadratic polynomial basis based on distance-weighted least-squares with radial Gaussian weights. Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 18 / 27
  • 19. Surrogate Model Quadrature Simple 3D example 1 1.2 1.4 1.6 1.8 2 2.2 2.4 log10 (number of samples) -3 -2.5 -2 -1.5 -1 log10 (meanerror) trig Integration error 3D MC error MC SLDS Error 1 1.2 1.4 1.6 1.8 2 2.2 2.4 log10 (number of samples) -4.5 -4 -3.5 -3 -2.5 -2 -1.5 -1 log10 (meanerror) trig Integration error 3D LDS error LDS SLDS Error In 3D, the surrogate data points are effective in reducing the errors for both iid MC and LDS samples. Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 19 / 27
  • 20. Surrogate Model Quadrature Simple 6 D example 1.4 1.6 1.8 2 2.2 2.4 log10 (number of samples) -2.2 -2 -1.8 -1.6 -1.4 -1.2 -1 log10 (meanerror) trig Integration error 6D MC error MC SLDS Error 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 log10 (number of samples) -4.5 -4 -3.5 -3 -2.5 -2 -1.5 log10 (meanerror) trig Integration error 6D LDS error LDS SLDS Error In 6D, the surrogate data points are effective in reducing the errors for both iid MC and LDS samples. Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 20 / 27
  • 21. Comparing Voronoi and Surrogate Model Quadrature The surrogate quadrature is consistently better than the Voronoi quadrature 1 1.2 1.4 1.6 1.8 2 2.2 2.4 log10 (number of samples) -3 -2.5 -2 -1.5 -1 log10 (meanerror) trig Integration error 3D MC error MC Voronoi error MC SLDS Error 1 1.2 1.4 1.6 1.8 2 2.2 2.4 log10 (number of samples) -4.5 -4 -3.5 -3 -2.5 -2 -1.5 -1 log10 (meanerror) trig Integration error 3D LDS error LSD Voronoi error LDS SLDS Error Both methods reduce the errors in this 3D dimensional problem. Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 21 / 27
  • 22. Comparing Voronoi and Surrogate Model Quadrature The surrogate quadrature is consistently better than the Voronoi quadrature 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 log10 (number of samples) -2.4 -2.2 -2 -1.8 -1.6 -1.4 -1.2 -1 log10 (meanerror) trig Integration error 6D MC error MC Voronoi error MC SLDS Error 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 log10 (number of samples) -4.5 -4 -3.5 -3 -2.5 -2 -1.5 log10 (meanerror) trig Integration error 6D LDS error LSD Voronoi error LDS SLDS Error • In this 6D dimensional problem, both methods reduce the error when the original sample is not a LDS. • When the original sample is LDS, then the Voronoi quadrature doesn’t improve the accuracy, while the surrogate model continues to be effective. Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 22 / 27
  • 23. Adaptive Sampling Quadrature 4. Adaptive Sampling Based on Kriging SE Estimates Instead of adding new samples to ’fill in the holes’ of the existing distribution, use kriging error estimates to guide future samples. Iterate until converged, or max number of function values is reached: Step 1: Generate a dense LDS sample, {ˆxj }, with NLDS points. Step 2: Using kriging to approximate the function, ˆf (ˆxj ), and estimate standard errors, SEj , at the LDS points. Step 3: If the max{SEj } > tolerance, then evaluate the function with the largest SEj , and return to Step 2 . Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 23 / 27
  • 24. Adaptive Sampling Quadrature Initial Random Sample 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 • = current samples The large standard errors are small red circles and smaller errors are blue. The next sample will be evaluated at the largest SE. Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 24 / 27
  • 25. Adaptive Sampling Quadrature First and second adaptive samples are in the corners 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 The large standard errors are small red circles and smaller errors are blue. • = current samples • = largest SE and next sample. Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 25 / 27
  • 26. Adaptive Sampling Quadrature The new samples fill in the holes to reduce the uncertainity 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 The large standard errors are small red circles and smaller errors are blue. • = current samples • = largest SE and next sample. Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 26 / 27
  • 27. Future Research Questions Future Research Questions Continue exploring surrogate models for guiding adaptive sampling. Develop theory for how many LDS surrogate samples are needed for the Voronoi weights and surrogate quadrature methods. Use the surrogate approach to interpolate to a sparse grid, instead of the LDS, and use higher order quadrature rules. Combine the surrogate LDS methods with the detrending approaches. Develop kriging methods that preserve local positivity, monotonicity, and convexity of the data for both design of experiment surrogate models and surrogate quadrature methods. Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 27 / 27