SlideShare a Scribd company logo
Estimation of the Box-Cox Transformation Parameter and Application to Hydrologic Data Melanie Wong Friday, August 29, 2008
Introduction Many statistical tests assume normality Most hydrologic data are highly skewed
Detecting Normality β1 (third moment) = skewness β2 (fourth moment) = kurtosis Characteristics of a normal distribution: β1=0 β2=3
Moment Ratio Diagram
Transformation to Normality Hydrologic data are not normal Various transformations available Logarithmic Box-Cox
The Box-Cox Transformation
Transformation Procedure Decide if sample data is normal Obtain value of λ Transform the data Apply confidence intervals, statistical  tests, or tolerance limits Perform inverse transformation
Determining the Optimal λ “ Snap-to-the-grid” method Box and Cox (1967): “... fix one, or possibly a small number, of λ's and go ahead with the detailed estimation...” Distribution-based method λ=-1.0 reciprocal transform λ=-0.5 reciprocal square root transform λ=0  natural log transform λ=+0.5 square root transform λ=1.0 no transformation needed
Research Goal and Objectives Goal: To develop a better understanding of the Box-Cox transformation so that it can be applied with greater confidence Objectives: To characterize the sampling variation To provide a method for estimating the Box-Cox transformation parameter for any set of data
Sampling Variation 10,000 simulations
Second Objective To provide a method for estimating the Box-Cox transformation parameter  for any set of data
The Importance of λ Small changes in λ    Large  changes in sampling variation Need a more precise method  to obtain optimum λ
Optimizing λ Procedure: Monte Carlo simulation to identify sampling distributions of β1 and β2 values for different λ values Distribution types: Gamma, exponential, uniform Population sizes: 1000, 500, 200, 100, 50, 10
Simulation Results β1 values  ↔ β2 values  ↕
Variation of λ
Confidence Intervals Procedure: Logarithmic 1) Make Logarithmic transformation of x 2) Use normal theory for confidence intervals on y 3) Inverse transform the confidence intervals  to values of x
Confidence Interval Procedure: Box-Cox 1) Make BCT of x to y using optimum λ 2) Use normal theory for confidence intervals on y 3) Inverse transform the confidence intervals to  values of x
Example 1: Monthly Rainfall Measurements 36 monthly rainfall measurements (mm/month) from Lawrenceville, GA β1: 0.56 β2: 2.91
Example 1: After Logarithmic Transformation β1: -1.12 β2: 4.40
Example 1: After Box-Cox Transformation λ: 0.55 β1: -0.09 β2: 2.82
Histogram of Rainfall Data
90% Confidence Intervals  on Rainfall Box-Cox Transformed: Log Transformed:
Example 2: Drainage Pipe Costs The costs of 70 drainage systems β1: 4.40  β2: 27.36
Example 2: After Logarithmic Transformation β1: -0.167 β2: 4.04
Example 2: After Box-Cox Transformation λ: 0.04 β1: 0.00379 β2: 3.96
Histogram of Pipe Cost Data
90% Confidence Intervals  on Pipe Cost Box-Cox Transformed: Log Transformed:
Conclusions Box-Cox transform is more suitable than logarithmic transform for: Normalizing data Determining confidence intervals, tolerance limits, outliers, and other tests Sampling distributions of λ were determined Optimum values of λ vs. β1 and β2 were developed
QUESTIONS?

More Related Content

PPTX
Multiple Regression Analysis
PPT
Multivariate Data Visualization
PDF
Introduction to Generalized Linear Models
PPTX
Probability distribution
PPTX
Health probabilities & estimation of parameters
PPT
Linear regression
PPT
Data analysis
PPT
The Normal Distribution and Other Continuous Distributions
Multiple Regression Analysis
Multivariate Data Visualization
Introduction to Generalized Linear Models
Probability distribution
Health probabilities & estimation of parameters
Linear regression
Data analysis
The Normal Distribution and Other Continuous Distributions

What's hot (20)

PPTX
Point and Interval Estimation
PPTX
Basic concepts of probability
PPTX
PPTX
conditional probabilty
PPTX
The Standard Normal Distribution
PPT
Chap08 fundamentals of hypothesis
PPTX
Conditional probability, and probability trees
PPTX
Scatterplots, Correlation, and Regression
PPTX
Sampling Distributions
PDF
Probability Density Functions
PPTX
The Central Limit Theorem
PPT
Sampling distribution
PPTX
PPTX
Maximum likelihood estimation
PPT
Confidence Intervals
PPT
Simple lin regress_inference
PPTX
Confidence interval & probability statements
PPTX
hypothesis testing-tests of proportions and variances in six sigma
PPTX
Multiple linear regression
PDF
Practice Test 1
Point and Interval Estimation
Basic concepts of probability
conditional probabilty
The Standard Normal Distribution
Chap08 fundamentals of hypothesis
Conditional probability, and probability trees
Scatterplots, Correlation, and Regression
Sampling Distributions
Probability Density Functions
The Central Limit Theorem
Sampling distribution
Maximum likelihood estimation
Confidence Intervals
Simple lin regress_inference
Confidence interval & probability statements
hypothesis testing-tests of proportions and variances in six sigma
Multiple linear regression
Practice Test 1
Ad

Recently uploaded (20)

PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Empathic Computing: Creating Shared Understanding
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Machine learning based COVID-19 study performance prediction
PDF
Approach and Philosophy of On baking technology
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Encapsulation theory and applications.pdf
PPTX
Cloud computing and distributed systems.
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Building Integrated photovoltaic BIPV_UPV.pdf
The AUB Centre for AI in Media Proposal.docx
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Unlocking AI with Model Context Protocol (MCP)
Empathic Computing: Creating Shared Understanding
Chapter 3 Spatial Domain Image Processing.pdf
Understanding_Digital_Forensics_Presentation.pptx
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
Machine learning based COVID-19 study performance prediction
Approach and Philosophy of On baking technology
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Encapsulation theory and applications.pdf
Cloud computing and distributed systems.
Dropbox Q2 2025 Financial Results & Investor Presentation
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Ad

Estimation Of The Box Cox Transformation Parameter And Application To Hydrologic Data 1

  • 1. Estimation of the Box-Cox Transformation Parameter and Application to Hydrologic Data Melanie Wong Friday, August 29, 2008
  • 2. Introduction Many statistical tests assume normality Most hydrologic data are highly skewed
  • 3. Detecting Normality β1 (third moment) = skewness β2 (fourth moment) = kurtosis Characteristics of a normal distribution: β1=0 β2=3
  • 5. Transformation to Normality Hydrologic data are not normal Various transformations available Logarithmic Box-Cox
  • 7. Transformation Procedure Decide if sample data is normal Obtain value of λ Transform the data Apply confidence intervals, statistical tests, or tolerance limits Perform inverse transformation
  • 8. Determining the Optimal λ “ Snap-to-the-grid” method Box and Cox (1967): “... fix one, or possibly a small number, of λ's and go ahead with the detailed estimation...” Distribution-based method λ=-1.0 reciprocal transform λ=-0.5 reciprocal square root transform λ=0 natural log transform λ=+0.5 square root transform λ=1.0 no transformation needed
  • 9. Research Goal and Objectives Goal: To develop a better understanding of the Box-Cox transformation so that it can be applied with greater confidence Objectives: To characterize the sampling variation To provide a method for estimating the Box-Cox transformation parameter for any set of data
  • 11. Second Objective To provide a method for estimating the Box-Cox transformation parameter for any set of data
  • 12. The Importance of λ Small changes in λ  Large changes in sampling variation Need a more precise method to obtain optimum λ
  • 13. Optimizing λ Procedure: Monte Carlo simulation to identify sampling distributions of β1 and β2 values for different λ values Distribution types: Gamma, exponential, uniform Population sizes: 1000, 500, 200, 100, 50, 10
  • 14. Simulation Results β1 values ↔ β2 values ↕
  • 16. Confidence Intervals Procedure: Logarithmic 1) Make Logarithmic transformation of x 2) Use normal theory for confidence intervals on y 3) Inverse transform the confidence intervals to values of x
  • 17. Confidence Interval Procedure: Box-Cox 1) Make BCT of x to y using optimum λ 2) Use normal theory for confidence intervals on y 3) Inverse transform the confidence intervals to values of x
  • 18. Example 1: Monthly Rainfall Measurements 36 monthly rainfall measurements (mm/month) from Lawrenceville, GA β1: 0.56 β2: 2.91
  • 19. Example 1: After Logarithmic Transformation β1: -1.12 β2: 4.40
  • 20. Example 1: After Box-Cox Transformation λ: 0.55 β1: -0.09 β2: 2.82
  • 22. 90% Confidence Intervals on Rainfall Box-Cox Transformed: Log Transformed:
  • 23. Example 2: Drainage Pipe Costs The costs of 70 drainage systems β1: 4.40 β2: 27.36
  • 24. Example 2: After Logarithmic Transformation β1: -0.167 β2: 4.04
  • 25. Example 2: After Box-Cox Transformation λ: 0.04 β1: 0.00379 β2: 3.96
  • 26. Histogram of Pipe Cost Data
  • 27. 90% Confidence Intervals on Pipe Cost Box-Cox Transformed: Log Transformed:
  • 28. Conclusions Box-Cox transform is more suitable than logarithmic transform for: Normalizing data Determining confidence intervals, tolerance limits, outliers, and other tests Sampling distributions of λ were determined Optimum values of λ vs. β1 and β2 were developed