SlideShare a Scribd company logo
4
Most read
5
Most read
11
Most read
1
TRANSFORMATION
An Assignment on
SUBMITTED BY
Mehta Kavish Kirtikumar
2nd
Sem. Ph. D.
(1010121013)
BACA, AAU,
Anand - 388110
SUBMITTED TO
Dr. A. D. Kalola
Professor & Head,
Dept. of Agril. Statistics,
BACA, AAU,
Anand - 388110
Ag. Stat 534: Statistical Methods for Crop
Protection II
2
TRANSFORMATION
 What is a transformation?
It is the technique where the original data are converted into a new scale resulting in a new
data that is expected to satisfy the condition of homogeneity or variance.
 It is a mathematical function that is applied to all the observations of a given variable.
 Y represents the original variable, Y* is the transformed variable, and f is a mathematical function that
is applied to the data .
 Data transformation is the most appropriate remedial measure for variance heterogeneity where the
variance and the mean are functionally related.
 With this technique, the original data are converted into a new scale resulting in a new data set
that is expected to satisfy the condition of 99% homogeneity of variance.
 It is worth noting that the comparative values between treatments are not changed and comparisons
between them remain valid due to application of a common transformation scale to all observations.
 The appropriate data transformation to be used depends upon the specific type of relationship between
the mean and the variance.
𝑌 ∗= 𝑓 (𝑌 )
3
WHY TRANSFORMATION IS REQUIRED?
It is required when data not following normal distribution.
It is required to make the mean and variances independent.
It is required when Coefficient of variation is high.
It is required to maintain the homogeneity of the test data.
4
LOG TRANSFORMATION
 When the original observation Y is converted to log Y, the
conversion is known as log transformation.
 Although log to any base can be used, log to base 10 is generally
easiest.
 If the observed value is 0, a constant value preferably 1 is added to
avoid negative logarithms.
 When such constant is added, it is added to all the observations.
 The log transformation is particularly effective in normalizing
positively skewed distributions. It is also used to achieve additivity.
5
 Here, It can be verified that the treatment and replication
effects are not additive in case of original observations.
 However, the treatment and replication effects are additive after
log transformation.
6
ANGULAR TRANSFORMATION
 The use of arcsine transformation, also known as inverse transformation (Rao, 1998) or
angular transformation (Snedecor and Cochran, 1989) has been open for debate as to the
usefulness in analysis of proportion data that tends to be skewed when the distribution is not
normal. e.g. data obtained from a count, the data expressed as decimal fractions and percentages.
 The mechanics of data transformation are greatly facilitated by using a table of the arcsine
transformation.
 where, p is the proportion and Y is the result of the transformation
7
 Although arcsine transformation is a useful tool in stabilizing variances and normalizing
proportional data, there are several reasons why this method can be problematic.
 The equalization of variance in proportional data when using arcsine transformations
requires the numbers of trials to be equal for each data point, while the efficacy of
arcsine transformation in normalizing proportional data is dependent on sample size, n,
and doesn’t perform well at extreme ends of the distribution (Worton and Hui
(2010);Hardy 2002).
8
 Another argument against arcsine transformation is that it does not confine
proportional data between 0 and 1, resulting in the extrapolation of proportional
values that aren’t biologically sensible (Hardy 2002).
 In an example provided by Hardy (2002), the arcsine transformation of the
relationship between sex ratio data and distance from a pollutant predicted a sex
ratio greater than 1 for males as the distance from the pollutant increased (Hardy,
2002).
 An alternative to arcsine transformation that is becoming more prevalent in today’s
biological analyses is the logistic regression, an analytical method which is
designed to deal with proportional data (Jeager, 2008).
9
SQUARE ROOT TRANSFORMATION
 If the original observation Y is converted to a new value by taking its square root,
it is known as square root transformation.
 It is used in case of count data, when some of the observed counts are numerically
small, say less than 10, the more appropriate transformation is (Y+0.5)1/2
.
 The transformation of the type (Y)1/2
+(Y+1)1/2
is also used.
 The square root transformation is used when the observations follow a Poisson
distribution.
10
 It is useful for small whole no. data:
 Eg.
No. of infested plant in a plot,
No. of insects caught in traps,
No. of weeds per plot.
 It is also appropriate for percent data, where range is between 0 -
30% or between 70-100%, but not both.
 Especially with 0 percent (x+0.5) should be use instead of x,
where x is the original data.
11
 The following rules may be useful in choosing the proper transformation
scale for percentage data derived from count data.
 Rule-1: For percentage data lying within the range 30 to 70 % no
transformation is needed.
 Rule-2: For percentage data lying within the range of either 0 to 20 % or
80 to 100 % but not both, the square root transformation should be used.
 Rule-3: For percentage data that do not follow the ranges specified in
either rule-1 or rule-2, the arc sine transformation should be used.
12

More Related Content

PDF
Transformasi Data Penelitian
PDF
applied multivariate statistical techniques in agriculture and plant science 2
PPT
Data Transformation.ppt
PDF
Optimizing transformation for linearity between online
PDF
Statistics for Data Analytics
PPTX
multiple Regression
PPTX
Data Analysis Of An Analytical Method Transfer To
PDF
Dr. A Sumathi - LINEARITY CONCEPT OF SIGNIFICANCE.pdf
Transformasi Data Penelitian
applied multivariate statistical techniques in agriculture and plant science 2
Data Transformation.ppt
Optimizing transformation for linearity between online
Statistics for Data Analytics
multiple Regression
Data Analysis Of An Analytical Method Transfer To
Dr. A Sumathi - LINEARITY CONCEPT OF SIGNIFICANCE.pdf

Similar to Transformation technique and when it is used (20)

PDF
erros em experimentos de adsorção
PDF
Machine learning based approaches for prompt diagnosis of aquatic plant ailme...
PDF
International Journal of Pharmaceutica Analytica Acta
PPTX
Chi-Square Test Non Parametric Test Categorical Variable
DOC
1 statistical analysis notes
PPTX
Sensitivity Analysis
PPTX
study metarial-DOE-13-12 (1).pptx
PDF
ANOVA.pdf
PPTX
Similarity and difference factors of dissolution
PPT
Chapter 15 Marketing Research Malhotra
PDF
artigo correlação policorica x correlaçãoperson pdf
PDF
A meta analysis of the use of genetically modified cotton and its conventiona...
PPTX
Factor Analysis in Research
PDF
2014 IIAG Imputation Assessments
PPTX
Transformation of variables
PPTX
Chi square test
PDF
Modelo Generalizado
PDF
C054
PDF
Crm handout probability distributions
erros em experimentos de adsorção
Machine learning based approaches for prompt diagnosis of aquatic plant ailme...
International Journal of Pharmaceutica Analytica Acta
Chi-Square Test Non Parametric Test Categorical Variable
1 statistical analysis notes
Sensitivity Analysis
study metarial-DOE-13-12 (1).pptx
ANOVA.pdf
Similarity and difference factors of dissolution
Chapter 15 Marketing Research Malhotra
artigo correlação policorica x correlaçãoperson pdf
A meta analysis of the use of genetically modified cotton and its conventiona...
Factor Analysis in Research
2014 IIAG Imputation Assessments
Transformation of variables
Chi square test
Modelo Generalizado
C054
Crm handout probability distributions
Ad

Recently uploaded (20)

PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPTX
Computer network topology notes for revision
PDF
annual-report-2024-2025 original latest.
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
IB Computer Science - Internal Assessment.pptx
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
Introduction to Knowledge Engineering Part 1
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
Database Infoormation System (DBIS).pptx
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PDF
Foundation of Data Science unit number two notes
PPTX
climate analysis of Dhaka ,Banglades.pptx
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
Computer network topology notes for revision
annual-report-2024-2025 original latest.
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
IB Computer Science - Internal Assessment.pptx
Galatica Smart Energy Infrastructure Startup Pitch Deck
Miokarditis (Inflamasi pada Otot Jantung)
Qualitative Qantitative and Mixed Methods.pptx
Fluorescence-microscope_Botany_detailed content
Introduction to Knowledge Engineering Part 1
Supervised vs unsupervised machine learning algorithms
Database Infoormation System (DBIS).pptx
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
Foundation of Data Science unit number two notes
climate analysis of Dhaka ,Banglades.pptx
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
Clinical guidelines as a resource for EBP(1).pdf
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Ad

Transformation technique and when it is used

  • 1. 1 TRANSFORMATION An Assignment on SUBMITTED BY Mehta Kavish Kirtikumar 2nd Sem. Ph. D. (1010121013) BACA, AAU, Anand - 388110 SUBMITTED TO Dr. A. D. Kalola Professor & Head, Dept. of Agril. Statistics, BACA, AAU, Anand - 388110 Ag. Stat 534: Statistical Methods for Crop Protection II
  • 2. 2 TRANSFORMATION  What is a transformation? It is the technique where the original data are converted into a new scale resulting in a new data that is expected to satisfy the condition of homogeneity or variance.  It is a mathematical function that is applied to all the observations of a given variable.  Y represents the original variable, Y* is the transformed variable, and f is a mathematical function that is applied to the data .  Data transformation is the most appropriate remedial measure for variance heterogeneity where the variance and the mean are functionally related.  With this technique, the original data are converted into a new scale resulting in a new data set that is expected to satisfy the condition of 99% homogeneity of variance.  It is worth noting that the comparative values between treatments are not changed and comparisons between them remain valid due to application of a common transformation scale to all observations.  The appropriate data transformation to be used depends upon the specific type of relationship between the mean and the variance. 𝑌 ∗= 𝑓 (𝑌 )
  • 3. 3 WHY TRANSFORMATION IS REQUIRED? It is required when data not following normal distribution. It is required to make the mean and variances independent. It is required when Coefficient of variation is high. It is required to maintain the homogeneity of the test data.
  • 4. 4 LOG TRANSFORMATION  When the original observation Y is converted to log Y, the conversion is known as log transformation.  Although log to any base can be used, log to base 10 is generally easiest.  If the observed value is 0, a constant value preferably 1 is added to avoid negative logarithms.  When such constant is added, it is added to all the observations.  The log transformation is particularly effective in normalizing positively skewed distributions. It is also used to achieve additivity.
  • 5. 5  Here, It can be verified that the treatment and replication effects are not additive in case of original observations.  However, the treatment and replication effects are additive after log transformation.
  • 6. 6 ANGULAR TRANSFORMATION  The use of arcsine transformation, also known as inverse transformation (Rao, 1998) or angular transformation (Snedecor and Cochran, 1989) has been open for debate as to the usefulness in analysis of proportion data that tends to be skewed when the distribution is not normal. e.g. data obtained from a count, the data expressed as decimal fractions and percentages.  The mechanics of data transformation are greatly facilitated by using a table of the arcsine transformation.  where, p is the proportion and Y is the result of the transformation
  • 7. 7  Although arcsine transformation is a useful tool in stabilizing variances and normalizing proportional data, there are several reasons why this method can be problematic.  The equalization of variance in proportional data when using arcsine transformations requires the numbers of trials to be equal for each data point, while the efficacy of arcsine transformation in normalizing proportional data is dependent on sample size, n, and doesn’t perform well at extreme ends of the distribution (Worton and Hui (2010);Hardy 2002).
  • 8. 8  Another argument against arcsine transformation is that it does not confine proportional data between 0 and 1, resulting in the extrapolation of proportional values that aren’t biologically sensible (Hardy 2002).  In an example provided by Hardy (2002), the arcsine transformation of the relationship between sex ratio data and distance from a pollutant predicted a sex ratio greater than 1 for males as the distance from the pollutant increased (Hardy, 2002).  An alternative to arcsine transformation that is becoming more prevalent in today’s biological analyses is the logistic regression, an analytical method which is designed to deal with proportional data (Jeager, 2008).
  • 9. 9 SQUARE ROOT TRANSFORMATION  If the original observation Y is converted to a new value by taking its square root, it is known as square root transformation.  It is used in case of count data, when some of the observed counts are numerically small, say less than 10, the more appropriate transformation is (Y+0.5)1/2 .  The transformation of the type (Y)1/2 +(Y+1)1/2 is also used.  The square root transformation is used when the observations follow a Poisson distribution.
  • 10. 10  It is useful for small whole no. data:  Eg. No. of infested plant in a plot, No. of insects caught in traps, No. of weeds per plot.  It is also appropriate for percent data, where range is between 0 - 30% or between 70-100%, but not both.  Especially with 0 percent (x+0.5) should be use instead of x, where x is the original data.
  • 11. 11  The following rules may be useful in choosing the proper transformation scale for percentage data derived from count data.  Rule-1: For percentage data lying within the range 30 to 70 % no transformation is needed.  Rule-2: For percentage data lying within the range of either 0 to 20 % or 80 to 100 % but not both, the square root transformation should be used.  Rule-3: For percentage data that do not follow the ranges specified in either rule-1 or rule-2, the arc sine transformation should be used.
  • 12. 12