Data Analysis technique, data collection, data analysis

Data Analysis
• The data, after collection, has to be processed and
analysed in accordance with the outline laid down for the
purpose at the time of developing the research plan
• This is essential for a scientific study and for ensuring that
we have all relevant data for making contemplated
comparisons and analysis
• Processing implies editing, coding, classification and
tabulation of collected data so that they are amenable to
analysis
• The term analysis refers to the computation of certain
measures along with searching for patterns of relationship
that exist among data-groups

Processing Operations
• Editing
• Coding
• Classification
• Tabulation

• Process of examining the collected raw data to detect
errors and omissions and to correct these when possible
• It involves a careful scrutiny of the completed
questionnaires and/or schedules
• It ensures that the data are accurate, consistent with other
facts gathered, uniformly entered, as completed as
possible and have been well arranged to facilitate coding
and tabulation
• With regard to points or stages at which editing should be
done, one can talk of field editing and central editing
• Field editing consists in the review of the reporting forms
by the investigator for completing (translating or rewriting)
what the latter has written in abbreviated and/or in illegible
form at the time of recording the respondents’ responses.
This type of editing is necessary in view of the fact that
individual writing styles often can be difficult for others to
decipher
Editing

• Central editing should take place when all forms or
schedules have been completed and returned to the office
• Thorough editing by a single editor or a team of editors in
case of a large inquiry
• Editor(s) may correct the obvious errors
• In case of inappropriate on missing replies, the editor can
sometimes determine the proper answer by reviewing the
other information in the schedule and at the same time
respondent can be contacted for clarification

Editors must keep in view several points while performing
their work:
• They should be familiar with instructions given to the
interviewers and coders as well as with the editing
instructions supplied
• While crossing out an original entry for one reason or
another, they should just draw a single line on it so that the
same may remain legible
• They must make entries (if any) on the form in some
distinctive colur and that too in a standardised form
• They should initial all answers which they change or
supply
• Editor’s initials and the date of editing should be placed on
each completed form or schedule

Coding
• It refers to the process of assigning numerals or other
symbols to answers so that responses can be put into a
limited number of categories or classes appropriate to the
research problem
• They must also possess the characteristic of
exhaustiveness and also that of mutual exclusively which
means
• Another rule to be observed is that of unidimensionality by
which is meant that every class is defined in terms of only
one concept
• Through it the several replies may be reduced to a small
number of classes which contain the critical information
required for analysis

Classification
• Most research studies result in a large volume of raw data
which must be reduced into homogeneous groups if we are
to get meaningful relationships
• This fact necessitates classification of data which happens
to be the process of arranging data in groups or classes on
the basis of common characteristics
• Data having a common characteristic are placed in one
class and in this way the entire data get divided into a
number of groups or classes
• Classification can be one of the following two types,
depending upon the nature of the phenomenon involved:
 According to attributes
 According to class intervals

Tabulation
• When a mass of data has been assembled, it becomes
necessary for the researcher to arrange the same in some
kind of concise and logical order
• This procedure is referred to as tabulation and thus,
tabulation is the process of summarizing raw data and
displaying the same in compact form (i.e., in the form of
statistical tables) for further analysis
• In a broader sense, tabulation is an orderly arrangement of
data in columns and rows
• Tabulation is essential because of the following reasons:
It conserves space and reduces explanatory and
descriptive statement to a minimum
It facilitates the process of comparison
It facilitates the summation of items and the detection of
errors and omissions
It provides a basis for various statistical computations

Generally Accepted Principles of Tabulation
• Every table should have a clear, concise and adequate title
so as to make the table intelligible without reference to the
text and this title should always be placed just above the
body of the table
• Every table should be given a distinct number to facilitate
easy reference
• The column headings (captions) and the row headings
(stubs) of the table should be clear and brief
• The units of measurement under each heading or sub-
heading must always be indicated
• Explanatory footnotes, if any, concerning the table should
be placed directly beneath the table, along with the
reference symbols used in the table
• Source or sources from where the data in the table have
been obtained must be indicated just below the table
• Usually the columns are separated from one another by
lines which make the table more readable and attractive

• Lines are always drawn at the top and bottom of the table
and below the captions
• There should be thick lines to separate the data under one
class from the data under another class and the lines
separating the sub-divisions of the classes should be
comparatively thin lines
• The columns may be numbered to facilitate reference
• Those columns whose data are to be compared should be
kept side by side
• Similarly, percentages and/or averages must also be kept
close to the data
• It is generally considered better to approximate figures
before tabulation as the same would reduce unnecessary
details in the table itself
• In order to emphasise the relative significance of certain
categories, different kinds of type, spacing and
indentations may be used

• It is important that all column figures be properly aligned
Decimal points and (+) or (–) signs should be in perfect
alignment
• Abbreviations should be avoided to the extent possible
and ditto marks should not be used in the table
• Miscellaneous and exceptional items, if any, should be
usually placed in the last row of the table
• Table should be made as logical, clear, accurate and simple
as possible. If the data happen to be very large, they
should not be crowded in a single table for that would
make the table unwieldy and inconvenient
• Total of rows should normally be placed in the extreme
right column and that of columns should be placed at the
bottom
• The arrangement of the categories in a table may be
chronological, geographical, alphabetical or according to
magnitude to facilitate comparison

Elements/ Types of Analysis
• By analysis we mean the computation of certain indices or
measures along with searching for patterns of relationship
that exist among the data groups
• It involves estimating the values of unknown parameters
and testing of hypotheses for drawing inferences
• Analysis may, therefore, be categorized as descriptive
analysis and inferential analysis (Inferential analysis is
often known as statistical analysis)
• Descriptive analysis is largely the study of distributions of
one variable & this sort of analysis may be in respect of
one variable (described as unidimensional analysis), or in
respect of two variables (described as bivariate analysis)
or in respect of more than two variables (described as
multivariate analysis)
• We may as well talk of correlation analysis and causal
analysis

• Correlation analysis studies the joint variation of two or
more variables for determining the amount of correlation
between two or more variables
• Causal analysis is concerned with the study of how one or
more variables affect changes in another variable
• It is thus a study of functional relationships existing
• This analysis can be termed as regression analysis
• Causal analysis is considered relatively more important in
experimental researches
• In modern times, with the availability of computer facilities,
there has been a rapid development of multivariate
analysis which may be defined as “all statistical methods
which simultaneously analyse more than two variables
Elements/ Types of Analysis

Multivariate analysis
• Multiple regression analysis: This analysis is adopted
when the researcher has one dependent variable which is
presumed to be a function of two or more independent
variables
• The objective of this analysis is to make a prediction about
the dependent variable based on its covariance with all the
concerned independent variables
• Multiple discriminant analysis: This analysis is appropriate
when the researcher has a single dependent variable that
cannot be measured, but can be classified into two or more
groups on the basis of some attribute
• The object of this analysis is to o predict an entity’s
possibility of belonging to a particular group based on
several predictor variables
• Multivariate analysis of variance (or multi-ANOVA):
Extension of two way ANOVA, wherein the ratio of among
group variance to within group variance is worked out on a
set of variables

• Canonical analysis: This analysis can be used in case of
both measurable and non-measurable variables for the
purpose of simultaneously predicting a set of dependent
variables from their joint covariance with a set of
independent variables
• Inferential analysis is concerned with the various tests of
significance for testing hypotheses in order to determine
with what validity data can be said to indicate some
conclusion or conclusions

Statistics in Research
• The role of statistics in research is to function as a tool in
designing research, analysing its data and drawing
conclusions therefrom
• Clearly the science of statistics cannot be ignored by any
research worker, even though he may not have occasion to
use statistical methods in all their details and ramifications
• The important statistical measures
 Measures of central tendency or statistical averages
 Measures of dispersion
 Measures of asymmetry (skewness)
 Measures of relationship
 Other measures

• Measures of central tendency (or statistical averages) tell
us the point about which items have a tendency to cluster
• Mean, median and mode are the most popular averages

Median
• Arrange your numbers in numerical order
• Count how many numbers you have
• If you have an odd number, divide by 2 and round up to get
the position of the median number
• If you have an even number, divide by 2. Go to the number
in that position and average it with the number in the next
higher position to get the median
Mode
To find the mode, or modal value, it is best to put the
numbers in order. Then count how many of each number. A
number that appears most often is the mode.

Find the mean, median, and mode for the following list of values:
13, 18, 13, 14, 13, 16, 14, 21, 13
Mean=15
Median: 14
Mode:13
1, 2, 4, 7
Mean=3.5
Median= (2+4)/2=3
Mode=0
G.M. & H.M.

Measure of Dispersion
• An average can represent a series only as best as a single
figure
• It fails to give any idea about the scatter of the values in
the series around the true value of average
• In order to measure this scatter, statistical devices called
measures of dispersion are calculated Important measures
of dispersion are
Range
Mean deviation
Standard deviation
https://geographyfieldwork.com/DataPresentationScatterGraphs.htm#

Range
• It is the simplest possible measure of dispersion and is
defined as the difference between the values of the
extreme items of a series
• Range = Highest value of an item in a series- Lowest value
of an item in a series
• It gives an idea of the variability very quickly, but the
drawback is that range is affected very greatly by
fluctuations of sampling
• Its value is never stable, being based on only two values of
the variable
• As such, range is mostly used as a rough measure of
variability and is not considered as an appropriate measure
in serious research studies

Mean deviation
• It is the average of difference of the values of items from
some average of the series
• In calculating mean deviation we ignore the minus sign of
deviations while taking their total for obtaining the mean
deviation
Standard deviation
• It is most widely used measure of dispersion of a series
and is commonly denoted by the symbol sigma
• Standard deviation is defined as the square-root of the
average of squares of deviations, when such deviations for
the values of individual items in a series are obtained from
the arithmetic average

Data Analysis technique, data collection, data analysis

Measures of Asymmetry
When the distribution of item in a series happens to be
perfectly symmetrical, we then have the following type of
curve for the distribution:

• A normal curve and the relating distribution as normal
distribution
• Such a curve is perfectly bell shaped curve in which case
the value of X or M or Z is just the same and skewness is
altogether absent
• If the curve is distorted (whether on the right side or on the
left side), we have asymmetrical distribution which
indicates that there is skewness
• If the curve is distorted on the right side, we have positive
skewness but when the curve is distorted towards left, we
have negative skewness

Skewness is, thus, a measure of asymmetry and shows the
manner in which the items are clustered around the average

Measures of Relationship
• Statistical measures that we used so far are in context of
univariate population i.e., measurement of only one
variable
• If for every measurement of a variable, X, there is a
corresponding value of a second variable, Y, the resulting
pairs of values are called a bivariate population
• Similarly it can be a multi-variable data
• There are several methods of determining the relationship
between variables, but no method can tell us for certain
that a correlation is indicative of causal relationship

Two types of questions in bivariate or multivariate
populations
• Does there exist association or correlation between the two
(or more) variables? If yes, of what degree?
• Is there any cause and effect relationship between the two
variables ? If yes, of what degree and in which direction?
 The first question is answered by the use of correlation
technique and the second question by the technique of
regression

There are several methods of applying the two techniques,
but the important ones are as under:
 In case of bivariate population: Correlation can be studied
through
• Cross tabulation
• Charles Spearman’s coefficient of correlation
• Karl Pearson’s coefficient of correlation; whereas cause
and effect relationship can be studied through simple
regression equations
 In case of multivariate population: Correlation can be
studied through
• Coefficient of multiple correlation
• Coefficient of partial correlation; whereas cause and effect
relationship can be studied through multiple regression

Simple Regression Analysis
• Regression is the determination of a statistical relationship
• In simple regression, we have only two variables, one
variable (defined as independent) is the cause of the
behaviour of another one (defined as dependent variable)
• Regression can only interpret what exists physically i.e.,
there must be a physical way in which independent
variable X can affect dependent variable Y
The basic relationship between X and Y is given by
denotes the estimated value of Y for a given value of X

Then generally used method to find the ‘best’ fit that a
straight line of this kind can give is the least-square method
Least-Square Method

Least Square Curve Fitting method
b
a b
S. S. Shashtri , “Introductory-Methods-of-Numerical-Analysis, 2012, PHI Learning, N. Delhi

A sigmoid function is a mathematical function having a characteristic "S"-shaped curve
or sigmoid curve. A common example of a sigmoid function is the logistic function shown in
the first figure and defined by the formula

Definition
• Curve fitting: is the process of constructing a
curve, or mathematical function, that has the
best fit to a series of data points, possibly
subject to constraints.
• It is a statistical technique use to drive
coefficient values for equations that express
the value of one(dependent) variable as a
function of another (independent variable)
https://www2.slideshare.net/shopnohinami/curve-fitting-53775511?from_action=save

What is curve fitting
Curve fitting is the process of constructing a curve, or
mathematical functions, which possess closest proximity to
the series of data. By the curve fitting we can mathematically
construct the functional relationship between the observed
fact and parameter values, etc. It is highly effective in
mathematical modelling some natural processes.
https://www2.slideshare.net/shopnohinami/curve-fitting-53775511?from_action=save

Interpolation & Curve fitting
• In many application areas, one is faced with the
test of describing data, often measured, with an
analytic function. There are two approaches to
this problem:-
• 1. In Interpolation, the data is assumed to be
correct and what is desired is some way to
descibe what happens between the data
points.
• 2. The other approach is called curve fitting
or regression, one looks for some smooth
curve that
``best fits'' the data, but does not necessarily
pass through any data points.
In many application areas, one is faced with the test of
describing data, often measured, with an analytic function.
There are two approaches to this problem
• In Interpolation, the data is assumed to be correct and what
is desired is some way to describe what happens between
the data points
•The other approach is called curve fitting or regression, one
looks for some smooth curve that ``best fits'' the data, but
does not necessarily pass through any data points

Curve fitting
There are two general approaches for curve fitting:
• Least squares regression
Data exhibit a significant degree of scatter. The strategy is to
derive a single curve that represents the general trend of the
data
• Interpolation
Data is very precise. The strategy is to pass a curve or a
series of curve through each of the points is very precise.

General approach for curve fitting

Engineering Aapplications of Curve fitting
Technique
• Trend Analysis:- Predicating values of dependent variable ,
may include extrapolation beyond data points or
interpolation between data points
In engineering, two types of applications are encountered:
• Trend analysis. Predicting values of dependent variable, may
include extrapolation beyond data points or interpolation
between data points
• Hypothesis testing. Comparing existing mathematical model
with measured data

Data scatterness
Positive
Correlation
Positive Correlation
No Correlation

Mathematical Background
• Variance. Representation of spread by the
square of the standard deviation.
• Coefficient of variation. Has the utility to
quantify the spread of data.
2
n 1
(y  y)
S 2

 i
y
2
2
2  
n1
 y / n
y
S  i
i
y
c.v. 
Sy
100%
y
• Mean
• S.D

Linear Regression: Criteria for a “Best” Fit
n n
 a0  a1 xi)
min ei  (yi
i1 i1
e1= -e2

n n
min |ei |  | yi  a0  a1 xi |
i1 i1

n
min max| ei || yi  a0  a1xi |
i1

Linear curve fitting (Straight line)?
Given a set of data point (xi, f(xi )) find a curve that best
captures the general trend
• Where g(x) is approximation function set of data point
(xi, f(xi )) find a curve that best captures the general
trend
• Where g(x) is approximation function
Try to fit a straight
line Through the
data

Linear Regression: Least Squares Fit
n
i
n n
r  i  i
S  
i1
2
2
i1 i1
2
( yi  a0 a1 xi )
e  (y ,measured  y ,model) 



n
n
i
r e
i1
i 0 1 i
i1
2
(y  a  a x)2
min S 
Yields a unique line for a given set of
data.

Linear Regression: Least Squares Fit
 
n
n
r i i 0 1 i
2
e  (y  a  a x)2
min S 
i1 i1
The coefficients a0 and a1 that minimize Sr must
satisfy the following conditions:
 0


a1

S


a0
r
 0
Sr

Linear Regression: Determination of ao and a1
2
0
0  1 i
 i i  i 
o
y x  a x  a x
1
0   yi  a0
 a1xi
 2(yi  ao  a1xi )xi  0
Sr
a
 2(yi  ao  a1 xi )  0
Sr
a
   2
0 1 i
i i i
y x  a x  a x
a0  na0
na0  xi a1   yi
2 equations with
2 unknowns,
can be solved
simultaneously

Linear Regression:
Determination of ao and a1
2
2
1
 i  i
x   x 
n
 i i  xi  yi
x y
n
a 
a0  y a1 x

Error Quantification of Linear Regression
• Sum of the squares of residuals around
the regression line is Sr
• Total sum of the squares around the mean
for the dependent variable, y, is St
2
St  ( yi  y)
2
n
n
2
r  i 
i1 i1
e  (yi ao a1xi )
S 

Example
• The table blew gives the temperatures T in C
and Resistance R in Ω of a circuit if R=a0 +
a1T
• Find the values of a0 and a1
T 10 20 30 40 50 60
R 20.1 20.2 20.4 20.6 20.8 21

Solution
T=Xi R=yi 𝑿𝒊𝟐 =𝑻𝟐 Xiyi=TR g(xi)=Y
10 20.1 100 201 20.05
20 20.2 400 404 20.24
30 20.4 900 612 20.42
40 20.6 1600 824 20.61
50 20.8 2500 1040 20.80
60 21 3600 1260 20.98
𝑥𝑖=210 𝑦𝑖=123.1 𝑥𝑖2= 9100 𝑥𝑖𝑦𝑖=4341

Solution
a0=19.867
a1
=0.01857
• 6a0+210a1=123.1
• 210a0+9100a1=4341
• g(x)=19.867+0.01857
*T

Least Squares Fit of a Straight Line:
Example
• Fit a straight line to the x and y values in
the
following
Table: xi  28  yi  24.0
2
 i
x 140 i i
x y 119.5
x 
28
 4
7
7
y 
24
 3.428571
xi yi xy
i i i
x 2
1 0.5 0.5 1
2 2.5 5 4
3 2 6 9
4 4 16 16
5 3.5 17.5 25
6 6 36 36
7 5.5 38.5 49
28 24 119.5 140

Example
2
2
1
 x )
x (
n
x y
n
a 
i i
 i i   xi yi

7119.5 28 24
 0.8392857
7140 282
a0  y  a1x
 3.428571 0.8392857 4  0.07142857
Y = 0.07142857 + 0.8392857 x

Least Squares Fit of a Straight Line: Example
(Error Analysis)
2
 i
r e  2.9911
S 
 0.868
St
 Sr
r 2

St
2
y  y  22.7143
S 
t  i
r2
r   0.868 0.932

Example
(Error Analysis)
• The standard deviation (quantifies the
spread around the mean):

n 1 71
s 
St 22.7143
 1.9457
y
•The standard error of estimate (quantifies the
spread around the regression line)
7 2

2.9911
 0.7735
n 2
s 
Sr
y /x

• The relationship between the dependent
and independent variables is linear.
• However, a few types of nonlinear functions
can be transformed into linear regression
problems.
 The exponential equation.
 The power equation.
 The saturation-growth-rate equation.
Linearization of Nonlinear Relationships

1. The exponential equation.
ln y  ln a1  b1x
y* = ao + a1 x

2. The power equation
log y  log a2  b2 logx
y* = ao + a1 x*

The saturation-growth-rate equation
a

x

y a 3  
1

1

b3  1
3
y* = 1/y
ao = 1/a3
a1 =
b3/a3 x*
= 1/x

Example
Fit the following Equation:
y  a2 xb2
To the data in the following table:
xi yi
X*=log xi Y*=logyi
1 0.5 0 0.602
2 1.7 0.301 0.753
3 3.4 0.301 0.699
4 5.7 .226 0.922
5 8.7 .447 2.079
15 19.7 .534 2.141
log y  log(a2 x 2
)
b
let Y*
 log y, X*
 log x,
a0  log a2 , a1 b2
2 2
log y  log a  b log x
Y *
 a  a X *
0 1

Example
Su
m
Xi Yi X*i=Log(X) Y*i=Log(Y) X*Y* X*^2
1 0.5 0.0000 -0.3010 0.0000 0.0000
2 1.7 0.3010 0.2304 0.0694 0.0906
3 3.4 0.4771 0.5315 0.2536 0.2276
4 5.7 0.6021 0.7559 0.4551 0.3625
5 8.4 0.6990 0.9243 0.6460 0.4886
15 19.700 2.079 2.141 1.424 1.169
i i
51.4242.0792.141
 1.75
51.1692.0792
n x 2
( x )
2
a1


a0 y  a1x  0.4282 1.75 0.41584 0.334
 nx i yi xi yi
 

Linearization of Nonlinear Functions: Example
log y=-0.334+1.75log
x
y  0.46x1.75

Polynomial Regression
• Some engineering data is poorly represented by a straight line
• For these cases a curve is better suited to fit the data
• The least squares method can readily be extended to fit
the data to higher order polynomials

(cont’d)
A parabola is preferable

(cont’d)
• A
2nd
2nd order polynomial (quadratic) is
defined by:
y  a  a x  a x2
e
o 1 2
• The residuals between the model and the data:
e  y  a  a x  a x2
i i o 1 i 2 i
• The sum of squares of the residual:
2
2
2
2 i
r  i   a x 
e  yi  ao  a1xi
S 

(cont’d)
• A system of 3x3 equations needs to be solved to
determine the coefficients of the polynomial.
• The standard error & the coefficient of
determination
n 3
s 
Sr
y /x
t
S
 Sr
r 2

St


   


  


 i i
i
 i  i  i 
i  i  i 
i

i
a x y
x
x
x
x a   x y 
x
x
2
 2  i i 
1
0

4
3
2
3
2
n x x2
a    y 

(cont’d)
• The coefficient of
determination:
General:
The mth-order polynomial:
y  a  a x  a x2
 .....  a xm
e
o 1 2 m
• A system of (m+1)x(m+1) linear equations must be
solved for
determining the coefficients of the mth-order polynomial.
• The standard error:
s 
Sr
n m1
y/ x
St
 Sr
r 2

St

Polynomial Regression-
Example
Fit a second order polynomial to
data:
3
x  225
4
 979
 i
x
xi yi
xi2
xi3
xi4 xiyi
xi yi
2
0 2.1 0 0 0 0 0
1 7.7 1 1 1 7.7 7.7
2 13.6 4 8 16 27.2 54.4
3 27.2 9 27 81 81.6 244.8
4 40.9 16 64 256 163.6 654.4
5 61.1 25 125 625 305.5 1527.5
15 152.6 55 225 979 585.6 2489
xi yi  585.6
xi 15
yi 152.6
2
 i
 i
x  55
y 
152.6
 25.433
6
x 
15
 2.5,
6
2
 i i
x y  2488.8

2nd order polynomial
Exampley  a  a x  a x2
o 1 2
xi fi 𝑥𝑖2 𝒙𝒊𝟑 𝒙𝒊𝟒 fixi 𝒇𝒊𝒙𝒊𝟐 g (x)
1 4 1 1 1 4 4 4.505
2 11 4 8 6 22 44 10.15
4 19 16 64 256 76 304 19.43
6 26 36 216 1296 156 936 26.03
8 30 64 512 4096 240 1920 29.95
𝑥=21
𝑓𝑖=
90
𝑥𝑖2=
121
𝑥𝑖3
=801
𝑥𝑖4
=5665
𝑓𝑖𝑥𝑖=
498
𝑓𝑖𝑥𝑖2
=3208

2nd order polynomial Example
5a0 +21a1+121a2=90
21a0+121a1+801a2=498
121a0+801a1+5665a2=3
208
a0=-1.81 ,a1=6.65 ,a2=-0.335
So the required equation
is g (x)=-1.81+6.65X-
0.335𝑥2

Exponential function
x 1 2 3 4 5
y 1.5 4.5 6 8.5 11
Solution
y=a𝑒𝑏𝑥
lny=lna𝑒𝑏𝑥
=lna+bx
Y=a0+a1X
Where Y=lny=fi, a0=a ,a1=b ,
X=x

X= xi yi Y=lny 𝒙𝒊𝟐 xiyi g (x)
1 1.5 0.405 1 0.405 2.06
2 4.5 1.504 4 3.008 3.27
3 6 1.791 9 5.373 5.186
4 8.5 2.14 16 8.56 8.22
5 11 2.39 25 11.95 13.03
𝑥𝑖=15
𝑓𝑖=
8.23
𝑥𝑖2= 55
𝑓𝑖𝑥𝑖=29.296
Solution

Solution
• 5a0 +15a1 =8.23
• 15a0 + 55a1 =
29.296
; a0=
0.2642
;a1=0.460
6
• a= 𝑒0.2642=1.30234, b=0.4606
• Require equation g (x)=1.30238𝑒
0.4606

Exampl
• Power
function:
x 2 2.5 3 3.5 4
y 7 8.5 11 12.75 15
Solution:
• y=a𝑥𝑏
• lny = lna + blnx
• Y=a0 +a1X
• Where, Y=lny, a0=lna; X=lnx;
a1=b

Solution
x y lnx=X lny=Y 𝑿𝟐 XY g (x)
2 7 0.6931 1.946 0.480 1.3487 6.868
2.5 8.5 0.9163 2.140 0.8396 1.9608 8.813
3 11 1.098 2.397 1.2056 2.6319 10.806
3.5 12.75 1.252 2.545 1.5675 3.1863 12.838
4 15 1.386 2.708 1.9209 3.7532 14.904
𝑋𝑖=
5.3454
𝑓𝑖=
11.736
𝑋𝑖2 =
6.0136
𝑓𝑖𝑋𝑖=
12.8809

Solution
• 5a0+5.3454a1=11.736
• 5.3454a0+6.0136a1=12.8809
• a0=1.1521 ; a1=1.1178
• a= 𝑒𝑎0
• =𝑒1.1521=3.1648
b=a1=1.1178
Required equation=3.1648𝑥1.1178

Polynomial Regression- Example
(cont’d)
xi yi ymodel ei
2 (yi-y`)2
0 2.1 2.4786 0.14332 544.42889
1 7.7 6.6986 1.00286 314.45929
2 13.6 14.64 1.08158 140.01989
3 27.2 26.303 0.80491 3.12229
4 40.9 41.687 0.61951 239.22809
5 61.1 60.793 0.09439 1272.13489
15 152.6 3.74657 2513.39333
•The standard error of estimate:
3.74657
1.12
6 3
y /x
s 
2513.39
r2
r   0.99925
•The coefficient of determination:
r2

2513.393.74657
 0.99851,

Data Analysis technique, data collection, data analysis

More Related Content

Similar to Data Analysis technique, data collection, data analysis (20)

Recently uploaded (20)

Data Analysis technique, data collection, data analysis