Statistical Methods.
Why Statistics. Statistics is used to take the analysis of data one stage beyond what can be achieved with maps and diagrams.  You can gain a primitive insight into patterns at a glance but mathematical manipulation usually gives greater precision. This allows us to discover things which might otherwise go unnoticed.
The need for justification. Justifying mathematical manipulation is vital. It is vital to be aware that statistics is an aid to analysis and no more. Too often students make statistical calculations in geographical projects without adequate justification. Before statistics is used it is essential to ask yourself two questions.
Question 1. Why am I using this technique? In the exam be absolutely clear what it is a statistical test can prove and how a statistical test can do this.
Question 2. Is the data appropriate to this particular technique? Each technique requires data to be arranged in a particular form. If they aren’t the technique cannot be used. If your data is not good in the first place the use of a complex statistical technique will not help you “ R ubbish in- Rubbish out”
Mean, Mode, Median.  To be used when faced with a large amount of data For example-  average temperature of a place every day for two years. It makes things far easier when we can summarise it. This is relatively easy to do and there are three common methods to achieve this.
1- Mean What most people call the average is the mean. You find it by adding all the numbers together and then divide by the total number of data values. The mean is shown by the symbol- x The mean is distorted if you have just one extreme value which can be a problem. However, it is the most commonly used as it can be used for further mathematical processing.
Find the mean of these data values- 3, 4, 4, 4, 6, 6, 9. 36  = 5.1 7 x = 5.1
2- The Mode. The mode is simply the most frequently occurring event. If we are using simple numbers then the mode is the most frequently occurring number. If we are looking at data on the nominal scale (grouped into categories) the mode is the most common category. The mode is very quick to calculate, but it cannot be used for further mathematical processing.  It is not effected by extreme values.
Find the mode of this data set. 3, 4, 4, 4, 6, 9. Mode (most frequently occurring number)= 4
Find the mode of this nominal data. Mode (Most frequently occurring category)= wheat.  17 Pasture 18 Barley 29 Wheat 3 Fruit 15 Vegetables 12 Rye 10 Clover Hectares Land Use
3- The Median. The Median is the central value in a series of ranked values. If there is an even number of values, the median is the mid point between the two centrally placed values. The median is not effected by extreme values but it cannot be used for further mathematical processing.
Find the median of this data set. 3, 4, 4, 4, 6, 9. Median (central value)= 4.
Now find the median of this data set. 3, 4, 4, 6, 6, 9. Median (central value)= 5
Spread around the median and mean. The median, mean and mode all give us a summary value for a set of data. On their own, however, they give us no idea of the spread of data around the summary value, which can be misleading.  For example…
I collected the following rainfall data. The mean for this data is 20mm. But that gives an untrue picture of what really happened.  There is a great “deviation about the mean”. Deviation can be measured statistically as follows.  0 1993 3 1992 0 1991 0 1990 Rainfall (mm) Year 97 1994
Spread around the median: the interquartile range.  The Interquartile range is a measure of the spread of the values around their median.  The greater the spread the higher the interquartile range.
Method. Stage 1- Place the variables in rank order, smallest to largest. Stage 2- Find the upper quartile. This is found by taking the 25% highest values and finding the mid-point between the lowest of these and the next lowest number. Stage 3- Find the lower quartile. This is obtained by taking the 25% lowest values and finding the mid-point between the highest of these and the next highest value. Stage 4- Find the difference between the upper and lower quartiles. This is the interquartile range, a crude index of the spread of the values around the median.  The higher the range the greater the spread.
Over to you. Copy out the data on the next slide Then find the interquartile range, remembering to follow all the four stages.
5 December 7 November 11 October 15 September 17 August 17 July 15 June 12 May 9 April 7 March 5 February 4 January Average temperature Month
Answer Ranked the data looks like this. 5  5  7  7  9  11  12  15  15  17  17 Lower Quartile  Median  Upper Quartile 6  10  15 Interquartile range: (15-6) = 9.
Spread about the mean: Standard deviation.  If we want to obtain some measure of the spread of our data about its mean we calculate its standard deviation. Two sets of figures can have the same mean but very different standard deviations.
Stage 1- Tabulate the values (x) and their squares (x ² ). Add these values (∑x and ∑x ² ).  Find the mean of all the values of x (x ) and square it (x  ² ). Stage 3- Calculate the formula  =   ∑x²   - x ²   n Method.
= standard deviation.   = the square root of. ∑  = the sum of. n  =  the number of values. x  = the mean of the values.
Over to you. Number of vehicles passing a traffic count point.  Calculate the standard deviation of the following data.
82 10 75 9 42 8 63 7 70 6 60 5 92 4 80 3 75 2 50 1 Number of vehicles. Day
Answer. 6 724 82 5 625 75 1 764 42 3 969 63 4 900 70 3 600 60 8 464 92 6 400 80 5 625 75 2 500 50 x²  x
Answer ∑  X = 689 ∑  x²  = 49 571. x = 689 divided by 10 = 68.9 x ² = (68.9) ² = 4747.2 =  ∑x²   - x ²  =  49 571  – 4747.2    n   10 = 14.5
Phew!!!!!! The higher the standard deviation, the greater the spread of data around the mean. The standard deviation is the best of the measures of spread as it takes into account all of the values under consideration.
Homework. Research the following tests of significance to find out their meaning. The Mann-Whitney U test. The Chi- Squared (x²) test.

More Related Content

PPT
Measures of dispersion
PPTX
Statistics
PPT
Introduction to Statistics and Statistical Inference
PPTX
Lecture 3 Measures of Central Tendency and Dispersion.pptx
PPTX
Combined mean and Weighted Arithmetic Mean
PPTX
Measures of Central tendency
PPT
Measure of central tendency
Measures of dispersion
Statistics
Introduction to Statistics and Statistical Inference
Lecture 3 Measures of Central Tendency and Dispersion.pptx
Combined mean and Weighted Arithmetic Mean
Measures of Central tendency
Measure of central tendency

What's hot (20)

PPTX
Statistics "Descriptive & Inferential"
PPSX
Introduction to statistics...ppt rahul
DOCX
descriptive and inferential statistics
PPTX
Confidence interval
PPTX
Descriptive statistics
PPTX
Stat 3203 -pps sampling
PPTX
Descriptive Statistics
PPTX
SAMPLING ; SAMPLING TECHNIQUES – RANDOM SAMPLING (SIMPLE RANDOM SAMPLING)
PPTX
Introduction to Descriptive Statistics
PPTX
Descriptive statistics
PPTX
STATISTICS: Normal Distribution
PPTX
frequency distribution
PPTX
Normal distribution
PPTX
Measures of Variability.pptx
PPT
Introduction to statistics
PPTX
Statistical inference
PPSX
Coefficient of correlation...ppt
PPTX
Statistics in research
PPT
Regression analysis ppt
PPTX
Understanding statistics in research
Statistics "Descriptive & Inferential"
Introduction to statistics...ppt rahul
descriptive and inferential statistics
Confidence interval
Descriptive statistics
Stat 3203 -pps sampling
Descriptive Statistics
SAMPLING ; SAMPLING TECHNIQUES – RANDOM SAMPLING (SIMPLE RANDOM SAMPLING)
Introduction to Descriptive Statistics
Descriptive statistics
STATISTICS: Normal Distribution
frequency distribution
Normal distribution
Measures of Variability.pptx
Introduction to statistics
Statistical inference
Coefficient of correlation...ppt
Statistics in research
Regression analysis ppt
Understanding statistics in research
Ad

Viewers also liked (14)

PPT
Standard Deviation
PPT
Mann Whitney U Test And Chi Squared
PPT
Chi Square Worked Example
PPT
GCSE Geography: How And Why To Use Spearman’s Rank
PPT
Univariate Analysis
PPT
Multivariate Analysis Techniques
PPTX
ppt on data collection , processing , analysis of data & report writing
PPT
Data Preparation and Processing
PPTX
Univariate Analysis
PPTX
Multivariate analysis
PPTX
Univariate & bivariate analysis
PPTX
Chi square test
PPTX
Data analysis powerpoint
PPT
Chapter 10-DATA ANALYSIS & PRESENTATION
Standard Deviation
Mann Whitney U Test And Chi Squared
Chi Square Worked Example
GCSE Geography: How And Why To Use Spearman’s Rank
Univariate Analysis
Multivariate Analysis Techniques
ppt on data collection , processing , analysis of data & report writing
Data Preparation and Processing
Univariate Analysis
Multivariate analysis
Univariate & bivariate analysis
Chi square test
Data analysis powerpoint
Chapter 10-DATA ANALYSIS & PRESENTATION
Ad

Similar to Statistical Methods (20)

PPTX
Lect 3 background mathematics
PPTX
Lect 3 background mathematics for Data Mining
PPTX
Basic Statistical Descriptions of Data.pptx
PPTX
descriptive measures of data(mean, median, mode and etc.).pptx
PPT
Community Medicine C22 P04 STATISTICAL AVERAGES.ppt
PPTX
Measures of Central Tendency With Variance and Ranges.pptx
PPTX
Lecture_4_-_Data_Management_using_Statistics(3).pptx
PPTX
Biostatistics cource for clinical pharmacy
PDF
Data Science_Chapter -2_Statical Data Analysis.pdf
PPTX
test & measuement
ODP
QT1 - 03 - Measures of Central Tendency
ODP
QT1 - 03 - Measures of Central Tendency
PPTX
measure of dispersion
PPTX
Statistics for 6 Sigma.pptx
DOCX
PPTX
Basic Statistical Concepts in Machine Learning.pptx
DOCX
ANALYSIS ANDINTERPRETATION OF DATA Analysis and Interpr.docx
PPTX
Presentation1.pptx
DOCX
Slide Copyright © 2007 Pearson Education, Inc Publishi.docx
PPTX
Statistics
Lect 3 background mathematics
Lect 3 background mathematics for Data Mining
Basic Statistical Descriptions of Data.pptx
descriptive measures of data(mean, median, mode and etc.).pptx
Community Medicine C22 P04 STATISTICAL AVERAGES.ppt
Measures of Central Tendency With Variance and Ranges.pptx
Lecture_4_-_Data_Management_using_Statistics(3).pptx
Biostatistics cource for clinical pharmacy
Data Science_Chapter -2_Statical Data Analysis.pdf
test & measuement
QT1 - 03 - Measures of Central Tendency
QT1 - 03 - Measures of Central Tendency
measure of dispersion
Statistics for 6 Sigma.pptx
Basic Statistical Concepts in Machine Learning.pptx
ANALYSIS ANDINTERPRETATION OF DATA Analysis and Interpr.docx
Presentation1.pptx
Slide Copyright © 2007 Pearson Education, Inc Publishi.docx
Statistics

More from guest9fa52 (6)

PPT
Mann Whitney U Test And Chi Squared
PPT
Sampling
PPT
New Topic Effectivesequencesofenquiry
PPT
Correlation
PPT
Confidencesignificancelimtis
PPT
Continuing Our Look At Primary And Secondary Data
Mann Whitney U Test And Chi Squared
Sampling
New Topic Effectivesequencesofenquiry
Correlation
Confidencesignificancelimtis
Continuing Our Look At Primary And Secondary Data

Statistical Methods

  • 2. Why Statistics. Statistics is used to take the analysis of data one stage beyond what can be achieved with maps and diagrams. You can gain a primitive insight into patterns at a glance but mathematical manipulation usually gives greater precision. This allows us to discover things which might otherwise go unnoticed.
  • 3. The need for justification. Justifying mathematical manipulation is vital. It is vital to be aware that statistics is an aid to analysis and no more. Too often students make statistical calculations in geographical projects without adequate justification. Before statistics is used it is essential to ask yourself two questions.
  • 4. Question 1. Why am I using this technique? In the exam be absolutely clear what it is a statistical test can prove and how a statistical test can do this.
  • 5. Question 2. Is the data appropriate to this particular technique? Each technique requires data to be arranged in a particular form. If they aren’t the technique cannot be used. If your data is not good in the first place the use of a complex statistical technique will not help you “ R ubbish in- Rubbish out”
  • 6. Mean, Mode, Median. To be used when faced with a large amount of data For example- average temperature of a place every day for two years. It makes things far easier when we can summarise it. This is relatively easy to do and there are three common methods to achieve this.
  • 7. 1- Mean What most people call the average is the mean. You find it by adding all the numbers together and then divide by the total number of data values. The mean is shown by the symbol- x The mean is distorted if you have just one extreme value which can be a problem. However, it is the most commonly used as it can be used for further mathematical processing.
  • 8. Find the mean of these data values- 3, 4, 4, 4, 6, 6, 9. 36 = 5.1 7 x = 5.1
  • 9. 2- The Mode. The mode is simply the most frequently occurring event. If we are using simple numbers then the mode is the most frequently occurring number. If we are looking at data on the nominal scale (grouped into categories) the mode is the most common category. The mode is very quick to calculate, but it cannot be used for further mathematical processing. It is not effected by extreme values.
  • 10. Find the mode of this data set. 3, 4, 4, 4, 6, 9. Mode (most frequently occurring number)= 4
  • 11. Find the mode of this nominal data. Mode (Most frequently occurring category)= wheat. 17 Pasture 18 Barley 29 Wheat 3 Fruit 15 Vegetables 12 Rye 10 Clover Hectares Land Use
  • 12. 3- The Median. The Median is the central value in a series of ranked values. If there is an even number of values, the median is the mid point between the two centrally placed values. The median is not effected by extreme values but it cannot be used for further mathematical processing.
  • 13. Find the median of this data set. 3, 4, 4, 4, 6, 9. Median (central value)= 4.
  • 14. Now find the median of this data set. 3, 4, 4, 6, 6, 9. Median (central value)= 5
  • 15. Spread around the median and mean. The median, mean and mode all give us a summary value for a set of data. On their own, however, they give us no idea of the spread of data around the summary value, which can be misleading. For example…
  • 16. I collected the following rainfall data. The mean for this data is 20mm. But that gives an untrue picture of what really happened. There is a great “deviation about the mean”. Deviation can be measured statistically as follows. 0 1993 3 1992 0 1991 0 1990 Rainfall (mm) Year 97 1994
  • 17. Spread around the median: the interquartile range. The Interquartile range is a measure of the spread of the values around their median. The greater the spread the higher the interquartile range.
  • 18. Method. Stage 1- Place the variables in rank order, smallest to largest. Stage 2- Find the upper quartile. This is found by taking the 25% highest values and finding the mid-point between the lowest of these and the next lowest number. Stage 3- Find the lower quartile. This is obtained by taking the 25% lowest values and finding the mid-point between the highest of these and the next highest value. Stage 4- Find the difference between the upper and lower quartiles. This is the interquartile range, a crude index of the spread of the values around the median. The higher the range the greater the spread.
  • 19. Over to you. Copy out the data on the next slide Then find the interquartile range, remembering to follow all the four stages.
  • 20. 5 December 7 November 11 October 15 September 17 August 17 July 15 June 12 May 9 April 7 March 5 February 4 January Average temperature Month
  • 21. Answer Ranked the data looks like this. 5 5 7 7 9 11 12 15 15 17 17 Lower Quartile Median Upper Quartile 6 10 15 Interquartile range: (15-6) = 9.
  • 22. Spread about the mean: Standard deviation. If we want to obtain some measure of the spread of our data about its mean we calculate its standard deviation. Two sets of figures can have the same mean but very different standard deviations.
  • 23. Stage 1- Tabulate the values (x) and their squares (x ² ). Add these values (∑x and ∑x ² ). Find the mean of all the values of x (x ) and square it (x ² ). Stage 3- Calculate the formula = ∑x² - x ² n Method.
  • 24. = standard deviation. = the square root of. ∑ = the sum of. n = the number of values. x = the mean of the values.
  • 25. Over to you. Number of vehicles passing a traffic count point. Calculate the standard deviation of the following data.
  • 26. 82 10 75 9 42 8 63 7 70 6 60 5 92 4 80 3 75 2 50 1 Number of vehicles. Day
  • 27. Answer. 6 724 82 5 625 75 1 764 42 3 969 63 4 900 70 3 600 60 8 464 92 6 400 80 5 625 75 2 500 50 x² x
  • 28. Answer ∑ X = 689 ∑ x² = 49 571. x = 689 divided by 10 = 68.9 x ² = (68.9) ² = 4747.2 = ∑x² - x ² = 49 571 – 4747.2 n 10 = 14.5
  • 29. Phew!!!!!! The higher the standard deviation, the greater the spread of data around the mean. The standard deviation is the best of the measures of spread as it takes into account all of the values under consideration.
  • 30. Homework. Research the following tests of significance to find out their meaning. The Mann-Whitney U test. The Chi- Squared (x²) test.