SlideShare a Scribd company logo
BEST PRACTICES FOR
STATISTICS
Know what you know and what you don’t know
Have a comparison group
Use validated measures
Have a Data Entry Plan
Get to know your data
If it doesn’t fit, change it
Place your bets before you collect the data
Use the best methods of analysis for your question & your data
Go beyond the p-value
BEST PRACTICES
What is Statistics?
•Study of Data
•Collecting
•Organizing
•Summarizing
•Analyzing
•Presenting
•Storing &
Sharing
Why is it
Important?
•Make sense of the
data
•Explain what
happens and
(possibly) why
•Make sound
decisions
•To know how
close we are to
the truth.
Results
Bias?
Sampling
Error?
Invalid
Measures?
Random
Error?
Other
Factors?
PURPOSE OF STATISTICS
BEST PRACTICE:
KNOW WHAT YOU ALREADY KNOW,
WHAT YOU WANT TO KNOW AND
WHAT YOU DON’T KNOW
How do users differ when
(searching, finding, selecting)
(articles, books, Web sites)?
What are the effects of ___________On ____________?
Whichis better at improving
_________?
How are people (finding, selecting, using) _______?
What are factors associated
with ___________?
STARTING WITH YOUR
RESEARCH QUESTION
KINDS OF VARIABLES
Independent
Subjects
Factors
Effects
of…
Dependent
Objects
Outcomes
Effects
on…
Nominal
•Counts by category
•No meaning between the categories (Blue is not better than
Red)
Ordinal
•Ranks
•Scales
•Space between ranks is subjective
Interval
•Integers
•No baseline
•Space between values is equal and objective, but discrete
Ratio
•Interval data with a baseline
•Space between is continuous
LEVELS OF MEASUREMENT (NOIR)
•Counts by Categories
•Ranks
•Scales
Qualitative
•Measurements
•Composite scores
•Simple Counts
Quantitative
ANOTHER WAY
LIKERT-TYPE SCALE?
Arbitrary
Few Levels
Individual
Questions
Ordinal?
Symmetrical
Many Levels
Composite
Score
Interval?
BEST PRACTICE:
HAVE A COMPARISON
GROUP
WAYS OF COMPARING…
Time Periods
Other Libraries
National Surveys
Patron Types
Material Types
•Qualitative
•Comparison
Expected ranks or ratios
•Quantitative
•Correlations
Two variables
•Quantitative or Qualitative
•Paired or Not Paired
Samples or Groups
KINDS OF COMPARISON
BEST PRACTICE:
USE A VALID MEASURE
Are you actually
measuring what you
are trying to measure?
VALIDITY OF MEASURES
USE A TOOL WITH ESTABLISHED VALIDITY
Approaches and Study
Skills Inventory for
Students (ASSIST)
User Engagement Scale (UES)
ESTABLISH VALIDITY OF MEASURES
•ConsistencyReliability
•Common sense
Content or
Face Validity
•Based on theory
Construct
Validity
•Comparison with other
valid measures
Criterion
Validity
BEST PRACTICE:
HAVE A DATA PLAN
GOAL OF DATA COLLECTION IN
STATISTICS
Reliability
Bias
BIAS
Systematic (not random) deviation from the true value
(Statistics.com)
Selection Bias
Measurement
• Observer Bias
• Non-response Bias
Analysis Bias
DATA INPUT
Have a data entry plan
Train the inputters
Use data validation tricks
Double-entry
BEST PRACTICE:
GET TO KNOW
YOUR DATA
Central
Tendency
SpreadError
EXPLORATORY DATA ANALYSIS
• Average
• For Quantative data
• Excel function: =Average(range)
Mean
• Middle
• For Quantitative or Rank data
• Excel function: =Median(range)
Median
• Most common
• Primarily for Qualitative data
• Excel function: =Mode(range)
Mode
MEASURES OF CENTRAL TENDENCY
SPREAD &
DISTRIBUTION
DISTRIBUTION OR SPREAD OF QUALITATIVE
DATA
Tables
•Counts
•Percentages/Ratios
•Averages of Counts
Excel
•Pivot Tables
PIVOT TABLES IN EXCEL
Select Data
•Highlight table
•Insert->Pivot Table
Select
Variables
•Categories (Row Labels)
•Values
Change
Settings
•Percentage of Grand Total
•Average
DEMONSTRATION OF PIVOT TABLES FOR
SPREAD OF QUALITATIVE DATA
GRAPH & CHART RULES OF THUMB
Trends
Connection
across the X-
axis
Categorical
Comparisons
Grouped
Stacked
Relative
Stacked
Categorical
Few
Categories
Differences
are Wide
QUANTITATIVE DISTRIBUTIONS
Stem &
Leaf
Histogram
Distribution
graphs
John W. Tukey
Exploratory Data
Analysis
Examining your data
visually.
Stem & Leaf
Hinges
Box plots
Scatter plots, etc.
EXPLORATORY DATA ANALYSIS
STEM-AND-LEAF
Stem Leaf
0 01112222222222222233333344445556
666677788899
1 0000000011122223333356778899
2 00122234444799
3 0245
First
digit(s)
Last
digit
Years at UNT
0 5 13
1 6 13
1 6 13
1 6 13
2 6 15
2 6 16
2 7 17
2 7 17
2 7 18
2 8 18
2 8 19
3 11 29
4 11 29
4 12 30
4 12 32
4 12 34
5 12 35
5 13
FROM STEM-AND-LEAF TO HISTOGRAMS
Stem Leaf Count
0 1122223334445555666666677777899 31
1 000011122222222333346677889 27
2 0122234468 10
3 1112355888 11
4 12 2
Range Count
0-9 31
10-19 27
20-29 10
30-39 11
40-49 2
0
10
20
30
40
0-9 10-19 20-29 30-39 40-49
Histogram of Years at UNT
HISTOGRAMS IN EXCEL
•Options
•Add-ins
•Manage Add-ins
Analysis
Toolpak
•Equal Size Ranges
•Ceiling (“more”)
Set ranges
•Data
•Data Analysis
•Histogram
Create
Histogram
•Insert Bar Chart
•Highlight histogram
•Select bars &
Format Selection
•Gap Width=0%
Create Graph
For
Histogram
9
19
29
39
49
DEMONSTRATION OF HISTOGRAM IN EXCEL
SPREAD OF QUANTITATIVE DATA
How variable is the data?
Range
Quantiles
Standard
Deviation
RANGE &
QUARTILES
Box plots
Median
Upper & lower
quartiles
Outliers
PRESENTATION OF
SPREAD
Measure of dispersion of data
Square root of the average
variation from the mean
STANDARD DEVIATION
Greater
variation, less
certainty
Lower variation,
more certainty
WHAT DOES THE SD TELL YOU?
•Min(range)
•Max(range)
Range
•Percentiles.inc(range, %)
•Quartile.inc(range, {1,2,3,4})
Quantiles
•STDEV.S(range)
Standard
Deviation
SPREAD IN EXCEL
NORMAL DISTRIBUTION
SKEWED DISTRIBUTIONS
DEMONSTRATION OF DISTRIBUTIONS
Distribution of the
Population
The “Truth”
N is the # of samples
n is the number of items
in each sample
Watch the cumulative mean & medians slowly
merge to the population
Transform
ation of
data
BEST PRACTICE:
IF IT DOESN’T FIT,
CHANGE IT
WHY TRANSFORM?
0
5
10
15
20
25
30
35
40
45
50
0-9 10-19 20-29 30-39
Years at UNT
0
2
4
6
8
10
12
14
16
Log10(Years at UNT)
Y=a+bx Log(Y)=Log(a+bx)
1/Y = 1/(a+bx)
HOW TRANSFORMATION WORKS
Evaluate the
distribution of
raw data
Select a
transformation
method
Transform the
data
Normally
Distributed?
Statistically
Test
Transformed
Data
HOW TO BECOME NORMAL
Express the result in the terms
of the transformation
BEST PRACTICE:
PLACE YOUR BETS BEFORE
YOU START
INFERENTIAL STATISTICS
Tests of hypotheses
•Associations
•Expectations
Accounts for uncertainty
•Random error
•Confidence interval
Your
Hypothesis
(H1)
Null
Hypothesis
(H0)
HYPOTHESIS TESTING
EXAMPLE HYPOTHESIS
>=75%* <75%*
*…of journal articles cited by UNT PACS faculty in journal articles
published between 2008-2011.
UNT Libraries provides access to…
p
Sample Size
Central
Tendency
SpreadDistribution
Significance
Level
HYPOTHESIS TESTING
TESTING HYPOTHESES
BEST PRACTICE:
CHOOSE THE BEST
METHOD FOR YOUR
QUESTION AND DATA
Assumptions
Limitations
Appropriate data type
What the test tests
KNOW THE TESTS
Variable Type
What is being
compared
Independence
of units
Underlying
variance in the
population
Distribution Sample size
Number of
comparison
groups
FACTORS ASSOCIATED WITH CHOICE OF
STATISTICAL METHOD
USE A FLOW CHART
BEST PRACTICE:
GOING BEYOND THE P-
VALUE
AND THE P-VALUE SAYS…
Much about the
distributions
More about the
H0 than H1
Little about size
of differences
MORE USEFUL STATISTICS
Effect Sizes
•Tell the real story
Confidence Intervals
•State your certainty
Correlations
•Cohen’s guidelines
for Pearson’s r
Differences from the
mean
•Standardized
•weighted against
the standard
deviation
•Cohen’s d
𝑑 =
𝑥1 − 𝑥2
𝑠
EFFECT SIZES OF QUANTITATIVE DATA
Effect
Size
r>
Small .10
Medium .30
Large .50
Based on
Contingency
table
• Odds of event A divided by odds of event
B
• Case-control studies
Odds ratio
• Uses probabilities rather than odds
• Experiments, RCTsRelative risk
EFFECT SIZES OF QUALITATIVE DATA
Test A/B Yes No Total
Yes 10 15 25
No 50 25 75
Totals 60 40 100
Point estimates
Intervals
Based on
Expressed as:
•Single value
•Mean
•Degree of uncertainty
•Range of certainty around the
point estimate
•Point estimate (e.g. mean)
•Confidence level (usually .95)
•Standard deviation
•The mean score of the students
who had the IL training was 83.5
with a 95% CI of 78.3 and 89.4.
CONFIDENCE INTERVALS
Noise
Signal
STATISTICAL ANALYSIS
Know what you know and what you don’t know
Have a comparison group
Use validated measures
Have a Data Entry Plan
Get to know your data
If it doesn’t fit, change it
Place your bets before you collect the data
Use the best methods of analysis for your question & your data
Go beyond the p-value
BEST PRACTICES
RESOURCES
Rice Virtual Lab in
Statistics
Excel Tutorials for
Statistical Analysis
Khan Academy -
videos
Basic Research
Methods for
Librarians
Descriptive Statistical
Techniques for
Librarians

More Related Content

PPTX
Some Research Concepts
PPTX
statistical analysis
PPTX
Statistical analysis training course
PPTX
Inferential Statistics
PDF
Data analysis
PPTX
Statistical analysis, presentation on Data Analysis in Research.
PDF
Clinical case studies and SPSS
PPTX
Data analysis
Some Research Concepts
statistical analysis
Statistical analysis training course
Inferential Statistics
Data analysis
Statistical analysis, presentation on Data Analysis in Research.
Clinical case studies and SPSS
Data analysis

What's hot (20)

PPT
Statistical Analysis Overview
PPTX
Descriptive &amp; inferential statistics presentation 2
PPT
Univariate Analysis
PPTX
Anova n metaanalysis
PPT
Analysis Of Medical Data
PPTX
MD Paediatrics (Part 1) - Overview of Basic Statistics
PPT
Quantitative analysis using SPSS
PDF
Basic knowledge on statistics
PPT
Analyzing survey data
PPTX
Data analysis and working on spss
PDF
Data Analysis using SPSS: Part 1
PPT
Inferential statistics
PPTX
Data analysis using spss for two sample t-test tutorial
PPTX
Commonly Used Statistics in Survey Research
PPTX
Understanding statistics in research
PPT
Analysis
PPTX
Basic statistics
PPTX
Statistical analysis using spss
PDF
Chapter 6 data analysis iec11
Statistical Analysis Overview
Descriptive &amp; inferential statistics presentation 2
Univariate Analysis
Anova n metaanalysis
Analysis Of Medical Data
MD Paediatrics (Part 1) - Overview of Basic Statistics
Quantitative analysis using SPSS
Basic knowledge on statistics
Analyzing survey data
Data analysis and working on spss
Data Analysis using SPSS: Part 1
Inferential statistics
Data analysis using spss for two sample t-test tutorial
Commonly Used Statistics in Survey Research
Understanding statistics in research
Analysis
Basic statistics
Statistical analysis using spss
Chapter 6 data analysis iec11
Ad

Similar to Statistics for Librarians, Session 4: Statistics best practices (20)

PPTX
Statistics for Librarians, Session 1: What is statistics & Why is it important?
PDF
Workshop on SPSS: Basic to Intermediate Level
PPTX
spssworksho9035530-lva1-app6891 (1).pptx
PDF
Data Analysis Toolkit_Final v1.0
PDF
data analysis in Statistics-2023 guide 2023
PPTX
8. data analysis in research practice.pptx
PPTX
Descriptive Statistics
PDF
Data preprocessing and unsupervised learning methods in Bioinformatics
DOCX
Planning-Data-Analysis-CHOOSING-STATISTICAL-TOOL.docx
PPTX
Introduction and Data for SPSS PAR1.pptx
PPTX
Descriptive Analysis.pptx
PPTX
050325Online SPSS.pptx spss social science
PPTX
Introduction to Educational statistics and measurement
PPTX
DATA ANALYSIS IN ACTION RESEARCH (Research Methodology)
PPTX
Statistics
PPTX
TREATMENT OF DATA_Scrd.pptx
PPTX
MA-STAT-200-DESCRIPTIVE-AND-INFERENTIAL-STATISTICS.-MARY-ROSE-M.-HERNANDEZppt...
PPTX
Quantitative research
PPTX
5 numerical descriptive statitics
PPTX
LECTURE 1 STATISTICS for data analytics and machine learning
Statistics for Librarians, Session 1: What is statistics & Why is it important?
Workshop on SPSS: Basic to Intermediate Level
spssworksho9035530-lva1-app6891 (1).pptx
Data Analysis Toolkit_Final v1.0
data analysis in Statistics-2023 guide 2023
8. data analysis in research practice.pptx
Descriptive Statistics
Data preprocessing and unsupervised learning methods in Bioinformatics
Planning-Data-Analysis-CHOOSING-STATISTICAL-TOOL.docx
Introduction and Data for SPSS PAR1.pptx
Descriptive Analysis.pptx
050325Online SPSS.pptx spss social science
Introduction to Educational statistics and measurement
DATA ANALYSIS IN ACTION RESEARCH (Research Methodology)
Statistics
TREATMENT OF DATA_Scrd.pptx
MA-STAT-200-DESCRIPTIVE-AND-INFERENTIAL-STATISTICS.-MARY-ROSE-M.-HERNANDEZppt...
Quantitative research
5 numerical descriptive statitics
LECTURE 1 STATISTICS for data analytics and machine learning
Ad

Recently uploaded (20)

PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PDF
Introduction to the R Programming Language
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PPTX
modul_python (1).pptx for professional and student
PPTX
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
PPTX
Topic 5 Presentation 5 Lesson 5 Corporate Fin
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PDF
Microsoft Core Cloud Services powerpoint
PPTX
Managing Community Partner Relationships
PPT
DATA COLLECTION METHODS-ppt for nursing research
PDF
[EN] Industrial Machine Downtime Prediction
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PDF
Business Analytics and business intelligence.pdf
PPTX
Pilar Kemerdekaan dan Identi Bangsa.pptx
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPT
Predictive modeling basics in data cleaning process
PPTX
Modelling in Business Intelligence , information system
PDF
How to run a consulting project- client discovery
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
Introduction to the R Programming Language
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
modul_python (1).pptx for professional and student
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
Topic 5 Presentation 5 Lesson 5 Corporate Fin
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Microsoft Core Cloud Services powerpoint
Managing Community Partner Relationships
DATA COLLECTION METHODS-ppt for nursing research
[EN] Industrial Machine Downtime Prediction
IBA_Chapter_11_Slides_Final_Accessible.pptx
Business Analytics and business intelligence.pdf
Pilar Kemerdekaan dan Identi Bangsa.pptx
climate analysis of Dhaka ,Banglades.pptx
Predictive modeling basics in data cleaning process
Modelling in Business Intelligence , information system
How to run a consulting project- client discovery
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Introduction-to-Cloud-ComputingFinal.pptx

Statistics for Librarians, Session 4: Statistics best practices