SlideShare a Scribd company logo
INTRODUCTION TO
BIOSTATISTICS
DR.S.Shaffi Ahamed
Asst. Professor
Dept. of Family and Comm. Medicine
KKUH
This session covers:
 Background and need to know
Biostatistics
 Origin and development of Biostatistics
 Definition of Statistics and Biostatistics
 Types of data
 Graphical representation of a data
 Frequency distribution of a data
 “Statistics is the science which deals
with collection, classification and
tabulation of numerical facts as the
basis for explanation, description
and comparison of phenomenon”.
------ Lovitt
“BIOSTATISICS”
 (1) Statistics arising out of biological
sciences, particularly from the fields of
Medicine and public health.
 (2) The methods used in dealing with
statistics in the fields of medicine, biology
and public health for planning,
conducting and analyzing data which
arise in investigations of these branches.
Origin and development of
statistics in Medical Research
 In 1929 a huge paper on application of
statistics was published in Physiology
Journal by Dunn.
 In 1937, 15 articles on statistical methods
by Austin Bradford Hill, were published in
book form.
 In 1948, a RCT of Streptomycin for
pulmonary tb., was published in which
Bradford Hill has a key influence.
 Then the growth of Statistics in Medicine
from 1952 was a 8-fold increase by 1982.
Douglas Altman Ronald Fisher Karl Pearson
C.R. Rao
Gauss -
Basis
Sources of Medical
Uncertainties
1. Intrinsic due to biological,
environmental and sampling factors
2. Natural variation among methods,
observers, instruments etc.
3. Errors in measurement or assessment
or errors in knowledge
4. Incomplete knowledge
Intrinsic variation as a
source of medical
uncertainties
 Biological due to age, gender, heredity, parity, height,
weight, etc. Also due to variation in anatomical,
physiological and biochemical parameters
 Environmental due to nutrition, smoking, pollution,
facilities of water and sanitation, road traffic, legislation,
stress and strains etc.,
 Sampling fluctuations because the entire world cannot
be studied and at least future cases can never be
included
 Chance variation due to unknown or complex to
comprehend factors
Natural variation despite
best care as a source of
uncertainties
 In assessment of any medical parameter
 Due to partial compliance by the patients
 Due to incomplete information in
conditions such as the patient in coma
Medical Errors that cause
Uncertainties
 Carelessness of the providers such as physicians,
surgeons, nursing staff, radiographers and
pharmacists.
 Errors in methods such as in using incorrect quantity or
quality of chemicals and reagents, misinterpretation of
ECG, using inappropriate diagnostic tools,
misrecording of information etc.
 Instrument error due to use of non-standardized or
faulty instrument and improper use of a right
instrument.
 Not collecting full information
 Inconsistent response by the patients or other subjects
under evaluation
Incomplete knowledge as a
source of Uncertainties
 Diagnostic, therapeutic and prognostic
uncertainties due to lack of knowledge
 Predictive uncertainties such as in
survival duration of a patient of cancer
 Other uncertainties such as how to
measure positive health
Biostatistics is the
science that helps in
managing medical
uncertainties
Reasons to know about
biostatistics:
 Medicine is becoming increasingly
quantitative.
 The planning, conduct and interpretation
of much of medical research are
becoming increasingly reliant on the
statistical methodology.
 Statistics pervades the medical literature.
CLINICAL MEDICINE
 Documentation of medical history of
diseases.
 Planning and conduct of clinical studies.
 Evaluating the merits of different
procedures.
 In providing methods for definition of
“normal” and “abnormal”.
Role of Biostatistics in
patient care
 In increasing awareness regarding diagnostic,
therapeutic and prognostic uncertainties and
providing rules of probability to delineate those
uncertainties
 In providing methods to integrate chances with value
judgments that could be most beneficial to patient
 In providing methods such as sensitivity-specificity
and predictivities that help choose valid tests for
patient assessment
 In providing tools such as scoring system and expert
system that can help reduce epistemic uncertainties
PREVENTIVE MEDICINE
 To provide the magnitude of any health
problem in the community.
 To find out the basic factors underlying
the ill-health.
 To evaluate the health programs which
was introduced in the community
(success/failure).
 To introduce and promote health
legislation.
Role of Biostatics in Health
Planning and Evaluation
 In carrying out a valid and reliable health
situation analysis, including in proper
summarization and interpretation of data.
 In proper evaluation of the achievements
and failures of a health programme
Role of Biostatistics in
Medical Research
 In developing a research design that can
minimize the impact of uncertainties
 In assessing reliability and validity of
tools and instruments to collect the
infromation
 In proper analysis of data
Example: Evaluation of Penicillin (treatment
A) vs Penicillin & Chloramphenicol
(treatment B) for treating bacterial
pneumonia in children< 2 yrs.
 What is the sample size needed to demonstrate the significance
of one group against other ?
 Is treatment A is better than treatment B or vice versa ?
 If so, how much better ?
 What is the normal variation in clinical measurement ? (mild,
moderate & severe) ?
 How reliable and valid is the measurement ? (clinical &
radiological) ?
 What is the magnitude and effect of laboratory and technical
error ?
 How does one interpret abnormal values ?
WHAT DOES STAISTICS
COVER ?
Planning
Design
Execution (Data collection)
Data Processing
Data analysis
Presentation
Interpretation
Publication
BASIC CONCEPTS
Data : Set of values of one or more variables recorded
on one or more observational units
Categories of data
1. Primary data: observation, questionnaire, record form,
interviews, survey,
2. Secondary data: census, medical record,registry
Sources of data 1. Routinely kept records
2. Surveys (census)
3. Experiments
4. External source
TYPES OF DATA
 QUALITATIVE DATA
 DISCRETE QUANTITATIVE
 CONTINOUS QUANTITATIVE
QUALITATIVE
Nominal
Example: Sex ( M, F)
Exam result (P, F)
Blood Group (A,B, O or AB)
Color of Eyes (blue, green,
brown, black)
ORDINAL
Example:
Response to treatment
(poor, fair, good)
Severity of disease
(mild, moderate, severe)
Income status (low, middle,
high)
QUANTITATIVE (DISCRETE)
Example: The no. of family members
The no. of heart beats
The no. of admissions in a day
QUANTITATIVE (CONTINOUS)
Example: Height, Weight, Age, BP, Serum
Cholesterol and BMI
Discrete data -- Gaps between possible values
Continuous data -- Theoretically,
no gaps between possible values
Number of Children
Hb
CONTINUOUS DATA
QUALITATIVE DATA
wt. (in Kg.) : under wt, normal & over wt.
Ht. (in cm.): short, medium & tall
hospital length of stay Number Percent
1 – 3 days 5891 43.3
4 – 7 days 3489 25.6
2 weeks 2449 18.0
3 weeks 813 6.0
1 month 417 3.1
More than 1 month 545 4.0
Total 14604 100.0
Mean = 7.85 SE = 0.10
Table 1 Distribution of blunt injured patients
according to hospital length of stay
Scale of measurement
Qualitative variable:
A categorical variable
Nominal (classificatory) scale
- gender, marital status, race
Ordinal (ranking) scale
- severity scale, good/better/best
Scale of measurement
Quantitative variable:
A numerical variable: discrete; continuous
Interval scale :
Data is placed in meaningful intervals and order. The unit of
measurement are arbitrary.
- Temperature (37º C -- 36º C; 38º C-- 37º C are equal) and
No implication of ratio (30º C is not twice as hot as 15º C)
Ratio scale:
Data is presented in frequency distribution in
logical order. A meaningful ratio exists.
- Age, weight, height, pulse rate
- pulse rate of 120 is twice as fast as 60
- person with weight of 80kg is twice as heavy
as the one with weight of 40 kg.
Scales of Measure
 Nominal – qualitative classification of
equal value: gender, race, color, city
 Ordinal - qualitative classification
which can be rank ordered:
socioeconomic status of families
 Interval - Numerical or quantitative
data: can be rank ordered and sizes
compared : temperature
 Ratio - Quantitative interval data along
with ratio: time, age.
CLINIMETRICS
A science called clinimetrics in which
qualities are converted to meaningful
quantities by using the scoring system.
Examples: (1) Apgar score based on
appearance, pulse, grimace, activity and
respiration is used for neonatal prognosis.
(2) Smoking Index: no. of cigarettes, duration,
filter or not, whether pipe, cigar etc.,
(3) APACHE( Acute Physiology and Chronic
Health Evaluation) score: to quantify the
severity of condition of a patient
introductoin to Biostatistics ( 1st and 2nd lec ).ppt
introductoin to Biostatistics ( 1st and 2nd lec ).ppt
introductoin to Biostatistics ( 1st and 2nd lec ).ppt
INVESTIGATION
Data Colllection
Data Presentation
Tabulation
Diagrams
Graphs
Descriptive Statistics
Measures of Location
Measures of Dispersion
Measures of Skewness &
Kurtosis
Inferential Statistiscs
Estimation Hypothesis
Testing
Ponit estimate
Inteval estimate
Univariate analysis
Multivariate analysis
Frequency Distributions
 data distribution – pattern of
variability.
 the center of a distribution
 the ranges
 the shapes
 simple frequency distributions
 grouped frequency distributions
 midpoint
Patien
t No
Hb
(g/dl)
Patien
t No
Hb
(g/dl)
Patien
t No
Hb
(g/dl)
1 12.0 11 11.2 21 14.9
2 11.9 12 13.6 22 12.2
3 11.5 13 10.8 23 12.2
4 14.2 14 12.3 24 11.4
5 12.3 15 12.3 25 10.7
6 13.0 16 15.7 26 12.5
7 10.5 17 12.6 27 11.8
8 12.8 18 9.1 28 15.1
9 13.2 19 12.9 29 13.4
10 11.2 20 14.6 30 13.1
Tabulate the hemoglobin values of 30 adult
male patients listed below
Steps for making a
table
Step1 Find Minimum (9.1) & Maximum (15.7)
Step2 Calculate difference 15.7 – 9.1 = 6.6
Step3 Decide the number and width of
the classes (7 c.l) 9.0 -9.9, 10.0-10.9,----
Step4 Prepare dummy table –
Hb (g/dl), Tally mark, No. patients
Hb (g/dl) Tall marks No.
patients
9.0 – 9.9
10.0 – 10.9
11.0 – 11.9
12.0 – 12.9
13.0 – 13.9
14.0 – 14.9
15.0 – 15.9
Total
Hb (g/dl) Tall marks No.
patients
9.0 – 9.9
10.0 – 10.9
11.0 – 11.9
12.0 – 12.9
13.0 – 13.9
14.0 – 14.9
15.0 – 15.9
l
lll
lll
llll llll
llll
lll
ll
1
3
6
10
5
3
2
Total - 30
DUMMY TABLE Tall Marks TABLE
Hb (g/dl) No. of
patients
9.0 – 9.9
10.0 – 10.9
11.0 – 11.9
12.0 – 12.9
13.0 – 13.9
14.0 – 14.9
15.0 – 15.9
1
3
6
10
5
3
2
Total 30
Table Frequency distribution of 30 adult male
patients by Hb
Table Frequency distribution of adult patients by
Hb and gender:
Hb
(g/dl)
Gender Total
Male Female
<9.0
9.0 – 9.9
10.0 – 10.9
11.0 – 11.9
12.0 – 12.9
13.0 – 13.9
14.0 – 14.9
15.0 – 15.9
0
1
3
6
10
5
3
2
2
3
5
8
6
4
2
0
2
4
8
14
16
9
5
2
Total 30 30 60
Elements of a Table
Ideal table should have Number
Title
Column headings
Foot-notes
Number – Table number for identification in a report
Title,place - Describe the body of the table, variables,
Time period (What, how classified, where and when)
Column - Variable name, No. , Percentages (%), etc.,
Heading
Foot-note(s) - to describe some column/row headings,
special cells, source, etc.,
Death rate (/1000 per annum)
No. of divisions
7.0-7.9 4 (3.3)
8.0 - 8.9 13 (10.8)
9.0 - 9.9 20 (16.7)
10.0 - 10.9 27 (22.5)
11.0 - 11.9 18 (15.0)
12.0 - 12.9 11 (0.2)
13.0 - 13.9 11 (9.2)
14.0 - 14.9 6 (5.0)
15.0 - 15.9 2 (1.7)
16.0 - 16.9 4 (3.3)
17.0 - 18.9 3 (2.5)
19.0 + 1 (0.8)
Total 120 (100.0)
Table II. Distribution of 120 (Madras) Corporation divisions
according to annual death rate based on registered deaths in
1975 and 1976
Figures in parentheses indicate percentages
DIAGRAMS/GRAPHS
Discrete data
--- Bar charts (one or two groups)
Continuous data
--- Histogram
--- Frequency polygon (curve)
--- Stem-and –leaf plot
--- Box-and-whisker plot
Example data
68 63 42 27 30 36 28 32
79 27 22 28 24 25 44 65
43 25 74 51 36 42 28 31
28 25 45 12 57 51 12 32
49 38 42 27 31 50 38 21
16 24 64 47 23 22 43 27
49 28 23 19 11 52 46 31
30 43 49 12
Histogram
Figure 1 Histogram of ages of 60 subjects
11.5 21.5 31.5 41.5 51.5 61.5 71.5
0
10
20
Age
Frequency
Polygon
71.5
61.5
51.5
41.5
31.5
21.5
11.5
20
10
0
Age
Frequency
Example data
68 63 42 27 30 36 28 32
79 27 22 28 24 25 44 65
43 25 74 51 36 42 28 31
28 25 45 12 57 51 12 32
49 38 42 27 31 50 38 21
16 24 64 47 23 22 43 27
49 28 23 19 11 52 46 31
30 43 49 12
Stem and leaf plot
Stem-and-leaf of Age N = 60
Leaf Unit = 1.0
6 1 122269
19 2 1223344555777788888
(11) 3 00111226688
13 4 2223334567999
5 5 01127
4 6 3458
2 7 49
Box plot
10
20
30
40
50
60
70
80
Age
Descriptive statistics report:
Boxplot
- minimum score
- maximum score
- lower quartile
- upper quartile
- median
- mean
- the skew of the distribution:
positive skew: mean > median & high-score whisker is longer
negative skew: mean < median & low-score whisker is longer
10%
20%
70%
Mild
Moderate
Severe
The prevalence of different degree of
Hypertension
in the population
Pie Chart
•Circular diagram – total -100%
•Divided into segments each
representing a category
•Decide adjacent category
•The amount for each category is
proportional to slice of the pie
Bar Graphs
9
12
20
16
12
8
20
0
5
10
15
20
25
Smo Alc Chol DM HTN No
Exer
F-H
Riskfactor
Number
The distribution of risk factor among cases with
Cardio vascular Diseases
Heights of the bar indicates
frequency
Frequency in the Y axis
and categories of variable
in the X axis
The bars should be of equal
width and no touching the
other bars
HIV cases enrolment in
USA by gender
0
2
4
6
8
10
12
1986 1987 1988 1989 1990 1991 1992
Year
Enrollment
(hundred)
Men
Women
Bar chart
HIV cases Enrollment
in USA by gender
0
2
4
6
8
10
12
14
16
18
1986 1987 1988 1989 1990 1991 1992
Year
Enrollment
(Thousands)
Women
Men
Stocked bar chart
Graphic Presentation of
Data
the histogram
(quantitative data)
the bar graph
(qualitative data)
the frequency polygon
(quantitative data)
introductoin to Biostatistics ( 1st and 2nd lec ).ppt
General rules for designing
graphs
 A graph should have a self-explanatory
legend
 A graph should help reader to understand
data
 Axis labeled, units of measurement
indicated
 Scales important. Start with zero (otherwise
// break)
 Avoid graphs with three-dimensional
impression, it may be misleading (reader
visualize less easily
Any Questions
Origin and development of
statistics in Medical Research
 In 1929 a huge paper on application of
statistics was published in Physiology
Journal by Dunn.
 In 1937, 15 articles on statistical methods
by Austin Bradford Hill, were published in
book form.
 In 1948, a RCT of Streptomycin for
pulmonary tb., was published in which
Bradford Hill has a key influence.
 Then the growth of Statistics in Medicine
from 1952 was a 8-fold increase by 1982.

More Related Content

PDF
Data Analysis with SPSS PPT.pdf
PPTX
PPT
role of Biostatistics (new)
PPTX
Statistical Data Analysis | Data Analysis | Statistics Services | Data Collec...
PPTX
What is a paired samples t test
PPTX
Fundamentals of biostatistics
PPTX
Comparing means
Data Analysis with SPSS PPT.pdf
role of Biostatistics (new)
Statistical Data Analysis | Data Analysis | Statistics Services | Data Collec...
What is a paired samples t test
Fundamentals of biostatistics
Comparing means

What's hot (20)

PPTX
Parametric tests seminar
PPT
Paired t Test
PPT
Introduction to t-tests (statistics)
PPT
Non parametric tests by meenu
PPTX
Point and Interval Estimation
PPT
Confidence intervals
PPTX
Kruskal wallis test
PDF
Graphical presentation of data
PPTX
Test of significance
PPTX
Biomedical statistics
PPTX
Cross sectional study
PPTX
Advance Statistics - Wilcoxon Signed Rank Test
PPTX
INFERENTIAL TECHNIQUES. Inferential Stat. pt 3
PPTX
Inferential Statistics
PPTX
Seminar 10 BIOSTATISTICS
PPTX
Kruskal Wall Test
PDF
Data Analysis using SPSS: Part 1
PPTX
Parametric Statistical tests
PPT
SURVIVAL ANALYSIS.ppt
Parametric tests seminar
Paired t Test
Introduction to t-tests (statistics)
Non parametric tests by meenu
Point and Interval Estimation
Confidence intervals
Kruskal wallis test
Graphical presentation of data
Test of significance
Biomedical statistics
Cross sectional study
Advance Statistics - Wilcoxon Signed Rank Test
INFERENTIAL TECHNIQUES. Inferential Stat. pt 3
Inferential Statistics
Seminar 10 BIOSTATISTICS
Kruskal Wall Test
Data Analysis using SPSS: Part 1
Parametric Statistical tests
SURVIVAL ANALYSIS.ppt
Ad

Similar to introductoin to Biostatistics ( 1st and 2nd lec ).ppt (20)

PPTX
BIOSTATISTICS IN MEDICINE & PUBLIC HEALTH.pptx
PPT
introductoin to Biostatistics ( 1st and 2nd lec ).ppt
PPTX
Introduction to biostatistics new with table and graphs.pptx
PPTX
Biostatistics Introduction Lecture 01.pptx
PPT
Applied Epid
PPTX
Application of Biostatistics
PDF
1. Introduction to biostatistics
PPTX
Biostatistics_pinky.pptx useful video to learn
PPTX
Epidemiology slides by Kuya Kabalo.pptx
PDF
Introduction to Medical Statistics - Master.Eman Khashabapptx.pdf
PPT
Clinical Research Informatics (CRI) Year-in-Review 2014
PPTX
Basic of Biostatistics and epidemology_1.pptx
PPT
Community diagnosis
PPTX
BIOSTATISTICS book that have all needed.pptx
PPTX
Intoduction to biostatistics
PDF
Understanding the-value-of-case-reports-and-studies-in-the-context-of-clinica...
PPTX
Descriptive epidemiology
PPT
Epidemiological study designs
PPT
1. Introdution to Biostatistics.ppt
PPTX
BIOSTATISTICS, EPIDEMIOLOGY AND RESEARCH METHODOLOGY SEMINAR.pptx
BIOSTATISTICS IN MEDICINE & PUBLIC HEALTH.pptx
introductoin to Biostatistics ( 1st and 2nd lec ).ppt
Introduction to biostatistics new with table and graphs.pptx
Biostatistics Introduction Lecture 01.pptx
Applied Epid
Application of Biostatistics
1. Introduction to biostatistics
Biostatistics_pinky.pptx useful video to learn
Epidemiology slides by Kuya Kabalo.pptx
Introduction to Medical Statistics - Master.Eman Khashabapptx.pdf
Clinical Research Informatics (CRI) Year-in-Review 2014
Basic of Biostatistics and epidemology_1.pptx
Community diagnosis
BIOSTATISTICS book that have all needed.pptx
Intoduction to biostatistics
Understanding the-value-of-case-reports-and-studies-in-the-context-of-clinica...
Descriptive epidemiology
Epidemiological study designs
1. Introdution to Biostatistics.ppt
BIOSTATISTICS, EPIDEMIOLOGY AND RESEARCH METHODOLOGY SEMINAR.pptx
Ad

Recently uploaded (20)

PDF
BIO-INSPIRED HORMONAL MODULATION AND ADAPTIVE ORCHESTRATION IN S-AI-GPT
PDF
Well-logging-methods_new................
PPTX
UNIT 4 Total Quality Management .pptx
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PPTX
Construction Project Organization Group 2.pptx
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPT
Project quality management in manufacturing
PPT
Mechanical Engineering MATERIALS Selection
PDF
PREDICTION OF DIABETES FROM ELECTRONIC HEALTH RECORDS
PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
PDF
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
PPTX
OOP with Java - Java Introduction (Basics)
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PPTX
Safety Seminar civil to be ensured for safe working.
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
BIO-INSPIRED HORMONAL MODULATION AND ADAPTIVE ORCHESTRATION IN S-AI-GPT
Well-logging-methods_new................
UNIT 4 Total Quality Management .pptx
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
Construction Project Organization Group 2.pptx
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
Project quality management in manufacturing
Mechanical Engineering MATERIALS Selection
PREDICTION OF DIABETES FROM ELECTRONIC HEALTH RECORDS
Automation-in-Manufacturing-Chapter-Introduction.pdf
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
OOP with Java - Java Introduction (Basics)
Foundation to blockchain - A guide to Blockchain Tech
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
Safety Seminar civil to be ensured for safe working.
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
UNIT-1 - COAL BASED THERMAL POWER PLANTS

introductoin to Biostatistics ( 1st and 2nd lec ).ppt

  • 1. INTRODUCTION TO BIOSTATISTICS DR.S.Shaffi Ahamed Asst. Professor Dept. of Family and Comm. Medicine KKUH
  • 2. This session covers:  Background and need to know Biostatistics  Origin and development of Biostatistics  Definition of Statistics and Biostatistics  Types of data  Graphical representation of a data  Frequency distribution of a data
  • 3.  “Statistics is the science which deals with collection, classification and tabulation of numerical facts as the basis for explanation, description and comparison of phenomenon”. ------ Lovitt
  • 4. “BIOSTATISICS”  (1) Statistics arising out of biological sciences, particularly from the fields of Medicine and public health.  (2) The methods used in dealing with statistics in the fields of medicine, biology and public health for planning, conducting and analyzing data which arise in investigations of these branches.
  • 5. Origin and development of statistics in Medical Research  In 1929 a huge paper on application of statistics was published in Physiology Journal by Dunn.  In 1937, 15 articles on statistical methods by Austin Bradford Hill, were published in book form.  In 1948, a RCT of Streptomycin for pulmonary tb., was published in which Bradford Hill has a key influence.  Then the growth of Statistics in Medicine from 1952 was a 8-fold increase by 1982.
  • 6. Douglas Altman Ronald Fisher Karl Pearson C.R. Rao Gauss -
  • 8. Sources of Medical Uncertainties 1. Intrinsic due to biological, environmental and sampling factors 2. Natural variation among methods, observers, instruments etc. 3. Errors in measurement or assessment or errors in knowledge 4. Incomplete knowledge
  • 9. Intrinsic variation as a source of medical uncertainties  Biological due to age, gender, heredity, parity, height, weight, etc. Also due to variation in anatomical, physiological and biochemical parameters  Environmental due to nutrition, smoking, pollution, facilities of water and sanitation, road traffic, legislation, stress and strains etc.,  Sampling fluctuations because the entire world cannot be studied and at least future cases can never be included  Chance variation due to unknown or complex to comprehend factors
  • 10. Natural variation despite best care as a source of uncertainties  In assessment of any medical parameter  Due to partial compliance by the patients  Due to incomplete information in conditions such as the patient in coma
  • 11. Medical Errors that cause Uncertainties  Carelessness of the providers such as physicians, surgeons, nursing staff, radiographers and pharmacists.  Errors in methods such as in using incorrect quantity or quality of chemicals and reagents, misinterpretation of ECG, using inappropriate diagnostic tools, misrecording of information etc.  Instrument error due to use of non-standardized or faulty instrument and improper use of a right instrument.  Not collecting full information  Inconsistent response by the patients or other subjects under evaluation
  • 12. Incomplete knowledge as a source of Uncertainties  Diagnostic, therapeutic and prognostic uncertainties due to lack of knowledge  Predictive uncertainties such as in survival duration of a patient of cancer  Other uncertainties such as how to measure positive health
  • 13. Biostatistics is the science that helps in managing medical uncertainties
  • 14. Reasons to know about biostatistics:  Medicine is becoming increasingly quantitative.  The planning, conduct and interpretation of much of medical research are becoming increasingly reliant on the statistical methodology.  Statistics pervades the medical literature.
  • 15. CLINICAL MEDICINE  Documentation of medical history of diseases.  Planning and conduct of clinical studies.  Evaluating the merits of different procedures.  In providing methods for definition of “normal” and “abnormal”.
  • 16. Role of Biostatistics in patient care  In increasing awareness regarding diagnostic, therapeutic and prognostic uncertainties and providing rules of probability to delineate those uncertainties  In providing methods to integrate chances with value judgments that could be most beneficial to patient  In providing methods such as sensitivity-specificity and predictivities that help choose valid tests for patient assessment  In providing tools such as scoring system and expert system that can help reduce epistemic uncertainties
  • 17. PREVENTIVE MEDICINE  To provide the magnitude of any health problem in the community.  To find out the basic factors underlying the ill-health.  To evaluate the health programs which was introduced in the community (success/failure).  To introduce and promote health legislation.
  • 18. Role of Biostatics in Health Planning and Evaluation  In carrying out a valid and reliable health situation analysis, including in proper summarization and interpretation of data.  In proper evaluation of the achievements and failures of a health programme
  • 19. Role of Biostatistics in Medical Research  In developing a research design that can minimize the impact of uncertainties  In assessing reliability and validity of tools and instruments to collect the infromation  In proper analysis of data
  • 20. Example: Evaluation of Penicillin (treatment A) vs Penicillin & Chloramphenicol (treatment B) for treating bacterial pneumonia in children< 2 yrs.  What is the sample size needed to demonstrate the significance of one group against other ?  Is treatment A is better than treatment B or vice versa ?  If so, how much better ?  What is the normal variation in clinical measurement ? (mild, moderate & severe) ?  How reliable and valid is the measurement ? (clinical & radiological) ?  What is the magnitude and effect of laboratory and technical error ?  How does one interpret abnormal values ?
  • 21. WHAT DOES STAISTICS COVER ? Planning Design Execution (Data collection) Data Processing Data analysis Presentation Interpretation Publication
  • 22. BASIC CONCEPTS Data : Set of values of one or more variables recorded on one or more observational units Categories of data 1. Primary data: observation, questionnaire, record form, interviews, survey, 2. Secondary data: census, medical record,registry Sources of data 1. Routinely kept records 2. Surveys (census) 3. Experiments 4. External source
  • 23. TYPES OF DATA  QUALITATIVE DATA  DISCRETE QUANTITATIVE  CONTINOUS QUANTITATIVE
  • 24. QUALITATIVE Nominal Example: Sex ( M, F) Exam result (P, F) Blood Group (A,B, O or AB) Color of Eyes (blue, green, brown, black)
  • 25. ORDINAL Example: Response to treatment (poor, fair, good) Severity of disease (mild, moderate, severe) Income status (low, middle, high)
  • 26. QUANTITATIVE (DISCRETE) Example: The no. of family members The no. of heart beats The no. of admissions in a day QUANTITATIVE (CONTINOUS) Example: Height, Weight, Age, BP, Serum Cholesterol and BMI
  • 27. Discrete data -- Gaps between possible values Continuous data -- Theoretically, no gaps between possible values Number of Children Hb
  • 28. CONTINUOUS DATA QUALITATIVE DATA wt. (in Kg.) : under wt, normal & over wt. Ht. (in cm.): short, medium & tall
  • 29. hospital length of stay Number Percent 1 – 3 days 5891 43.3 4 – 7 days 3489 25.6 2 weeks 2449 18.0 3 weeks 813 6.0 1 month 417 3.1 More than 1 month 545 4.0 Total 14604 100.0 Mean = 7.85 SE = 0.10 Table 1 Distribution of blunt injured patients according to hospital length of stay
  • 30. Scale of measurement Qualitative variable: A categorical variable Nominal (classificatory) scale - gender, marital status, race Ordinal (ranking) scale - severity scale, good/better/best
  • 31. Scale of measurement Quantitative variable: A numerical variable: discrete; continuous Interval scale : Data is placed in meaningful intervals and order. The unit of measurement are arbitrary. - Temperature (37º C -- 36º C; 38º C-- 37º C are equal) and No implication of ratio (30º C is not twice as hot as 15º C)
  • 32. Ratio scale: Data is presented in frequency distribution in logical order. A meaningful ratio exists. - Age, weight, height, pulse rate - pulse rate of 120 is twice as fast as 60 - person with weight of 80kg is twice as heavy as the one with weight of 40 kg.
  • 33. Scales of Measure  Nominal – qualitative classification of equal value: gender, race, color, city  Ordinal - qualitative classification which can be rank ordered: socioeconomic status of families  Interval - Numerical or quantitative data: can be rank ordered and sizes compared : temperature  Ratio - Quantitative interval data along with ratio: time, age.
  • 34. CLINIMETRICS A science called clinimetrics in which qualities are converted to meaningful quantities by using the scoring system. Examples: (1) Apgar score based on appearance, pulse, grimace, activity and respiration is used for neonatal prognosis. (2) Smoking Index: no. of cigarettes, duration, filter or not, whether pipe, cigar etc., (3) APACHE( Acute Physiology and Chronic Health Evaluation) score: to quantify the severity of condition of a patient
  • 38. INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures of Skewness & Kurtosis Inferential Statistiscs Estimation Hypothesis Testing Ponit estimate Inteval estimate Univariate analysis Multivariate analysis
  • 39. Frequency Distributions  data distribution – pattern of variability.  the center of a distribution  the ranges  the shapes  simple frequency distributions  grouped frequency distributions  midpoint
  • 40. Patien t No Hb (g/dl) Patien t No Hb (g/dl) Patien t No Hb (g/dl) 1 12.0 11 11.2 21 14.9 2 11.9 12 13.6 22 12.2 3 11.5 13 10.8 23 12.2 4 14.2 14 12.3 24 11.4 5 12.3 15 12.3 25 10.7 6 13.0 16 15.7 26 12.5 7 10.5 17 12.6 27 11.8 8 12.8 18 9.1 28 15.1 9 13.2 19 12.9 29 13.4 10 11.2 20 14.6 30 13.1 Tabulate the hemoglobin values of 30 adult male patients listed below
  • 41. Steps for making a table Step1 Find Minimum (9.1) & Maximum (15.7) Step2 Calculate difference 15.7 – 9.1 = 6.6 Step3 Decide the number and width of the classes (7 c.l) 9.0 -9.9, 10.0-10.9,---- Step4 Prepare dummy table – Hb (g/dl), Tally mark, No. patients
  • 42. Hb (g/dl) Tall marks No. patients 9.0 – 9.9 10.0 – 10.9 11.0 – 11.9 12.0 – 12.9 13.0 – 13.9 14.0 – 14.9 15.0 – 15.9 Total Hb (g/dl) Tall marks No. patients 9.0 – 9.9 10.0 – 10.9 11.0 – 11.9 12.0 – 12.9 13.0 – 13.9 14.0 – 14.9 15.0 – 15.9 l lll lll llll llll llll lll ll 1 3 6 10 5 3 2 Total - 30 DUMMY TABLE Tall Marks TABLE
  • 43. Hb (g/dl) No. of patients 9.0 – 9.9 10.0 – 10.9 11.0 – 11.9 12.0 – 12.9 13.0 – 13.9 14.0 – 14.9 15.0 – 15.9 1 3 6 10 5 3 2 Total 30 Table Frequency distribution of 30 adult male patients by Hb
  • 44. Table Frequency distribution of adult patients by Hb and gender: Hb (g/dl) Gender Total Male Female <9.0 9.0 – 9.9 10.0 – 10.9 11.0 – 11.9 12.0 – 12.9 13.0 – 13.9 14.0 – 14.9 15.0 – 15.9 0 1 3 6 10 5 3 2 2 3 5 8 6 4 2 0 2 4 8 14 16 9 5 2 Total 30 30 60
  • 45. Elements of a Table Ideal table should have Number Title Column headings Foot-notes Number – Table number for identification in a report Title,place - Describe the body of the table, variables, Time period (What, how classified, where and when) Column - Variable name, No. , Percentages (%), etc., Heading Foot-note(s) - to describe some column/row headings, special cells, source, etc.,
  • 46. Death rate (/1000 per annum) No. of divisions 7.0-7.9 4 (3.3) 8.0 - 8.9 13 (10.8) 9.0 - 9.9 20 (16.7) 10.0 - 10.9 27 (22.5) 11.0 - 11.9 18 (15.0) 12.0 - 12.9 11 (0.2) 13.0 - 13.9 11 (9.2) 14.0 - 14.9 6 (5.0) 15.0 - 15.9 2 (1.7) 16.0 - 16.9 4 (3.3) 17.0 - 18.9 3 (2.5) 19.0 + 1 (0.8) Total 120 (100.0) Table II. Distribution of 120 (Madras) Corporation divisions according to annual death rate based on registered deaths in 1975 and 1976 Figures in parentheses indicate percentages
  • 47. DIAGRAMS/GRAPHS Discrete data --- Bar charts (one or two groups) Continuous data --- Histogram --- Frequency polygon (curve) --- Stem-and –leaf plot --- Box-and-whisker plot
  • 48. Example data 68 63 42 27 30 36 28 32 79 27 22 28 24 25 44 65 43 25 74 51 36 42 28 31 28 25 45 12 57 51 12 32 49 38 42 27 31 50 38 21 16 24 64 47 23 22 43 27 49 28 23 19 11 52 46 31 30 43 49 12
  • 49. Histogram Figure 1 Histogram of ages of 60 subjects 11.5 21.5 31.5 41.5 51.5 61.5 71.5 0 10 20 Age Frequency
  • 51. Example data 68 63 42 27 30 36 28 32 79 27 22 28 24 25 44 65 43 25 74 51 36 42 28 31 28 25 45 12 57 51 12 32 49 38 42 27 31 50 38 21 16 24 64 47 23 22 43 27 49 28 23 19 11 52 46 31 30 43 49 12
  • 52. Stem and leaf plot Stem-and-leaf of Age N = 60 Leaf Unit = 1.0 6 1 122269 19 2 1223344555777788888 (11) 3 00111226688 13 4 2223334567999 5 5 01127 4 6 3458 2 7 49
  • 54. Descriptive statistics report: Boxplot - minimum score - maximum score - lower quartile - upper quartile - median - mean - the skew of the distribution: positive skew: mean > median & high-score whisker is longer negative skew: mean < median & low-score whisker is longer
  • 55. 10% 20% 70% Mild Moderate Severe The prevalence of different degree of Hypertension in the population Pie Chart •Circular diagram – total -100% •Divided into segments each representing a category •Decide adjacent category •The amount for each category is proportional to slice of the pie
  • 56. Bar Graphs 9 12 20 16 12 8 20 0 5 10 15 20 25 Smo Alc Chol DM HTN No Exer F-H Riskfactor Number The distribution of risk factor among cases with Cardio vascular Diseases Heights of the bar indicates frequency Frequency in the Y axis and categories of variable in the X axis The bars should be of equal width and no touching the other bars
  • 57. HIV cases enrolment in USA by gender 0 2 4 6 8 10 12 1986 1987 1988 1989 1990 1991 1992 Year Enrollment (hundred) Men Women Bar chart
  • 58. HIV cases Enrollment in USA by gender 0 2 4 6 8 10 12 14 16 18 1986 1987 1988 1989 1990 1991 1992 Year Enrollment (Thousands) Women Men Stocked bar chart
  • 59. Graphic Presentation of Data the histogram (quantitative data) the bar graph (qualitative data) the frequency polygon (quantitative data)
  • 61. General rules for designing graphs  A graph should have a self-explanatory legend  A graph should help reader to understand data  Axis labeled, units of measurement indicated  Scales important. Start with zero (otherwise // break)  Avoid graphs with three-dimensional impression, it may be misleading (reader visualize less easily
  • 63. Origin and development of statistics in Medical Research  In 1929 a huge paper on application of statistics was published in Physiology Journal by Dunn.  In 1937, 15 articles on statistical methods by Austin Bradford Hill, were published in book form.  In 1948, a RCT of Streptomycin for pulmonary tb., was published in which Bradford Hill has a key influence.  Then the growth of Statistics in Medicine from 1952 was a 8-fold increase by 1982.