SlideShare a Scribd company logo
WEIGHTING OF DATA
Robert Radicsa (riradics@ncsu.edu), Sudipta Dasmohapatrab (sdasmoh@ncsu.edu), Steve Kelley c (sskelley@ncsu.edu),
a Graduate Research Assistant, b Associate Prof., c Department Head, Department of Forest Biomaterials, College of Natural Resources
Data Collection Method
Abstract
Data collected from consumer samples in the IBSS
project was adjusted (weighted) to make inferences
to the population in the states of NC and TN. This
paper presents information from the lessons learned
during the process of weighting of the data when
using multiple variables to account for differences
between a selected sample and the population.
Goal of Weighting and Raking
Battaglia, M., Hoaglin, D., & Frankel, M. (2013). Practical considerations in raking survey data.
http://magmods.wordpress.com/2011/03/23/magmods-questionnaire-3/
Weighting with More Variables; Raking
Battaglia, M., Hoaglin, D., & Frankel, M. (2013). Practical considerations in raking survey data.
Basic Algorithm
Limitations
Weighting with One Variable -Gender
Weighting with Two Variables – Gender,
Ethnicity
• Survey instrument
• Sampling: Randomly selected consumer
email addresses from third party
consumer database
• Data collection: Fall 2013 in NC and TN
•Pilot test: 34 consumers
•Cover Letter
• Completed Surveys:
• 586 in total
• 376 in NC and 210 in TN
• Response rate=2%
respondents % NC Census TN Census
n 376 9,848,000 209 6,496,000
Gender
Male 54.0 48.7 45.5 48.8
Female 46.0 51.3 54.5 51.2
Education
College 4 or 4+ 66.7 26.8 31.0 23.5
Ethnicity
White/Caucasian 79.0 71.9 88.5 79
Black/African-American 10.1 22.0 6.7 17
Age
18-24 9.6 10.0 10.3 2.1
25-44 26.5 43.1 26.8 26.6
45-64 26.9 24.9 26.2 52.1
65+ 13.3 1.0 12.4 17.8
Sample and Population
Demography Data
Sample data do not have the same demographics
proportions as the population data have.
Weighting and raking improve the relation between
the sample and the population by fine tuning the
sampling weights of the cases. At the end of the
process the marginal totals of the adjusted weights
on different characteristics are equal to the totals of
the population on the similar characteristics.
NC Census Weight
n 376 9,848,000 Census% / Sample%
Gender
Male 54 48.7 0.90
Female 46 51.3 1.12
• All male respondents get 0.90 weight for statistic analyses.
• All female respondents get 1.12 weight for statistic analyses.
NC Census Weight
n 376 9,848,000 Census% / Sample%
Gender
Male 54 48.7 0.90
Female 46 51.3 1.12
Ethnicity
White/Caucasian 79 71.9 0.91
Black/African-
American 10.1 22 2.18
Others 10.9 6.1 0.56
All respondents get two weights.
Issue: Gender proportions are not represented according to
the census because of these two multiplications.
Raking is the method of the iterative
proportional fitting.
Raking adjusts a set of data so that its
marginal totals match specified control totals
on a specified set of variables.
Raking is analogy of the process of leveling
the soil in a garden by alternately working
with a rake in two perpendicular directions.
• Lack of convergence or slow convergence.
• Large weights > 30, few respondents
represents large proportion of the
population.
• Small weights < 0.01 large proportion of the
sample represents small proportion of the
population.
The basic raking algorithm in terms of those individual weights, wi, i = 1, 2, ..., n. For an
unweighted (i.e., equally weighted) sample, one can simply take the initial weights to be wi = 1
for each i. In a cross-classification that has J rows and K columns, we denote the sum of the wi
in cell (j,k) by wjk. To indicate further summation, we replace a subscript by a + sign. Thus, the
initial row totals and column totals of the sample weights are w j+ and w+k respectively.
Analogously, we denote the corresponding population control totals by T j+ and T+k .
(1) for the sum of the modified weights in cell (j,k) at the end of step 1. If we begin by matching
the control totals for the rows, T j+, the initial steps of the algorithm are
mjk(0) = wjk (j = 1,...,J; k=1,...,K)
mjk(1) = mjk(0) ( T j+ / mj+(0) )
mjk(2) = mjk(1) ( T +k / m+k(1) )
The adjustment factors, T j+ /m j+(0) and T+k / m+k
(1), are actually applied to the individual weights, which we could denote by mi (2), for example.
In the iterative process an iteration rakes both rows and columns. For iteration s ( s = 0, 1, ...) we
may write
mjk(2s+1) = mjk(2s) ( T j+ / mj+(2s) )
mjk(2s+2) = mjk(2s+1) ( T +k / m+k(2s+1) )
Raking can also adjust a set of data to control totals on three or more variables.

More Related Content

PDF
14; allometry in chelonians
DOC
Stats Ch 3 worksheet
PDF
PPT
Weightage & Complex Sampling
PDF
Spatial interpolation techniques
PDF
Jeffrey henning april lecture series - 2014
PPTX
Dsc 3750 lecture 3
PPTX
Dsc 5530 lecture 3
14; allometry in chelonians
Stats Ch 3 worksheet
Weightage & Complex Sampling
Spatial interpolation techniques
Jeffrey henning april lecture series - 2014
Dsc 3750 lecture 3
Dsc 5530 lecture 3

More from Robert Radics (7)

PPTX
Dsc 5530 lecture 2
PPTX
Dsc 5530 lecture 1
PPTX
Dsc 3750 lecture 2
PPTX
Dsc 3750 lecture 1
PPTX
Using SPSS raking algorithm handling population and sample differences
PDF
Public Perceptions of Bioenergy
PPTX
Consumers perception segments
Dsc 5530 lecture 2
Dsc 5530 lecture 1
Dsc 3750 lecture 2
Dsc 3750 lecture 1
Using SPSS raking algorithm handling population and sample differences
Public Perceptions of Bioenergy
Consumers perception segments
Ad

Recently uploaded (20)

PPTX
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
PPTX
modul_python (1).pptx for professional and student
PDF
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
Topic 5 Presentation 5 Lesson 5 Corporate Fin
PDF
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
PDF
Navigating the Thai Supplements Landscape.pdf
PPTX
New ISO 27001_2022 standard and the changes
PDF
Microsoft 365 products and services descrption
PPTX
IMPACT OF LANDSLIDE.....................
PDF
Business Analytics and business intelligence.pdf
PPTX
Leprosy and NLEP programme community medicine
PPT
DU, AIS, Big Data and Data Analytics.ppt
PDF
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
PPTX
Managing Community Partner Relationships
PPTX
Business_Capability_Map_Collection__pptx
PDF
Transcultural that can help you someday.
PDF
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
PPTX
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
PPT
ISS -ESG Data flows What is ESG and HowHow
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
modul_python (1).pptx for professional and student
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
IBA_Chapter_11_Slides_Final_Accessible.pptx
Topic 5 Presentation 5 Lesson 5 Corporate Fin
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
Navigating the Thai Supplements Landscape.pdf
New ISO 27001_2022 standard and the changes
Microsoft 365 products and services descrption
IMPACT OF LANDSLIDE.....................
Business Analytics and business intelligence.pdf
Leprosy and NLEP programme community medicine
DU, AIS, Big Data and Data Analytics.ppt
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
Managing Community Partner Relationships
Business_Capability_Map_Collection__pptx
Transcultural that can help you someday.
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
ISS -ESG Data flows What is ESG and HowHow
Ad

Using SPSS raking algorithm handling population and sample differences

  • 1. WEIGHTING OF DATA Robert Radicsa (riradics@ncsu.edu), Sudipta Dasmohapatrab (sdasmoh@ncsu.edu), Steve Kelley c (sskelley@ncsu.edu), a Graduate Research Assistant, b Associate Prof., c Department Head, Department of Forest Biomaterials, College of Natural Resources Data Collection Method Abstract Data collected from consumer samples in the IBSS project was adjusted (weighted) to make inferences to the population in the states of NC and TN. This paper presents information from the lessons learned during the process of weighting of the data when using multiple variables to account for differences between a selected sample and the population. Goal of Weighting and Raking Battaglia, M., Hoaglin, D., & Frankel, M. (2013). Practical considerations in raking survey data. http://magmods.wordpress.com/2011/03/23/magmods-questionnaire-3/ Weighting with More Variables; Raking Battaglia, M., Hoaglin, D., & Frankel, M. (2013). Practical considerations in raking survey data. Basic Algorithm Limitations Weighting with One Variable -Gender Weighting with Two Variables – Gender, Ethnicity • Survey instrument • Sampling: Randomly selected consumer email addresses from third party consumer database • Data collection: Fall 2013 in NC and TN •Pilot test: 34 consumers •Cover Letter • Completed Surveys: • 586 in total • 376 in NC and 210 in TN • Response rate=2% respondents % NC Census TN Census n 376 9,848,000 209 6,496,000 Gender Male 54.0 48.7 45.5 48.8 Female 46.0 51.3 54.5 51.2 Education College 4 or 4+ 66.7 26.8 31.0 23.5 Ethnicity White/Caucasian 79.0 71.9 88.5 79 Black/African-American 10.1 22.0 6.7 17 Age 18-24 9.6 10.0 10.3 2.1 25-44 26.5 43.1 26.8 26.6 45-64 26.9 24.9 26.2 52.1 65+ 13.3 1.0 12.4 17.8 Sample and Population Demography Data Sample data do not have the same demographics proportions as the population data have. Weighting and raking improve the relation between the sample and the population by fine tuning the sampling weights of the cases. At the end of the process the marginal totals of the adjusted weights on different characteristics are equal to the totals of the population on the similar characteristics. NC Census Weight n 376 9,848,000 Census% / Sample% Gender Male 54 48.7 0.90 Female 46 51.3 1.12 • All male respondents get 0.90 weight for statistic analyses. • All female respondents get 1.12 weight for statistic analyses. NC Census Weight n 376 9,848,000 Census% / Sample% Gender Male 54 48.7 0.90 Female 46 51.3 1.12 Ethnicity White/Caucasian 79 71.9 0.91 Black/African- American 10.1 22 2.18 Others 10.9 6.1 0.56 All respondents get two weights. Issue: Gender proportions are not represented according to the census because of these two multiplications. Raking is the method of the iterative proportional fitting. Raking adjusts a set of data so that its marginal totals match specified control totals on a specified set of variables. Raking is analogy of the process of leveling the soil in a garden by alternately working with a rake in two perpendicular directions. • Lack of convergence or slow convergence. • Large weights > 30, few respondents represents large proportion of the population. • Small weights < 0.01 large proportion of the sample represents small proportion of the population. The basic raking algorithm in terms of those individual weights, wi, i = 1, 2, ..., n. For an unweighted (i.e., equally weighted) sample, one can simply take the initial weights to be wi = 1 for each i. In a cross-classification that has J rows and K columns, we denote the sum of the wi in cell (j,k) by wjk. To indicate further summation, we replace a subscript by a + sign. Thus, the initial row totals and column totals of the sample weights are w j+ and w+k respectively. Analogously, we denote the corresponding population control totals by T j+ and T+k . (1) for the sum of the modified weights in cell (j,k) at the end of step 1. If we begin by matching the control totals for the rows, T j+, the initial steps of the algorithm are mjk(0) = wjk (j = 1,...,J; k=1,...,K) mjk(1) = mjk(0) ( T j+ / mj+(0) ) mjk(2) = mjk(1) ( T +k / m+k(1) ) The adjustment factors, T j+ /m j+(0) and T+k / m+k (1), are actually applied to the individual weights, which we could denote by mi (2), for example. In the iterative process an iteration rakes both rows and columns. For iteration s ( s = 0, 1, ...) we may write mjk(2s+1) = mjk(2s) ( T j+ / mj+(2s) ) mjk(2s+2) = mjk(2s+1) ( T +k / m+k(2s+1) ) Raking can also adjust a set of data to control totals on three or more variables.