SlideShare a Scribd company logo
POP77034
Experimental Methods
for Social Scientists
Dr Noah Buckley
Trinity College Dublin
HT2023
1
The plan for today
• Quick causal inference recap
• Experimental design
• Ethics
• Next week: more experimental design
2
The Fundamental Problem of Causal Inference
• It is impossible to observe any unit we’re interested in (e.g., person, country,
fi
rm, school) both when it has and has not been changed by a causal action
• Only in physics and chemistry are units (particles, molecules)
interchangeable (“exchangeable”) enough that we don’t have to worry
about this
• If I give 100 euros to Mary and she gets happier than she was before, we
fundamentally cannot know how happy she would have been if I had not
given her 100 euros
• We can use theory, intuition, anecdote, data to come up with a (very) good
guess
• But we can never be sure
Recapping causal inference
https://egap.org/resource/10-things-to-know-about-causal-inference/
Recapping causal inference
That’s experiments in theory, what about in practice?
Real-world implementation of experiments is di
ffi
cult!
• When you assign a unit (e.g. person) to treatment, they may not actually take
that treatment
• You give them a drug but they don’t take it
• You send them a YouTube video to watch but they don’t watch it, or they
mute it and don’t pay attention
• Same for the control group
• They may go out and
fi
nd the drug themselves, or stumble on the YouTube
video
That’s experiments in theory, what about in practice?
• “Compliers are subjects who will take the treatment if and only if they were
assigned to the treatment group…
• Non-compliers are composed of the three remaining subgroups:
• Always-takers are subjects who will always take the treatment even if they
were assigned to the control group
• Never-takers are subjects who will never take the treatment even if they
were assigned to the treatment group
• De
fi
ers are subjects who will do the opposite of their treatment assignment
status” https://en.wikipedia.org/wiki/Local_average_treatment_e
ff
ect
Experimental design
• Experiments have three “main” components:
• Treatment
• Randomization into treatment and control groups
• Measurement of outcome
• Let’s look at each of these components in turn
• Also look at groupings of these that form common ‘types’
of experiments
8
Designing a treatment
Good treatments
• “One hopes that the treatment alters values of the independent variable (e.g., causes
subjects to think about campaign
fi
nance in terms of free speech) or induces certain
beliefs among participants (e.g., how much they will get paid).” (Druckman 2020 p.82)
• The treatment should:
• Be e
ffi
cacious
• Fit with the theoretical construct the researcher is interested in
• Vitamin D and…beach holiday? Multivitamin? Stern lecture from doctor?
• Support for Putin and…seeing an o
ffi
cial arrested for corruption? Watching a Navalny
video about regime corruption? Reading a TI report about Russian corruption levels?
• Have a basis in theory
• How will the knowledge gained from the experiment
fi
t in with other things we know
about the world?
9
Designing a treatment
Validation, piloting
• “When it comes to evaluating treatments, researchers should not
trust themselves to validate them.
• A crucial step taken in the design of an experiment entails validating
the intervention with a sample that matches the experimental
participants and/or the participants themselves.”
• “One need not test the outcome variables of interest but instead
assess whether participants interpret and react to the intervention
as presumed (e.g., increased anxiety or social trust)
10
Designing a treatment
Validation, piloting 2
• Piloting has the advantage of allowing one to evaluate di
ff
erent
approaches before implementing the actual experiment
• Ideally, one pilots on a sample drawn from the same population as
the experiment
• If that is not possible, however, one should carefully think about
possible di
ff
erences between the pilot sample and the
experimental sample”
11
Designing a treatment
Manipulation checks
• “In addition to piloting, one can incorporate a manipulation check
into the experiment itself to empirically assess whether respondents
receive and perceive the treatment as intended.”
• Example: experiment on whether seeing a news report from Fox
News leads people to vote for Republicans more than a CNN report
• Manipulation check: ask what the source of the clip was
• Downsides: extra cost, be careful with outcome measurement
12
Measurement and validity
Druckman 2020 p.87, 93
• Experiments are usually*
taken to have good
internal validity and
‘statistical conclusion
validity’
• Good treatment design,
measurement,
randomization will help
ensure the
fi
rst three of
these types of validity
13
External validity and generalizability
Druckman p.94-102
• “External validity means generalizing across 1) samples, 2) settings, 3)
treatments, and 4) outcome measures”
• What is being generalized?
• Existence of an e
ff
ect? Precise e
ff
ect size?
• To what are you generalizing?
• What population?
• The answers to these questions depend on the goals of the experiment
14
External validity and realism/naturalism
• Does it matter how realistic your treatment is?
• What is feasible and ethical?
• Example:
• Outcome: voting in an election
• Conceptual treatment: watching advertisements for a candidate
• Practical/actual treatment:
• Have participants watch 30 minutes of the news with advertisements
interspersed?
• Show a series of only advertisements? How many? How many times?
15
Other kinds of treatments
Encouragement design
• Intent-to-treat estimator
• “randomly incentivize subjects recruited via survey to follow one of two Twitter
accounts programmed to retweet posts by politically in
fl
uential users. Subjects were
periodically quizzed about the contents of their Twitter feeds and surveyed again to
gauge the e
ff
ect of exposure to counter-attitudinal social media content.” (Guess 2021)
• Shows the trade-o
ff
between naturalism and strength of treatment
• “Like the o
ffl
ine world, online environments are crowded and multifaceted, with
many competing demands on users’ attention.”
• People just don’t see or pay attention to stu
ff
!
• “at least in an intent-to-treat world, manipulating a single post, ad impression, or
account exposure may not in itself be expected to produce measurably large
e
ff
ects.”
16
Assignment 1
17
Ethics
Morton & Williams Chapters 11-13
• Experiments must be
ethical!
• Harm or risk to participants
• Changing of important real-
world outcomes (e.g.,
elections)
• Deception
18
Ethics
Morton & Williams
• Bene
fi
ts vs. risks
• Harms
• Psychological harm
• Invasion of privacy or
con
fi
dentiality
19
Ethics
Morton & Williams
• Probability and magnitude of harm
• Compare to daily life and routine risks
• Vulnerable subjects
• Prisoners, children, disabled
• When possible, experiments need to get informed consent
• Not always feasible! This may be a foreign concept or may interfere with the
experiment
• “Informed consent has become a mainstay of research with human subjects because it
serves two purposes: (1) it ensures that the subjects are voluntarily participating and
that their autonomy is protected and (2) it provides researchers with legal protections in
case of unexpected events.”
20
Ethics
Morton & Williams Chapter 13
• Deception
• Concerns about contaminating a subject pool
• If you must use deception, you should probably debrief
21
Population and sample
• The population you wish to generalize to may be:
• All adult residents of Ireland
• All adult voters of Ireland
• Residents of Dublin between 18 and 45 years of age
• Or perhaps the population is irrelevant
• Your experiment will need to de
fi
ne a sample of that population on
which your treatment will be applied
22
Sampling
Druckman p.109-120
• How homogenous do you think the treatment is?
• If you’re interested in attitudes towards pension reform, your
sample may need su
ffi
cient young and old people
• Pharmaceuticals and biological sex
• Urban vs. rural residents
• Cost, generalizability, practicality
23
Sampling: Random samples
• Dial random telephone numbers
• Pick names out of a list (phonebook) randomly
• Where do you get the list??
• Not always legal or feasible
24
Druckman p.109-120
Sampling: Convenience samples
• Take whoever is convenient
• or whoever selects into your sample
• Put up posters, send out emails, buy advertisements
• Talk to people on the street
• Cheaper and easier, but sharply limits generalizability
25
Druckman p.109-120
Sampling: Weighting
• “Weighting requires that one obtain descriptive data of the target
population, typically demographics.
• For example, when the population includes all Americans, one can
use the U.S. Census…for demographic population
fi
gures.
• One then computes weights that account for each respondent’s
probability of being included in the sample
• For example, if the population consists of 50% men but the
sample contains only 40% men, then male sample respondents
will be weighted to count more in computations from the sample
(and women will be counted less)
26
Druckman p.109-120
Sampling: Weighting
• Survey researchers commonly use weights, even with many
probability samples, to ensure the accuracy of observational
inferences (e.g., the percentage of men who hold a particular
attitude)” (Druckman 2020 p.117)
• Consider weighting if:
• e
ff
ects are heterogeneous in a way you can correct for
• you care about the population
• you are interested in precise e
ff
ect size
27
Druckman p.109-120
Sample size and power
https://egap.org/resource/10-things-to-know-about-statistical-power/
• “Power is the ability to distinguish signal from noise.”
• “If our experiments are highly-powered, we can be con
fi
dent that if there truly is a
treatment e
ff
ect, we’ll be able to see it.”
• We want to avoid false negatives and false positives
• Example:
• “Now suppose an experiment instead used subjects’ income as an outcome variable.
• Incomes can vary pretty widely – in some places, it is not uncommon for people to
have neighbors that earn two, ten, or one hundred times their daily wages.
• When noise is high, experiments have more trouble.
• A treatment that increased workers’ incomes by 1% would be di
ffi
cult to detect,
because incomes di
ff
er by so much in the
fi
rst place.”
28
Sample size and power
https://egap.org/resource/10-things-to-know-about-statistical-power/
• The three ingredients of statistical power:
• Strength of the treatment
• Background noise
• As the background noise of your outcome variables increases, the power of
your experiment decreases
• To the extent that it is possible, try to select outcome variables that have low
variability
• In practical terms, this means comparing the standard deviation of the
outcome variable to the expected treatment e
ff
ect size
• Sample size
• See link for formula and calculator, but also beware! Power is a slippery thing
29
Sample size and power
https://egap.org/resource/10-things-to-know-about-statistical-power/
• https://www.stat.ubc.ca/~rollin/stats/ssize/n2.html
• https://machinelearningmastery.com/statistical-power-and-power-
analysis-in-python/
• “Statistical power is the probability of a hypothesis test of
fi
nding
an e
ff
ect if there is an e
ff
ect to be found.
• A power analysis can be used to estimate the minimum sample
size required for an experiment, given a desired signi
fi
cance level,
e
ff
ect size, and statistical power.”
30
Randomization
Random assignment to treatment and control groups
• So you’ve got your experimental design, a sample of people to
experiment on
• Now you need to assign people to treatment and control
• Otherwise it wouldn’t be an experiment!
• Simple randomization
• Complete simple randomization
• Block and cluster randomization
31
Randomization: Simple random assignment
Druckman 2020, p.109-120
• “Simple random assignment is a term of art, referring to a procedure—a die roll
or coin toss—that gives each subject an identical probability of being assigned
to the treatment group
• The practical drawback of simple random assignment is that when N is small,
random chance can create a treatment group that is larger or smaller than
what the researcher intended.” (FEDAI p.36)
• “A useful special case of simple random assignment is complete random
assignment, where exactly m of N units are assigned to the treatment group with
equal probability.”
• Be careful about de
fi
ning random: things like birthday may not be completely
random in a formal sense 32
Block randomization
https://egap.org/resource/10-things-to-know-about-randomization/
• It is possible, when randomizing, to specify the balance of particular
factors you care about between treatment and control groups
• even though it is not possible to specify which particular units are
selected for either group
• For example, you can specify that treatment and control groups
contain equal ratios of men to women
33
Block randomization
https://egap.org/resource/10-things-to-know-about-randomization/
• Why is this desirable?
• Not because our estimate of the average treatment e
ff
ect would otherwise be
biased, but because it could be really noisy.
• Suppose that a random assignment happened to generate a very male
treatment group and a very female control group. We would observe a
correlation between gender and treatment status. If we were to estimate a
treatment e
ff
ect, that treatment e
ff
ect would still be unbiased because gender
did not cause treatment status.
• However, it would be more di
ffi
cult to reject the null hypothesis that it was
not our treatment but gender that was producing the e
ff
ect.
• In short, the imbalance produces a noisy estimate, which makes it more
di
ffi
cult for us to be con
fi
dent in our estimates.
34
Block randomization
https://cran.r-project.org/web/packages/randomizr/vignettes/randomizr_vignette.html
• “Block random assignment (sometimes known as strati
fi
ed
random assignment) is a powerful tool when used well.
• In this design, subjects are sorted into blocks (strata) according to
their pre-treatment covariates, and then complete random
assignment is conducted within each block.
• For example, a researcher might block on gender, assigning
exactly half of the men and exactly half of the women to
treatment.”
35
Block randomization
https://cran.r-project.org/web/packages/randomizr/vignettes/randomizr_vignette.html
• “Why block?
• The
fi
rst reason is to signal to future readers that treatment e
ff
ect
heterogeneity may be of interest: is the treatment e
ff
ect di
ff
erent for
men versus women? Of course, such heterogeneity could be explored
if complete random assignment had been used, but blocking on a
covariate defends a researcher (somewhat) against claims of data
dredging.
• The second reason is to increase precision. If the blocking variables
are predictive of the outcome (i.e., they are correlated with the
outcome), then blocking may help to decrease sampling variability. It’s
important, however, not to overstate these advantages. The gains from
a blocked design can often be realized through covariate adjustment
alone.” 36
Block randomization
https://cran.r-project.org/web/packages/randomizr/vignettes/randomizr_vignette.html
37
Cluster randomization
https://cran.r-project.org/web/packages/randomizr/vignettes/randomizr_vignette.html
• Assigning units to treatment or control as a cluster
• “Housemates in households: whole households are assigned to treatment or control
• Students in classrooms: whole classrooms are assigned to treatment or control
• Residents in towns or villages: whole communities are assigned to treatment or
control”
• Don’t do this unless you really have to!
• “Clustered assignment decreases the e
ff
ective sample size of an experiment. In
the extreme case when outcomes are perfectly correlated with clusters, the
experiment has an e
ff
ective sample size equal to the number of clusters. When
outcomes are perfectly uncorrelated with clusters, the e
ff
ective sample size is equal
to the number of subjects. Almost all cluster-assigned experiments fall somewhere in
the middle of these two extremes.”
38
Types of experiments
Common groupings of treatment and measurement
• More on this next week
39
Experiment cookbook
Druckman p.234+
• Big picture idea
• Short (i.e., few pages) document on the general topic and why it is
relevant to understanding social, political, and/or economic
phenomena
40
• Detailed literature review
• An exhaustive search of research on the topic, and detailed
descriptions of speci
fi
c studies
• It is here that the researcher should identify speci
fi
c gaps in
existing knowledge.
41
Experiment cookbook
Druckman p.234+
• Research question(s) and outcomes
• Given the identi
fi
cation of a gap in existing work, the next step is to
put forth a speci
fi
c question (or questions) to be addressed
• This includes identifying the precise outcome variable(s) of interest
42
Experiment cookbook
Druckman p.234+
• Theory and hypotheses
• Development of a theory and hypotheses to be tested
• Researchers should take their time to derive concrete and speci
fi
c
predictions
• As part of this step, potential mediators and/or moderators should
be speci
fi
ed
• Also, in putting forth predictions, one must be careful to isolate the
comparisons to be used.
43
Experiment cookbook
Druckman p.234+
• Research design
• Discussion of the designs used by others who have addressed
similar questions, and how the proposed design connects with
previous work. In many cases, the ideal strategy is to utilize and
extend prior designs.
• Discussion of how such a design will provide data relevant to the
larger questions.
44
Experiment cookbook
Druckman p.234+
• Research design (cont’d)
• Identifying where the data will come from, which includes:
• Consideration of the sample and any potential biases.
• Detailed measures and where the measures were obtained—that
is, where have they been used in prior studies? The measures
need to clearly connect to the hypotheses, including the
outcome variables and mediators/moderators.
45
Experiment cookbook
Druckman p.234+
• Research design (continued more)
• In many cases, the design may be too practically complex (e.g.,
number of experimental conditions relative to realistic sample size),
and decisions must be made on what can be trimmed without
interfering with the goal of the study.
• For original data collection, pre-tests of stimuli, question wordings,
etc., are critical to ensure the approach has content and construct
validity.
• Issues related to internal and external validity should be discussed.
46
Experiment cookbook
Druckman p.234+
• Data collection document
• If the project involves original data collection, a step-by-step plan
needs to be put forth so as not to later forget such details as
recruitment, implementation, etc.
47
Experiment cookbook
Druckman p.234+
• Data analysis plan
• There needs to be a clear data analysis plan—how exactly will the
data be used to test hypotheses? The researcher should directly
connect the design and measures to the hypotheses.
• This often involves making a table with each measure and how it
maps on to speci
fi
c hypotheses.
48
Experiment cookbook
Druckman p.234+
• Then
• Do the experiment
49
Experiment cookbook
Druckman p.234+
Next time
• More on speci
fi
c experimental designs
• Take a look at the readings — choose chapters that are interesting to
you
• Assignment 1!
• Due Sunday
50

More Related Content

PPT
Chapter7
PPT
Ch07 Experimental & Quasi-Experimental Designs
PPT
experimental designs
PDF
HEALTHCARE RESEARCH METHODS: Experimental Studies and Qualitative Studies
PDF
Research Methodology / Experimental research design
PPTX
Les7e ppt ada_0103
PPT
Marketing Research Experimental Research
PPTX
The experimental epidemiology seminar. pptx
Chapter7
Ch07 Experimental & Quasi-Experimental Designs
experimental designs
HEALTHCARE RESEARCH METHODS: Experimental Studies and Qualitative Studies
Research Methodology / Experimental research design
Les7e ppt ada_0103
Marketing Research Experimental Research
The experimental epidemiology seminar. pptx

Similar to POP77034 Experimental Methods HT2023 week 2 slides.pdf (20)

PPTX
Experimental design
PPTX
Experimental design
PPTX
Experimental research_Kritika.pptx
DOCX
SAMPLINGFor what population do you want to test the new therap.docx
PPT
experimental types of research in mentioned in research methodology.ppt
PDF
Chapter 3 part1-Design of Experiments
PPTX
Issues in Experimental Design
PPTX
Experimental design
PPTX
week 11b.pptx
PPTX
Experimental Quasi Designs and bacon bites
PDF
Research methods 1
PPT
2. research design ldr 280-2
PPT
2. Research Design-LDR 280 (1).PPT
PPTX
Randomized trials ii dr.wah
DOCX
1) The path length from A to B in the following graph is .docx
PPS
Lesson 10
PPTX
The classic experiment_(and_its_limitations)-1
PPTX
Aqa research methods 1
PPT
Experimental_research
PPT
research design : experiments
Experimental design
Experimental design
Experimental research_Kritika.pptx
SAMPLINGFor what population do you want to test the new therap.docx
experimental types of research in mentioned in research methodology.ppt
Chapter 3 part1-Design of Experiments
Issues in Experimental Design
Experimental design
week 11b.pptx
Experimental Quasi Designs and bacon bites
Research methods 1
2. research design ldr 280-2
2. Research Design-LDR 280 (1).PPT
Randomized trials ii dr.wah
1) The path length from A to B in the following graph is .docx
Lesson 10
The classic experiment_(and_its_limitations)-1
Aqa research methods 1
Experimental_research
research design : experiments
Ad

Recently uploaded (20)

PPTX
Week 4 Term 3 Study Techniques revisited.pptx
PDF
Complications of Minimal Access Surgery at WLH
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PDF
01-Introduction-to-Information-Management.pdf
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
RMMM.pdf make it easy to upload and study
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PPTX
Introduction to Child Health Nursing – Unit I | Child Health Nursing I | B.Sc...
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PDF
Pre independence Education in Inndia.pdf
PPTX
PPH.pptx obstetrics and gynecology in nursing
PDF
Microbial disease of the cardiovascular and lymphatic systems
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PDF
Basic Mud Logging Guide for educational purpose
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Week 4 Term 3 Study Techniques revisited.pptx
Complications of Minimal Access Surgery at WLH
Abdominal Access Techniques with Prof. Dr. R K Mishra
01-Introduction-to-Information-Management.pdf
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
RMMM.pdf make it easy to upload and study
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Introduction to Child Health Nursing – Unit I | Child Health Nursing I | B.Sc...
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
Pre independence Education in Inndia.pdf
PPH.pptx obstetrics and gynecology in nursing
Microbial disease of the cardiovascular and lymphatic systems
2.FourierTransform-ShortQuestionswithAnswers.pdf
Pharmacology of Heart Failure /Pharmacotherapy of CHF
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
Basic Mud Logging Guide for educational purpose
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Ad

POP77034 Experimental Methods HT2023 week 2 slides.pdf

  • 1. POP77034 Experimental Methods for Social Scientists Dr Noah Buckley Trinity College Dublin HT2023 1
  • 2. The plan for today • Quick causal inference recap • Experimental design • Ethics • Next week: more experimental design 2
  • 3. The Fundamental Problem of Causal Inference • It is impossible to observe any unit we’re interested in (e.g., person, country, fi rm, school) both when it has and has not been changed by a causal action • Only in physics and chemistry are units (particles, molecules) interchangeable (“exchangeable”) enough that we don’t have to worry about this • If I give 100 euros to Mary and she gets happier than she was before, we fundamentally cannot know how happy she would have been if I had not given her 100 euros • We can use theory, intuition, anecdote, data to come up with a (very) good guess • But we can never be sure
  • 6. That’s experiments in theory, what about in practice? Real-world implementation of experiments is di ffi cult! • When you assign a unit (e.g. person) to treatment, they may not actually take that treatment • You give them a drug but they don’t take it • You send them a YouTube video to watch but they don’t watch it, or they mute it and don’t pay attention • Same for the control group • They may go out and fi nd the drug themselves, or stumble on the YouTube video
  • 7. That’s experiments in theory, what about in practice? • “Compliers are subjects who will take the treatment if and only if they were assigned to the treatment group… • Non-compliers are composed of the three remaining subgroups: • Always-takers are subjects who will always take the treatment even if they were assigned to the control group • Never-takers are subjects who will never take the treatment even if they were assigned to the treatment group • De fi ers are subjects who will do the opposite of their treatment assignment status” https://en.wikipedia.org/wiki/Local_average_treatment_e ff ect
  • 8. Experimental design • Experiments have three “main” components: • Treatment • Randomization into treatment and control groups • Measurement of outcome • Let’s look at each of these components in turn • Also look at groupings of these that form common ‘types’ of experiments 8
  • 9. Designing a treatment Good treatments • “One hopes that the treatment alters values of the independent variable (e.g., causes subjects to think about campaign fi nance in terms of free speech) or induces certain beliefs among participants (e.g., how much they will get paid).” (Druckman 2020 p.82) • The treatment should: • Be e ffi cacious • Fit with the theoretical construct the researcher is interested in • Vitamin D and…beach holiday? Multivitamin? Stern lecture from doctor? • Support for Putin and…seeing an o ffi cial arrested for corruption? Watching a Navalny video about regime corruption? Reading a TI report about Russian corruption levels? • Have a basis in theory • How will the knowledge gained from the experiment fi t in with other things we know about the world? 9
  • 10. Designing a treatment Validation, piloting • “When it comes to evaluating treatments, researchers should not trust themselves to validate them. • A crucial step taken in the design of an experiment entails validating the intervention with a sample that matches the experimental participants and/or the participants themselves.” • “One need not test the outcome variables of interest but instead assess whether participants interpret and react to the intervention as presumed (e.g., increased anxiety or social trust) 10
  • 11. Designing a treatment Validation, piloting 2 • Piloting has the advantage of allowing one to evaluate di ff erent approaches before implementing the actual experiment • Ideally, one pilots on a sample drawn from the same population as the experiment • If that is not possible, however, one should carefully think about possible di ff erences between the pilot sample and the experimental sample” 11
  • 12. Designing a treatment Manipulation checks • “In addition to piloting, one can incorporate a manipulation check into the experiment itself to empirically assess whether respondents receive and perceive the treatment as intended.” • Example: experiment on whether seeing a news report from Fox News leads people to vote for Republicans more than a CNN report • Manipulation check: ask what the source of the clip was • Downsides: extra cost, be careful with outcome measurement 12
  • 13. Measurement and validity Druckman 2020 p.87, 93 • Experiments are usually* taken to have good internal validity and ‘statistical conclusion validity’ • Good treatment design, measurement, randomization will help ensure the fi rst three of these types of validity 13
  • 14. External validity and generalizability Druckman p.94-102 • “External validity means generalizing across 1) samples, 2) settings, 3) treatments, and 4) outcome measures” • What is being generalized? • Existence of an e ff ect? Precise e ff ect size? • To what are you generalizing? • What population? • The answers to these questions depend on the goals of the experiment 14
  • 15. External validity and realism/naturalism • Does it matter how realistic your treatment is? • What is feasible and ethical? • Example: • Outcome: voting in an election • Conceptual treatment: watching advertisements for a candidate • Practical/actual treatment: • Have participants watch 30 minutes of the news with advertisements interspersed? • Show a series of only advertisements? How many? How many times? 15
  • 16. Other kinds of treatments Encouragement design • Intent-to-treat estimator • “randomly incentivize subjects recruited via survey to follow one of two Twitter accounts programmed to retweet posts by politically in fl uential users. Subjects were periodically quizzed about the contents of their Twitter feeds and surveyed again to gauge the e ff ect of exposure to counter-attitudinal social media content.” (Guess 2021) • Shows the trade-o ff between naturalism and strength of treatment • “Like the o ffl ine world, online environments are crowded and multifaceted, with many competing demands on users’ attention.” • People just don’t see or pay attention to stu ff ! • “at least in an intent-to-treat world, manipulating a single post, ad impression, or account exposure may not in itself be expected to produce measurably large e ff ects.” 16
  • 18. Ethics Morton & Williams Chapters 11-13 • Experiments must be ethical! • Harm or risk to participants • Changing of important real- world outcomes (e.g., elections) • Deception 18
  • 19. Ethics Morton & Williams • Bene fi ts vs. risks • Harms • Psychological harm • Invasion of privacy or con fi dentiality 19
  • 20. Ethics Morton & Williams • Probability and magnitude of harm • Compare to daily life and routine risks • Vulnerable subjects • Prisoners, children, disabled • When possible, experiments need to get informed consent • Not always feasible! This may be a foreign concept or may interfere with the experiment • “Informed consent has become a mainstay of research with human subjects because it serves two purposes: (1) it ensures that the subjects are voluntarily participating and that their autonomy is protected and (2) it provides researchers with legal protections in case of unexpected events.” 20
  • 21. Ethics Morton & Williams Chapter 13 • Deception • Concerns about contaminating a subject pool • If you must use deception, you should probably debrief 21
  • 22. Population and sample • The population you wish to generalize to may be: • All adult residents of Ireland • All adult voters of Ireland • Residents of Dublin between 18 and 45 years of age • Or perhaps the population is irrelevant • Your experiment will need to de fi ne a sample of that population on which your treatment will be applied 22
  • 23. Sampling Druckman p.109-120 • How homogenous do you think the treatment is? • If you’re interested in attitudes towards pension reform, your sample may need su ffi cient young and old people • Pharmaceuticals and biological sex • Urban vs. rural residents • Cost, generalizability, practicality 23
  • 24. Sampling: Random samples • Dial random telephone numbers • Pick names out of a list (phonebook) randomly • Where do you get the list?? • Not always legal or feasible 24 Druckman p.109-120
  • 25. Sampling: Convenience samples • Take whoever is convenient • or whoever selects into your sample • Put up posters, send out emails, buy advertisements • Talk to people on the street • Cheaper and easier, but sharply limits generalizability 25 Druckman p.109-120
  • 26. Sampling: Weighting • “Weighting requires that one obtain descriptive data of the target population, typically demographics. • For example, when the population includes all Americans, one can use the U.S. Census…for demographic population fi gures. • One then computes weights that account for each respondent’s probability of being included in the sample • For example, if the population consists of 50% men but the sample contains only 40% men, then male sample respondents will be weighted to count more in computations from the sample (and women will be counted less) 26 Druckman p.109-120
  • 27. Sampling: Weighting • Survey researchers commonly use weights, even with many probability samples, to ensure the accuracy of observational inferences (e.g., the percentage of men who hold a particular attitude)” (Druckman 2020 p.117) • Consider weighting if: • e ff ects are heterogeneous in a way you can correct for • you care about the population • you are interested in precise e ff ect size 27 Druckman p.109-120
  • 28. Sample size and power https://egap.org/resource/10-things-to-know-about-statistical-power/ • “Power is the ability to distinguish signal from noise.” • “If our experiments are highly-powered, we can be con fi dent that if there truly is a treatment e ff ect, we’ll be able to see it.” • We want to avoid false negatives and false positives • Example: • “Now suppose an experiment instead used subjects’ income as an outcome variable. • Incomes can vary pretty widely – in some places, it is not uncommon for people to have neighbors that earn two, ten, or one hundred times their daily wages. • When noise is high, experiments have more trouble. • A treatment that increased workers’ incomes by 1% would be di ffi cult to detect, because incomes di ff er by so much in the fi rst place.” 28
  • 29. Sample size and power https://egap.org/resource/10-things-to-know-about-statistical-power/ • The three ingredients of statistical power: • Strength of the treatment • Background noise • As the background noise of your outcome variables increases, the power of your experiment decreases • To the extent that it is possible, try to select outcome variables that have low variability • In practical terms, this means comparing the standard deviation of the outcome variable to the expected treatment e ff ect size • Sample size • See link for formula and calculator, but also beware! Power is a slippery thing 29
  • 30. Sample size and power https://egap.org/resource/10-things-to-know-about-statistical-power/ • https://www.stat.ubc.ca/~rollin/stats/ssize/n2.html • https://machinelearningmastery.com/statistical-power-and-power- analysis-in-python/ • “Statistical power is the probability of a hypothesis test of fi nding an e ff ect if there is an e ff ect to be found. • A power analysis can be used to estimate the minimum sample size required for an experiment, given a desired signi fi cance level, e ff ect size, and statistical power.” 30
  • 31. Randomization Random assignment to treatment and control groups • So you’ve got your experimental design, a sample of people to experiment on • Now you need to assign people to treatment and control • Otherwise it wouldn’t be an experiment! • Simple randomization • Complete simple randomization • Block and cluster randomization 31
  • 32. Randomization: Simple random assignment Druckman 2020, p.109-120 • “Simple random assignment is a term of art, referring to a procedure—a die roll or coin toss—that gives each subject an identical probability of being assigned to the treatment group • The practical drawback of simple random assignment is that when N is small, random chance can create a treatment group that is larger or smaller than what the researcher intended.” (FEDAI p.36) • “A useful special case of simple random assignment is complete random assignment, where exactly m of N units are assigned to the treatment group with equal probability.” • Be careful about de fi ning random: things like birthday may not be completely random in a formal sense 32
  • 33. Block randomization https://egap.org/resource/10-things-to-know-about-randomization/ • It is possible, when randomizing, to specify the balance of particular factors you care about between treatment and control groups • even though it is not possible to specify which particular units are selected for either group • For example, you can specify that treatment and control groups contain equal ratios of men to women 33
  • 34. Block randomization https://egap.org/resource/10-things-to-know-about-randomization/ • Why is this desirable? • Not because our estimate of the average treatment e ff ect would otherwise be biased, but because it could be really noisy. • Suppose that a random assignment happened to generate a very male treatment group and a very female control group. We would observe a correlation between gender and treatment status. If we were to estimate a treatment e ff ect, that treatment e ff ect would still be unbiased because gender did not cause treatment status. • However, it would be more di ffi cult to reject the null hypothesis that it was not our treatment but gender that was producing the e ff ect. • In short, the imbalance produces a noisy estimate, which makes it more di ffi cult for us to be con fi dent in our estimates. 34
  • 35. Block randomization https://cran.r-project.org/web/packages/randomizr/vignettes/randomizr_vignette.html • “Block random assignment (sometimes known as strati fi ed random assignment) is a powerful tool when used well. • In this design, subjects are sorted into blocks (strata) according to their pre-treatment covariates, and then complete random assignment is conducted within each block. • For example, a researcher might block on gender, assigning exactly half of the men and exactly half of the women to treatment.” 35
  • 36. Block randomization https://cran.r-project.org/web/packages/randomizr/vignettes/randomizr_vignette.html • “Why block? • The fi rst reason is to signal to future readers that treatment e ff ect heterogeneity may be of interest: is the treatment e ff ect di ff erent for men versus women? Of course, such heterogeneity could be explored if complete random assignment had been used, but blocking on a covariate defends a researcher (somewhat) against claims of data dredging. • The second reason is to increase precision. If the blocking variables are predictive of the outcome (i.e., they are correlated with the outcome), then blocking may help to decrease sampling variability. It’s important, however, not to overstate these advantages. The gains from a blocked design can often be realized through covariate adjustment alone.” 36
  • 38. Cluster randomization https://cran.r-project.org/web/packages/randomizr/vignettes/randomizr_vignette.html • Assigning units to treatment or control as a cluster • “Housemates in households: whole households are assigned to treatment or control • Students in classrooms: whole classrooms are assigned to treatment or control • Residents in towns or villages: whole communities are assigned to treatment or control” • Don’t do this unless you really have to! • “Clustered assignment decreases the e ff ective sample size of an experiment. In the extreme case when outcomes are perfectly correlated with clusters, the experiment has an e ff ective sample size equal to the number of clusters. When outcomes are perfectly uncorrelated with clusters, the e ff ective sample size is equal to the number of subjects. Almost all cluster-assigned experiments fall somewhere in the middle of these two extremes.” 38
  • 39. Types of experiments Common groupings of treatment and measurement • More on this next week 39
  • 40. Experiment cookbook Druckman p.234+ • Big picture idea • Short (i.e., few pages) document on the general topic and why it is relevant to understanding social, political, and/or economic phenomena 40
  • 41. • Detailed literature review • An exhaustive search of research on the topic, and detailed descriptions of speci fi c studies • It is here that the researcher should identify speci fi c gaps in existing knowledge. 41 Experiment cookbook Druckman p.234+
  • 42. • Research question(s) and outcomes • Given the identi fi cation of a gap in existing work, the next step is to put forth a speci fi c question (or questions) to be addressed • This includes identifying the precise outcome variable(s) of interest 42 Experiment cookbook Druckman p.234+
  • 43. • Theory and hypotheses • Development of a theory and hypotheses to be tested • Researchers should take their time to derive concrete and speci fi c predictions • As part of this step, potential mediators and/or moderators should be speci fi ed • Also, in putting forth predictions, one must be careful to isolate the comparisons to be used. 43 Experiment cookbook Druckman p.234+
  • 44. • Research design • Discussion of the designs used by others who have addressed similar questions, and how the proposed design connects with previous work. In many cases, the ideal strategy is to utilize and extend prior designs. • Discussion of how such a design will provide data relevant to the larger questions. 44 Experiment cookbook Druckman p.234+
  • 45. • Research design (cont’d) • Identifying where the data will come from, which includes: • Consideration of the sample and any potential biases. • Detailed measures and where the measures were obtained—that is, where have they been used in prior studies? The measures need to clearly connect to the hypotheses, including the outcome variables and mediators/moderators. 45 Experiment cookbook Druckman p.234+
  • 46. • Research design (continued more) • In many cases, the design may be too practically complex (e.g., number of experimental conditions relative to realistic sample size), and decisions must be made on what can be trimmed without interfering with the goal of the study. • For original data collection, pre-tests of stimuli, question wordings, etc., are critical to ensure the approach has content and construct validity. • Issues related to internal and external validity should be discussed. 46 Experiment cookbook Druckman p.234+
  • 47. • Data collection document • If the project involves original data collection, a step-by-step plan needs to be put forth so as not to later forget such details as recruitment, implementation, etc. 47 Experiment cookbook Druckman p.234+
  • 48. • Data analysis plan • There needs to be a clear data analysis plan—how exactly will the data be used to test hypotheses? The researcher should directly connect the design and measures to the hypotheses. • This often involves making a table with each measure and how it maps on to speci fi c hypotheses. 48 Experiment cookbook Druckman p.234+
  • 49. • Then • Do the experiment 49 Experiment cookbook Druckman p.234+
  • 50. Next time • More on speci fi c experimental designs • Take a look at the readings — choose chapters that are interesting to you • Assignment 1! • Due Sunday 50