SlideShare a Scribd company logo
PSYC214 – LEARNING AND BEHAVIOUR
WEEK ONE LECTURE
DOMJAN CHAPTERS 1 AND 2
HOUSEKEEPING
 LIC and Lecturer: Dylan Fuller
 Lectures 1, 11-12
 dylan.fuller@acu.edu.au
 Consultation: via appointment (email)
 First point of contact for administrative needs related to PSYC214
(e.g., extensions, special consideration, other general questions, etc.)
 Lecturer and Tutor: Tom Nicholl
 Lectures 2-10
 tom.nicholl@acu.edu.au
HOUSEKEEPING
 Prerequisites: PSYC100, PSYC101, PSYC104
 You must have completed these units prior to commencing this unit
 If you have not, contact your course coordinator immediately
ABOUT THE UNIT
 Main aims:
 Use models of learning (e.g., classical conditioning, operant conditioning, behavioural
economics) to describe, explain, predict and change human behaviour
 Show how models of learning can be used in everyday life / clinical settings.
ASSESSMENTS
AssessmentTask Due Date Weighting
Mid-Semester
Exam
During lecture time
in Week 7
30%
Lab Report Monday 7th of
October @ 5pm
40%
Final Exam During
examinations
period
30%
MAIN TEXTBOOK
 The Principles of Learning and Behaviour, 7th ed (Domjan, 2020).
 Information for this book is in ‘Readings’ under Information and
Resources on Canvas
 You must complete the weekly readings – lectures cannot possibly
cover everything!
TODAY’S LECTURE
 Introduction to the study of learning and behaviour
 History
 Definition
 Methodological approaches
 Elicited behaviour
 Habituation and sensitization
LEARNING AND BEHAVIOUR THEORY – A BRIEF
HISTORY
RENÉ DESCARTES (1596-1650)
 Theories of Learning arguably began with the philosophy of
Descartes
 Previously, people believed that behaviour was due to conscious
choices/free will
 Descartes disagreed! He put forward Cartesian Dualism:
 Involuntary Behaviour: Automatic reactions to external stimuli
(i.e., reflex)
 Voluntary Behaviour: Conscious actions made by people
 Believed that voluntary behaviour was uniquely human (no
voluntary behaviours in animals)
NATIVISMVS EMPIRICISM
 Some philosophers like Descartes believed in Nativism:
 Innate knowledge that people are born just knowing
 E.g., the concept of self
 Other philosophers believed in Empiricism:
 We are born as a ‘clean slate’ with no previous knowledge
 Learn things as we go
 Predictability of behaviours was also an issue of contention
THOMAS HOBBES (1588– 1679)
 Believed that human behaviour was guided in a predictable
manner by fixed principles
 This was unlike Descartes who believed that although at times
voluntary, behaviour was not predictable
 Proposed that voluntary behaviour was governed by hedonism,
the pursuit of pleasure and avoidance of pain.
THEORY OF EVOLUTION, NATURAL
SELECTION – CHARLES DARWIN (1809 - 1892)
 Postulates how species have been surviving
 All species descend (have evolved) from a common ancestor.
This assumes continuity from nonhuman to human animals.
 Environment poses an ‘evolutionary problem’ (Avoid
predators, find food and mate, etc.)
 Only some organisms will have “the solution”.These organisms
are the most likely to survive and reproduce and pass on their
genes
EXAMPLE OF
NATURAL
SELECTION –
GREY PEPPERED
MOTHS
 Environmental issue for moths: Camouflage needed to avoid predators
 3 mechanisms support natural selection:
 Variability: A population has various characteristics, e.g., some
moths are light and some dark
 Differential reproductive success:Those with “the solution” to
evolutionary problem, are more likely to reproduce, e.g., light moths
are more likely to survive and reproduce than dark moths
 Heritability: Pass on the genes and therefore pass on characteristic
facilitating survival, e.g., more light moths over generations
THEORY OF EVOLUTION – NATURAL SELECTION
 These changes happen slowly over
time
 But…what is selected at one time,
may be detrimental later
 When the bark changes back!
THEORY OF EVOLUTION – NATURAL SELECTION
 According to theory:
 Both physical traits and psychological or mental abilities play a role in
natural selection and evolution of humans and non-human animals
 In both humans and non-human animals, relevant psychological abilities
such as activity level, aggression, introversion, extroversion, anxiety,
curiosity, imitation, attention, memory, reasoning, aesthetic sensibility,
belief in spiritual agencies
 “Turkey death dance”
https://www.youtube.com/watch?v=VZr7uzw6gpg
THEORY OF EVOLUTION – NATURAL SELECTION
LIMITS OF NATURAL SELECTION
 Natural selection works over many generations.
 Sudden changes in the environment/animal may lead to extinction (not enough time to adapt)
 Polar Bears – Global Warming
 Some adaptations that occur may not be useful (and can even be harmful) when the environment changes
TODAY – 3 TYPES OF BEHAVIOURS
 Elicited Behaviour: Reflexes
 Elicited Behaviour: Modal Action Patterns
 Learning
REFLEXES
 The simplest form of elicited behaviour
 Involves an elicited stimulus and a specific
response (S – R)
 Highly stereotypic for all members of a
species
 But! Some variability via sensitisation and
habituation
REFLEXES, SOME EXAMPLES
 Puff of air to the eye → eye blinking
 Dust on the face → sneezing
 Fearful stimulus → higher galvanic skin response (e.g., goosebumps)
REFLEXES
 A consequence of activation of the nervous system via a reflex arc
1. The environmental stimulus activates an afferent neuron (sensory neuron) in the spinal cord.
2. The neural impulses are relayed to an interneuron and then to an efferent neuron (motor neuron).
3. REFLEX ARC comprises afferent neuron, interneuron, efferent neuron
MODAL ACTION PATTERNS (MAPS)
 Response sequence to a specific “sign”
stimulus, species-specific
 Involves interconnected actions.These are not
learnt, and same across generations
 E.g., Baby penguin feeding: Baby penguin will
tap on parents' beak (stimulus) for food, and
with that stimulus the parent penguin will
know to feed baby penguin
MODAL ACTION PATTERNS (MAPS)
 Series of interrelated actions that can be found in all members of the species
 Similar to reflexes:
1. Genetic basis (no learning)
2. Little variability within species or within individuals
3. Reliably elicited by particular events
 Different from reflexes:
1. More complex
2. More variable
MODAL ACTION PATTERNS (MAPS)
 Can also be known as ‘Fixed action’ patterns
 These MAPS solve evolutionary problems
 They provide the members of the species with a ready-made solution to problems they are sure to
encounter
 No need to learn the behaviours that will solve the problem
 They are likely to have evolved gradually.
 Although they are complex behaviours, they are not intentional acts…
ELICITING STIMULI FOR MAPS
 MAPs are triggered by events called “releasers”
 The elicitation of MAPS often occurs within a complex array of competing stimuli.
 The few essential features for eliciting a specific response are called sign stimuli (or releasing stimuli)
 Supernormal stimuli combine eliciting stimulus features and are effective in eliciting behaviour.
MAPS – SOME EXAMPLES
 Cats’ reaction to threat: arched back, hissing, growls, flick of tail, etc
 Migration of geese inV pattern
 Courtship and mating in many animals
 Laying eggs
BEHAVIOURS ORGANISED INTO SEQUENCES
 Appetitive behaviours
 Early components of a behavioural sequence.
 Represent desire or need for a particular consequence.
 They tend to vary (e.g., strategies to get to food)
 Consummatory behaviours
 The end components.
 Represent the consummation or completion of a response sequence
 They tend to be fixed (e.g., eating)
LEARNING
LEARNING - DEFINITION
 “an enduring change in the mechanisms of behaviour involving specific stimuli and/or responses that
results from prior experience with those or similar stimuli and responses.” – Domjan, 2020, p.14
IS LEARNING EXPLICIT OR IMPLICIT?
 Explicit - Learning can result from special instructional training such as schooling, and from common
contacts with the environment.
 Implicit - Much of our behaviour (and learning) is done outside of conscious awareness
 Much of the field of learning and behaviour is from the behaviourist perspective * i.e., empirically
rigorous science focused on observable behaviours and non-observable internal mental processes
 Learning and behaviour can be understood through an analysis of its antecedents and consequences
METHODOLOGICAL APPROACH – HOW DO WE
MEASURE LEARNING?
UNDER WHAT CONDITIONS SHOULD LEARNING (NOT) BE
STUDIED?
 Anecdotal evidence:“I know someone who…” or “everyone knows that…”
→Biased, insufficient sample
 Case studies: Few individuals in detail
→Costly, questions of generalisability, no possibility to assert causality
 Descriptive studies (correlational)
→Cannot infer causality.You would have heard me say this many times in statistics!
UNDER WHAT CONDITIONS SHOULD LEARNING BE STUDIED?
 Experimental studies: Individuals who received the training procedure have to be compared with individuals
who do not receive that training
 Ensuring the cause is either present or absent
 Helps eliminate any confounding variables
 Can be between subjects or within subjects’ designs
 Between subjects’ designs
 Different groups of people, assigned to different conditions (levels) of the independent variable
 Within subjects’ designs
 Same group of people completing every condition (level) of the independent variable
HOW DO WE MEASURE LEARNING?
 To study learning: measure behaviour change
 Operational definitions (operations used to measure/define the behaviour)
 Number of errors made (less errors as we learn)
 E.g., spelling tests
HOW DO WE MEASURE LEARNING?
 To study learning: measure behaviour change
 Operational definitions (operations used to measure/define the behaviour)
 Number of errors made (less errors as we learn)
 Change in topography (e.g., improve accuracy)
HOW DO WE MEASURE LEARNING?
 To study learning: measure behaviour change
 Operational definitions (operations used to measure/define the behaviour)
 Number of errors made (less errors as we learn)
 Change in topography (e.g., improve accuracy)
 Change in intensity
HOW DO WE MEASURE LEARNING?
 To study learning: measure behaviour change
 Operational definitions (operations used to measure/define the behaviour)
 Number of errors made (less errors as we learn)
 Change in topography (e.g., improve accuracy)
 Change in intensity
 Change in speed (e.g., faster)
HOW DO WE MEASURE LEARNING?
 To study learning: measure behaviour change
 Operational definitions (operations used to measure/define the behaviour)
 Number of errors made (less errors as we learn)
 Change in topography (e.g., improve accuracy)
 Change in intensity
 Change in speed (e.g., faster)
 Change in latency (e.g., faster response over time)
HOW DO WE MEASURE LEARNING?
 To study learning: measure behaviour change
 Operational definitions (operations used to measure/define the
behaviour)
 Number of errors made (less errors as we learn)
 Change in topography (e.g., improve accuracy)
 Change in intensity
 Change in speed (e.g., faster)
 Change in latency (e.g., faster response over time)
 Change in rate (e.g., greater frequency over time)
WHAT ARE THE PROBLEMS WITH EXPERIMENTAL RESEARCH?
 Lack of generalizability
 Uses arbitrary responses and stimuli
 E.g., Salivation and bells
 (for the benefit of increased control)
 But! it provides the basic principles on which applied research can be built
 Simplifying problems before making them complicated makes sense
GENERAL PROCESS APPROACHTO LEARNING
 Assumption: learning phenomena are products of elemental processes that operate
consistently across situations and species.
 Across species, learning is governed by common fundamental rules or “principles”
 Seeks to formulate laws to organize and explain a diversity of events, through experiments in
many distinct situations and species
GENERALITY OF LEARNING
 If general processes of learning exist, then we should be able to discover these rules in any
situation where learning occurs, and across any species.
 Animal studies allow us to conduct research that could not be conducted with humans
 Ethics approval: cost-benefit analysis
HABITUATION AND SENSITIZATION
HABITUATION AND SENSITIZATION
 Habituation and sensitization are caused by repeated presentation of the same stimulus.
 They both serve to focus on the relevant stimuli.
 They are both affected by the level of arousal / state system
 They occur in the central nervous system.
 Habituation effects = decreases in responsiveness due to repeated stimulation. E.g., cannot
notice background music after a while (particularly if the room is quiet, as increases with less
arousal)
 Sensitization effects = increases in responsiveness due to repeated stimulus presentation.
E.g.,“happy” music becomes annoying after a while. Particularly if someone is talking loudly
(increases with arousal)
HABITUATION – STIMULUS SPECIFIC EFFECT
 Habituation is a reaction to a specific stimulus.
 Changes in the stimuli may reduce habituation
 Focusing on alternative stimuli while initiating the consummatory behaviour reduces
habituation
 E.g.,We may habituate less to specific foods if we eat them with different spices, or while
listening to music or in company
CASE EXAMPLE – THE STARTLE RESPONSE
THE STARTLE RESPONSE IS A DEFENSIVE REACTION TO THREAT OR FEAR, CHARACTERIZED BY A SUDDEN
TENSING OF THE UPPER BODY IN RESPONSE TO A SUDDEN STIMULI
CASE EXAMPLE – THE STARTLE RESPONSE
 https://www.youtube.com/watch?v=FOUZ7xmUkC
8
HABITUATION OF THE STARTLE RESPONSE
SENSITIZATION MODULATES THE STARTLE RESPONSE
 Sensitization is influenced by:
 Physiological arousal
 How often the stimuli is presented
 Sensitization of the startle response is situation dependent.When an organism is already
aroused, it is more likely to sensitize to a stimulus.This can be used to “increase”
 Excitement e.g., loud music, movies, and sporting events
 Fear e.g., music during a horror movie
 Pain e.g., repeated exposure, could lead to stronger feelings of pain.
SENSITIZATION AND THE STARTLE RESPONSE
DUAL PROCESS THEORY
 The dual process theory assumes that different neural processes underlie habituation and
sensitization (and that these processes are not mutually exclusive).
 Habituation and sensitization effects are assumed to reflect the sum of the outcomes of the
habituation and sensitization processes, depending on which system is stronger in a given situation.
 For example, rats habituate to a loud bang if the background noise is quiet; but sensitise to the same
loud bang if the background noise is loud.
DUAL PROCESS THEORY
 The habituation process occurs in the Stimulus-Response system (S-R system) of the
nervous system.
 The S-R system is the shortest neural pathway between the sense organ and responding
muscle.
 Every presentation of an eliciting stimulus activates the Stimulus-Response system.
 E.g., we habituate to food, music, sex, drugs, environment (bushwalk, clock ticking)
DUAL PROCESS THEORY
 The sensitization process occurs in the state system.
 The state system consists of other regions of the nervous system responsible for general
arousal levels.
 The state system is only activated in special circumstances, such as presentation of an intense
stimulus.
 E.g.,We sensitize to music, pain, drugs, etc.
OPPONENT PROCESS THEORY OF MOTIVATION
 Some stimuli elicit biphasic emotional responses.
 Biphasic responses have:
 An initial strong response
 An adaptation response
 An opposite response
 E.g., alcohol and drugs (anxiolytic and anxiogenic), love and attachment
OPPONENT PROCESS THEORY OF MOTIVATION
 Homeostatic theory – postulates that opposite neurophysiological mechanisms involved in
emotional behaviour serve to maintain emotional stability.
 An emotionally arousing stimulus pushes an individual’s state away from neutral which
triggers an opponent process that counters the shift.
 That is, when experiencing an intense positive emotion, the opponent process pushes us to
feel down/low.
OPPONENT PROCESS THEORY OF MOTIVATION - ADDICTION
Initial presentation (recreational use) After habituation (addiction stage)
SUMMARY
 Historical perspectives
 Natural selection
 Behaviour traits
 Elicited behaviour
 Reflexes
 Modal Action Patterns
 Learning
 Habituation and sensitization
NEXT WEEK
CLASSICAL CONDITIONING
PSYC214 – LEARNING AND BEHAVIOUR
WEEK TWO LECTURE
DOMJAN CHAPTER 3
HOUSEKEEPING
¡ My name isThomas Nicholl, I will be working with Dylan Fuller for PSYC214
¡ Tutorials Week 1 to 10
¡ Lectures Week 2 to 10
¡ Post questions in the CANVAS discussion board first
¡ Contact via CANVAS (not email) for tutorial based questions
¡ Contact Dylan (LiC) for course/assignment/exam related questions.
¡ Data Collection for Lab Report in Tutorials
¡ Online Study
¡ Please bring laptops to class
TODAY’S LECTURE
¡ Classical (Pavlovian) conditioning
¡ History
¡ Examples
¡ Higher order conditioning
¡ Fear conditioning
¡ Eyeblink conditioning
¡ Sign tracking/Auto shaping vs Goal tracking
¡ How is it studied/measured?
¡ Excitatory vs inhibitory classical conditioning
LEARNING OUTCOMES
¡ Understand, explain & exemplify the following concepts:
¡ Mechanisms of classical conditioning
¡ Factors that contribute to effective conditional and unconditional stimuli
¡ Learning associations, higher-order conditioning and sensory preconditioning
¡ Unconditioned Response, Unconditioned Stimulus, Conditioned Response, Conditioned Stimulus, Neutral Stimulus
¡ Effects of the US and CS on the CR
¡ Classical and higher order conditioning
¡ Fear Conditioning
¡ Excitatory vs inhibitory classical conditioning
¡ Sign tracking
PAVLOV
¡ Russian physiologist, well known for his work on classical conditioning
¡ In his experiment while studying the functioning of digestive system,
found that dogs not only salivate upon actually eating, but also when
they saw the food, noticed the man who brought it, or even heard his
footstep.
¡ Pavlov began to study this phenomenon, which he called ‘conditioning’
PAVLOV
¡ Initially, dog salivates when food is
presented
¡ This is a reflex of the salivary gland
¡ Then, dog salivates before food arrives
¡ How could this happen as a result of
experience?
¡ Psychic reflex (psychic secretions)
https://www.youtube.com/watch?v=S6AYofQchoM
3 minute video explaining Pavlov’s interest &
contribution to classical conditioning & associative
learning
PAVLOVIAN OR CLASSICAL CONDITIONING
¡ Simplest mechanism whereby organisms learn about relations between one event and another
¡ Two types of reflexes:
¡ Unconditional (unconditioned) reflexes
¡ US à UR
¡ Food in mouth à salivation
¡ Conditional (conditioned) reflexes
¡ CS à CR
¡ Experimenter à salivation
CLASSICAL CONDITIONING
CS = NS à NR
CLASSICAL CONDITIONING – UNCONDITIONED STIMULUS
¡ A stimulus that elicits a particular response without the necessity of prior training
CLASSICAL CONDITIONING – UNCONDITIONED STIMULUS
¡ A stimulus that elicits a particular response without the necessity of prior training (e.g., food)
CLASSICAL CONDITIONING – UNCONDITIONED RESPONSE
¡ A response that occurs without the necessity of prior training (e.g., salivating)
Repeated Pairings
Note:The NS does not have to be a bell! Can
be another neutral thing light a light bulb
flashing
CLASSICAL CONDITIONING – CONDITIONED RESPONSE
¡ A stimulus that does not elicit a particular response initially but comes to do so after being associated with an
UNCONDITIONED STIMULUS (E.g., meat)
TESTTRIAL: The CONDITIONED STIMULUS is presented
without the UNCONDITIONED STIMULUS.This allows
measurement of the conditioned response in the absence of the
UNCONDITIONED STIMULUS.
INITIAL RESPONSES TO THE STIMULI
¡ US effective in eliciting target response from outset
¡ CS does not elicit conditioned response initially; results from association with US
US and CS
relative to each
other
RECAP – A CLASSIC LOOK AT CLASSICAL CONDITIONING
HIGHER ORDER CONDITIONING
LEARNING WITHOUT AN UNCONDITIONED STIMULUS
¡ Higher-Order Conditioning:
¡ CS1 is paired with US often enough to condition strong response to CS1
¡ Once CS1 elicits conditioned response, pairing CS1 with new stimulus CS2 conditions CS2 to also elicit the conditioned
response
¡ Conditioning occurs in the absence of US
¡ This is the basis of “irrational fears”
HIGHER-ORDER CONDITIONING
HIGHER ORDER CONDITIONING
LEARNING WITHOUT A CONDITIONED STIMULUS (CONT.)
¡ Sensory Pre-Conditioning:
¡ CS1 and CS2 become associated
¡ CS1 paired illness; CR develops to CS1
¡ Participants with aversion to CS1 also show aversion to CS2, even though CS2 was never directly paired with US
EVERYDAY EXAMPLES OF HIGHER ORDER CONDITIONING
¡ Prof Sapolsky highlights that the outcome is not about the
pleasure/reward, but the anticipation of it (refer to video link below)
¡ Other examples of conditioned stimuli leading to CR?
¡ In restaurants
¡ Smart phone tones
¡ THAT song when your relationship ended ….
¡ Use of celebrities in advertising
www.youtube.com/watch?v=axrywDP9Ii0 Prof Sapolsky on anticipation of
pleasure
https://www.youtube.com/watch?v=YvhlOtQAU0A Simpsons video of
conditioning
FEAR CONDITIONING
FEAR CONDITIONING – CONDITIONED SUPPRESSION
¡ www.youtube.com/watch?v=ZlZekx1P1g4
Conditioned suppression of a rat's lever pressing
video
¡ Suppression of ongoing behaviour (e.g., drinking or
suppression of lever to get food) produced by the
presentation of a CONDITIONED STIMULUS that
has been conditioned to elicit fear through
association with an aversive UNCONDITIONED
STIMULUS (e.g., shock)
FEAR CONDITIONING – CONDITIONED SUPPRESSION
FEAR CONDITIONING – LICK SUPPRESSION PROCEDURE
¡ A procedure to test fear conditioning
¡ Presentation of a fear conditioned CONDITIONED STIMULUS (e.g., light predating an electric shock) slows
down the rate of drinking
HOW DOES THIS TRANSLATE TO HUMAN BEHAVIOUR?
HOW DOES THIS TRANSLATE TO HUMAN BEHAVIOUR?
¡ “At approximately nine months of age we ran Albert through the emotional tests that have become a part of our
regular routine in determining whether fear reactions can be called out by other stimuli than sharp noises and the
sudden removal of support....”
¡ “In brief, the infant was confronted suddenly and for the first time successively with a white rat, a rabbit, a dog, a monkey
with masks with and without hair, cotton wool, burning newspapers, etc.A permanent record of Albert's reactions to
these objects and situations has been preserved in a motion picture study....”
¡ “At no time did this infant ever show fear in any situation.”
¡ http://www.youtube.com/watch?v=Xt0ucxOrPQE&feature=related (John Watson overview)
LITTLE ALBERT: WATSON AND RAYNER (1920)
¡ “The sound stimulus, thus, at nine months of age,
gives us the means of testing several important
factors:
¡ Can we condition fear of an animal, e.g., a white rat,
by visually presenting it and simultaneously striking a
steel bar?
¡ If such a conditioned emotional response can be
established, will there be a transfer to other animals
or other objects?
¡ What is the effect of time upon such conditioned
emotional responses?
¡ If after a reasonable period such emotional
responses have not died out, what laboratory
methods can be devised for their removal ?”
LITTLE ALBERT: WATSON AND RAYNER (1920)
¡ “These experiments would seem to show conclusively that directly conditioned emotional responses as well as
those conditioned by transfer persist, although with a certain loss in the intensity of the reaction, for a longer
period than one month. Our view is that they persist and modify personality throughout life.”
¡ “Unfortunately,Albert was taken from the hospital the day the above tests were made. Hence the opportunity of
building up an experimental technique by means of which we could remove the conditioned emotional responses
was denied us.
¡ Our own view, expressed above, which is possibly not very well grounded, is that these responses in the home
environment are likely to persist indefinitely, unless an accidental method for removing them is hit upon.”
WHAT HAPPENED TO LITTLE ALBERT?
¡ Recent investigations indicate that Albert was likely to have been a pseudonym for Douglas Merritte, who died
aged 6 from hydrocephalus in 1925.
¡ Subsequent investigations suggest that Albert/Douglas suffered from congenital hydrocephalus:
¡ This calls into question Watson and Rayner’s assertion that he was a healthy child at the time of the experiments.
¡ Recent studies claim that the archival footage of Albert/Douglas indicates delayed development and abnormal
responses.
WATSON AND RAYNER
¡ The pair married in 1921 after Watson’s divorce.
¡ Rosalie assistedWatson to write the most popular childrearing book of
the time “The Psychological Care of Infant and Child” (1928).
¡ "Never hug and kiss them, never let them sit on your lap. If you must,
kiss them once on the forehead when they say good night. Shake hands
with them in the morning.“
¡ Watson’s 4 children were documented to experience a range of
psychological problems, and one became a psychoanalyst.
EYEBLINK CONDITIONING
¡ Eyeblink reflex is an early component of the startle response
¡ A defensive reaction to threat or fear, characterised by a sudden tensing of the upper body in response to a sudden stimuli
¡ E.g., puff of air to the eye, someone clapping in your face
¡ https://www.youtube.com/watch?v=hoA2Pm9cjQs
SIGN TRACKING OR AUTO SHAPING
SIGN TRACKING OR AUTO SHAPING
¡ Example: Pigeons
¡ CS (light) paired with food (US)
¡ Conditioned pecking of light even though
not required to gain access to food (and it was
2.5 meters away from food dispenser!)
SIGN TRACKING OR AUTO SHAPING
¡ Example: Quails
¡ CS (wood block) paired with females (US)
¡ Conditioned standing on wood even though not
required to gain access to females (and it was
away from females!)
SIGN TRACKINGVERSUS GOAL TRACKING
¡ Individual differences (e.g., rats)
¡ Some individuals show sign tracking (peck on light linked to food or go to block associated with females).
¡ Other individuals show goal tracking (peck on food or follow female)
¡ These individual differences appear to have a genetic basis
CLASSICAL CONDITIONING: MORE THAN STIMULUS-RESPONSE?
¡ Stimulus substitution theory:
¡ The behaviourist tradition viewed classical
conditioning as a simple mechanical process in which
control over a reflex response is passed from
one stimulus (UCS) to another (CS)
¡ Evidence in support of the stimulus
substitution hypothesis:
¡ Jenkins & Moore (1973) study:
¡ AutoShaping in pigeons:
¡ One group had CS(light) àUS(grain)
¡ Photos showed pigeons trying to “eat” the lit key (open
beak and closed eyes) when they pecked
¡ 2nd group had CS(light) à US(water)
¡ Photos showed pigeons trying to “drink” the lit key
(closed beak and open eyes) when they pecked
ACQUIRED TASTE PREFERENCES AND AVERSIONS
¡ We can acquire new preferences given specific circumstances
¡ Taste aversion learning
¡ Flavour-illness pairing
¡ Single trial learning
¡ Long delay learning
¡ Evaluative conditioning
¡ Learn to like/dislike new flavour
¡ Neutral flavour paired with already liked or disliked flavour
CLASSICAL CONDITIONING: MORE THAN STIMULUS-RESPONSE?
¡ Evidence against the stimulus substitution
hypothesis
¡ Any study in which the elicited CR is different from
the UCR
¡ e.g., when a tone is paired with shock, rats will jump to
the UCS (shock), but the CR is typically freezing
¡ e.g., when a light is paired with food, rats will rear to
the light (CR) but the UCR is approach to the food
dispenser
¡ Preparatory Response Model
¡ Kimble’s (1961, 1967) theory proposed that the CR
is a response that serves to prepare the organism for
the upcoming UCS
¡ e.g., following acquisition of CRs in eyeblink
conditioning, the CR eyeblink may actually prepare the
person for the upcoming air puff such that the eye
would be partially closed when the air puff occurs
IS IT POSSIBLE TO TRAIN ANY ANIMAL ANY CS-UCS
ASSOCIATION?
¡ No. What can be learned is strongly constrained by evolutionary history of learning
¡ We call the biological constraints on classical conditioning Biological Preparedness (Seligman, 1970)
¡ Not all CSs are created equal
¡ Mere contiguity (temporal pairing) of NS and UCS is not sufficient
WITHIN SPECIES THERE ARE BIOLOGICAL CONSTRAINTS ON WHAT
ASSOCIATIONS CAN BE LEARNED
¡ The table shows the results from studies with rats
in which the experience of an electric shock (which
produces a pain response) or X-rays (which
produces a nausea response) was paired with three
different kinds of CSs: Light, sound and taste.
¡ What do the results tell us about the preparedness
of rats for learning signals that predict painful or
nauseating stimuli?
BIOLOGICAL PREPAREDNESS
¡ Wilcoxon et al (1971) conducted experiments to test for biological preparedness
¡ Presented a compound stimulus (blue sour water) to quails & rats to test if colour or taste were used in learning
(in food)
¡ Two species with different biological preparedness for learning taste aversion
¡ 1. Rats – seek food based on smell/taste
¡ 2. Quails – seek food based on sight/colour
¡ What might we expect in the results?
CLASSICAL CONDITIONING: MORE THAN STIMULUS-RESPONSE?
¡ The compensatory-response model is one version of
preparatory-response theory
¡ In this model of classical conditioning, the
compensatory after-effects to a US are what come
to be elicited by the CS
¡ Based on the opponent-process theory of emotion /
motivation
¡ Opponent-ProcessTheory of Emotion
¡ Emotional events elicit two competing processes:
¡ The primary/A process that is immediately
elicited by the event
¡ e.g., taking an exam elicits an unpleasant A-
state
¡ An opponent/B process that is the opposite of
the A-process and counteracts it
¡ e.g., the pain during the exam (A-state) creates
a pleasant relief response (B-state) following
the exam
DRUGS,ADDICTION,AND PREPARATORY RESPONSE THEORY
¡ Taking a drug disturbs the homeostasis of the body (up or down).
¡ The body has a natural reflex response to respond to the effect of the drug by compensating for its effect to
return homeostasis.
¡ This is called the compensatory response (reflex)
¡ The act of taking a drug is accompanied by many environmental stimuli
¡ When you repeatedly use a drug, you become tolerant of its effects.Tolerance occurs when the effect of a drug
decreases over the course of repeated administrations.
¡ You also experience cravings (withdrawal)
¡ In terms of classical conditioning – try to develop an explanation for how the effects of tolerance and
craving (withdrawal) might arise.
CONDITIONED COMPENSATORY RESPONSE
¡ The reflex is the body’s natural compensatory response to the effect of the drug
¡ Situational cues (initially neutral) become associated with drug use become CSs
¡ Repeated cueing of the drug’s effect by situational cues establishes a conditioned compensatory response.
¡ When the CSs are present, the body prepares for the likely effect of the drug before it has been administered.
¡ Tolerance develops after pairings of the pre-drug CSs
with the drug effect UCS.
¡ Produces a conditional compensatory response to
the CSs alone.
¡ The conditioned compensatory response
counteracts the drug effect, producing tolerance.
¡ As the drug is administered more and more often, and
the conditional compensatory response grows in
strength, the weakening of the drug effect becomes
more pronounced.
EXCITATORY AND INHIBITORY CONDITIONING
¡ Excitatory Conditioning – Neutral Stimulus (NS)
associated with presentation of Unconditioned
Stimulus (US)
¡ Inhibitory Conditioning – Neutral Stimulus (NS)
associated with absence or removal of
Unconditioned Stimulus (US)
What if a stimulus is associated with the absence of
the US rather than its presentation?
¡ In excitatory conditioning, organisms learn a
relationship between a CS and US
¡ As a result of this learning, presentation of the CS
activates behavioural and neural activity related to
the US in the absence of the actual presentation of
that US.
¡ E.g., Pavlov's Dog
INHIBITORY CONDITIONING
¡ Organisms learn to predict the absence of the US
(when bad things happen)
¡ PREREQUISITE US must occur regularly before (you
need to know WHEN bad things happen e.g., shocks,
panic attacks, bullying, out of petrol)
¡ Something is introduced so it prevents an
outcome that would occur otherwise
¡ Learn the absence of the US
¡ Examples:
¡ Panic attack in crowds àNo panic attack when
avoiding crowds
¡ Bullied when teacher is away à No bullying when
teacher is around
¡ Out of petrol when sign is on àWith petrol when
sign is off
PSEUDO CONDITIONING
SUMMARY
¡ Classical conditioning
¡ Form of associative learning
¡ Not just restricted to reflexes – higher order conditioning
¡ Examples include
¡ Fear conditioning
¡ Eyeblink conditioning
¡ Sign tracking and goal tracking
¡ Excitatory and Inhibitory conditioning
¡ Procedures
¡ Measures of conditioned responding
RECOMMENDED READINGS
¡ Fridlund,A. J., Beck, H. P., Goldie,W. D., & Irons, G. (2012, January 23). Little Albert:A neurologically impaired child.
History of Psychology.Advance online publication. doi:10.1037/a0026720
¡ Jones, M.C. (1924) A laboratory study of fear: the case of Peter,The Pedagogical Seminary and Journal of Genetic
Psychology, 31:4, 308-316, DOI: 10.1080/08856559.1924.9944851
¡ Rescorla, R.A. (1988). Pavlonian Conditioning: It’s not what you think it is.American Psychologist. 43(3) 151-160.
¡ Siegal, S. (2005). Drug tolerance, drug addiction, and drug anticipation. Psychological Science. 14. 296-300. doi:
10.1111/j.0963-7214.2005.00384.x
¡ Watson, J. B., & Rayner, R. (1920). Conditioned emotional reactions. Journal of Experimental Psychology, 3(1), 1–14.
https://doi.org/10.1037/h0069608
¡ Wilcoxon, H. C., Dragoin,W. B., & Kral, P.A. (1971). Illness-induced aversions in rat and quail: Relative salience of
visual and gustatory cues. Science, 171(3973), 826-828.
NEXT WEEK
¡ Factors that affect CC
¡ Alternative models of CC
PSYC214 – LEARNING AND BEHAVIOUR
WEEK THREE LECTURE
DOMJAN CHAPTER 4
HOUSEKEEPING
¡ Lab report guide and required readings now available on Canvas.
¡ Attendance at Tutorials for support with lab report.
¡ Mid-semester exam in Week 7
REVIEWING CLASSICAL CONDITIONING
CLASSICAL CONDITIONING
CLASSICAL CONDITIONING – UNCONDITIONED STIMULUS
¡ A stimulus that elicits a particular response without the necessity of prior training
CLASSICAL CONDITIONING – UNCONDITIONED STIMULUS
¡ A stimulus that elicits a particular response without the necessity of prior training (e.g., food)
CLASSICAL CONDITIONING – UNCONDITIONED RESPONSE
¡ A response that occurs without the necessity of prior training (e.g., salivating)
PSYC214 lecture slides for mid-sem exam (weeks 1-6)
PSYC214 lecture slides for mid-sem exam (weeks 1-6)
CLASSICAL CONDITIONING – CONDITIONED RESPONSE
¡ A stimulus that does not elicit a particular response initially, but comes to do so after being associated with an
UNCONDITIONED STIMULUS (E.g., meat)
REVISION QUESTION
¡ Identify each of the key concepts (i.e., US, UR, CS, CR)
¡ EXAMPLE 1: Every time someone turns on the washing machine in your house, the shower
becomes very cold and causes you to jump back. Over time, the person begins to jump back
automatically after hearing the washing machine working, before the water temperature
changes
¡ NS = NR
¡ US = UR
¡ NS + US = UR
¡ CS = CR
REVISION QUESTION
¡ Identify each of the key concepts (i.e., US, UR, CS, CR)
¡ EXAMPLE 2:You eat a new food and then get sick because of the flu. However, you develop a
dislike for the food and feel nauseated whenever you smell it.
¡ NS = NR
¡ US = UR
¡ NS + US = UR
¡ CS = CR
HIGHER ORDER CONDITIONING
CONDITIONING - CONTINUED
¡ Examples:
¡ Fear conditioning:
¡ Sign tracking – movement towards/contact with a stimulus that signals the availability of a positive reinformcent.
¡ Goal tracking – conditioned behaviour elicited by a CS that consists of approaching the location where the US is present
¡ Eyeblink conditioning
¡ Stimulus Substitution Theory
¡ Opponent ProcessTheory
¡ Excitatory (presentation) and Inhibitory (absence/removal) Conditioning
CLASSICAL CONDITIONING CONTINUED
TODAY’S LECTURE
¡ What makes an effective CS and US?
¡ Novelty
¡ Belongingness
¡ Salience
¡ Intensity
¡ What determines the nature of CR?
¡ The US
¡ The CS
¡ How do CS and US become associated?
¡ Rescorla Wagner Model
¡ Attentional models
¡ Temporal Coding Hypothesis
WHAT MAKES EFFECTIVE CS AND US?
¡ Initial responses to stimuli
¡ CS does not elicit the conditioned response initially, but comes to do so after being associated with US
¡ US elicits the response unconditionally
¡ Nearly any stimulus can be CS and US
¡ Novelty (more novel, faster to learn about the CS and US)
¡ CS and US intensity and salience
¡ Belongingness
¡ Learning without a US?
EFFECTIVE CS AND US: NOVELTY
¡ The Latent-Inhibition or CS-Preexposure Effect - novel (new) stimuli
are more effective than familiar stimuli
¡ Experiments on latent-inhibition effect have two phases:
¡ Participants given repeated presentations of CS by itself: the CS
preexposure makes CS familiar
¡ CS is paired with a US. Participants are slower to acquire responding
because of the CS preexposure
¡ The US-Preexposure Effect
¡ Experiments on US novelty similar to CS-preexposure experiments
¡ Conditioning proceeded faster for novel stimuli than familiar stimuli
EFFECTIVE CS AND US: INTENSITY & SALIENCE
¡ Learning facilitated with higher stimulus intensity, which gives the stimulus more salience
¡ Salience is significance, noticeability, attention-getting, more likely to occur in natural environment
¡ Examples:
¡ Faster learning using food as US, if subject is hungry
¡ Stronger fear of dogs if was attacked by a big dog as a child
¡ Use CS that are more likely to occur in the natural environment, as reinforcers e.g., use female quail as US/sexual reinforcer,
to make male quail more responsive/faster learning to CS like light, block of wood
CS–US RELEVANCE/BELONGINGNESS
¡ Learning depends on relevance of the CS to the US
¡ Kind of stimuli presented with USs important
¡ Example: taste readily associated with illness; audio-visual cues readily associated with peripheral pain
¡ Belongingness Experiment (Garcia and Koelling, 1966):
¡ Rats drink from a tube, before administration of one of 2 US: (i) shock or (ii) illness
¡ Rapid learning occurred only if CS was combined with appropriate US
¡ *Rapid learning occurred only if CS was combined
with appropriate US*
¡ Rats conditioned with illness learned a stronger
aversion to taste CS (than to audio-visual CS).
¡ Rats conditioned with shock learned stronger
aversion to audio-visual CS (than to taste CS).
CS–US RELEVANCE/BELONGINGNESS
¡ When the CS “belongs to” / is related to / is biologically relevant to the US, it is easier / faster to learn that
they are associated.
¡ It reflects sensitization effect of CS pre-exposure
¡ For example:
¡ Rhesus monkeys and humans learn fear conditioning faster if the CS that signal danger are fear
relevant/biologically dangerous cues (snake, image of a skull) versus fear irrelevant (mushroom)
¡ Can you think of another example?
LEARNING WITHOUT AN UNCONDITIONED STIMULUS
¡ Pavlovian conditioning: food, shock,… but what about everything else?
¡ Higher-Order Conditioning:
¡ CS1 is paired with US often enough to condition strong response to CS1
¡ Once CS1 elicits conditioned response, pairing CS1 with new stimulus CS2 conditions CS2 to also elicit the conditioned
response
¡ Conditioning occurs in absence of US
¡ This is the basis of “irrational fears”
LEARNING WITHOUT AN UNCONDITIONED STIMULUS
¡ Stimulus Substitution Model
¡ The theory that as a result of classical conditioning, subjects come to
respond to the CS in the same way that they respond to the US
¡ The CS can then act as a US
¡ It assumes that
¡ The CS becomes a surrogate of the US
¡ The nature of the US dictates the CR
¡ The Conditioned Stimulus lead to the Unconditioned Response via excitation of
Unconditioned Stimulus centres (e.g., amygdala).
STIMULUS
SUBSTITUTION
MODEL -
LEARNING
WITHOUT AN US:
HIGHER ORDER
CONDITIONING
EXAMPLE
¡ You are afraid of crowds and feel anxious in them (CS1).
¡ Perhaps because when you were in a crowd once (CS), someone
pushed you and hurt you (US).
¡ You go to the movies, and a group of people rush in (CS) and you
feel anxious (CR).
¡ The next time you think about going to the movies (CS2) with
friends, you feel anxious (CR). (Adapted from Wolpe, 1990)
WHAT DETERMINES THE NATURE OF THE CR?
¡ The US
¡ Core factor – the type of US should direct the UR/CR
¡ Example of the pigeons with food vs. water and their response (beak open/closed)
¡ Stimulus substitution model (previous slides)
¡ The CS
¡ Type will determine the CR dependant on the organism
¡ Example of rat being presented (UR of gnawing/biting, vs. CR of social orientation)
¡ The US-CS interval
¡ The interval (time, distance) between the US and the CS is vital
¡ Example of car coming toward you and the potential for injury – response is dependant on the distance from the car.
HOW DO THE CS AND US BECOME ASSOCIATED? [THEORIES OF
CLASSICAL CONDITIONING]
¡ The blocking effect
¡ “not learning an association”
¡ The Rescorla-Wagner model
¡ Attentional models
¡ Temporal coding hypothesis models
BLOCKING EFFECT
¡ Every Sunday you visit your partner’ parents… They always serve a cherry cake that slightly disagrees with you. L
You don’t want to upset them, so you don’t say anything
¡ You acquire an aversion to the cherry cake (so that every time you are supposed to eat it, you feel unpleasant).
¡ On a special occasion, your partner’s parents add a special chocolate sauce to the cherry cake.
¡ You feel sick
Will you develop an aversion to the
chocolate sauce?
BLOCKING EFFECT
¡ Classical conditioning may not occur in some instances (e.g., chocolate might make you sick… but don’t learn an
association)
¡ Kamin: Classical conditioning occurs only when the US is unexpected
¡ The presence of a previously conditioned stimulus (e.g., cherry pie), may block / interfere with the conditioning of a novel
stimulus (e.g., chocolate sauce)
BLOCKING EFFECT, EXPERIMENT
¡ Phase 1:Training link bell & food
¡ Bell (=CS 1)
àsalivation (=CR) because anticipates Food (=US)
¡ Phase 2:Training link bell + light & food
¡ Bell (=CS 1) + simultaneous light (CS =2)
à salivation (=CR) because anticipates Food (=US)
¡ Phase 3:Training link light + food
¡ Light (CS =2)
à NO salivation (no CR) as it does not anticipate food
à Salivation only 20% of times if light only is presented
¡ Learning of light as CS (CS=2) predating food, is “blocked” by
the existing association between the bell (CS =1) and the food
RESCORLA-WAGNER MODEL (1972)
¡ Mathematical model proposed by Robert Rescorla and Allan Wagner (1972)
¡ The idea that the effectiveness of a US is determined by how surprising it is forms the basis of a formal
mathematical model of conditioning.
¡ The model look at the implications of the US surprise to a wide variety of conditioning phenomena.
¡ According to the model, an unexpectedly large US is the basis for excitatory conditioning and an unexpectedly
small US (or absence) is the basis for inhibitory learning.
¡ The model suggests assumptions can be made about the expectation of the US.
¡ We will look at their mathematical model in a later slide.
BLOCKING EFFECT
¡ On the blocking effect…
¡ The RW model clearly predicts the blocking effect.
¡ If one CS already fully predicts that the US will come, nothing will be learned about a second CS that
accompanies the first CS
¡ On extinction: it is new learning about undoing an existing association
¡ It is not the reverse of acquisition
RESCORLAWAGNER MODEL (1972): LOSS ASSOCIATIVEVALUE IF
THERE ARE 2 CS
¡ The associative value of a CS (e.g. bell) in predicting a US (e.g.
food) is lost:
¡ After both CSs (e.g. bell, light) have been associated with the US
(e.g. food) in separate trials
¡ When it is presented together with another CS (e.g. light) on a
conditioning trial
RESCORLAWAGNER MODEL (1972): LOSS ASSOCIATIVEVALUE IF
THERE ARE 2 CS
¡ Example:
¡ One learning trial light à food (& light leads to salivation)
¡ One learning trial bell à food (& bell leads to salivation)
¡ Final trial light+bell => food (& light+bell does not lead to salivation)
¡ Why?
¡ Because light+bell lead to over-expectation of double the food
¡ The subject has to decrease expectation-> light+bell lose associative value
RESCORLAWAGNER MODEL (1972): LEARNING IS ABOUT SURPRISE
¡ How much you learn depends on the effectiveness of the US, how “surprising” and “unpredictable” the US is
¡ This generates a strong conditioned response (CR)
¡ The value of the CS (e.g., bell, light) is stronger / highly salient when it predicts the onset of the US (e.g., food)
RESCORLA-WAGNER MODEL (1972)
¡ The delta rule
¡ Learning at the start: big errors in predicting “uncertain” US at the
start of the learning (i.e., don’t know when the US is going to happen)
¡ As learning goes on,“error correction” the US becomes more
predictable
ATTENTIONAL MODELS
¡ American psychologists focused on changes in the impact of the US, whilst British psychologists looked at the
impact of how the CS commands attention.
¡ Assumption that increased attention facilitates learning about a stimulus and procedures that disrupt attention to
a CS disrupt learning.
¡ The outcome of a given trial (e.g., reward/loss) alters degree of attention commanded by the CS on future trials
¡ More attention à More learning
¡ If the US is surprising, it boosts the influence of attention to the US (not the salience of the US) on
conditioning
TEMPORAL CODING HYPOTHESIS/MODELS
¡ Neither the RW or attentional models looked at timing, however, acknowledged it is a critical factor.
¡ Time is a critical factor in classic conditioning.
¡ Several models (Rescorla-Wagner, attentional models) do not explain effects of time in conditioning
¡ Timing is important
¡ Learn not only that the US will occur, but when
¡ “Temporal coding” – Participants learn when the US occurs in relation to a CS and use this
information in blacking, second-order conditions, and other trainings.
¡ Learn when the US occurs in relation to a CS –2 temporal factors of interest
TEMPORAL CODING HYPOTHESIS / MODEL
¡ Conditioned Responding depends on
¡ 1. Duration of the CS-US interval (or inter stimulus interval)
¡ How long one must wait for the US to occur after the CS
¡ Longer duration between Pavlov’s dog bell and meat à less conditioned responding
¡ 2. Inter-trial Interval
¡ How much time between one CS-US trial and the next
¡ Longer duration à more Conditioned Responding
¡ “CS associable with a US … only to the extent that it reduces the expected time to the next US”
RECAP
¡ Classical conditioning as a basic approach
¡ Fear conditioning
¡ Eyeblink conditioning
¡ Sign tracking / goal tracking
¡ Higher order conditioning
¡ Excitatory/Inhibitory Conditioning
¡ What makes CC effective?
¡ Novelty, belongingness, salience, intensity
¡ What if it doesn’t work….Aka the Blocking Effect (links to higher order conditioning)
¡ How do the CS and US become associated (and do they?)
¡ RW model
¡ Attentional Modes
¡ Temporal Coding
WEEK 4 AND WEEK 5
¡ Instrumental Conditioning (IC) à Week 4
¡ Reinforcement schedules and Choice à Week 5
BEFORE I LEAVEYOU (FOR NOW)
¡ Remember to read and review:
¡ Assigned Readings
¡ Lab report guide
¡ Required readings for the lab report (we will begin to discuss in week four tutorials)
PSYC214 – LEARNING AND BEHAVIOUR
WEEK FOUR LECTURE
DOMJAN CHAPTER 5
§ Instrumental (Operant) Conditioning
§ Early investigations
§ Modern approaches
§ Procedures
§ Fundamental elements
TODAY’S LECTURE
INSTRUMENTAL (OPERANT CONDITIONING)
LEARNINGVIA CONSEQUENCES
¡ Classical conditioning reflects how organisms adjust to events in their environment that they do not directly
condition.
¡ Now we will look at learning situations in which the stimuli an organism encounters are a result or consequence of
its behaviour.
¡ This is referred to as goal-directed or instrumental because responding is necessary to produce a desire
environmental outcome.
¡ Here behaviour is instrumental in producing a significant stimulus or outcome.
§ Reflexes, habituation, sensitisation, Classical Conditioning:
§ Behaviours are elicited in response to environmental stimuli that the organism does not directly control.
§ Organisms are not required to respond in a particular way (or behave) to obtain a unconditioned or conditioned
stimuli.
§ Instrumental/operant conditioning:
§ The stimuli an organism encounters are a result or consequence of its behaviour.That is, the organism has control
of an outcome.
§ Behaviours that occur because they were previously effective in producing an outcome are instrumental.
§ We change our behaviour to maximise outcomes.
§ Focuses on effect of behaviour in environment
§ Behaviours are goal-directed to produce environmental outcome
§ Instrumental behaviour (the environment contains the opportunity for reward; behaviour occurs
because it was effective in producing [favourable] consequences)
§ Response-reinforcer
LEARNING
• Instrumental Conditioning • Classical Conditioning
INSTRUMENTAL/OPERANT CONDITIONINGVS. PAVLOVIAN/CLASSICAL
6
• Environmental event depends on behaviour
• Pigeon peck à food/water
• Voluntary behaviour
• (Though some involuntary behaviour)
• Environmental event depends on another
stimulus, not on behaviour
• Bell à food
• Involuntary behaviour (reflex)
EARLY
INVESTIGATIONS
OF OPERANT
CONDITIONING
INSTRUMENTAL (OPERANT) CONDITIONING: THORNDIKE
¡ E. L.Thorndike (1898)
¡ Systematic study of “animal intelligence”
¡ Puzzle box problem with cats
¡ Measure of performance: how quickly cat exited
box in successive trials
¡ Interpreted results as reflecting the learning of a new
S-R association.
THORNDIKE’S FINDINGS
¡ Findings:
¡ First trial, cat’s behaviour displayed high variability
¡ Took a long time to solve
¡ Every successive trial, the cat took less and less
time to exit
§ Law of effect – If response (R) in presence of stimulus (S) is followed by a satisfying event, association between
stimulus S and response R becomes strengthened
§ If the response is followed by annoying event, the S–R association is weakened
§ Association between response and stimuli present at the time of response is learned
§ Thorndike’s law of effect (1911):
§ “Of several responses made to the same situation, those which are accompanied or closely followed by satisfaction to the
animal will, other things being equal, be more firmly connected with the situation, so that, when it recurs, they will be
more likely to recur; those which are accompanied or closely followed by discomfort to the animal will, other things
being equal, have their connections with that situation weakened so that, when it recurs, they will be less likely to
occur.The greater the satisfaction or discomfort, the greater the strengthening or weakening of the bond.”
THORNDIKE’S LAW OF EFFECT (1911)
§ What does this all mean?
§ If a behaviour is followed by a “positive” consequence, the chances of it happening again increase
§ If a behaviour is followed by a “negative” consequence, the chances of it happening again decreases
BEHAVIOUR ISA FUNCTION OF ITS CONSEQUENCE
… so, we need to do research to prove that …
§ Notice that the consequence is not one of the elements in the association.
§ The level of satisfaction strengthens the responses, verses the level of annoyance weakens the response.
§ Can this then explain compulsive habits that are difficult to break?
§ Once learned, habitual responses occur because they are triggered by an antecedent stimulus and not
because they result in a desired consequence. (Everitt & Robbins, 2005)
THORNDIKE’S LAW OF EFFECT (1911)
MODERN
APPROACHESTO
OPERANT
CONDITIONING
§ DiscreteTrial Procedures
§ Free-Operant Procedures
13
MODERN APPROACHES TO INSTRUMENTAL CONDITIONING
THORNDIKE à SKINNER
§ Discrete trial procedures:
§ Each trial begins with the organism being placed in an apparatus (S) and ends after the intrumental
response (R) has occurred.
§ Examples include mazes.
§ Target behaviour (and its consequence) demonstrated, then trial ended (animal removed).
§ Measure of performance (DV):
§ Time to perform target behaviour (i.e., running speed and latency)
§ Number of errors made before target behaviour
§ Skinner further develop systematic study of behaviour via free operant procedures
13
MODERN APPROACHES TO INSTRUMENTAL CONDITIONING
THORNDIKE à SKINNER
§ Free-Operant Procedures:
§ Invented by B.F.Skinner (1938)
§ "Allow an animal to repeat the instrumental response without constraint over and over again without
being taken out of the apparatus until the end of the experimental session” (p.126).
§ Suggested‘on going behaviour was continuous’.
13
MODERN APPROACHES TO INSTRUMENTAL CONDITIONING
THORNDIKE à SKINNER
OPERANT CONDITIONING: B. F.SKINNER
¡ Skinner: From single trial to operant procedures
¡ Systematic study of the establishment and maintenance of
behaviour controlled by its consequences…
¡ Organism can repeat instrumental response numerus times.
¡ ‘Operant response’ is defined in terms of the effect that the
behaviour has on the environment.
¡ ‘Behaviour is not defined by a particular muscle movement,
but in terms of how the behaviour operates on the
environment’.
¡ Analysed how behaviour changes based on its consequences.
¡ Developed principals of reinforcement.
§ In operant conditioning, responses are said to be emitted (rather than elicited)
§ Because they are voluntary
§ Response à consequence
§ Behaviour is affected by its consequences (it is said to be caused by them)
§ You tell jokes because people laugh
§ Kid cries because he gets lolly
OPERANT CONDITIONING
18
Two DifferentTypes of Behaviour
Respondent behaviours: those that occur automatically and reflexively (e.g., pulling hand from hot stove)
Operant behaviours: those that occur under conscious control (spontaneously or purposely). It is the
consequences of these actions that effect whether or not they occur again
OPERANT CONDITIONING: SKINNER
§ In short
§ It is adaptive to learn associations between voluntary behaviours, which will reliably predict punishing or
rewarding outcomes
§ Behaviour is shaped by the learner’s history of experiencing reinforcement (likely to be repeated) and
punishment (less likely to be repeated)
§ Skinner:“Life is guided by consequences”
§ Note that no need for awareness of the relationship between behaviour and consequence
§ Almost every behaviour is learned (notable biological and cognitive exceptions in CC – e.g., taste aversion, preparedness and phobias)
19
OPERANT CONDITIONING: REINFORCEMENT AND
PUNISHMENT
§ Shaping:
§ Reduce complex behaviours into sequence of more simple behaviours
§ Reinforce successive approximations to the final behaviour
§ E.g., training rat to press lever
§ Reinforce (with food) when rat is:
§ Rearing (activity/exploration, anywhere in cage)
§ Rears at lever
§ T
ouches lever
§ Paws/presses lever
1. Begin by reinforcing a high frequency component of the desired response (e.g.,rearing)
2. Then drop this reinforcement – behaviour becomes more variable again
3. Await a response that is still closer to the desired response – then reintroduce the reinforcer
4. Keeping cycling through as closer and closer approximations to the desired behaviour are achieved
§ Enables the moulding of a response that is not normally part of the animal’s repertoire
20
OPERANT CONDITIONING: SHAPING
§ Shaping:
§ Three vital components
§ Clearly defined final response
§ Clearly assess starting level of performance
§ Divide progression from starting point to the final target behaviour into appropriate training steps/successive
approximations
§ Free-operant procedures = ideal
§ Chaining:
§ Sequence of behaviours that are linked to form a complex behaviour
§ This in turn becomes a unit (behavioural units)
§ E.g., teaching a young person to practice good hygiene
§ Washing hands
§ Learning to create a lather (soap + water)
§ Rinse and dry
§ Washing hands then prompts other hygienic behaviours
21
OPERANT CONDITIONING: SHAPING AND
CHAINING
§ Consequences to a behaviour can increase or decrease its likelihood
§ If it increases: reinforcement
§ If it decreases: punishment
§ Consequences may consist of presenting or removing a stimulus
§ If presenting: positive
§ If removing: negative
22
REINFORCEMENTVERSUS PUNISHMENT
INSTRUMENTAL
(OPERANT)
CONDITIONING
PROCEDURES
REINFORCEMENTVS.
PUNISHMENT
What happens to the behaviour in the next
few trials
More frequent
(reinforcement)
Less frequent
(punishment)
Are you
presenting or
removing a
Presenting
(positive)
Positive
Reinforcement
Positive Punishment
stimulus?
Removing
(negative)
Negative
Reinforcement
Negative
Punishment
REINFORCEMENTVS.
PUNISHMENT
• Pigeon pecks key, receives food
• Girl does homework and Dad
praises her
• Boy throws tantrum and Mum
gives him lollies he wants
REWARD LEARNING
What happens to the behaviour in the next
few trials
More frequent
(reinforcement)
Less frequent
(punishment)
Are you
presenting or
removing a
Presenting
(positive)
Positive
Reinforcement
stimulus?
Removing
(negative)
REINFORCEMENTVS.
PUNISHMENT
• Pigeon pecks on key and
receives a shock
• Dog barks and collar releases
aversive smell
• Nail biting and bad tasting nail
polish
What happens to the behaviour in the next
few trials
More frequent
(reinforcement)
Less frequent
(punishment)
Are you
presenting or
removing a
Presenting
(positive)
Positive Punishment
stimulus?
Removing
(negative)
REINFORCEMENTVS.
PUNISHMENT
• Pigeon pecks key and shock is
delayed/removed
• Open umbrella when it starts raining – don’t get
wet
• Students worked hard during the week so teacher
removes weekend homework
ESCAPE ORAVOIDANCE LEARNING
What happens to the behaviour in the next
few trials
More frequent
(reinforcement)
Less frequent
(punishment)
Are you
presenting or
removing a
Presenting
(positive)
stimulus?
Removing
(negative)
Negative
Reinforcement
REINFORCEMENTVS.
PUNISHMENT
§ Pigeon pecks and available food is removed
§ Child misbehaves andTV watching is forbidden
§ ‘Time-out’ for naughty behaviour
§ OMISSIONTRAINING
What happens to the behaviour in the next
few trials
More frequent
(reinforcement)
Less frequent
(punishment)
Are you
presenting or
removing a
Presenting
(positive)
stimulus?
Removing
(negative)
Negative Punishment
REMEMBER
What happens to the behaviour in the next
few trials
More frequent
(reinforcement)
Less frequent
(punishment)
Are you
presenting or
removing a
Presenting
(positive)
Positive
Reinforcement
Positive
Punishment
stimulus?
Removing
(negative)
Negative
Reinforcement
Negative
Punishment
§ Does not ‘erase’ an undesirable habit
§ What could we do then to build in more ‘desirable’ behaviour?
§ May not teach more a desirable behaviour (if the only focus)
§ Often ineffective unless:
§ Given immediately after undesirable behaviour
§ Given each time the behaviour occurs (continuous schedule)
§ Think about managing challenging behaviours…..
DRAWBACKS OF PUNISHMENT
§ The three C’s
§ Contingency
§ Clear relationship betweenA and B
§ Contiguity
§ A à B, where contiguity (à) is the time/proximity betweenA and B
§ Consistency
§ Every time behaviour occurs (continuous schedule – more on that next
week!)
31
WHEN IS PUNISHMENT EFFECTIVE?
FUNDAMENTAL
ELEMENTS OF
INSTRUMENTAL
(OPERANT
CONDITIONING)
ELEMENTS OF INSTRUMENTAL CONDITIONING
¡ The instrumental response
¡ The outcome of the response
¡ The relation (or contingency) between the response and outcome
THE INSTRUMENTAL RESPONSE
¡ The outcome of instrumental conditioning depends in part on the nature of the response being conditioned.
¡ Behavioural variability vs. stereotypy
¡ Variability: performing differently on each trial or differently for a response to occur.
¡ Stereotypy: develops if allowed/required.
¡ Thorndike and Skinner: Operant responses become more stereotyped with continued conditioning
¡ Variable or novel responses can be produced if response variation is needed for reinforcement
¡ See Ross and Neuringer (2002)
THE INSTRUMENTAL RESPONSE
¡ Relevance or belongingness in Instrumental (Operant) Conditioning
¡ Responses that naturally belong with a reinforcer
¡ Limitations:“A behaviour cannot be reinforced by a reinforcer if it is not naturally linked to that reinforcer in the repertoire
of the animal”
¡ In CC: Rat learns taste + sickness faster than taste + shock
¡ In IC: Thorndike’s cats learnt operating latch/pulling string + escape > yawning/scratching + escape
¡ Where the latter responses were mimicked once learned, where as the former remained constant (natural, evolutionary
responses)
THE INSTRUMENTAL RESPONSE
¡ Behaviour systems and constraints on Instrumental (Operant) Conditioning
¡ Is the response part of their behavioural system?
¡ What is the context of the S and R?
¡ If not, difficulty learning (e.g., conditioning raccoons with food reinforcement to drop a coin into a slot, which is
incompatible with their pre-existing organisation of their feeding system)
¡ Here we acknowledge the constraints, dependant on the context and the behaviour system of the organism.
THE INSTRUMENTAL REINFORCER
¡ Quantity and quality of the reinforcer
¡ Larger and better reinforcers are more effective
¡ Study in individuals with substance addiction
• If participants remained drug free, they received $10 at end of the day
• Larger payments were significantly better at encouraging abstinence than smaller payments
• If participant received $ immediately after passing drug test, they were more likely to abstain in future vs. those who received
$ a few days after
¡ Shifts in reinforcer quality and quantity
¡ Effectiveness of reinforcer reduces if strength of reinforcer is reduced (compared to previous trials)
• Rats with sucrose water (sucrose concentration reduced in half, who then respond less than other half who had no
previous exposure)
• Behavioural Contrast Effects
• A large reward is particularly good after a small reward, while a small reward is treated especially poor after
reinforcement with a larger reward
• This type of “anticipatory negative contrast may explain why individuals addicted to cocaine derive little satisfaction from
conventional reinforcers (a tasty meal) that other enjoy on a daily basis”.
THE RESPONSE-REINFORCER RELATION
¡ Two types of relationships between a response and a reinforcer:
1. Temporal relation: the time between the response and reinforcer
¡ Temporal contiguity – the delivery of the reinforcer immediately after the response
2. Response-reinforcer contingency: the extent to which the instrumental response is necessary and sufficient to
produce the reinforcer
¡ Temporal and causal factors are, however, independent of one another.A strong temporal relation does not require a strong
causal relation, and vice versa.
¡ E.g., there is a “strong causal relationship between taking your clothes to be dry cleaned and getting clean clothes back.
However, the temporal delay may be a day or two”.
§ Contiguity
§ Temporal proximity between behaviour and consequence
§ Learning is faster the closer the reinforcement is to the behaviour (in time)
§ Dickenson et al. (1992): shaping of lever pressing for food in rats with shorter (2-4s) or longer (up to 64s)
delays
§ Important because:
§ Delay allows for intervening behaviour to occur – not clear what the desired response was
§ However, signaling the delay (i.e., marking) reduces the delay of the impact on learning (i.e.,
conditioned reinforcement)
TEMPORAL CONTIGUITY
39
• Marking group: A light was presented for 5
seconds at the beginning of the delay interval
(immediately after the instrumental response)
• No signal (30 second delay only)
• Blocking group: The light was introduced at the
end of the delay interval, just before the
delivery of food
TEMPORAL CONTIGUITY
40
IS CONTIGUITY ENOUGH? WHAT ABOUT CONTINGENCY…
¡ Contingency
¡ Correlation between behaviour and consequence
¡ Learning the degree to which our behaviour has an effect on the environment
CONTINGENCY
¡ Learning the degree to which our behaviour has an effect on the environment
¡ Learned Helplessness: no contingency, lack of control
¡ Animal that is first subjected to learned helplessness, cannot learn to escape/avoid shock later
¡ I.e., they believe that they have no control over outcome
OTHERVARIABLES AFFECTING LEARNING
¡ Amount of reinforcement
¡ More is better (but not linear relationship)
¡ Quality / type of reinforcement
¡ Not all foods are created equal
¡ Task features:
¡ Difficulty
¡ Biological tendencies: pigeons auto shape; some things cannot be taught (more on this next week)
• Instrumental Conditioning • Classical Conditioning
INSTRUMENTAL/OPERANT CONDITIONINGVS. PAVLOVIAN/CLASSICAL
44
• Environmental event depends on behaviour
• Pigeon peck à food/water
• Voluntary behaviour
• (Though some involuntary behaviour)
• Environmental event depends on another
stimulus, not on behaviour
• Bell à food
• Involuntary behaviour (reflex)
§ Not always a clear cut distinction, some learning is not easy to separate
§ i.e., some behaviours elicited in response to stimuli (CC), others are goal-directed to produce
environmental outcome (IC)
§ Training a dog to sit / stay / jump with food
§ Learned helplessness: pairing of two stimuli but also punishment of all behaviours…
§ Child being given ‘treat’ at the supermarket
§ Dwight in the Office
PAVLOVIAN/CLASSICALVS. INSTRUMENTAL/OPERANT
CONDITIONING
45
SUMMARY
¡ Operant Conditioning
¡ Reinforcement à increases behaviour
¡ Punishment à decreases behaviour
¡ Positive: consequence presented
¡ Negative: consequence removed
NEXT WEEK
¡ Reinforcement Schedules and Choice (Chapter 6)
PSYC214 – LEARNING AND BEHAVIOUR
WEEK FIVE LECTURE
DOMJAN CHAPTER 6
UPDATES ON CANVAS
¡ Mid-semester Exam:
¡ Week 7 (Friday)
¡ 8am to 12midday
¡ 75-minutes to complete
¡ 40 MC and 1 SR
¡ Tutorials are on that day
¡ Practice Q’s available on CANVAS
¡ Lab Report
¡ Results released this week
¡ You are now able to complete results section
¡ This week we will cover remainder of introduction and methods (results next week)
REVIEW OF WEEK 4
¡ Instrumental (Operant) conditioning
¡ Reinforcement/punishment
¡ Positive/negative
¡ Shaping Behaviour
REMEMBER
LEARNING AND REINFORCEMENT
¡ Language is important à positive v. negative in the context of learning.
¡ Desirable v. undesirable behaviour // helpful v. unhelpful behaviours.
¡ Who determines it?
¡ Is learning always a “helpful” thing?
¡ “Undesirable” behaviours are also learned
¡ They must lead to some high value consequence
¡ Aggression, tantrums, etc.
¡ Are “consequences” or “reinforcers” always “good”?
¡ Cigarettes, drugs, etc.
TODAY’S LECTURE
¡ Extension from last weeks lecture
¡ Simple schedules – ratio and interval
¡ Concurrent schedules – studying choice
¡ Complex choice and self-control
¡ Delay Discounting
SCHEDULES OF
REINFORCEMENT
SCHEDULES OF REINFORCEMENT
¡ So far, we have talked about behaviour acquisition and the role of reinforcement in the process
¡ Typically looked at cases where every response leads to reinforcement (continuous reinforcement)
¡ In real life, hardly ever does every single response lead to its corresponding reinforcement
¡ Reinforcement, in the world, is intermittent à not every response is reinforced
¡ Next, we will look at the effects of intermittent reinforcement on behaviour
EXAMPLE
What is the….
¡ Behaviour?
¡ Reinforcer?
¡ Is every response (i.e., instance of the behaviour
reinforced?
EXAMPLE
What is the….
¡ Behaviour?
¡ Reinforcer?
¡ Is every response (i.e., instance of the behaviour
reinforced?
SCHEDULES OF REINFORCEMENT
¡ Skinner developed free operant procedures
¡ Allowed to record multiple responses in a single
session
SCHEDULES OF REINFORCEMENT
¡ Skinner developed free operant procedures
¡ Allowed to record multiple responses in a single
session
¡ Developed the “cumulative recorder”
¡ Pen moves up on paper for every response
SCHEDULES OF REINFORCEMENT
¡ Rules/system governing the delivery of reinforcement
¡ Behaviour dependent reinforcement – contingent
¡ Ratio schedules
¡ Fixed (FR) &Variable (VR)
¡ Interval schedules
¡ Fixed (FI) &Variable (VI)
¡ Behaviour independent, non-contingent –
¡ “time” schedules
¡ Fixed & variable
BEHAVIOUR DEPENDENT SCHEDULES
How often is the behaviour reinforced?
RATIO SCHEDULES
¡ Ratio schedules: establish a ratio of responses to reinforcers
¡ FIXED: a fixed number of responses are required for delivery of each reinforcer
¡ Denoted by FR# (# is the number of responses required)
¡ Continuous reinforcement (CRF): FR1
¡ Establishment of behaviour
¡ Not very representative of the world
¡ Anything above CRF/FR1 is considered “intermittent” reinforcement
FIXED RATIO (FR)
¡ FR schedules à high rates of responding, followed
by pauses:
¡ High and steady rate of responding that completes
each ratio requirements is called the run rate
¡ The zero rate of responding that usually occurs just
after reinforcement is called the post-reinforcement
pause (PRP)
¡ Run rate is independent of FR size, but PRPs are
longer with larger FR size
¡ Therefore, PRPs affect overall rate of responding
¡ Larger schedules à lower rates
¡ E.g., bonus for selling 5 houses
STRETCHING THE RATIO
¡ Subjects can work on really strenuous schedules (say, 100 responses for 2 pellets of food)
¡ They reach that level via shaping.
¡ Start as continuous reinforcement schedule and gradually stretch the schedule
¡ Any resource that depletes represents a stretching schedule
¡ Can be useful in therapeutic settings
¡ E.g., in exposure therapy for someone with agoraphobia – order favourite takeaway after a trip to the supermarket, then after a trip to X, thenY.
¡ Stretching ratio must be gradual or else can lead to ratio strain, where responding breaks down
RATIO SCHEDULES
¡ Ratio schedules: establish a ratio of responses to reinforcers
¡ VARIABLE: a certain number of responses on average are required for delivery of reinforcer
¡ Denoted byVR#
¡ Example:VR6 can be:
¡ Half reinforced after 2 responses, half after 10 responses
¡ Average of 2 & 10 is 6
¡ 1, 3, 6, 8, 12 à average is 6
VARIABLE RATIO (VR)
¡ VR schedules à steady performance
¡ High rate of responding
¡ Almost never PRP
¡ If any PRP occur, they are shorter than for FR
¡ Affected by the size of theVR and the size of the lowest
ratio
¡ So, forVR50 (20,80) andVR50 (40,60) the second will lead
to longer PRPs
¡ Example: gambling
INTERVAL SCHEDULES
¡ Interval schedules: provide reinforcement for the first response after a specified period of time
¡ FIXED: a fixed period of time before a response leads to reinforcement
¡ Denoted by FI (time)
¡ Important to note that a response is required
FIXED INTERVAL (FI)
¡ FI schedules à moderate to low rate of responding
¡ Lead to a scalloped pattern
¡ Exams at fixed periods…
INTERVAL SCHEDULES
¡ Interval schedules: provide reinforcement for the first response after a specified period of time
¡ VARIABLE: the period before response leads to reinforcement varies around an average value
¡ Denoted byVI(time)
¡ ExampleVI 6s can be:
¡ Half reinforcement for first response after 2s, half for first response after 10s, or
¡ 1s, 3, 6s, 8s, 12s à average is 6s
VARIABLE INTERVAL (VI)
¡ VI schedules à lower rate of responding, but
steady (still moderate)
¡ Higher rates than an FI but not as high as FR orVR
¡ Closest representation of high variability out there
¡ Bonus for worker when supervisor shows up
¡ Elevators
¡ Random pop-quiz
PSYC214 lecture slides for mid-sem exam (weeks 1-6)
DO RATIO AND INTERVAL SCHEDULES MOTIVATE BEHAVIOUR
SIMILARLY?
¡ Not really. – Different mechanisms
¡ Pigeons trained onVR orVI (intervals based onVR pigeon)
¡ VR pigeon shows more vigorous responding
¡ Why? Short Inter ResponseTime and Feedback
BEHAVIOUR INDEPENDENT SCHEDULES
¡ Also called non-contingent
¡ Deliver reinforcement independent of responses
¡ Fixed time (FT)
¡ Birthdays, anniversaries
¡ Variable time (VT)
¡ Random calls, messages from friends
SOWHY STUDY REINFORCEMENT SCHEDULES?
¡ Used in choice research
¡ Organisms show preferences for some over other schedules
¡ FR5 preferred over FR15 (no surprise)
¡ Variable interval preferred to equivalent Fixed interval
¡ VI 15 (2, 28) preferred over a FI15
CONCURRENT SCHEDULES AND CHOICE
¡ Development of these complex schedules allows us to study more complex (i.e.,‘real life’) behaviour
¡ We hardly ever have a single response alternative available
¡ In fact, all behaviour is the product of choice…
¡ Having concurrent schedules (providing two options) allows to study choice…
CONCURRENT SCHEDULES AND CHOICE
¡ Concurrent schedules have been extensively used in the study of choice
¡ Assumption: these schedules may provide (simplified) model of world
¡ Basic premises for studying choice:
¡ Organisms face choices
¡ Choices are characterised by consequences
¡ Consequences can be defined in terms of properties
¡ Rate of occurrence (probability)
¡ Magnitude
¡ Delay
¡ Organisms should maximise some function of consequences
CONCURRENT SCHEDULES AND CHOICE
¡ How to study?
¡ Present organisms with choices between alternatives that vary systematically
¡ Observe preferences (i.e., response behaviour)
CONCURRENT SCHEDULES AND CHOICE
¡ Rate of reinforcement
¡ If a pigeon has an option to peck two different keys associated different schedules of reinforcement, which key are they
likely to peck most frequently?
¡ 1. Red key = FR5 vs Blue key = FR10?
¡ 2. Red key = FI5 vs Blue key =VI5?
¡ 3. Red key =VI5 vs Blue key = FR1?
CONCURRENT SCHEDULES AND CHOICE
¡ Herrnstein (1961) and Herrnstein and Mazur (1987) showed behaviour distribution can be modelled by an
equation
¡ Relative frequency of behaviour (B) equals relative frequency of reinforcement (r)
¡ If one pays twice as much, I will spend twice as long there.
¡ Matching Law
MATCHING LAW
¡ Matching law has been extended to other properties of reinforcement:
¡ Magnitude/amount of reinforcement
¡ Relative frequency of responses matches relative amount of reinforcement
¡ A pigeon will spend twice as long responding to the key that gives twice the amount of food
¡ Delay of reinforcement
¡ Relative frequency of responses matches the relative immediacy of reinforcement
¡ A pigeon will spend twice as long responding to the key that keeps it waiting half of the time
CHOICE BEHAVIOUR: RELATIVE RATE OF RESPONDING
¡ Relative rate of responding to option 1 (B1) is calculated by dividing the rate of responding to option 1 (B1) by
the total rate of responding to options 1 and 2 (B1 + B2)
¡ If response rate to B1 and B2 is equal, ratio will be = .5
¡ If response rate to B1 is less than B2, ratio will be less than .5
¡ If response rate to B1 is more than B2, ratio will be more than .5
CHOICE BEHAVIOUR: SELF-CONTROL
¡ When one looks at amount and delay to reinforcement at the same time...
¡ Suppose pigeon is given a choice between:
¡ B1: 4s access to food after a 4s delay
¡ B2: 2s access to food right now (after he reaches food hopper .1s)
¡ Ratio = .04762 (Less than .5)
¡ Pigeon will prefer the immediate small reward (i.e., B2)
CHOICE BEHAVIOUR: SELF-CONTROL
¡ When one looks at amount and delay to reinforcement at the same time...
¡ Suppose pigeon is now given a choice between:
¡ B1: 4s access to food in 14s
¡ B2: 2s access to food in 10s
¡ Ratio = .59 (more than .5)
¡ Now pigeon prefers larger, delayed reward
CHOICE BEHAVIOUR: SELF-CONTROL
¡ Let’s recap...
¡ Matching law says:
¡ When faced with an immediate small vs large delayed choice, the organism should mostly choose immediate small choice
¡ But when both options are further delayed, preference will switch
SELF-CONTROL
DELAY
DISCOUNTING
https://www.youtube.com/watch?v=Q
X_oy9614HQ
MARSHMALLOWTEST
¡ Replication crisis
¡ Partial failure to replicate
¡ Replication showed dramatic reduction in effect (though still sig.)
¡ Non-sig when controlling for environmental factors
¡ See Watts et al. (2018). Revisiting the marshmallow test: a conceptual replication investigating links between early delay of
gratification and later outcomes
DELAY DISCOUNTING
¡ Originates in early studies on concurrent-chain schedules (Rachlin & Green, 1972)
¡ The value of a reinforcer declines as a function of how long you have to wait for it
¡ The value of a reinforcer is directly related to reward magnitude and inversely related to reward delay
¡ So, the longer a reinforcer is delayed, the smaller its value
¡ However, with increasing time (delay) a larger reinforcer will have a higher value than a smaller reinforcer
DELAY DISCOUNTING
¡ Often studied in cases with hypothetical $
¡ $100 today vs. $500 in a week?
¡ $100 today vs. $150 in a week?
¡ $100 today vs. $150 in a year?
¡ Can manipulate both time and value
DELAY DISCOUNTING
¡ Subjective value of the smaller rewards decreases more rapidly with longer delay compared to the larger reward
FIGURE 6.7
The subjective value of 16 ml and 8 ml of juice as a function of
delay in college students. Curves represent best-fitting
hyperboloid functions (based on Jimura et al., 2009).
DELAY DISCOUNTING
DELAY DISCOUNTING AND HUMAN BEHAVIOUR
¡ The steeper the delay discounting function the more difficulty that person will have in exhibiting self-control
¡ The larger, more remote reward will seem much less valuable than a smaller, more immediate reward
¡ Poor self-control may be associated with a variety of human problem behaviours
¡ But…Is there evidence to support this idea?
DELAY DISCOUNTING AND HUMAN BEHAVIOUR
¡ Drug use/addictive behaviours à steeper discounting functions than controls
¡ Moffitt et al. (2011)
¡ Longitudinal NZ study of ~1000 children (from birth to 32 years old)
¡ Higher levels of self-control in childhood
à better health, lower rates drug use, higher income levels, lower rates criminal behaviour
¡ Remember:“steep” delay discounting function = less self-control
CAN SELF CONTROL BE LEARNED?
¡ It seems the answer is yes!
¡ Training with delayed reward increased preference for larger delayed reward
¡ Shaping: initially no delay, then increasing the delay between small and large reward
¡ HOWEVER, more research is needed
LECTURE SUMMARY
¡ Instrumental Conditioning: reinforcement aims to increases behaviour – punishment aims decreases behaviour
¡ We can increase one behaviour and subsequently decrease another by the increased new behaviour.
¡ “It’s not what we stop, its what we start”
¡ Extinction – new learning.
¡ Reinforcement schedules can vary in multiple ways
¡ Ratio: fixed vs variable
¡ Interval: fixed vs variable
¡ Different reinforcement schedules lead to different rates of responding
¡ Choice scenarios involve multiple response alternatives
¡ Delay discounting – indicator of self-control
NEXT WEEK
¡ Models of Instrumental Conditioning (Chapter 7)
¡ Associative Structure
¡ Response Allocation and Behavioural Economics
PSYC214 – LEARNING AND BEHAVIOUR
WEEK THREE LECTURE
DOMJAN CHAPTER 7
HOUSEKEEPING – MID SEMESTER EXAM
Mid-Semester Exam in Week 7:
The online mid-semester exam assesses content from lectures, tutorials and readings from Weeks 1 to 6 inclusive.
MELBOURNE: The exam will open at 12pm on Wednesday 11th of September and will close at 4pm on
Wednesday 11th September.
The exam consists of 40 multiple-choice questions worth one (1) mark each and 1short-answer question worth ten
(10) marks. The short-answer response should be approximately 250 words.
You have 75 minutes to complete all of the questions. If you experience ongoing technological issues that prevent
you completing the exam in the allocated timeframe, before any request for an opportunity to sit the exam at
another time will be considered you MUST supply timestamped screenshots of the error demonstrating that this
error persisted for the duration of four-hour window you had to complete the exam and flag any difficulties with me
ASAP.
HOUSEKEEPING
Mid-Semester Exam in Week 7 (Continued):
You may make only one attempt at the exam.
The exam is worth 30% of your unit assessment.
You must attempt the exam in order to be eligible for a passing grade in PSYC214.
The exam is open book – you may refer to your notes and textbooks when attempting this exam.
You must however complete the exam independently – forms of academic misconduct, such as collusion are not
acceptable – please refer to the Academic Integrity and Misconduct Policy for further information.
In submitting your exam, you acknowledge that you have read and understand ACU’s Academic Integrity and
Misconduct Policy and have not engaged in any behaviour that would constitute Academic Misconduct.
Remember to check Canvas announcements for any further announcements on
assessments and the unit
TODAY’S LECTURE / OBJECTIVES
Models of Instrumental Conditioning: Explain/apply the key mechanisms that motivate and direct instrumental responses
Explain / Apply reward expectancy and S-O associations, Two-Process Theory
Exemplify:
The S–R Association and the Law of Effect
The Expectancy of Reward and the S–O Association
The R–O and S(R–O) Relations in Instrumental Conditioning
Describe and apply Antecedents of the Response Allocation Approach
Consummatory response
Premack/Differential Probability Principle
Response Deprivation Hypothesis
Response Allocation and Behavioural Economics
Antecedents of the Response-Allocation Approach
The Response Allocation Approach
WHAT MOTIVATES INSTRUMENTAL CONDITIONING (IC)?
Two competing theories:
1. Associative structure of instrumental conditioning
Thorndike and Pavlov
2. Response allocation and behavioural economics
Skinnerian tradition (context matters)
Neither approach can stand alone. These are competing theories that
have proceeded independently
INSTRUMENTAL BEHAVIOUR FROM TWO RADICALLY
DIFFERENT PERSPECTIVES
Associative Structure
Thorndike/Pavlovian conditioning
Relies heavily on associations; compatible with
Pavlovian conditioning
Relevant research stimulated by efforts to identify
role of Pavlovian mechanisms in instrumental
learning
Molecular perspective: focuses on individual
responses and specific stimulus antecedents/
response outcomes
Response Allocation
Skinnerian tradition & context
Relies on broader context of numerous activities
organisms are constantly doing
Concerned with how instrumental conditioning
procedure limits free flow of activities/
consequences, & behaviour follows as a
consequence of this limitation
Molar perspective: considers long-term goals and
how to achieve goals in context of behavioural
options
THE ASSOCIATIVE STRUCTURE OF INSTRUMENTAL
CONDITIONING
Thorndike recognised that Instrumental
Conditioning involves much more than a
response to a reinforcer
Context matters
The instrumental response occurs in specific
contexts
Thorndike's Law of Effect (revisited)
Cats learned to escape puzzle box to obtain food
reward
Thorndike assumed this was caused by the
development of an S-R association
Reinforcement “stamped in” this association,
without itself being learned about
EXAMPLES OF INSTRUMENTAL BEHAVIOURS, WHICH OCCUR
IN THE CONTEXT OF SPECIFIC ENVIRONMENTAL STIMULI
Sending a text message
Context of tactile stimulus (holding phone) + visual cue (looking at screen)
Igniting your car
Context of sitting on driver seat + holding driver’s keys
Can you think of some other examples?
SKINNER: INSTRUMENTAL CONDITIONING HAS 3 KEY
ELEMENTS
1. Stimulus Context (S)
Sight, smell, or thought of pizza
2. Instrumental Response (R)
Going to a place to buy a pizza
Order delivery
Or you could make one!
3. Reinforcer or response outcome (O)
Pleasant taste, rewarding experience
THORNDIKE’S LAW OF EFFECT
Stimulus-Response (or Stimulus-Outcome) association is solely responsible for the occurrence of
instrumental conditioning
Stimulus (e.g. alcohol drink or pizza) = Contextual stimuli that are present when a response is reinforced
Response (e.g. drinking or eating) = Instrumental response
Reinforcer (e.g. positive feeling from alcohol use pizza eating) = only “stamps in” or strengthens the
association between a stimulus and a response
THORNDIKE’S LAW OF EFFECT
The motivation for an instrumental behaviour is the activation of the S-R association due to exposure to
contextual stimuli/triggers that were present when the S-R association formed.
Applies to habit forming in the process of drug addiction
At the start the reward is the pleasant feeling from drug use
With the development of addiction, exposure to contextual stimuli/triggers (sight of alcohol), enough to
trigger a response (drinking)
HULL (1930, 1931) AND SPENCE (1956)
Over the course of instrumental conditioning, the instrumental response increases because of:
1. Stimulus-Response/Thorndike association (classic conditioning):
Contextual stimuli trigger response (e.g. gambling response to visual cues)
AND
2. Stimulus-Outcome association (instrumental conditioning):
The response is made because a reward is expected
Yet debated how the S-O association motivates instrumental conditioning
TWO-PROCESS THEORY (RESCORLA & SOLOMON, 1967)
There are 2 types of learning:
Classical Conditioning (S-R) AND Instrumental Conditioning (S-O)
1. Stimulus-Response association formed during instrumental conditioning, as a result of classical
conditioning
Association is formed between alcohol (stimulus) and drinking behaviour (response)
stimulus
alcohol
response
drinking
outcome
Pleasant physiological
response
Seeing/thinking about a
drink…will activate the
S-R association &
motivate drinking…
TWO-PROCESS THEORY (RESCORLA & SOLOMON, 1967)
There are 2 types of learning:
Classical Conditioning (S-R) AND Instrumental Conditioning (S-O)
2. Stimulus-Outcome association motivates positive/negative emotional state, which in turns motivates
responding
Association formed between alcohol (stimulus) & nice physiological outcome (response)
stimulus
alcohol
response
drinking
outcome
Pleasant physiological
response
Seeing/thinking about a
drink…will activate the
S-O association 
positive emotional state
motivate drinking
behaviour
THE PAVLOVIAN INSTRUMENTAL TRANSFER PROCEDURE/
TEST
PAVLOVIAN INSTRUMENTAL TRANSFER PROCEDURE/TEST
Test the idea that instrumental behaviour is motivated by the Stimulus-Outcome association (and
emotions related to outcome). Example of experiment with rats, 3 phases:
Phase I. INSTRUMENTAL CONDITIONING: STIMULUS-RESPONSE
Lever pressing  Food
Phase II. PAVLOVIAN CONDITIONING: STIMULUS-RESPONSE
Lever is removed
‘Pavlovian’ bell  Food
Phase III. CRITICAL TRANSFER PHASE
Allowed to lever press for food
Occasionally, also Pavlovian bell is rung
IF Pavlovian association motivates instrumental responding, THEN trials with bell will lead to more food intake than
trials without bell
S-R ASSOCIATION AND THE LAW OF EFFECT
S–R Association – Key to instrumental learning and central to law of effect
Law of Effect – Involves establishment of S-R association between instrumental response (R) and
contextual stimuli (S) present when response reinforced
Law of effect does not involve learning about reinforcer or response outcome (O) or relation between
response and reinforcing outcome (the RO association)
Role of the reinforcer = “stamp in” or strengthen S-R association
Thorndike thought once established, S-R association solely responsible for instrumental behaviour
Fell into disfavour during cognitive revolution
Resurgence of interest in S-R mechanisms in recent efforts to characterise habitual behaviour in people
(example = drug addiction)
Habits are 45% of human behaviour!!
EXPECTANCY OF REWARD AND THE S-O ASSOCIATION
Specification of instrumental response ensures participant will always experience certain distinctive
stimuli (S) in connection with making response
Stimuli may involve distinctive place, texture, smell, sight cues
Reinforcement of instrumental response results in pairing stimuli (S) with reinforcer or response outcome
(O)
Pairings provide potential for classical conditioning and establishment of association between S and O
EVIDENCE FOR S–O ASSOCIATIONS IN INSTRUMENTAL
LEARNING
Two Process Theory (Rescorla & Solomon, 1967):
S – O & S – R associations are learned
The stimulus (S) comes to “motivate” responding
It does this because S associates with the emotional
aspects of the reinforcing outcome
Pavlovian and Instrumental Conditioning are related
This evoked emotional state energises instrumental
responding based on an underlying S – R association.
The Pavlovian CS (Tone) increases instrumental lever
pressing when it is paired with food.
But it decreases instrumental lever pressing when it is
paired with foot shock (CER).
CONDITIONED EMOTIONAL STATES OR REWARD-SPECIFIC
EXPECTANCIES?
Two-Process Theory: Assumes classical conditioning mediates instrumental behaviour through conditioning
of positive or negative emotions depending on emotional valence of reinforcer
Organisms also acquire specific reward expectancies instead of just categorical positive or negative
emotions during instrumental and classical conditioning
Expectancies for specific rewards rather than general positive emotional states determine results in
transfer test
BREAK
RE-THINKING REINFORCERS
Thorndike’s Law of Effect  Postulates that a reinforcer = A stimulus that produces ‘a satisfying state of
affairs’ … This may be a limited interpretation
The Response-Outcome association, does not explain what causes the response in the first place.
Response allocation approaches challenge the notion that reinforcers are “special” stimuli & focus on
a molar approach where instrumental conditioning procedures put limitations on an organisms activities
causing redistributions of behaviour among available response options
Challenging the idea that reinforcers were special stimuli that strengthened instrumental behaviour
RE-THINKING REINFORCERS
Antecedents of Response Allocation Approach
Consummatory-Response Theory
Premack Principle
Response Deprivation Hypothesis
CONSUMMATORY-RESPONSE THEORY
Theory claims that species-typical
consummatory responses (eating, drinking,
swallowing etc) are themselves the critical
feature of reinforcers
The REAL REINFORCER IS:
The consummatory response (e.g., eating, drinking,
etc.)
NOT the reinforcer (e.g.. Food pellet, chocolate,
water, etc.)
…so eating the food pellet rather than the food
pellet itself is the reinforcer
According to CR theory, for example…
Eating the chocolate is the reinforcer – not the
chocolate itself
CONSUMMATORY-RESPONSE THEORY
According to CR theory, for example…
Eating the chocolate is the reinforcer – not the chocolate
itself!
PREMACK THEORY
Premack suggested a way to predict a priori what events would be reinforcers
Denied assumptions of classical reinforcement theory
The reinforcement process:
Relation between responses
Not a relation between responses and consequential stimuli
No clear boundaries between behaviours and reinforcers
Is it food that is the reinforcer, or eating?
Is it the toy, or playing?
Premack’s principle of positive reinforcement:
If an instrumental response is followed by a contingent response that is more highly probable, the
instrumental response will increase in frequency
PREMACK THEORY
Frames behaviour as high and low probability
responses
High probability responses reinforce lower
probability responses
1. STRONG REINFORCERS ARE:
High probability responses: that one is likely to
make (naturally occurring/reinforcing)
2. INSTRUMENTAL RESPONSES ARE:
Low probability responses: unlikely to occur without
some reason to perform (unlikely to made by
choice)
PREMACK PRINCIPLE
More preferred activities can be used to
reinforce less preferred activities
You must finish your VEGETABLES (low frequency)
before you can have your ICECREAM (high
frequency)
The Premack principle is a theory of reinforcement
that states that a less desired behaviour can be
reinforced by the opportunity to engage in a more
desired behaviour
PREMACK THEORY
High vs low probability behaviours: A behaviour that naturally occurs frequently has a high probability of
reinforcing a behaviour that naturally occurs less frequently
High probability responses reinforce lower probability responses
1. STRONG REINFORCERS ARE:
High probability responses, that one is likely to make
2. INSTRUMENTAL RESPONSES ARE:
Low probability responses, unlikely made by choice
INSTRUMENTAL CONDITIONING IS
MORE POWERFUL WHEN THE
SUBJECT HAS A MARKEDLY
DIFFERENT LIKELIHOOD OF
PERFORMING THESE 2
RESPONSES
PREMACK THEORY
1. STRONG REINFORCERS ARE:
High probability responses, that one is likely to
make
2. INSTRUMENTAL RESPONSES ARE:
Low probability responses, unlikely made by
choice
EATING LEVER
PRESSING
PREMACK THEORY
Measure unconstrained baseline behaviour and rank all activities in terms of their probability
Every behaviour can reinforce behaviour down the list and punish behaviour further up
The Premack principle implies the following:
1. That all responses are potentially reinforceable – whereas classical theory claims some are/some are not
2. That all responses are potentially reinforcers for other, less probable responses
Premack’s indifference principle
It is irrelevant how the current behaviour probabilities got to be what they are (e.g., through deprivation).
All that matters is the CURRENT probabilities
REINFORCING CONSUMMATORY RESPONSES
Premack (1959) conducted research on children
comparing behaviours: could play pinball or eat
chocolate
61% of children preferred to play pinball
39% preferred to eat chocolate
Premack divided each of these groups into two
subgroups
Eat-to-Play
Play-to-Eat
For the pinballers
Eat-to-Play increased Eat significantly
Play-to-Eat increased Play only a very small
amount
For the Eaters:
Eat-to-Play very small increase in Eat
Play-to-Eat increased Play significantly
Supports Premack theory that eating is a
consummatory response according to classical
reinforcement theory – could be reinforced
APPLICATIONS OF PREMACK PRINCIPLE
Restriction of the reinforcer activity is the critical factor for instrumental reinforcement
Low probability responses can serve as reinforcer…as long as subjects are restricted from making the
response!
Can create a new reinforcer…simply by restricting access to it
Premack’s work influenced new theories of reinforcement such as Response Deprivation Theory &
Behaviour-Regulation theory with many applications IRL
Premack theory in application: https://www.youtube.com/watch?v=2HIiQ0ukHaU
THE RESPONSE DEPRIVATION HYPOTHESIS
You can create a new reinforcer simply by
restricting access to it!
A critical factor for instrumental reinforcement,
is restricting access to the reinforcer activity
(and makes it more valuable)
If subjects are restricted from making a
response/behaviours, such responses/
behaviours can serve as reinforcer
THE RESPONSE DEPRIVATION HYPOTHESIS
According to the probability-differential view (the original Premack theory), a low probability response can
never reinforce a higher probability response
However, it has been shown that this CAN happen if the organism is prevented from emitting the lower
response activity at its baseline level,
Any response is therefore reinforcing
EXAMPLES: RESPONSE
DEPRIVATION
HYPOTHESIS
RATS
Given any choice rats may
prefer sitting to running on a
wheel
EXAMPLES: RESPONSE DEPRIVATION HYPOTHESIS – RATS
BUT, IF access to the running wheel is restricted THEN, running on the wheel could be used as a
reinforcer for lever pressing!
EXAMPLES: RESPONSE DEPRIVATION HYPOTHESIS – RATS
BUT, IF access to sitting is restricted THEN, sitting could be used as a reinforcer for
lever pressing!
…or the other way around
EXAMPLES: RESPONSE
DEPRIVATION
HYPOTHESIS
HUMANS
Given any choice you may
prefer sitting to standing
EXAMPLES: RESPONSE DEPRIVATION HYPOTHESIS – HUMANS
But IF your ability to stand is restricted (e.g., long
haul flight with the seat belt sign continuously on)
THEN standing up can be used as a reinforcer
for some other behaviour
THE RESPONSE DEPRIVATION HYPOTHESIS
IN SUM
An organism will work to gain access to a reinforcer response if access to that reinforcer response has
been restricted.
THE RESPONSE ALLOCATION APPROACH
Response allocation approach views IC in terms of the other
available behavioural response options, and how an individual
distributes their responses among the various options that are
available
IC puts limitations on animal’s activities and causes ‘redistribution’
of behaviour among available options
BEHAVIOURAL BLISS POINT
Definition:
“The preferred distribution of an organism’s activities before an
instrumental conditioning procedure is introduced that sets
constraints and limitations on response allocation” Domjan (2015,
p.210).
A distribution of responses, among available alternatives, in the
absence of restrictions
RESPONSE ALLOCATION
Increased performance of the instrumental response
RESULTS FROM reallocating responses that minimise deviations from the bliss point, as much as
possible
IF THERE ARE OTHER REINFORCERS IN THE ENVIRONMENT
Other ‘enjoyable’ behaviour options can undermine the instrumental behaviour
RESPONSE ALLOCATION: EXAMPLE
Instrumental response: Studying
Reinforcer: Facebook/Instagram/Snapchat
Bliss point: 3hrs/night of Facebook/Instagram/Snapchat
REINFORCEMENT SCHEDULE:
Could be set up that for 1 hr of Facebook person must do 1hr study (i.e., FB becomes contingent on
studying)
This deprives person of time in FB and motivates an increase in time studying
Time studying will increase to bring the time allowed for FB closer to the preferred level / bliss point
RESPONSE ALLOCATION: EXAMPLE
BUT, if other reinforcers are also present in the environment…e.g., Instagram, Netflix, PlayStation,
eating, cleaning, etc
… then the success of the instrumental conditioning contingency will be undermined
The person may be ok to ‘give up’ FB time if they have other pleasant options (or reinforcers) which are
not contingent on studying!
SUMMARY: ANTECEDENTS OF THE RESPONSE ALLOCATION
APPROACH
Response Allocation– Molar approach focusing on how instrumental conditioning procedures put
limitations on activities, cause redistributions of behaviour among available response options
Consummatory-Response– Species-typical consummatory responses (example = eating, drinking) are
critical feature of reinforcers
Premack/Differential Probability Principle– Difference in likelihood of instrumental and reinforcer
responses
Encourages thinking about reinforcers as responses rather than as stimuli
Greatly expands range of activities investigators use as reinforcers; any behaviour can serve as reinforcer
provided it is more likely than the instrumental response
Response-Deprivation Hypothesis– Restriction of reinforcer activity critical factor for instrumental
reinforcement
BEHAVIOURAL ECONOMICS AND RESPONSE ALLOCATION
Economics is the study of the allocation of behaviour
within a system of constraints
Instrumental conditioning is similar to economics
Ability to make responses is ‘income’
Available time & energy
Number of responses required: “effort cost” is the
“price”
Schedule of reinforcement determines the “price” of the
reinforcer
Number of reinforcers earned is the amount
purchased
Consumer demand
Relationship between price and amount purchased
The demand curve is elasticity of demand
BEHAVIOURAL ECONOMICS
Similarities between economic restrictions in marketplace and schedule constraints in instrumental
conditioning
Demand Curve - Relation between price of commodity and purchased amount
Elasticity of Demand – Degree to which price influences demand
Determinants of the Elasticity of Demand
Availability of Substitutes
Price Range
Income Level
Link to Complementary Commodity
Consumer demand is used to analyse instrumental behaviour by considering number of
responses performed (or time spent responding) analogous to money and reinforcer
obtained to be analogous to purchased commodity
CONTRIBUTIONS OF RESPONSE ALLOCATION & BEHAVIOURAL ECONOMICS TO
REINFORCEMENT CONDITIONING THEORY & BEHAVIOUR REGULATION
Better understanding of the motivational mechanisms
Think about the cause of reinforcement as constraints on the free flow of behaviour (rather than thinking
of reinforcers as special kinds of stimuli/response)
Instrumental conditioning procedures do not “stamp in” to strengthen instrumental behaviour
Instead, instrumental conditioning creates new distribution/allocation of responses
Resulting reallocation depends on trade-offs between various options usefully characterized by
behavioural economics
Response Allocation Approach and behavioural Economics provide new and precise ways of describing
constraints that various instrumental conditioning procedures impose on organism’s behaviour
To study the complex examples of choice self control and economic behaviour
Requires complex models from response allocation approach
CONTRIBUTIONS OF RESPONSE ALLOCATION & BEHAVIOURAL ECONOMICS TO
REINFORCEMENT CONDITIONING THEORY & BEHAVIOUR REGULATION
Behavioural Economics emphasise that instrumental behaviour cannot be studied in a vacuum
Instead, all response options must be considered as a system; changes in one part of system determine how other
parts of the system can be altered
Changed the concept of reinforcer & the way instrumental conditioning procedures were viewed
Optimal distribution of behaviour determined by physiological needs, ecological niche and species-
specific response tendencies
Emphasis on broader behavioural context for understanding instrumental behaviour
SUMMARY
The Associative Structure of Instrumental Conditioning
The S-R Association and the Law of Effect
Expectancy of Reward and the S-O Association
R-O and S(R-O) Relations in Instrumental Conditioning
Response Allocation and Behavioural Economics
Antecedents of the Response Allocation Approach
The Response Allocation Approach
Behavioural Economics
Contributions of the Response Allocation Approach and Behavioural Economics
NEXT WEEK
MID SEMESTER EXAM!
NO Lecture will be running.
Tutorials will run on Friday as usual.
AND of course, Good Luck!!

More Related Content

PPTX
Learning theories
DOCX
CH. 3- INTELLECTUAL BACKGROUNDSKINNER’S PSYCHOLOGY.docx
PPTX
Teaching methodology.pptx
PPTX
Theories of learning
PPTX
CH 7_Behaviorial And Cogntive Approaches.pptx
PPTX
AQA Psychology A Level Revision Cards - Approaches Topic
PPTX
Final-Learning-Theories-Related-to-Healthcare-Practices.pptx
PPT
Behavioural perspective of AB n Psychotherapy.ppt
Learning theories
CH. 3- INTELLECTUAL BACKGROUNDSKINNER’S PSYCHOLOGY.docx
Teaching methodology.pptx
Theories of learning
CH 7_Behaviorial And Cogntive Approaches.pptx
AQA Psychology A Level Revision Cards - Approaches Topic
Final-Learning-Theories-Related-to-Healthcare-Practices.pptx
Behavioural perspective of AB n Psychotherapy.ppt

Similar to PSYC214 lecture slides for mid-sem exam (weeks 1-6) (20)

DOCX
CH. 4 LEARNING, MEMORY, AND INTELLIGENCELearning is defined
PPTX
LEARNING CONCEPT.pptx
PPTX
E learning week2
PPTX
learningtheories-180314152533.mmmmmmmpptx
PPTX
Chapter05
DOCX
What is behaviorist perspective
PPTX
Learning and Memory
PPT
Behaviorist Theory Presentation
DOCX
Psychology
PPTX
Behavioral Sciences And Medical Psychology Revision-1.pptx
PPT
Klein ch1
PPT
Klein ch1
PPT
Evolutionary Psychology
PPTX
Psychological foundations of education
PPTX
Learning, Types, Process & Theories.pptx
PDF
Behavioural approach & therapies
PDF
As aqb approaches_behaviourism_basics
PDF
As aqb approaches_behaviourism_basics
PPTX
Nature nurture powerpoint
CH. 4 LEARNING, MEMORY, AND INTELLIGENCELearning is defined
LEARNING CONCEPT.pptx
E learning week2
learningtheories-180314152533.mmmmmmmpptx
Chapter05
What is behaviorist perspective
Learning and Memory
Behaviorist Theory Presentation
Psychology
Behavioral Sciences And Medical Psychology Revision-1.pptx
Klein ch1
Klein ch1
Evolutionary Psychology
Psychological foundations of education
Learning, Types, Process & Theories.pptx
Behavioural approach & therapies
As aqb approaches_behaviourism_basics
As aqb approaches_behaviourism_basics
Nature nurture powerpoint
Ad

Recently uploaded (20)

PPTX
master seminar digital applications in india
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
O7-L3 Supply Chain Operations - ICLT Program
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
VCE English Exam - Section C Student Revision Booklet
PPTX
Lesson notes of climatology university.
PPTX
Pharma ospi slides which help in ospi learning
PPTX
Presentation on HIE in infants and its manifestations
PDF
Chinmaya Tiranga quiz Grand Finale.pdf
PDF
Microbial disease of the cardiovascular and lymphatic systems
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PPTX
Cell Structure & Organelles in detailed.
PDF
Complications of Minimal Access Surgery at WLH
PDF
A systematic review of self-coping strategies used by university students to ...
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PDF
01-Introduction-to-Information-Management.pdf
master seminar digital applications in india
Pharmacology of Heart Failure /Pharmacotherapy of CHF
O7-L3 Supply Chain Operations - ICLT Program
human mycosis Human fungal infections are called human mycosis..pptx
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
O5-L3 Freight Transport Ops (International) V1.pdf
VCE English Exam - Section C Student Revision Booklet
Lesson notes of climatology university.
Pharma ospi slides which help in ospi learning
Presentation on HIE in infants and its manifestations
Chinmaya Tiranga quiz Grand Finale.pdf
Microbial disease of the cardiovascular and lymphatic systems
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
Module 4: Burden of Disease Tutorial Slides S2 2025
Cell Structure & Organelles in detailed.
Complications of Minimal Access Surgery at WLH
A systematic review of self-coping strategies used by university students to ...
FourierSeries-QuestionsWithAnswers(Part-A).pdf
01-Introduction-to-Information-Management.pdf
Ad

PSYC214 lecture slides for mid-sem exam (weeks 1-6)

  • 1. PSYC214 – LEARNING AND BEHAVIOUR WEEK ONE LECTURE DOMJAN CHAPTERS 1 AND 2
  • 2. HOUSEKEEPING  LIC and Lecturer: Dylan Fuller  Lectures 1, 11-12  dylan.fuller@acu.edu.au  Consultation: via appointment (email)  First point of contact for administrative needs related to PSYC214 (e.g., extensions, special consideration, other general questions, etc.)  Lecturer and Tutor: Tom Nicholl  Lectures 2-10  tom.nicholl@acu.edu.au
  • 3. HOUSEKEEPING  Prerequisites: PSYC100, PSYC101, PSYC104  You must have completed these units prior to commencing this unit  If you have not, contact your course coordinator immediately
  • 4. ABOUT THE UNIT  Main aims:  Use models of learning (e.g., classical conditioning, operant conditioning, behavioural economics) to describe, explain, predict and change human behaviour  Show how models of learning can be used in everyday life / clinical settings.
  • 5. ASSESSMENTS AssessmentTask Due Date Weighting Mid-Semester Exam During lecture time in Week 7 30% Lab Report Monday 7th of October @ 5pm 40% Final Exam During examinations period 30%
  • 6. MAIN TEXTBOOK  The Principles of Learning and Behaviour, 7th ed (Domjan, 2020).  Information for this book is in ‘Readings’ under Information and Resources on Canvas  You must complete the weekly readings – lectures cannot possibly cover everything!
  • 7. TODAY’S LECTURE  Introduction to the study of learning and behaviour  History  Definition  Methodological approaches  Elicited behaviour  Habituation and sensitization
  • 8. LEARNING AND BEHAVIOUR THEORY – A BRIEF HISTORY
  • 9. RENÉ DESCARTES (1596-1650)  Theories of Learning arguably began with the philosophy of Descartes  Previously, people believed that behaviour was due to conscious choices/free will  Descartes disagreed! He put forward Cartesian Dualism:  Involuntary Behaviour: Automatic reactions to external stimuli (i.e., reflex)  Voluntary Behaviour: Conscious actions made by people  Believed that voluntary behaviour was uniquely human (no voluntary behaviours in animals)
  • 10. NATIVISMVS EMPIRICISM  Some philosophers like Descartes believed in Nativism:  Innate knowledge that people are born just knowing  E.g., the concept of self  Other philosophers believed in Empiricism:  We are born as a ‘clean slate’ with no previous knowledge  Learn things as we go  Predictability of behaviours was also an issue of contention
  • 11. THOMAS HOBBES (1588– 1679)  Believed that human behaviour was guided in a predictable manner by fixed principles  This was unlike Descartes who believed that although at times voluntary, behaviour was not predictable  Proposed that voluntary behaviour was governed by hedonism, the pursuit of pleasure and avoidance of pain.
  • 12. THEORY OF EVOLUTION, NATURAL SELECTION – CHARLES DARWIN (1809 - 1892)  Postulates how species have been surviving  All species descend (have evolved) from a common ancestor. This assumes continuity from nonhuman to human animals.  Environment poses an ‘evolutionary problem’ (Avoid predators, find food and mate, etc.)  Only some organisms will have “the solution”.These organisms are the most likely to survive and reproduce and pass on their genes
  • 13. EXAMPLE OF NATURAL SELECTION – GREY PEPPERED MOTHS  Environmental issue for moths: Camouflage needed to avoid predators  3 mechanisms support natural selection:  Variability: A population has various characteristics, e.g., some moths are light and some dark  Differential reproductive success:Those with “the solution” to evolutionary problem, are more likely to reproduce, e.g., light moths are more likely to survive and reproduce than dark moths  Heritability: Pass on the genes and therefore pass on characteristic facilitating survival, e.g., more light moths over generations
  • 14. THEORY OF EVOLUTION – NATURAL SELECTION  These changes happen slowly over time  But…what is selected at one time, may be detrimental later  When the bark changes back!
  • 15. THEORY OF EVOLUTION – NATURAL SELECTION  According to theory:  Both physical traits and psychological or mental abilities play a role in natural selection and evolution of humans and non-human animals  In both humans and non-human animals, relevant psychological abilities such as activity level, aggression, introversion, extroversion, anxiety, curiosity, imitation, attention, memory, reasoning, aesthetic sensibility, belief in spiritual agencies  “Turkey death dance” https://www.youtube.com/watch?v=VZr7uzw6gpg
  • 16. THEORY OF EVOLUTION – NATURAL SELECTION
  • 17. LIMITS OF NATURAL SELECTION  Natural selection works over many generations.  Sudden changes in the environment/animal may lead to extinction (not enough time to adapt)  Polar Bears – Global Warming  Some adaptations that occur may not be useful (and can even be harmful) when the environment changes
  • 18. TODAY – 3 TYPES OF BEHAVIOURS  Elicited Behaviour: Reflexes  Elicited Behaviour: Modal Action Patterns  Learning
  • 19. REFLEXES  The simplest form of elicited behaviour  Involves an elicited stimulus and a specific response (S – R)  Highly stereotypic for all members of a species  But! Some variability via sensitisation and habituation
  • 20. REFLEXES, SOME EXAMPLES  Puff of air to the eye → eye blinking  Dust on the face → sneezing  Fearful stimulus → higher galvanic skin response (e.g., goosebumps)
  • 21. REFLEXES  A consequence of activation of the nervous system via a reflex arc 1. The environmental stimulus activates an afferent neuron (sensory neuron) in the spinal cord. 2. The neural impulses are relayed to an interneuron and then to an efferent neuron (motor neuron). 3. REFLEX ARC comprises afferent neuron, interneuron, efferent neuron
  • 22. MODAL ACTION PATTERNS (MAPS)  Response sequence to a specific “sign” stimulus, species-specific  Involves interconnected actions.These are not learnt, and same across generations  E.g., Baby penguin feeding: Baby penguin will tap on parents' beak (stimulus) for food, and with that stimulus the parent penguin will know to feed baby penguin
  • 23. MODAL ACTION PATTERNS (MAPS)  Series of interrelated actions that can be found in all members of the species  Similar to reflexes: 1. Genetic basis (no learning) 2. Little variability within species or within individuals 3. Reliably elicited by particular events  Different from reflexes: 1. More complex 2. More variable
  • 24. MODAL ACTION PATTERNS (MAPS)  Can also be known as ‘Fixed action’ patterns  These MAPS solve evolutionary problems  They provide the members of the species with a ready-made solution to problems they are sure to encounter  No need to learn the behaviours that will solve the problem  They are likely to have evolved gradually.  Although they are complex behaviours, they are not intentional acts…
  • 25. ELICITING STIMULI FOR MAPS  MAPs are triggered by events called “releasers”  The elicitation of MAPS often occurs within a complex array of competing stimuli.  The few essential features for eliciting a specific response are called sign stimuli (or releasing stimuli)  Supernormal stimuli combine eliciting stimulus features and are effective in eliciting behaviour.
  • 26. MAPS – SOME EXAMPLES  Cats’ reaction to threat: arched back, hissing, growls, flick of tail, etc  Migration of geese inV pattern  Courtship and mating in many animals  Laying eggs
  • 27. BEHAVIOURS ORGANISED INTO SEQUENCES  Appetitive behaviours  Early components of a behavioural sequence.  Represent desire or need for a particular consequence.  They tend to vary (e.g., strategies to get to food)  Consummatory behaviours  The end components.  Represent the consummation or completion of a response sequence  They tend to be fixed (e.g., eating)
  • 29. LEARNING - DEFINITION  “an enduring change in the mechanisms of behaviour involving specific stimuli and/or responses that results from prior experience with those or similar stimuli and responses.” – Domjan, 2020, p.14
  • 30. IS LEARNING EXPLICIT OR IMPLICIT?  Explicit - Learning can result from special instructional training such as schooling, and from common contacts with the environment.  Implicit - Much of our behaviour (and learning) is done outside of conscious awareness  Much of the field of learning and behaviour is from the behaviourist perspective * i.e., empirically rigorous science focused on observable behaviours and non-observable internal mental processes  Learning and behaviour can be understood through an analysis of its antecedents and consequences
  • 31. METHODOLOGICAL APPROACH – HOW DO WE MEASURE LEARNING?
  • 32. UNDER WHAT CONDITIONS SHOULD LEARNING (NOT) BE STUDIED?  Anecdotal evidence:“I know someone who…” or “everyone knows that…” →Biased, insufficient sample  Case studies: Few individuals in detail →Costly, questions of generalisability, no possibility to assert causality  Descriptive studies (correlational) →Cannot infer causality.You would have heard me say this many times in statistics!
  • 33. UNDER WHAT CONDITIONS SHOULD LEARNING BE STUDIED?  Experimental studies: Individuals who received the training procedure have to be compared with individuals who do not receive that training  Ensuring the cause is either present or absent  Helps eliminate any confounding variables  Can be between subjects or within subjects’ designs  Between subjects’ designs  Different groups of people, assigned to different conditions (levels) of the independent variable  Within subjects’ designs  Same group of people completing every condition (level) of the independent variable
  • 34. HOW DO WE MEASURE LEARNING?  To study learning: measure behaviour change  Operational definitions (operations used to measure/define the behaviour)  Number of errors made (less errors as we learn)  E.g., spelling tests
  • 35. HOW DO WE MEASURE LEARNING?  To study learning: measure behaviour change  Operational definitions (operations used to measure/define the behaviour)  Number of errors made (less errors as we learn)  Change in topography (e.g., improve accuracy)
  • 36. HOW DO WE MEASURE LEARNING?  To study learning: measure behaviour change  Operational definitions (operations used to measure/define the behaviour)  Number of errors made (less errors as we learn)  Change in topography (e.g., improve accuracy)  Change in intensity
  • 37. HOW DO WE MEASURE LEARNING?  To study learning: measure behaviour change  Operational definitions (operations used to measure/define the behaviour)  Number of errors made (less errors as we learn)  Change in topography (e.g., improve accuracy)  Change in intensity  Change in speed (e.g., faster)
  • 38. HOW DO WE MEASURE LEARNING?  To study learning: measure behaviour change  Operational definitions (operations used to measure/define the behaviour)  Number of errors made (less errors as we learn)  Change in topography (e.g., improve accuracy)  Change in intensity  Change in speed (e.g., faster)  Change in latency (e.g., faster response over time)
  • 39. HOW DO WE MEASURE LEARNING?  To study learning: measure behaviour change  Operational definitions (operations used to measure/define the behaviour)  Number of errors made (less errors as we learn)  Change in topography (e.g., improve accuracy)  Change in intensity  Change in speed (e.g., faster)  Change in latency (e.g., faster response over time)  Change in rate (e.g., greater frequency over time)
  • 40. WHAT ARE THE PROBLEMS WITH EXPERIMENTAL RESEARCH?  Lack of generalizability  Uses arbitrary responses and stimuli  E.g., Salivation and bells  (for the benefit of increased control)  But! it provides the basic principles on which applied research can be built  Simplifying problems before making them complicated makes sense
  • 41. GENERAL PROCESS APPROACHTO LEARNING  Assumption: learning phenomena are products of elemental processes that operate consistently across situations and species.  Across species, learning is governed by common fundamental rules or “principles”  Seeks to formulate laws to organize and explain a diversity of events, through experiments in many distinct situations and species
  • 42. GENERALITY OF LEARNING  If general processes of learning exist, then we should be able to discover these rules in any situation where learning occurs, and across any species.  Animal studies allow us to conduct research that could not be conducted with humans  Ethics approval: cost-benefit analysis
  • 44. HABITUATION AND SENSITIZATION  Habituation and sensitization are caused by repeated presentation of the same stimulus.  They both serve to focus on the relevant stimuli.  They are both affected by the level of arousal / state system  They occur in the central nervous system.  Habituation effects = decreases in responsiveness due to repeated stimulation. E.g., cannot notice background music after a while (particularly if the room is quiet, as increases with less arousal)  Sensitization effects = increases in responsiveness due to repeated stimulus presentation. E.g.,“happy” music becomes annoying after a while. Particularly if someone is talking loudly (increases with arousal)
  • 45. HABITUATION – STIMULUS SPECIFIC EFFECT  Habituation is a reaction to a specific stimulus.  Changes in the stimuli may reduce habituation  Focusing on alternative stimuli while initiating the consummatory behaviour reduces habituation  E.g.,We may habituate less to specific foods if we eat them with different spices, or while listening to music or in company
  • 46. CASE EXAMPLE – THE STARTLE RESPONSE THE STARTLE RESPONSE IS A DEFENSIVE REACTION TO THREAT OR FEAR, CHARACTERIZED BY A SUDDEN TENSING OF THE UPPER BODY IN RESPONSE TO A SUDDEN STIMULI
  • 47. CASE EXAMPLE – THE STARTLE RESPONSE  https://www.youtube.com/watch?v=FOUZ7xmUkC 8
  • 48. HABITUATION OF THE STARTLE RESPONSE
  • 49. SENSITIZATION MODULATES THE STARTLE RESPONSE  Sensitization is influenced by:  Physiological arousal  How often the stimuli is presented  Sensitization of the startle response is situation dependent.When an organism is already aroused, it is more likely to sensitize to a stimulus.This can be used to “increase”  Excitement e.g., loud music, movies, and sporting events  Fear e.g., music during a horror movie  Pain e.g., repeated exposure, could lead to stronger feelings of pain.
  • 50. SENSITIZATION AND THE STARTLE RESPONSE
  • 51. DUAL PROCESS THEORY  The dual process theory assumes that different neural processes underlie habituation and sensitization (and that these processes are not mutually exclusive).  Habituation and sensitization effects are assumed to reflect the sum of the outcomes of the habituation and sensitization processes, depending on which system is stronger in a given situation.  For example, rats habituate to a loud bang if the background noise is quiet; but sensitise to the same loud bang if the background noise is loud.
  • 52. DUAL PROCESS THEORY  The habituation process occurs in the Stimulus-Response system (S-R system) of the nervous system.  The S-R system is the shortest neural pathway between the sense organ and responding muscle.  Every presentation of an eliciting stimulus activates the Stimulus-Response system.  E.g., we habituate to food, music, sex, drugs, environment (bushwalk, clock ticking)
  • 53. DUAL PROCESS THEORY  The sensitization process occurs in the state system.  The state system consists of other regions of the nervous system responsible for general arousal levels.  The state system is only activated in special circumstances, such as presentation of an intense stimulus.  E.g.,We sensitize to music, pain, drugs, etc.
  • 54. OPPONENT PROCESS THEORY OF MOTIVATION  Some stimuli elicit biphasic emotional responses.  Biphasic responses have:  An initial strong response  An adaptation response  An opposite response  E.g., alcohol and drugs (anxiolytic and anxiogenic), love and attachment
  • 55. OPPONENT PROCESS THEORY OF MOTIVATION  Homeostatic theory – postulates that opposite neurophysiological mechanisms involved in emotional behaviour serve to maintain emotional stability.  An emotionally arousing stimulus pushes an individual’s state away from neutral which triggers an opponent process that counters the shift.  That is, when experiencing an intense positive emotion, the opponent process pushes us to feel down/low.
  • 56. OPPONENT PROCESS THEORY OF MOTIVATION - ADDICTION Initial presentation (recreational use) After habituation (addiction stage)
  • 57. SUMMARY  Historical perspectives  Natural selection  Behaviour traits  Elicited behaviour  Reflexes  Modal Action Patterns  Learning  Habituation and sensitization
  • 59. PSYC214 – LEARNING AND BEHAVIOUR WEEK TWO LECTURE DOMJAN CHAPTER 3
  • 60. HOUSEKEEPING ¡ My name isThomas Nicholl, I will be working with Dylan Fuller for PSYC214 ¡ Tutorials Week 1 to 10 ¡ Lectures Week 2 to 10 ¡ Post questions in the CANVAS discussion board first ¡ Contact via CANVAS (not email) for tutorial based questions ¡ Contact Dylan (LiC) for course/assignment/exam related questions. ¡ Data Collection for Lab Report in Tutorials ¡ Online Study ¡ Please bring laptops to class
  • 61. TODAY’S LECTURE ¡ Classical (Pavlovian) conditioning ¡ History ¡ Examples ¡ Higher order conditioning ¡ Fear conditioning ¡ Eyeblink conditioning ¡ Sign tracking/Auto shaping vs Goal tracking ¡ How is it studied/measured? ¡ Excitatory vs inhibitory classical conditioning
  • 62. LEARNING OUTCOMES ¡ Understand, explain & exemplify the following concepts: ¡ Mechanisms of classical conditioning ¡ Factors that contribute to effective conditional and unconditional stimuli ¡ Learning associations, higher-order conditioning and sensory preconditioning ¡ Unconditioned Response, Unconditioned Stimulus, Conditioned Response, Conditioned Stimulus, Neutral Stimulus ¡ Effects of the US and CS on the CR ¡ Classical and higher order conditioning ¡ Fear Conditioning ¡ Excitatory vs inhibitory classical conditioning ¡ Sign tracking
  • 63. PAVLOV ¡ Russian physiologist, well known for his work on classical conditioning ¡ In his experiment while studying the functioning of digestive system, found that dogs not only salivate upon actually eating, but also when they saw the food, noticed the man who brought it, or even heard his footstep. ¡ Pavlov began to study this phenomenon, which he called ‘conditioning’
  • 64. PAVLOV ¡ Initially, dog salivates when food is presented ¡ This is a reflex of the salivary gland ¡ Then, dog salivates before food arrives ¡ How could this happen as a result of experience? ¡ Psychic reflex (psychic secretions) https://www.youtube.com/watch?v=S6AYofQchoM 3 minute video explaining Pavlov’s interest & contribution to classical conditioning & associative learning
  • 65. PAVLOVIAN OR CLASSICAL CONDITIONING ¡ Simplest mechanism whereby organisms learn about relations between one event and another ¡ Two types of reflexes: ¡ Unconditional (unconditioned) reflexes ¡ US à UR ¡ Food in mouth à salivation ¡ Conditional (conditioned) reflexes ¡ CS à CR ¡ Experimenter à salivation
  • 67. CLASSICAL CONDITIONING – UNCONDITIONED STIMULUS ¡ A stimulus that elicits a particular response without the necessity of prior training
  • 68. CLASSICAL CONDITIONING – UNCONDITIONED STIMULUS ¡ A stimulus that elicits a particular response without the necessity of prior training (e.g., food)
  • 69. CLASSICAL CONDITIONING – UNCONDITIONED RESPONSE ¡ A response that occurs without the necessity of prior training (e.g., salivating)
  • 71. Note:The NS does not have to be a bell! Can be another neutral thing light a light bulb flashing
  • 72. CLASSICAL CONDITIONING – CONDITIONED RESPONSE ¡ A stimulus that does not elicit a particular response initially but comes to do so after being associated with an UNCONDITIONED STIMULUS (E.g., meat)
  • 73. TESTTRIAL: The CONDITIONED STIMULUS is presented without the UNCONDITIONED STIMULUS.This allows measurement of the conditioned response in the absence of the UNCONDITIONED STIMULUS.
  • 74. INITIAL RESPONSES TO THE STIMULI ¡ US effective in eliciting target response from outset ¡ CS does not elicit conditioned response initially; results from association with US US and CS relative to each other
  • 75. RECAP – A CLASSIC LOOK AT CLASSICAL CONDITIONING
  • 77. LEARNING WITHOUT AN UNCONDITIONED STIMULUS ¡ Higher-Order Conditioning: ¡ CS1 is paired with US often enough to condition strong response to CS1 ¡ Once CS1 elicits conditioned response, pairing CS1 with new stimulus CS2 conditions CS2 to also elicit the conditioned response ¡ Conditioning occurs in the absence of US ¡ This is the basis of “irrational fears”
  • 80. LEARNING WITHOUT A CONDITIONED STIMULUS (CONT.) ¡ Sensory Pre-Conditioning: ¡ CS1 and CS2 become associated ¡ CS1 paired illness; CR develops to CS1 ¡ Participants with aversion to CS1 also show aversion to CS2, even though CS2 was never directly paired with US
  • 81. EVERYDAY EXAMPLES OF HIGHER ORDER CONDITIONING ¡ Prof Sapolsky highlights that the outcome is not about the pleasure/reward, but the anticipation of it (refer to video link below) ¡ Other examples of conditioned stimuli leading to CR? ¡ In restaurants ¡ Smart phone tones ¡ THAT song when your relationship ended …. ¡ Use of celebrities in advertising www.youtube.com/watch?v=axrywDP9Ii0 Prof Sapolsky on anticipation of pleasure https://www.youtube.com/watch?v=YvhlOtQAU0A Simpsons video of conditioning
  • 83. FEAR CONDITIONING – CONDITIONED SUPPRESSION ¡ www.youtube.com/watch?v=ZlZekx1P1g4 Conditioned suppression of a rat's lever pressing video ¡ Suppression of ongoing behaviour (e.g., drinking or suppression of lever to get food) produced by the presentation of a CONDITIONED STIMULUS that has been conditioned to elicit fear through association with an aversive UNCONDITIONED STIMULUS (e.g., shock)
  • 84. FEAR CONDITIONING – CONDITIONED SUPPRESSION
  • 85. FEAR CONDITIONING – LICK SUPPRESSION PROCEDURE ¡ A procedure to test fear conditioning ¡ Presentation of a fear conditioned CONDITIONED STIMULUS (e.g., light predating an electric shock) slows down the rate of drinking
  • 86. HOW DOES THIS TRANSLATE TO HUMAN BEHAVIOUR?
  • 87. HOW DOES THIS TRANSLATE TO HUMAN BEHAVIOUR? ¡ “At approximately nine months of age we ran Albert through the emotional tests that have become a part of our regular routine in determining whether fear reactions can be called out by other stimuli than sharp noises and the sudden removal of support....” ¡ “In brief, the infant was confronted suddenly and for the first time successively with a white rat, a rabbit, a dog, a monkey with masks with and without hair, cotton wool, burning newspapers, etc.A permanent record of Albert's reactions to these objects and situations has been preserved in a motion picture study....” ¡ “At no time did this infant ever show fear in any situation.” ¡ http://www.youtube.com/watch?v=Xt0ucxOrPQE&feature=related (John Watson overview)
  • 88. LITTLE ALBERT: WATSON AND RAYNER (1920) ¡ “The sound stimulus, thus, at nine months of age, gives us the means of testing several important factors: ¡ Can we condition fear of an animal, e.g., a white rat, by visually presenting it and simultaneously striking a steel bar? ¡ If such a conditioned emotional response can be established, will there be a transfer to other animals or other objects? ¡ What is the effect of time upon such conditioned emotional responses? ¡ If after a reasonable period such emotional responses have not died out, what laboratory methods can be devised for their removal ?”
  • 89. LITTLE ALBERT: WATSON AND RAYNER (1920) ¡ “These experiments would seem to show conclusively that directly conditioned emotional responses as well as those conditioned by transfer persist, although with a certain loss in the intensity of the reaction, for a longer period than one month. Our view is that they persist and modify personality throughout life.” ¡ “Unfortunately,Albert was taken from the hospital the day the above tests were made. Hence the opportunity of building up an experimental technique by means of which we could remove the conditioned emotional responses was denied us. ¡ Our own view, expressed above, which is possibly not very well grounded, is that these responses in the home environment are likely to persist indefinitely, unless an accidental method for removing them is hit upon.”
  • 90. WHAT HAPPENED TO LITTLE ALBERT? ¡ Recent investigations indicate that Albert was likely to have been a pseudonym for Douglas Merritte, who died aged 6 from hydrocephalus in 1925. ¡ Subsequent investigations suggest that Albert/Douglas suffered from congenital hydrocephalus: ¡ This calls into question Watson and Rayner’s assertion that he was a healthy child at the time of the experiments. ¡ Recent studies claim that the archival footage of Albert/Douglas indicates delayed development and abnormal responses.
  • 91. WATSON AND RAYNER ¡ The pair married in 1921 after Watson’s divorce. ¡ Rosalie assistedWatson to write the most popular childrearing book of the time “The Psychological Care of Infant and Child” (1928). ¡ "Never hug and kiss them, never let them sit on your lap. If you must, kiss them once on the forehead when they say good night. Shake hands with them in the morning.“ ¡ Watson’s 4 children were documented to experience a range of psychological problems, and one became a psychoanalyst.
  • 92. EYEBLINK CONDITIONING ¡ Eyeblink reflex is an early component of the startle response ¡ A defensive reaction to threat or fear, characterised by a sudden tensing of the upper body in response to a sudden stimuli ¡ E.g., puff of air to the eye, someone clapping in your face ¡ https://www.youtube.com/watch?v=hoA2Pm9cjQs
  • 93. SIGN TRACKING OR AUTO SHAPING
  • 94. SIGN TRACKING OR AUTO SHAPING ¡ Example: Pigeons ¡ CS (light) paired with food (US) ¡ Conditioned pecking of light even though not required to gain access to food (and it was 2.5 meters away from food dispenser!)
  • 95. SIGN TRACKING OR AUTO SHAPING ¡ Example: Quails ¡ CS (wood block) paired with females (US) ¡ Conditioned standing on wood even though not required to gain access to females (and it was away from females!)
  • 96. SIGN TRACKINGVERSUS GOAL TRACKING ¡ Individual differences (e.g., rats) ¡ Some individuals show sign tracking (peck on light linked to food or go to block associated with females). ¡ Other individuals show goal tracking (peck on food or follow female) ¡ These individual differences appear to have a genetic basis
  • 97. CLASSICAL CONDITIONING: MORE THAN STIMULUS-RESPONSE? ¡ Stimulus substitution theory: ¡ The behaviourist tradition viewed classical conditioning as a simple mechanical process in which control over a reflex response is passed from one stimulus (UCS) to another (CS) ¡ Evidence in support of the stimulus substitution hypothesis: ¡ Jenkins & Moore (1973) study: ¡ AutoShaping in pigeons: ¡ One group had CS(light) àUS(grain) ¡ Photos showed pigeons trying to “eat” the lit key (open beak and closed eyes) when they pecked ¡ 2nd group had CS(light) à US(water) ¡ Photos showed pigeons trying to “drink” the lit key (closed beak and open eyes) when they pecked
  • 98. ACQUIRED TASTE PREFERENCES AND AVERSIONS ¡ We can acquire new preferences given specific circumstances ¡ Taste aversion learning ¡ Flavour-illness pairing ¡ Single trial learning ¡ Long delay learning ¡ Evaluative conditioning ¡ Learn to like/dislike new flavour ¡ Neutral flavour paired with already liked or disliked flavour
  • 99. CLASSICAL CONDITIONING: MORE THAN STIMULUS-RESPONSE? ¡ Evidence against the stimulus substitution hypothesis ¡ Any study in which the elicited CR is different from the UCR ¡ e.g., when a tone is paired with shock, rats will jump to the UCS (shock), but the CR is typically freezing ¡ e.g., when a light is paired with food, rats will rear to the light (CR) but the UCR is approach to the food dispenser ¡ Preparatory Response Model ¡ Kimble’s (1961, 1967) theory proposed that the CR is a response that serves to prepare the organism for the upcoming UCS ¡ e.g., following acquisition of CRs in eyeblink conditioning, the CR eyeblink may actually prepare the person for the upcoming air puff such that the eye would be partially closed when the air puff occurs
  • 100. IS IT POSSIBLE TO TRAIN ANY ANIMAL ANY CS-UCS ASSOCIATION? ¡ No. What can be learned is strongly constrained by evolutionary history of learning ¡ We call the biological constraints on classical conditioning Biological Preparedness (Seligman, 1970) ¡ Not all CSs are created equal ¡ Mere contiguity (temporal pairing) of NS and UCS is not sufficient
  • 101. WITHIN SPECIES THERE ARE BIOLOGICAL CONSTRAINTS ON WHAT ASSOCIATIONS CAN BE LEARNED ¡ The table shows the results from studies with rats in which the experience of an electric shock (which produces a pain response) or X-rays (which produces a nausea response) was paired with three different kinds of CSs: Light, sound and taste. ¡ What do the results tell us about the preparedness of rats for learning signals that predict painful or nauseating stimuli?
  • 102. BIOLOGICAL PREPAREDNESS ¡ Wilcoxon et al (1971) conducted experiments to test for biological preparedness ¡ Presented a compound stimulus (blue sour water) to quails & rats to test if colour or taste were used in learning (in food) ¡ Two species with different biological preparedness for learning taste aversion ¡ 1. Rats – seek food based on smell/taste ¡ 2. Quails – seek food based on sight/colour ¡ What might we expect in the results?
  • 103. CLASSICAL CONDITIONING: MORE THAN STIMULUS-RESPONSE? ¡ The compensatory-response model is one version of preparatory-response theory ¡ In this model of classical conditioning, the compensatory after-effects to a US are what come to be elicited by the CS ¡ Based on the opponent-process theory of emotion / motivation ¡ Opponent-ProcessTheory of Emotion ¡ Emotional events elicit two competing processes: ¡ The primary/A process that is immediately elicited by the event ¡ e.g., taking an exam elicits an unpleasant A- state ¡ An opponent/B process that is the opposite of the A-process and counteracts it ¡ e.g., the pain during the exam (A-state) creates a pleasant relief response (B-state) following the exam
  • 104. DRUGS,ADDICTION,AND PREPARATORY RESPONSE THEORY ¡ Taking a drug disturbs the homeostasis of the body (up or down). ¡ The body has a natural reflex response to respond to the effect of the drug by compensating for its effect to return homeostasis. ¡ This is called the compensatory response (reflex) ¡ The act of taking a drug is accompanied by many environmental stimuli ¡ When you repeatedly use a drug, you become tolerant of its effects.Tolerance occurs when the effect of a drug decreases over the course of repeated administrations. ¡ You also experience cravings (withdrawal) ¡ In terms of classical conditioning – try to develop an explanation for how the effects of tolerance and craving (withdrawal) might arise.
  • 105. CONDITIONED COMPENSATORY RESPONSE ¡ The reflex is the body’s natural compensatory response to the effect of the drug ¡ Situational cues (initially neutral) become associated with drug use become CSs ¡ Repeated cueing of the drug’s effect by situational cues establishes a conditioned compensatory response. ¡ When the CSs are present, the body prepares for the likely effect of the drug before it has been administered.
  • 106. ¡ Tolerance develops after pairings of the pre-drug CSs with the drug effect UCS. ¡ Produces a conditional compensatory response to the CSs alone. ¡ The conditioned compensatory response counteracts the drug effect, producing tolerance. ¡ As the drug is administered more and more often, and the conditional compensatory response grows in strength, the weakening of the drug effect becomes more pronounced.
  • 107. EXCITATORY AND INHIBITORY CONDITIONING ¡ Excitatory Conditioning – Neutral Stimulus (NS) associated with presentation of Unconditioned Stimulus (US) ¡ Inhibitory Conditioning – Neutral Stimulus (NS) associated with absence or removal of Unconditioned Stimulus (US) What if a stimulus is associated with the absence of the US rather than its presentation? ¡ In excitatory conditioning, organisms learn a relationship between a CS and US ¡ As a result of this learning, presentation of the CS activates behavioural and neural activity related to the US in the absence of the actual presentation of that US. ¡ E.g., Pavlov's Dog
  • 108. INHIBITORY CONDITIONING ¡ Organisms learn to predict the absence of the US (when bad things happen) ¡ PREREQUISITE US must occur regularly before (you need to know WHEN bad things happen e.g., shocks, panic attacks, bullying, out of petrol) ¡ Something is introduced so it prevents an outcome that would occur otherwise ¡ Learn the absence of the US ¡ Examples: ¡ Panic attack in crowds àNo panic attack when avoiding crowds ¡ Bullied when teacher is away à No bullying when teacher is around ¡ Out of petrol when sign is on àWith petrol when sign is off
  • 110. SUMMARY ¡ Classical conditioning ¡ Form of associative learning ¡ Not just restricted to reflexes – higher order conditioning ¡ Examples include ¡ Fear conditioning ¡ Eyeblink conditioning ¡ Sign tracking and goal tracking ¡ Excitatory and Inhibitory conditioning ¡ Procedures ¡ Measures of conditioned responding
  • 111. RECOMMENDED READINGS ¡ Fridlund,A. J., Beck, H. P., Goldie,W. D., & Irons, G. (2012, January 23). Little Albert:A neurologically impaired child. History of Psychology.Advance online publication. doi:10.1037/a0026720 ¡ Jones, M.C. (1924) A laboratory study of fear: the case of Peter,The Pedagogical Seminary and Journal of Genetic Psychology, 31:4, 308-316, DOI: 10.1080/08856559.1924.9944851 ¡ Rescorla, R.A. (1988). Pavlonian Conditioning: It’s not what you think it is.American Psychologist. 43(3) 151-160. ¡ Siegal, S. (2005). Drug tolerance, drug addiction, and drug anticipation. Psychological Science. 14. 296-300. doi: 10.1111/j.0963-7214.2005.00384.x ¡ Watson, J. B., & Rayner, R. (1920). Conditioned emotional reactions. Journal of Experimental Psychology, 3(1), 1–14. https://doi.org/10.1037/h0069608 ¡ Wilcoxon, H. C., Dragoin,W. B., & Kral, P.A. (1971). Illness-induced aversions in rat and quail: Relative salience of visual and gustatory cues. Science, 171(3973), 826-828.
  • 112. NEXT WEEK ¡ Factors that affect CC ¡ Alternative models of CC
  • 113. PSYC214 – LEARNING AND BEHAVIOUR WEEK THREE LECTURE DOMJAN CHAPTER 4
  • 114. HOUSEKEEPING ¡ Lab report guide and required readings now available on Canvas. ¡ Attendance at Tutorials for support with lab report. ¡ Mid-semester exam in Week 7
  • 117. CLASSICAL CONDITIONING – UNCONDITIONED STIMULUS ¡ A stimulus that elicits a particular response without the necessity of prior training
  • 118. CLASSICAL CONDITIONING – UNCONDITIONED STIMULUS ¡ A stimulus that elicits a particular response without the necessity of prior training (e.g., food)
  • 119. CLASSICAL CONDITIONING – UNCONDITIONED RESPONSE ¡ A response that occurs without the necessity of prior training (e.g., salivating)
  • 122. CLASSICAL CONDITIONING – CONDITIONED RESPONSE ¡ A stimulus that does not elicit a particular response initially, but comes to do so after being associated with an UNCONDITIONED STIMULUS (E.g., meat)
  • 123. REVISION QUESTION ¡ Identify each of the key concepts (i.e., US, UR, CS, CR) ¡ EXAMPLE 1: Every time someone turns on the washing machine in your house, the shower becomes very cold and causes you to jump back. Over time, the person begins to jump back automatically after hearing the washing machine working, before the water temperature changes ¡ NS = NR ¡ US = UR ¡ NS + US = UR ¡ CS = CR
  • 124. REVISION QUESTION ¡ Identify each of the key concepts (i.e., US, UR, CS, CR) ¡ EXAMPLE 2:You eat a new food and then get sick because of the flu. However, you develop a dislike for the food and feel nauseated whenever you smell it. ¡ NS = NR ¡ US = UR ¡ NS + US = UR ¡ CS = CR
  • 126. CONDITIONING - CONTINUED ¡ Examples: ¡ Fear conditioning: ¡ Sign tracking – movement towards/contact with a stimulus that signals the availability of a positive reinformcent. ¡ Goal tracking – conditioned behaviour elicited by a CS that consists of approaching the location where the US is present ¡ Eyeblink conditioning ¡ Stimulus Substitution Theory ¡ Opponent ProcessTheory ¡ Excitatory (presentation) and Inhibitory (absence/removal) Conditioning
  • 128. TODAY’S LECTURE ¡ What makes an effective CS and US? ¡ Novelty ¡ Belongingness ¡ Salience ¡ Intensity ¡ What determines the nature of CR? ¡ The US ¡ The CS ¡ How do CS and US become associated? ¡ Rescorla Wagner Model ¡ Attentional models ¡ Temporal Coding Hypothesis
  • 129. WHAT MAKES EFFECTIVE CS AND US? ¡ Initial responses to stimuli ¡ CS does not elicit the conditioned response initially, but comes to do so after being associated with US ¡ US elicits the response unconditionally ¡ Nearly any stimulus can be CS and US ¡ Novelty (more novel, faster to learn about the CS and US) ¡ CS and US intensity and salience ¡ Belongingness ¡ Learning without a US?
  • 130. EFFECTIVE CS AND US: NOVELTY ¡ The Latent-Inhibition or CS-Preexposure Effect - novel (new) stimuli are more effective than familiar stimuli ¡ Experiments on latent-inhibition effect have two phases: ¡ Participants given repeated presentations of CS by itself: the CS preexposure makes CS familiar ¡ CS is paired with a US. Participants are slower to acquire responding because of the CS preexposure ¡ The US-Preexposure Effect ¡ Experiments on US novelty similar to CS-preexposure experiments ¡ Conditioning proceeded faster for novel stimuli than familiar stimuli
  • 131. EFFECTIVE CS AND US: INTENSITY & SALIENCE ¡ Learning facilitated with higher stimulus intensity, which gives the stimulus more salience ¡ Salience is significance, noticeability, attention-getting, more likely to occur in natural environment ¡ Examples: ¡ Faster learning using food as US, if subject is hungry ¡ Stronger fear of dogs if was attacked by a big dog as a child ¡ Use CS that are more likely to occur in the natural environment, as reinforcers e.g., use female quail as US/sexual reinforcer, to make male quail more responsive/faster learning to CS like light, block of wood
  • 132. CS–US RELEVANCE/BELONGINGNESS ¡ Learning depends on relevance of the CS to the US ¡ Kind of stimuli presented with USs important ¡ Example: taste readily associated with illness; audio-visual cues readily associated with peripheral pain ¡ Belongingness Experiment (Garcia and Koelling, 1966): ¡ Rats drink from a tube, before administration of one of 2 US: (i) shock or (ii) illness ¡ Rapid learning occurred only if CS was combined with appropriate US
  • 133. ¡ *Rapid learning occurred only if CS was combined with appropriate US* ¡ Rats conditioned with illness learned a stronger aversion to taste CS (than to audio-visual CS). ¡ Rats conditioned with shock learned stronger aversion to audio-visual CS (than to taste CS).
  • 134. CS–US RELEVANCE/BELONGINGNESS ¡ When the CS “belongs to” / is related to / is biologically relevant to the US, it is easier / faster to learn that they are associated. ¡ It reflects sensitization effect of CS pre-exposure ¡ For example: ¡ Rhesus monkeys and humans learn fear conditioning faster if the CS that signal danger are fear relevant/biologically dangerous cues (snake, image of a skull) versus fear irrelevant (mushroom) ¡ Can you think of another example?
  • 135. LEARNING WITHOUT AN UNCONDITIONED STIMULUS ¡ Pavlovian conditioning: food, shock,… but what about everything else? ¡ Higher-Order Conditioning: ¡ CS1 is paired with US often enough to condition strong response to CS1 ¡ Once CS1 elicits conditioned response, pairing CS1 with new stimulus CS2 conditions CS2 to also elicit the conditioned response ¡ Conditioning occurs in absence of US ¡ This is the basis of “irrational fears”
  • 136. LEARNING WITHOUT AN UNCONDITIONED STIMULUS ¡ Stimulus Substitution Model ¡ The theory that as a result of classical conditioning, subjects come to respond to the CS in the same way that they respond to the US ¡ The CS can then act as a US ¡ It assumes that ¡ The CS becomes a surrogate of the US ¡ The nature of the US dictates the CR ¡ The Conditioned Stimulus lead to the Unconditioned Response via excitation of Unconditioned Stimulus centres (e.g., amygdala).
  • 137. STIMULUS SUBSTITUTION MODEL - LEARNING WITHOUT AN US: HIGHER ORDER CONDITIONING EXAMPLE ¡ You are afraid of crowds and feel anxious in them (CS1). ¡ Perhaps because when you were in a crowd once (CS), someone pushed you and hurt you (US). ¡ You go to the movies, and a group of people rush in (CS) and you feel anxious (CR). ¡ The next time you think about going to the movies (CS2) with friends, you feel anxious (CR). (Adapted from Wolpe, 1990)
  • 138. WHAT DETERMINES THE NATURE OF THE CR? ¡ The US ¡ Core factor – the type of US should direct the UR/CR ¡ Example of the pigeons with food vs. water and their response (beak open/closed) ¡ Stimulus substitution model (previous slides) ¡ The CS ¡ Type will determine the CR dependant on the organism ¡ Example of rat being presented (UR of gnawing/biting, vs. CR of social orientation) ¡ The US-CS interval ¡ The interval (time, distance) between the US and the CS is vital ¡ Example of car coming toward you and the potential for injury – response is dependant on the distance from the car.
  • 139. HOW DO THE CS AND US BECOME ASSOCIATED? [THEORIES OF CLASSICAL CONDITIONING] ¡ The blocking effect ¡ “not learning an association” ¡ The Rescorla-Wagner model ¡ Attentional models ¡ Temporal coding hypothesis models
  • 140. BLOCKING EFFECT ¡ Every Sunday you visit your partner’ parents… They always serve a cherry cake that slightly disagrees with you. L You don’t want to upset them, so you don’t say anything ¡ You acquire an aversion to the cherry cake (so that every time you are supposed to eat it, you feel unpleasant). ¡ On a special occasion, your partner’s parents add a special chocolate sauce to the cherry cake. ¡ You feel sick Will you develop an aversion to the chocolate sauce?
  • 141. BLOCKING EFFECT ¡ Classical conditioning may not occur in some instances (e.g., chocolate might make you sick… but don’t learn an association) ¡ Kamin: Classical conditioning occurs only when the US is unexpected ¡ The presence of a previously conditioned stimulus (e.g., cherry pie), may block / interfere with the conditioning of a novel stimulus (e.g., chocolate sauce)
  • 142. BLOCKING EFFECT, EXPERIMENT ¡ Phase 1:Training link bell & food ¡ Bell (=CS 1) àsalivation (=CR) because anticipates Food (=US) ¡ Phase 2:Training link bell + light & food ¡ Bell (=CS 1) + simultaneous light (CS =2) à salivation (=CR) because anticipates Food (=US) ¡ Phase 3:Training link light + food ¡ Light (CS =2) à NO salivation (no CR) as it does not anticipate food à Salivation only 20% of times if light only is presented ¡ Learning of light as CS (CS=2) predating food, is “blocked” by the existing association between the bell (CS =1) and the food
  • 143. RESCORLA-WAGNER MODEL (1972) ¡ Mathematical model proposed by Robert Rescorla and Allan Wagner (1972) ¡ The idea that the effectiveness of a US is determined by how surprising it is forms the basis of a formal mathematical model of conditioning. ¡ The model look at the implications of the US surprise to a wide variety of conditioning phenomena. ¡ According to the model, an unexpectedly large US is the basis for excitatory conditioning and an unexpectedly small US (or absence) is the basis for inhibitory learning. ¡ The model suggests assumptions can be made about the expectation of the US. ¡ We will look at their mathematical model in a later slide.
  • 144. BLOCKING EFFECT ¡ On the blocking effect… ¡ The RW model clearly predicts the blocking effect. ¡ If one CS already fully predicts that the US will come, nothing will be learned about a second CS that accompanies the first CS ¡ On extinction: it is new learning about undoing an existing association ¡ It is not the reverse of acquisition
  • 145. RESCORLAWAGNER MODEL (1972): LOSS ASSOCIATIVEVALUE IF THERE ARE 2 CS ¡ The associative value of a CS (e.g. bell) in predicting a US (e.g. food) is lost: ¡ After both CSs (e.g. bell, light) have been associated with the US (e.g. food) in separate trials ¡ When it is presented together with another CS (e.g. light) on a conditioning trial
  • 146. RESCORLAWAGNER MODEL (1972): LOSS ASSOCIATIVEVALUE IF THERE ARE 2 CS ¡ Example: ¡ One learning trial light à food (& light leads to salivation) ¡ One learning trial bell à food (& bell leads to salivation) ¡ Final trial light+bell => food (& light+bell does not lead to salivation) ¡ Why? ¡ Because light+bell lead to over-expectation of double the food ¡ The subject has to decrease expectation-> light+bell lose associative value
  • 147. RESCORLAWAGNER MODEL (1972): LEARNING IS ABOUT SURPRISE ¡ How much you learn depends on the effectiveness of the US, how “surprising” and “unpredictable” the US is ¡ This generates a strong conditioned response (CR) ¡ The value of the CS (e.g., bell, light) is stronger / highly salient when it predicts the onset of the US (e.g., food)
  • 148. RESCORLA-WAGNER MODEL (1972) ¡ The delta rule ¡ Learning at the start: big errors in predicting “uncertain” US at the start of the learning (i.e., don’t know when the US is going to happen) ¡ As learning goes on,“error correction” the US becomes more predictable
  • 149. ATTENTIONAL MODELS ¡ American psychologists focused on changes in the impact of the US, whilst British psychologists looked at the impact of how the CS commands attention. ¡ Assumption that increased attention facilitates learning about a stimulus and procedures that disrupt attention to a CS disrupt learning. ¡ The outcome of a given trial (e.g., reward/loss) alters degree of attention commanded by the CS on future trials ¡ More attention à More learning ¡ If the US is surprising, it boosts the influence of attention to the US (not the salience of the US) on conditioning
  • 150. TEMPORAL CODING HYPOTHESIS/MODELS ¡ Neither the RW or attentional models looked at timing, however, acknowledged it is a critical factor. ¡ Time is a critical factor in classic conditioning. ¡ Several models (Rescorla-Wagner, attentional models) do not explain effects of time in conditioning ¡ Timing is important ¡ Learn not only that the US will occur, but when ¡ “Temporal coding” – Participants learn when the US occurs in relation to a CS and use this information in blacking, second-order conditions, and other trainings. ¡ Learn when the US occurs in relation to a CS –2 temporal factors of interest
  • 151. TEMPORAL CODING HYPOTHESIS / MODEL ¡ Conditioned Responding depends on ¡ 1. Duration of the CS-US interval (or inter stimulus interval) ¡ How long one must wait for the US to occur after the CS ¡ Longer duration between Pavlov’s dog bell and meat à less conditioned responding ¡ 2. Inter-trial Interval ¡ How much time between one CS-US trial and the next ¡ Longer duration à more Conditioned Responding ¡ “CS associable with a US … only to the extent that it reduces the expected time to the next US”
  • 152. RECAP ¡ Classical conditioning as a basic approach ¡ Fear conditioning ¡ Eyeblink conditioning ¡ Sign tracking / goal tracking ¡ Higher order conditioning ¡ Excitatory/Inhibitory Conditioning ¡ What makes CC effective? ¡ Novelty, belongingness, salience, intensity ¡ What if it doesn’t work….Aka the Blocking Effect (links to higher order conditioning) ¡ How do the CS and US become associated (and do they?) ¡ RW model ¡ Attentional Modes ¡ Temporal Coding
  • 153. WEEK 4 AND WEEK 5 ¡ Instrumental Conditioning (IC) à Week 4 ¡ Reinforcement schedules and Choice à Week 5
  • 154. BEFORE I LEAVEYOU (FOR NOW) ¡ Remember to read and review: ¡ Assigned Readings ¡ Lab report guide ¡ Required readings for the lab report (we will begin to discuss in week four tutorials)
  • 155. PSYC214 – LEARNING AND BEHAVIOUR WEEK FOUR LECTURE DOMJAN CHAPTER 5
  • 156. § Instrumental (Operant) Conditioning § Early investigations § Modern approaches § Procedures § Fundamental elements TODAY’S LECTURE
  • 158. ¡ Classical conditioning reflects how organisms adjust to events in their environment that they do not directly condition. ¡ Now we will look at learning situations in which the stimuli an organism encounters are a result or consequence of its behaviour. ¡ This is referred to as goal-directed or instrumental because responding is necessary to produce a desire environmental outcome. ¡ Here behaviour is instrumental in producing a significant stimulus or outcome.
  • 159. § Reflexes, habituation, sensitisation, Classical Conditioning: § Behaviours are elicited in response to environmental stimuli that the organism does not directly control. § Organisms are not required to respond in a particular way (or behave) to obtain a unconditioned or conditioned stimuli. § Instrumental/operant conditioning: § The stimuli an organism encounters are a result or consequence of its behaviour.That is, the organism has control of an outcome. § Behaviours that occur because they were previously effective in producing an outcome are instrumental. § We change our behaviour to maximise outcomes. § Focuses on effect of behaviour in environment § Behaviours are goal-directed to produce environmental outcome § Instrumental behaviour (the environment contains the opportunity for reward; behaviour occurs because it was effective in producing [favourable] consequences) § Response-reinforcer LEARNING
  • 160. • Instrumental Conditioning • Classical Conditioning INSTRUMENTAL/OPERANT CONDITIONINGVS. PAVLOVIAN/CLASSICAL 6 • Environmental event depends on behaviour • Pigeon peck à food/water • Voluntary behaviour • (Though some involuntary behaviour) • Environmental event depends on another stimulus, not on behaviour • Bell à food • Involuntary behaviour (reflex)
  • 162. INSTRUMENTAL (OPERANT) CONDITIONING: THORNDIKE ¡ E. L.Thorndike (1898) ¡ Systematic study of “animal intelligence” ¡ Puzzle box problem with cats ¡ Measure of performance: how quickly cat exited box in successive trials ¡ Interpreted results as reflecting the learning of a new S-R association.
  • 163. THORNDIKE’S FINDINGS ¡ Findings: ¡ First trial, cat’s behaviour displayed high variability ¡ Took a long time to solve ¡ Every successive trial, the cat took less and less time to exit
  • 164. § Law of effect – If response (R) in presence of stimulus (S) is followed by a satisfying event, association between stimulus S and response R becomes strengthened § If the response is followed by annoying event, the S–R association is weakened § Association between response and stimuli present at the time of response is learned § Thorndike’s law of effect (1911): § “Of several responses made to the same situation, those which are accompanied or closely followed by satisfaction to the animal will, other things being equal, be more firmly connected with the situation, so that, when it recurs, they will be more likely to recur; those which are accompanied or closely followed by discomfort to the animal will, other things being equal, have their connections with that situation weakened so that, when it recurs, they will be less likely to occur.The greater the satisfaction or discomfort, the greater the strengthening or weakening of the bond.” THORNDIKE’S LAW OF EFFECT (1911)
  • 165. § What does this all mean? § If a behaviour is followed by a “positive” consequence, the chances of it happening again increase § If a behaviour is followed by a “negative” consequence, the chances of it happening again decreases BEHAVIOUR ISA FUNCTION OF ITS CONSEQUENCE … so, we need to do research to prove that … § Notice that the consequence is not one of the elements in the association. § The level of satisfaction strengthens the responses, verses the level of annoyance weakens the response. § Can this then explain compulsive habits that are difficult to break? § Once learned, habitual responses occur because they are triggered by an antecedent stimulus and not because they result in a desired consequence. (Everitt & Robbins, 2005) THORNDIKE’S LAW OF EFFECT (1911)
  • 167. § DiscreteTrial Procedures § Free-Operant Procedures 13 MODERN APPROACHES TO INSTRUMENTAL CONDITIONING THORNDIKE à SKINNER
  • 168. § Discrete trial procedures: § Each trial begins with the organism being placed in an apparatus (S) and ends after the intrumental response (R) has occurred. § Examples include mazes. § Target behaviour (and its consequence) demonstrated, then trial ended (animal removed). § Measure of performance (DV): § Time to perform target behaviour (i.e., running speed and latency) § Number of errors made before target behaviour § Skinner further develop systematic study of behaviour via free operant procedures 13 MODERN APPROACHES TO INSTRUMENTAL CONDITIONING THORNDIKE à SKINNER
  • 169. § Free-Operant Procedures: § Invented by B.F.Skinner (1938) § "Allow an animal to repeat the instrumental response without constraint over and over again without being taken out of the apparatus until the end of the experimental session” (p.126). § Suggested‘on going behaviour was continuous’. 13 MODERN APPROACHES TO INSTRUMENTAL CONDITIONING THORNDIKE à SKINNER
  • 170. OPERANT CONDITIONING: B. F.SKINNER ¡ Skinner: From single trial to operant procedures ¡ Systematic study of the establishment and maintenance of behaviour controlled by its consequences… ¡ Organism can repeat instrumental response numerus times. ¡ ‘Operant response’ is defined in terms of the effect that the behaviour has on the environment. ¡ ‘Behaviour is not defined by a particular muscle movement, but in terms of how the behaviour operates on the environment’. ¡ Analysed how behaviour changes based on its consequences. ¡ Developed principals of reinforcement.
  • 171. § In operant conditioning, responses are said to be emitted (rather than elicited) § Because they are voluntary § Response à consequence § Behaviour is affected by its consequences (it is said to be caused by them) § You tell jokes because people laugh § Kid cries because he gets lolly OPERANT CONDITIONING
  • 172. 18 Two DifferentTypes of Behaviour Respondent behaviours: those that occur automatically and reflexively (e.g., pulling hand from hot stove) Operant behaviours: those that occur under conscious control (spontaneously or purposely). It is the consequences of these actions that effect whether or not they occur again OPERANT CONDITIONING: SKINNER
  • 173. § In short § It is adaptive to learn associations between voluntary behaviours, which will reliably predict punishing or rewarding outcomes § Behaviour is shaped by the learner’s history of experiencing reinforcement (likely to be repeated) and punishment (less likely to be repeated) § Skinner:“Life is guided by consequences” § Note that no need for awareness of the relationship between behaviour and consequence § Almost every behaviour is learned (notable biological and cognitive exceptions in CC – e.g., taste aversion, preparedness and phobias) 19 OPERANT CONDITIONING: REINFORCEMENT AND PUNISHMENT
  • 174. § Shaping: § Reduce complex behaviours into sequence of more simple behaviours § Reinforce successive approximations to the final behaviour § E.g., training rat to press lever § Reinforce (with food) when rat is: § Rearing (activity/exploration, anywhere in cage) § Rears at lever § T ouches lever § Paws/presses lever 1. Begin by reinforcing a high frequency component of the desired response (e.g.,rearing) 2. Then drop this reinforcement – behaviour becomes more variable again 3. Await a response that is still closer to the desired response – then reintroduce the reinforcer 4. Keeping cycling through as closer and closer approximations to the desired behaviour are achieved § Enables the moulding of a response that is not normally part of the animal’s repertoire 20 OPERANT CONDITIONING: SHAPING
  • 175. § Shaping: § Three vital components § Clearly defined final response § Clearly assess starting level of performance § Divide progression from starting point to the final target behaviour into appropriate training steps/successive approximations § Free-operant procedures = ideal § Chaining: § Sequence of behaviours that are linked to form a complex behaviour § This in turn becomes a unit (behavioural units) § E.g., teaching a young person to practice good hygiene § Washing hands § Learning to create a lather (soap + water) § Rinse and dry § Washing hands then prompts other hygienic behaviours 21 OPERANT CONDITIONING: SHAPING AND CHAINING
  • 176. § Consequences to a behaviour can increase or decrease its likelihood § If it increases: reinforcement § If it decreases: punishment § Consequences may consist of presenting or removing a stimulus § If presenting: positive § If removing: negative 22 REINFORCEMENTVERSUS PUNISHMENT
  • 178. REINFORCEMENTVS. PUNISHMENT What happens to the behaviour in the next few trials More frequent (reinforcement) Less frequent (punishment) Are you presenting or removing a Presenting (positive) Positive Reinforcement Positive Punishment stimulus? Removing (negative) Negative Reinforcement Negative Punishment
  • 179. REINFORCEMENTVS. PUNISHMENT • Pigeon pecks key, receives food • Girl does homework and Dad praises her • Boy throws tantrum and Mum gives him lollies he wants REWARD LEARNING What happens to the behaviour in the next few trials More frequent (reinforcement) Less frequent (punishment) Are you presenting or removing a Presenting (positive) Positive Reinforcement stimulus? Removing (negative)
  • 180. REINFORCEMENTVS. PUNISHMENT • Pigeon pecks on key and receives a shock • Dog barks and collar releases aversive smell • Nail biting and bad tasting nail polish What happens to the behaviour in the next few trials More frequent (reinforcement) Less frequent (punishment) Are you presenting or removing a Presenting (positive) Positive Punishment stimulus? Removing (negative)
  • 181. REINFORCEMENTVS. PUNISHMENT • Pigeon pecks key and shock is delayed/removed • Open umbrella when it starts raining – don’t get wet • Students worked hard during the week so teacher removes weekend homework ESCAPE ORAVOIDANCE LEARNING What happens to the behaviour in the next few trials More frequent (reinforcement) Less frequent (punishment) Are you presenting or removing a Presenting (positive) stimulus? Removing (negative) Negative Reinforcement
  • 182. REINFORCEMENTVS. PUNISHMENT § Pigeon pecks and available food is removed § Child misbehaves andTV watching is forbidden § ‘Time-out’ for naughty behaviour § OMISSIONTRAINING What happens to the behaviour in the next few trials More frequent (reinforcement) Less frequent (punishment) Are you presenting or removing a Presenting (positive) stimulus? Removing (negative) Negative Punishment
  • 183. REMEMBER What happens to the behaviour in the next few trials More frequent (reinforcement) Less frequent (punishment) Are you presenting or removing a Presenting (positive) Positive Reinforcement Positive Punishment stimulus? Removing (negative) Negative Reinforcement Negative Punishment
  • 184. § Does not ‘erase’ an undesirable habit § What could we do then to build in more ‘desirable’ behaviour? § May not teach more a desirable behaviour (if the only focus) § Often ineffective unless: § Given immediately after undesirable behaviour § Given each time the behaviour occurs (continuous schedule) § Think about managing challenging behaviours….. DRAWBACKS OF PUNISHMENT
  • 185. § The three C’s § Contingency § Clear relationship betweenA and B § Contiguity § A à B, where contiguity (à) is the time/proximity betweenA and B § Consistency § Every time behaviour occurs (continuous schedule – more on that next week!) 31 WHEN IS PUNISHMENT EFFECTIVE?
  • 187. ELEMENTS OF INSTRUMENTAL CONDITIONING ¡ The instrumental response ¡ The outcome of the response ¡ The relation (or contingency) between the response and outcome
  • 188. THE INSTRUMENTAL RESPONSE ¡ The outcome of instrumental conditioning depends in part on the nature of the response being conditioned. ¡ Behavioural variability vs. stereotypy ¡ Variability: performing differently on each trial or differently for a response to occur. ¡ Stereotypy: develops if allowed/required. ¡ Thorndike and Skinner: Operant responses become more stereotyped with continued conditioning ¡ Variable or novel responses can be produced if response variation is needed for reinforcement ¡ See Ross and Neuringer (2002)
  • 189. THE INSTRUMENTAL RESPONSE ¡ Relevance or belongingness in Instrumental (Operant) Conditioning ¡ Responses that naturally belong with a reinforcer ¡ Limitations:“A behaviour cannot be reinforced by a reinforcer if it is not naturally linked to that reinforcer in the repertoire of the animal” ¡ In CC: Rat learns taste + sickness faster than taste + shock ¡ In IC: Thorndike’s cats learnt operating latch/pulling string + escape > yawning/scratching + escape ¡ Where the latter responses were mimicked once learned, where as the former remained constant (natural, evolutionary responses)
  • 190. THE INSTRUMENTAL RESPONSE ¡ Behaviour systems and constraints on Instrumental (Operant) Conditioning ¡ Is the response part of their behavioural system? ¡ What is the context of the S and R? ¡ If not, difficulty learning (e.g., conditioning raccoons with food reinforcement to drop a coin into a slot, which is incompatible with their pre-existing organisation of their feeding system) ¡ Here we acknowledge the constraints, dependant on the context and the behaviour system of the organism.
  • 191. THE INSTRUMENTAL REINFORCER ¡ Quantity and quality of the reinforcer ¡ Larger and better reinforcers are more effective ¡ Study in individuals with substance addiction • If participants remained drug free, they received $10 at end of the day • Larger payments were significantly better at encouraging abstinence than smaller payments • If participant received $ immediately after passing drug test, they were more likely to abstain in future vs. those who received $ a few days after ¡ Shifts in reinforcer quality and quantity ¡ Effectiveness of reinforcer reduces if strength of reinforcer is reduced (compared to previous trials) • Rats with sucrose water (sucrose concentration reduced in half, who then respond less than other half who had no previous exposure) • Behavioural Contrast Effects • A large reward is particularly good after a small reward, while a small reward is treated especially poor after reinforcement with a larger reward • This type of “anticipatory negative contrast may explain why individuals addicted to cocaine derive little satisfaction from conventional reinforcers (a tasty meal) that other enjoy on a daily basis”.
  • 192. THE RESPONSE-REINFORCER RELATION ¡ Two types of relationships between a response and a reinforcer: 1. Temporal relation: the time between the response and reinforcer ¡ Temporal contiguity – the delivery of the reinforcer immediately after the response 2. Response-reinforcer contingency: the extent to which the instrumental response is necessary and sufficient to produce the reinforcer ¡ Temporal and causal factors are, however, independent of one another.A strong temporal relation does not require a strong causal relation, and vice versa. ¡ E.g., there is a “strong causal relationship between taking your clothes to be dry cleaned and getting clean clothes back. However, the temporal delay may be a day or two”.
  • 193. § Contiguity § Temporal proximity between behaviour and consequence § Learning is faster the closer the reinforcement is to the behaviour (in time) § Dickenson et al. (1992): shaping of lever pressing for food in rats with shorter (2-4s) or longer (up to 64s) delays § Important because: § Delay allows for intervening behaviour to occur – not clear what the desired response was § However, signaling the delay (i.e., marking) reduces the delay of the impact on learning (i.e., conditioned reinforcement) TEMPORAL CONTIGUITY 39
  • 194. • Marking group: A light was presented for 5 seconds at the beginning of the delay interval (immediately after the instrumental response) • No signal (30 second delay only) • Blocking group: The light was introduced at the end of the delay interval, just before the delivery of food TEMPORAL CONTIGUITY 40
  • 195. IS CONTIGUITY ENOUGH? WHAT ABOUT CONTINGENCY… ¡ Contingency ¡ Correlation between behaviour and consequence ¡ Learning the degree to which our behaviour has an effect on the environment
  • 196. CONTINGENCY ¡ Learning the degree to which our behaviour has an effect on the environment ¡ Learned Helplessness: no contingency, lack of control ¡ Animal that is first subjected to learned helplessness, cannot learn to escape/avoid shock later ¡ I.e., they believe that they have no control over outcome
  • 197. OTHERVARIABLES AFFECTING LEARNING ¡ Amount of reinforcement ¡ More is better (but not linear relationship) ¡ Quality / type of reinforcement ¡ Not all foods are created equal ¡ Task features: ¡ Difficulty ¡ Biological tendencies: pigeons auto shape; some things cannot be taught (more on this next week)
  • 198. • Instrumental Conditioning • Classical Conditioning INSTRUMENTAL/OPERANT CONDITIONINGVS. PAVLOVIAN/CLASSICAL 44 • Environmental event depends on behaviour • Pigeon peck à food/water • Voluntary behaviour • (Though some involuntary behaviour) • Environmental event depends on another stimulus, not on behaviour • Bell à food • Involuntary behaviour (reflex)
  • 199. § Not always a clear cut distinction, some learning is not easy to separate § i.e., some behaviours elicited in response to stimuli (CC), others are goal-directed to produce environmental outcome (IC) § Training a dog to sit / stay / jump with food § Learned helplessness: pairing of two stimuli but also punishment of all behaviours… § Child being given ‘treat’ at the supermarket § Dwight in the Office PAVLOVIAN/CLASSICALVS. INSTRUMENTAL/OPERANT CONDITIONING 45
  • 200. SUMMARY ¡ Operant Conditioning ¡ Reinforcement à increases behaviour ¡ Punishment à decreases behaviour ¡ Positive: consequence presented ¡ Negative: consequence removed
  • 201. NEXT WEEK ¡ Reinforcement Schedules and Choice (Chapter 6)
  • 202. PSYC214 – LEARNING AND BEHAVIOUR WEEK FIVE LECTURE DOMJAN CHAPTER 6
  • 203. UPDATES ON CANVAS ¡ Mid-semester Exam: ¡ Week 7 (Friday) ¡ 8am to 12midday ¡ 75-minutes to complete ¡ 40 MC and 1 SR ¡ Tutorials are on that day ¡ Practice Q’s available on CANVAS ¡ Lab Report ¡ Results released this week ¡ You are now able to complete results section ¡ This week we will cover remainder of introduction and methods (results next week)
  • 204. REVIEW OF WEEK 4 ¡ Instrumental (Operant) conditioning ¡ Reinforcement/punishment ¡ Positive/negative ¡ Shaping Behaviour
  • 206. LEARNING AND REINFORCEMENT ¡ Language is important à positive v. negative in the context of learning. ¡ Desirable v. undesirable behaviour // helpful v. unhelpful behaviours. ¡ Who determines it? ¡ Is learning always a “helpful” thing? ¡ “Undesirable” behaviours are also learned ¡ They must lead to some high value consequence ¡ Aggression, tantrums, etc. ¡ Are “consequences” or “reinforcers” always “good”? ¡ Cigarettes, drugs, etc.
  • 207. TODAY’S LECTURE ¡ Extension from last weeks lecture ¡ Simple schedules – ratio and interval ¡ Concurrent schedules – studying choice ¡ Complex choice and self-control ¡ Delay Discounting
  • 209. SCHEDULES OF REINFORCEMENT ¡ So far, we have talked about behaviour acquisition and the role of reinforcement in the process ¡ Typically looked at cases where every response leads to reinforcement (continuous reinforcement) ¡ In real life, hardly ever does every single response lead to its corresponding reinforcement ¡ Reinforcement, in the world, is intermittent à not every response is reinforced ¡ Next, we will look at the effects of intermittent reinforcement on behaviour
  • 210. EXAMPLE What is the…. ¡ Behaviour? ¡ Reinforcer? ¡ Is every response (i.e., instance of the behaviour reinforced?
  • 211. EXAMPLE What is the…. ¡ Behaviour? ¡ Reinforcer? ¡ Is every response (i.e., instance of the behaviour reinforced?
  • 212. SCHEDULES OF REINFORCEMENT ¡ Skinner developed free operant procedures ¡ Allowed to record multiple responses in a single session
  • 213. SCHEDULES OF REINFORCEMENT ¡ Skinner developed free operant procedures ¡ Allowed to record multiple responses in a single session ¡ Developed the “cumulative recorder” ¡ Pen moves up on paper for every response
  • 214. SCHEDULES OF REINFORCEMENT ¡ Rules/system governing the delivery of reinforcement ¡ Behaviour dependent reinforcement – contingent ¡ Ratio schedules ¡ Fixed (FR) &Variable (VR) ¡ Interval schedules ¡ Fixed (FI) &Variable (VI) ¡ Behaviour independent, non-contingent – ¡ “time” schedules ¡ Fixed & variable
  • 215. BEHAVIOUR DEPENDENT SCHEDULES How often is the behaviour reinforced?
  • 216. RATIO SCHEDULES ¡ Ratio schedules: establish a ratio of responses to reinforcers ¡ FIXED: a fixed number of responses are required for delivery of each reinforcer ¡ Denoted by FR# (# is the number of responses required) ¡ Continuous reinforcement (CRF): FR1 ¡ Establishment of behaviour ¡ Not very representative of the world ¡ Anything above CRF/FR1 is considered “intermittent” reinforcement
  • 217. FIXED RATIO (FR) ¡ FR schedules à high rates of responding, followed by pauses: ¡ High and steady rate of responding that completes each ratio requirements is called the run rate ¡ The zero rate of responding that usually occurs just after reinforcement is called the post-reinforcement pause (PRP) ¡ Run rate is independent of FR size, but PRPs are longer with larger FR size ¡ Therefore, PRPs affect overall rate of responding ¡ Larger schedules à lower rates ¡ E.g., bonus for selling 5 houses
  • 218. STRETCHING THE RATIO ¡ Subjects can work on really strenuous schedules (say, 100 responses for 2 pellets of food) ¡ They reach that level via shaping. ¡ Start as continuous reinforcement schedule and gradually stretch the schedule ¡ Any resource that depletes represents a stretching schedule ¡ Can be useful in therapeutic settings ¡ E.g., in exposure therapy for someone with agoraphobia – order favourite takeaway after a trip to the supermarket, then after a trip to X, thenY. ¡ Stretching ratio must be gradual or else can lead to ratio strain, where responding breaks down
  • 219. RATIO SCHEDULES ¡ Ratio schedules: establish a ratio of responses to reinforcers ¡ VARIABLE: a certain number of responses on average are required for delivery of reinforcer ¡ Denoted byVR# ¡ Example:VR6 can be: ¡ Half reinforced after 2 responses, half after 10 responses ¡ Average of 2 & 10 is 6 ¡ 1, 3, 6, 8, 12 à average is 6
  • 220. VARIABLE RATIO (VR) ¡ VR schedules à steady performance ¡ High rate of responding ¡ Almost never PRP ¡ If any PRP occur, they are shorter than for FR ¡ Affected by the size of theVR and the size of the lowest ratio ¡ So, forVR50 (20,80) andVR50 (40,60) the second will lead to longer PRPs ¡ Example: gambling
  • 221. INTERVAL SCHEDULES ¡ Interval schedules: provide reinforcement for the first response after a specified period of time ¡ FIXED: a fixed period of time before a response leads to reinforcement ¡ Denoted by FI (time) ¡ Important to note that a response is required
  • 222. FIXED INTERVAL (FI) ¡ FI schedules à moderate to low rate of responding ¡ Lead to a scalloped pattern ¡ Exams at fixed periods…
  • 223. INTERVAL SCHEDULES ¡ Interval schedules: provide reinforcement for the first response after a specified period of time ¡ VARIABLE: the period before response leads to reinforcement varies around an average value ¡ Denoted byVI(time) ¡ ExampleVI 6s can be: ¡ Half reinforcement for first response after 2s, half for first response after 10s, or ¡ 1s, 3, 6s, 8s, 12s à average is 6s
  • 224. VARIABLE INTERVAL (VI) ¡ VI schedules à lower rate of responding, but steady (still moderate) ¡ Higher rates than an FI but not as high as FR orVR ¡ Closest representation of high variability out there ¡ Bonus for worker when supervisor shows up ¡ Elevators ¡ Random pop-quiz
  • 226. DO RATIO AND INTERVAL SCHEDULES MOTIVATE BEHAVIOUR SIMILARLY? ¡ Not really. – Different mechanisms ¡ Pigeons trained onVR orVI (intervals based onVR pigeon) ¡ VR pigeon shows more vigorous responding ¡ Why? Short Inter ResponseTime and Feedback
  • 227. BEHAVIOUR INDEPENDENT SCHEDULES ¡ Also called non-contingent ¡ Deliver reinforcement independent of responses ¡ Fixed time (FT) ¡ Birthdays, anniversaries ¡ Variable time (VT) ¡ Random calls, messages from friends
  • 228. SOWHY STUDY REINFORCEMENT SCHEDULES? ¡ Used in choice research ¡ Organisms show preferences for some over other schedules ¡ FR5 preferred over FR15 (no surprise) ¡ Variable interval preferred to equivalent Fixed interval ¡ VI 15 (2, 28) preferred over a FI15
  • 229. CONCURRENT SCHEDULES AND CHOICE ¡ Development of these complex schedules allows us to study more complex (i.e.,‘real life’) behaviour ¡ We hardly ever have a single response alternative available ¡ In fact, all behaviour is the product of choice… ¡ Having concurrent schedules (providing two options) allows to study choice…
  • 230. CONCURRENT SCHEDULES AND CHOICE ¡ Concurrent schedules have been extensively used in the study of choice ¡ Assumption: these schedules may provide (simplified) model of world ¡ Basic premises for studying choice: ¡ Organisms face choices ¡ Choices are characterised by consequences ¡ Consequences can be defined in terms of properties ¡ Rate of occurrence (probability) ¡ Magnitude ¡ Delay ¡ Organisms should maximise some function of consequences
  • 231. CONCURRENT SCHEDULES AND CHOICE ¡ How to study? ¡ Present organisms with choices between alternatives that vary systematically ¡ Observe preferences (i.e., response behaviour)
  • 232. CONCURRENT SCHEDULES AND CHOICE ¡ Rate of reinforcement ¡ If a pigeon has an option to peck two different keys associated different schedules of reinforcement, which key are they likely to peck most frequently? ¡ 1. Red key = FR5 vs Blue key = FR10? ¡ 2. Red key = FI5 vs Blue key =VI5? ¡ 3. Red key =VI5 vs Blue key = FR1?
  • 233. CONCURRENT SCHEDULES AND CHOICE ¡ Herrnstein (1961) and Herrnstein and Mazur (1987) showed behaviour distribution can be modelled by an equation ¡ Relative frequency of behaviour (B) equals relative frequency of reinforcement (r) ¡ If one pays twice as much, I will spend twice as long there. ¡ Matching Law
  • 234. MATCHING LAW ¡ Matching law has been extended to other properties of reinforcement: ¡ Magnitude/amount of reinforcement ¡ Relative frequency of responses matches relative amount of reinforcement ¡ A pigeon will spend twice as long responding to the key that gives twice the amount of food ¡ Delay of reinforcement ¡ Relative frequency of responses matches the relative immediacy of reinforcement ¡ A pigeon will spend twice as long responding to the key that keeps it waiting half of the time
  • 235. CHOICE BEHAVIOUR: RELATIVE RATE OF RESPONDING ¡ Relative rate of responding to option 1 (B1) is calculated by dividing the rate of responding to option 1 (B1) by the total rate of responding to options 1 and 2 (B1 + B2) ¡ If response rate to B1 and B2 is equal, ratio will be = .5 ¡ If response rate to B1 is less than B2, ratio will be less than .5 ¡ If response rate to B1 is more than B2, ratio will be more than .5
  • 236. CHOICE BEHAVIOUR: SELF-CONTROL ¡ When one looks at amount and delay to reinforcement at the same time... ¡ Suppose pigeon is given a choice between: ¡ B1: 4s access to food after a 4s delay ¡ B2: 2s access to food right now (after he reaches food hopper .1s) ¡ Ratio = .04762 (Less than .5) ¡ Pigeon will prefer the immediate small reward (i.e., B2)
  • 237. CHOICE BEHAVIOUR: SELF-CONTROL ¡ When one looks at amount and delay to reinforcement at the same time... ¡ Suppose pigeon is now given a choice between: ¡ B1: 4s access to food in 14s ¡ B2: 2s access to food in 10s ¡ Ratio = .59 (more than .5) ¡ Now pigeon prefers larger, delayed reward
  • 238. CHOICE BEHAVIOUR: SELF-CONTROL ¡ Let’s recap... ¡ Matching law says: ¡ When faced with an immediate small vs large delayed choice, the organism should mostly choose immediate small choice ¡ But when both options are further delayed, preference will switch
  • 241. MARSHMALLOWTEST ¡ Replication crisis ¡ Partial failure to replicate ¡ Replication showed dramatic reduction in effect (though still sig.) ¡ Non-sig when controlling for environmental factors ¡ See Watts et al. (2018). Revisiting the marshmallow test: a conceptual replication investigating links between early delay of gratification and later outcomes
  • 242. DELAY DISCOUNTING ¡ Originates in early studies on concurrent-chain schedules (Rachlin & Green, 1972) ¡ The value of a reinforcer declines as a function of how long you have to wait for it ¡ The value of a reinforcer is directly related to reward magnitude and inversely related to reward delay ¡ So, the longer a reinforcer is delayed, the smaller its value ¡ However, with increasing time (delay) a larger reinforcer will have a higher value than a smaller reinforcer
  • 243. DELAY DISCOUNTING ¡ Often studied in cases with hypothetical $ ¡ $100 today vs. $500 in a week? ¡ $100 today vs. $150 in a week? ¡ $100 today vs. $150 in a year? ¡ Can manipulate both time and value
  • 244. DELAY DISCOUNTING ¡ Subjective value of the smaller rewards decreases more rapidly with longer delay compared to the larger reward FIGURE 6.7 The subjective value of 16 ml and 8 ml of juice as a function of delay in college students. Curves represent best-fitting hyperboloid functions (based on Jimura et al., 2009).
  • 246. DELAY DISCOUNTING AND HUMAN BEHAVIOUR ¡ The steeper the delay discounting function the more difficulty that person will have in exhibiting self-control ¡ The larger, more remote reward will seem much less valuable than a smaller, more immediate reward ¡ Poor self-control may be associated with a variety of human problem behaviours ¡ But…Is there evidence to support this idea?
  • 247. DELAY DISCOUNTING AND HUMAN BEHAVIOUR ¡ Drug use/addictive behaviours à steeper discounting functions than controls ¡ Moffitt et al. (2011) ¡ Longitudinal NZ study of ~1000 children (from birth to 32 years old) ¡ Higher levels of self-control in childhood à better health, lower rates drug use, higher income levels, lower rates criminal behaviour ¡ Remember:“steep” delay discounting function = less self-control
  • 248. CAN SELF CONTROL BE LEARNED? ¡ It seems the answer is yes! ¡ Training with delayed reward increased preference for larger delayed reward ¡ Shaping: initially no delay, then increasing the delay between small and large reward ¡ HOWEVER, more research is needed
  • 249. LECTURE SUMMARY ¡ Instrumental Conditioning: reinforcement aims to increases behaviour – punishment aims decreases behaviour ¡ We can increase one behaviour and subsequently decrease another by the increased new behaviour. ¡ “It’s not what we stop, its what we start” ¡ Extinction – new learning. ¡ Reinforcement schedules can vary in multiple ways ¡ Ratio: fixed vs variable ¡ Interval: fixed vs variable ¡ Different reinforcement schedules lead to different rates of responding ¡ Choice scenarios involve multiple response alternatives ¡ Delay discounting – indicator of self-control
  • 250. NEXT WEEK ¡ Models of Instrumental Conditioning (Chapter 7) ¡ Associative Structure ¡ Response Allocation and Behavioural Economics
  • 251. PSYC214 – LEARNING AND BEHAVIOUR WEEK THREE LECTURE DOMJAN CHAPTER 7
  • 252. HOUSEKEEPING – MID SEMESTER EXAM Mid-Semester Exam in Week 7: The online mid-semester exam assesses content from lectures, tutorials and readings from Weeks 1 to 6 inclusive. MELBOURNE: The exam will open at 12pm on Wednesday 11th of September and will close at 4pm on Wednesday 11th September. The exam consists of 40 multiple-choice questions worth one (1) mark each and 1short-answer question worth ten (10) marks. The short-answer response should be approximately 250 words. You have 75 minutes to complete all of the questions. If you experience ongoing technological issues that prevent you completing the exam in the allocated timeframe, before any request for an opportunity to sit the exam at another time will be considered you MUST supply timestamped screenshots of the error demonstrating that this error persisted for the duration of four-hour window you had to complete the exam and flag any difficulties with me ASAP.
  • 253. HOUSEKEEPING Mid-Semester Exam in Week 7 (Continued): You may make only one attempt at the exam. The exam is worth 30% of your unit assessment. You must attempt the exam in order to be eligible for a passing grade in PSYC214. The exam is open book – you may refer to your notes and textbooks when attempting this exam. You must however complete the exam independently – forms of academic misconduct, such as collusion are not acceptable – please refer to the Academic Integrity and Misconduct Policy for further information. In submitting your exam, you acknowledge that you have read and understand ACU’s Academic Integrity and Misconduct Policy and have not engaged in any behaviour that would constitute Academic Misconduct. Remember to check Canvas announcements for any further announcements on assessments and the unit
  • 254. TODAY’S LECTURE / OBJECTIVES Models of Instrumental Conditioning: Explain/apply the key mechanisms that motivate and direct instrumental responses Explain / Apply reward expectancy and S-O associations, Two-Process Theory Exemplify: The S–R Association and the Law of Effect The Expectancy of Reward and the S–O Association The R–O and S(R–O) Relations in Instrumental Conditioning Describe and apply Antecedents of the Response Allocation Approach Consummatory response Premack/Differential Probability Principle Response Deprivation Hypothesis Response Allocation and Behavioural Economics Antecedents of the Response-Allocation Approach The Response Allocation Approach
  • 255. WHAT MOTIVATES INSTRUMENTAL CONDITIONING (IC)? Two competing theories: 1. Associative structure of instrumental conditioning Thorndike and Pavlov 2. Response allocation and behavioural economics Skinnerian tradition (context matters) Neither approach can stand alone. These are competing theories that have proceeded independently
  • 256. INSTRUMENTAL BEHAVIOUR FROM TWO RADICALLY DIFFERENT PERSPECTIVES Associative Structure Thorndike/Pavlovian conditioning Relies heavily on associations; compatible with Pavlovian conditioning Relevant research stimulated by efforts to identify role of Pavlovian mechanisms in instrumental learning Molecular perspective: focuses on individual responses and specific stimulus antecedents/ response outcomes Response Allocation Skinnerian tradition & context Relies on broader context of numerous activities organisms are constantly doing Concerned with how instrumental conditioning procedure limits free flow of activities/ consequences, & behaviour follows as a consequence of this limitation Molar perspective: considers long-term goals and how to achieve goals in context of behavioural options
  • 257. THE ASSOCIATIVE STRUCTURE OF INSTRUMENTAL CONDITIONING Thorndike recognised that Instrumental Conditioning involves much more than a response to a reinforcer Context matters The instrumental response occurs in specific contexts Thorndike's Law of Effect (revisited) Cats learned to escape puzzle box to obtain food reward Thorndike assumed this was caused by the development of an S-R association Reinforcement “stamped in” this association, without itself being learned about
  • 258. EXAMPLES OF INSTRUMENTAL BEHAVIOURS, WHICH OCCUR IN THE CONTEXT OF SPECIFIC ENVIRONMENTAL STIMULI Sending a text message Context of tactile stimulus (holding phone) + visual cue (looking at screen) Igniting your car Context of sitting on driver seat + holding driver’s keys Can you think of some other examples?
  • 259. SKINNER: INSTRUMENTAL CONDITIONING HAS 3 KEY ELEMENTS 1. Stimulus Context (S) Sight, smell, or thought of pizza 2. Instrumental Response (R) Going to a place to buy a pizza Order delivery Or you could make one! 3. Reinforcer or response outcome (O) Pleasant taste, rewarding experience
  • 260. THORNDIKE’S LAW OF EFFECT Stimulus-Response (or Stimulus-Outcome) association is solely responsible for the occurrence of instrumental conditioning Stimulus (e.g. alcohol drink or pizza) = Contextual stimuli that are present when a response is reinforced Response (e.g. drinking or eating) = Instrumental response Reinforcer (e.g. positive feeling from alcohol use pizza eating) = only “stamps in” or strengthens the association between a stimulus and a response
  • 261. THORNDIKE’S LAW OF EFFECT The motivation for an instrumental behaviour is the activation of the S-R association due to exposure to contextual stimuli/triggers that were present when the S-R association formed. Applies to habit forming in the process of drug addiction At the start the reward is the pleasant feeling from drug use With the development of addiction, exposure to contextual stimuli/triggers (sight of alcohol), enough to trigger a response (drinking)
  • 262. HULL (1930, 1931) AND SPENCE (1956) Over the course of instrumental conditioning, the instrumental response increases because of: 1. Stimulus-Response/Thorndike association (classic conditioning): Contextual stimuli trigger response (e.g. gambling response to visual cues) AND 2. Stimulus-Outcome association (instrumental conditioning): The response is made because a reward is expected Yet debated how the S-O association motivates instrumental conditioning
  • 263. TWO-PROCESS THEORY (RESCORLA & SOLOMON, 1967) There are 2 types of learning: Classical Conditioning (S-R) AND Instrumental Conditioning (S-O) 1. Stimulus-Response association formed during instrumental conditioning, as a result of classical conditioning Association is formed between alcohol (stimulus) and drinking behaviour (response) stimulus alcohol response drinking outcome Pleasant physiological response Seeing/thinking about a drink…will activate the S-R association & motivate drinking…
  • 264. TWO-PROCESS THEORY (RESCORLA & SOLOMON, 1967) There are 2 types of learning: Classical Conditioning (S-R) AND Instrumental Conditioning (S-O) 2. Stimulus-Outcome association motivates positive/negative emotional state, which in turns motivates responding Association formed between alcohol (stimulus) & nice physiological outcome (response) stimulus alcohol response drinking outcome Pleasant physiological response Seeing/thinking about a drink…will activate the S-O association  positive emotional state motivate drinking behaviour
  • 265. THE PAVLOVIAN INSTRUMENTAL TRANSFER PROCEDURE/ TEST
  • 266. PAVLOVIAN INSTRUMENTAL TRANSFER PROCEDURE/TEST Test the idea that instrumental behaviour is motivated by the Stimulus-Outcome association (and emotions related to outcome). Example of experiment with rats, 3 phases: Phase I. INSTRUMENTAL CONDITIONING: STIMULUS-RESPONSE Lever pressing  Food Phase II. PAVLOVIAN CONDITIONING: STIMULUS-RESPONSE Lever is removed ‘Pavlovian’ bell  Food Phase III. CRITICAL TRANSFER PHASE Allowed to lever press for food Occasionally, also Pavlovian bell is rung IF Pavlovian association motivates instrumental responding, THEN trials with bell will lead to more food intake than trials without bell
  • 267. S-R ASSOCIATION AND THE LAW OF EFFECT S–R Association – Key to instrumental learning and central to law of effect Law of Effect – Involves establishment of S-R association between instrumental response (R) and contextual stimuli (S) present when response reinforced Law of effect does not involve learning about reinforcer or response outcome (O) or relation between response and reinforcing outcome (the RO association) Role of the reinforcer = “stamp in” or strengthen S-R association Thorndike thought once established, S-R association solely responsible for instrumental behaviour Fell into disfavour during cognitive revolution Resurgence of interest in S-R mechanisms in recent efforts to characterise habitual behaviour in people (example = drug addiction) Habits are 45% of human behaviour!!
  • 268. EXPECTANCY OF REWARD AND THE S-O ASSOCIATION Specification of instrumental response ensures participant will always experience certain distinctive stimuli (S) in connection with making response Stimuli may involve distinctive place, texture, smell, sight cues Reinforcement of instrumental response results in pairing stimuli (S) with reinforcer or response outcome (O) Pairings provide potential for classical conditioning and establishment of association between S and O
  • 269. EVIDENCE FOR S–O ASSOCIATIONS IN INSTRUMENTAL LEARNING Two Process Theory (Rescorla & Solomon, 1967): S – O & S – R associations are learned The stimulus (S) comes to “motivate” responding It does this because S associates with the emotional aspects of the reinforcing outcome Pavlovian and Instrumental Conditioning are related This evoked emotional state energises instrumental responding based on an underlying S – R association. The Pavlovian CS (Tone) increases instrumental lever pressing when it is paired with food. But it decreases instrumental lever pressing when it is paired with foot shock (CER).
  • 270. CONDITIONED EMOTIONAL STATES OR REWARD-SPECIFIC EXPECTANCIES? Two-Process Theory: Assumes classical conditioning mediates instrumental behaviour through conditioning of positive or negative emotions depending on emotional valence of reinforcer Organisms also acquire specific reward expectancies instead of just categorical positive or negative emotions during instrumental and classical conditioning Expectancies for specific rewards rather than general positive emotional states determine results in transfer test
  • 271. BREAK
  • 272. RE-THINKING REINFORCERS Thorndike’s Law of Effect  Postulates that a reinforcer = A stimulus that produces ‘a satisfying state of affairs’ … This may be a limited interpretation The Response-Outcome association, does not explain what causes the response in the first place. Response allocation approaches challenge the notion that reinforcers are “special” stimuli & focus on a molar approach where instrumental conditioning procedures put limitations on an organisms activities causing redistributions of behaviour among available response options Challenging the idea that reinforcers were special stimuli that strengthened instrumental behaviour
  • 273. RE-THINKING REINFORCERS Antecedents of Response Allocation Approach Consummatory-Response Theory Premack Principle Response Deprivation Hypothesis
  • 274. CONSUMMATORY-RESPONSE THEORY Theory claims that species-typical consummatory responses (eating, drinking, swallowing etc) are themselves the critical feature of reinforcers The REAL REINFORCER IS: The consummatory response (e.g., eating, drinking, etc.) NOT the reinforcer (e.g.. Food pellet, chocolate, water, etc.) …so eating the food pellet rather than the food pellet itself is the reinforcer According to CR theory, for example… Eating the chocolate is the reinforcer – not the chocolate itself
  • 275. CONSUMMATORY-RESPONSE THEORY According to CR theory, for example… Eating the chocolate is the reinforcer – not the chocolate itself!
  • 276. PREMACK THEORY Premack suggested a way to predict a priori what events would be reinforcers Denied assumptions of classical reinforcement theory The reinforcement process: Relation between responses Not a relation between responses and consequential stimuli No clear boundaries between behaviours and reinforcers Is it food that is the reinforcer, or eating? Is it the toy, or playing? Premack’s principle of positive reinforcement: If an instrumental response is followed by a contingent response that is more highly probable, the instrumental response will increase in frequency
  • 277. PREMACK THEORY Frames behaviour as high and low probability responses High probability responses reinforce lower probability responses 1. STRONG REINFORCERS ARE: High probability responses: that one is likely to make (naturally occurring/reinforcing) 2. INSTRUMENTAL RESPONSES ARE: Low probability responses: unlikely to occur without some reason to perform (unlikely to made by choice) PREMACK PRINCIPLE More preferred activities can be used to reinforce less preferred activities You must finish your VEGETABLES (low frequency) before you can have your ICECREAM (high frequency) The Premack principle is a theory of reinforcement that states that a less desired behaviour can be reinforced by the opportunity to engage in a more desired behaviour
  • 278. PREMACK THEORY High vs low probability behaviours: A behaviour that naturally occurs frequently has a high probability of reinforcing a behaviour that naturally occurs less frequently High probability responses reinforce lower probability responses 1. STRONG REINFORCERS ARE: High probability responses, that one is likely to make 2. INSTRUMENTAL RESPONSES ARE: Low probability responses, unlikely made by choice INSTRUMENTAL CONDITIONING IS MORE POWERFUL WHEN THE SUBJECT HAS A MARKEDLY DIFFERENT LIKELIHOOD OF PERFORMING THESE 2 RESPONSES
  • 279. PREMACK THEORY 1. STRONG REINFORCERS ARE: High probability responses, that one is likely to make 2. INSTRUMENTAL RESPONSES ARE: Low probability responses, unlikely made by choice EATING LEVER PRESSING
  • 280. PREMACK THEORY Measure unconstrained baseline behaviour and rank all activities in terms of their probability Every behaviour can reinforce behaviour down the list and punish behaviour further up The Premack principle implies the following: 1. That all responses are potentially reinforceable – whereas classical theory claims some are/some are not 2. That all responses are potentially reinforcers for other, less probable responses Premack’s indifference principle It is irrelevant how the current behaviour probabilities got to be what they are (e.g., through deprivation). All that matters is the CURRENT probabilities
  • 281. REINFORCING CONSUMMATORY RESPONSES Premack (1959) conducted research on children comparing behaviours: could play pinball or eat chocolate 61% of children preferred to play pinball 39% preferred to eat chocolate Premack divided each of these groups into two subgroups Eat-to-Play Play-to-Eat For the pinballers Eat-to-Play increased Eat significantly Play-to-Eat increased Play only a very small amount For the Eaters: Eat-to-Play very small increase in Eat Play-to-Eat increased Play significantly Supports Premack theory that eating is a consummatory response according to classical reinforcement theory – could be reinforced
  • 282. APPLICATIONS OF PREMACK PRINCIPLE Restriction of the reinforcer activity is the critical factor for instrumental reinforcement Low probability responses can serve as reinforcer…as long as subjects are restricted from making the response! Can create a new reinforcer…simply by restricting access to it Premack’s work influenced new theories of reinforcement such as Response Deprivation Theory & Behaviour-Regulation theory with many applications IRL Premack theory in application: https://www.youtube.com/watch?v=2HIiQ0ukHaU
  • 283. THE RESPONSE DEPRIVATION HYPOTHESIS You can create a new reinforcer simply by restricting access to it! A critical factor for instrumental reinforcement, is restricting access to the reinforcer activity (and makes it more valuable) If subjects are restricted from making a response/behaviours, such responses/ behaviours can serve as reinforcer
  • 284. THE RESPONSE DEPRIVATION HYPOTHESIS According to the probability-differential view (the original Premack theory), a low probability response can never reinforce a higher probability response However, it has been shown that this CAN happen if the organism is prevented from emitting the lower response activity at its baseline level, Any response is therefore reinforcing
  • 285. EXAMPLES: RESPONSE DEPRIVATION HYPOTHESIS RATS Given any choice rats may prefer sitting to running on a wheel
  • 286. EXAMPLES: RESPONSE DEPRIVATION HYPOTHESIS – RATS BUT, IF access to the running wheel is restricted THEN, running on the wheel could be used as a reinforcer for lever pressing!
  • 287. EXAMPLES: RESPONSE DEPRIVATION HYPOTHESIS – RATS BUT, IF access to sitting is restricted THEN, sitting could be used as a reinforcer for lever pressing! …or the other way around
  • 288. EXAMPLES: RESPONSE DEPRIVATION HYPOTHESIS HUMANS Given any choice you may prefer sitting to standing
  • 289. EXAMPLES: RESPONSE DEPRIVATION HYPOTHESIS – HUMANS But IF your ability to stand is restricted (e.g., long haul flight with the seat belt sign continuously on) THEN standing up can be used as a reinforcer for some other behaviour
  • 290. THE RESPONSE DEPRIVATION HYPOTHESIS IN SUM An organism will work to gain access to a reinforcer response if access to that reinforcer response has been restricted.
  • 291. THE RESPONSE ALLOCATION APPROACH Response allocation approach views IC in terms of the other available behavioural response options, and how an individual distributes their responses among the various options that are available IC puts limitations on animal’s activities and causes ‘redistribution’ of behaviour among available options
  • 292. BEHAVIOURAL BLISS POINT Definition: “The preferred distribution of an organism’s activities before an instrumental conditioning procedure is introduced that sets constraints and limitations on response allocation” Domjan (2015, p.210). A distribution of responses, among available alternatives, in the absence of restrictions
  • 293. RESPONSE ALLOCATION Increased performance of the instrumental response RESULTS FROM reallocating responses that minimise deviations from the bliss point, as much as possible IF THERE ARE OTHER REINFORCERS IN THE ENVIRONMENT Other ‘enjoyable’ behaviour options can undermine the instrumental behaviour
  • 294. RESPONSE ALLOCATION: EXAMPLE Instrumental response: Studying Reinforcer: Facebook/Instagram/Snapchat Bliss point: 3hrs/night of Facebook/Instagram/Snapchat REINFORCEMENT SCHEDULE: Could be set up that for 1 hr of Facebook person must do 1hr study (i.e., FB becomes contingent on studying) This deprives person of time in FB and motivates an increase in time studying Time studying will increase to bring the time allowed for FB closer to the preferred level / bliss point
  • 295. RESPONSE ALLOCATION: EXAMPLE BUT, if other reinforcers are also present in the environment…e.g., Instagram, Netflix, PlayStation, eating, cleaning, etc … then the success of the instrumental conditioning contingency will be undermined The person may be ok to ‘give up’ FB time if they have other pleasant options (or reinforcers) which are not contingent on studying!
  • 296. SUMMARY: ANTECEDENTS OF THE RESPONSE ALLOCATION APPROACH Response Allocation– Molar approach focusing on how instrumental conditioning procedures put limitations on activities, cause redistributions of behaviour among available response options Consummatory-Response– Species-typical consummatory responses (example = eating, drinking) are critical feature of reinforcers Premack/Differential Probability Principle– Difference in likelihood of instrumental and reinforcer responses Encourages thinking about reinforcers as responses rather than as stimuli Greatly expands range of activities investigators use as reinforcers; any behaviour can serve as reinforcer provided it is more likely than the instrumental response Response-Deprivation Hypothesis– Restriction of reinforcer activity critical factor for instrumental reinforcement
  • 297. BEHAVIOURAL ECONOMICS AND RESPONSE ALLOCATION Economics is the study of the allocation of behaviour within a system of constraints Instrumental conditioning is similar to economics Ability to make responses is ‘income’ Available time & energy Number of responses required: “effort cost” is the “price” Schedule of reinforcement determines the “price” of the reinforcer Number of reinforcers earned is the amount purchased Consumer demand Relationship between price and amount purchased The demand curve is elasticity of demand
  • 298. BEHAVIOURAL ECONOMICS Similarities between economic restrictions in marketplace and schedule constraints in instrumental conditioning Demand Curve - Relation between price of commodity and purchased amount Elasticity of Demand – Degree to which price influences demand Determinants of the Elasticity of Demand Availability of Substitutes Price Range Income Level Link to Complementary Commodity Consumer demand is used to analyse instrumental behaviour by considering number of responses performed (or time spent responding) analogous to money and reinforcer obtained to be analogous to purchased commodity
  • 299. CONTRIBUTIONS OF RESPONSE ALLOCATION & BEHAVIOURAL ECONOMICS TO REINFORCEMENT CONDITIONING THEORY & BEHAVIOUR REGULATION Better understanding of the motivational mechanisms Think about the cause of reinforcement as constraints on the free flow of behaviour (rather than thinking of reinforcers as special kinds of stimuli/response) Instrumental conditioning procedures do not “stamp in” to strengthen instrumental behaviour Instead, instrumental conditioning creates new distribution/allocation of responses Resulting reallocation depends on trade-offs between various options usefully characterized by behavioural economics Response Allocation Approach and behavioural Economics provide new and precise ways of describing constraints that various instrumental conditioning procedures impose on organism’s behaviour To study the complex examples of choice self control and economic behaviour Requires complex models from response allocation approach
  • 300. CONTRIBUTIONS OF RESPONSE ALLOCATION & BEHAVIOURAL ECONOMICS TO REINFORCEMENT CONDITIONING THEORY & BEHAVIOUR REGULATION Behavioural Economics emphasise that instrumental behaviour cannot be studied in a vacuum Instead, all response options must be considered as a system; changes in one part of system determine how other parts of the system can be altered Changed the concept of reinforcer & the way instrumental conditioning procedures were viewed Optimal distribution of behaviour determined by physiological needs, ecological niche and species- specific response tendencies Emphasis on broader behavioural context for understanding instrumental behaviour
  • 301. SUMMARY The Associative Structure of Instrumental Conditioning The S-R Association and the Law of Effect Expectancy of Reward and the S-O Association R-O and S(R-O) Relations in Instrumental Conditioning Response Allocation and Behavioural Economics Antecedents of the Response Allocation Approach The Response Allocation Approach Behavioural Economics Contributions of the Response Allocation Approach and Behavioural Economics
  • 302. NEXT WEEK MID SEMESTER EXAM! NO Lecture will be running. Tutorials will run on Friday as usual. AND of course, Good Luck!!