SlideShare a Scribd company logo
Instrumentos de Pesquisa
Testes de Língua: ‘Validade’ em
Foco
Prof. Dr. Ron Martinez
1
2
3
4
Language Tests as Research Instruments
• Para que ‘testar’ língua em pesquisas de LA?
• Vocês já usaram algum tipo teste nas suas
pesquisas? Quais, e porque?
5
Leitura para hoje
6
Um teste válido?
8
Alderson & Banerjee (2002, p. 79)
“Validity is not a characteristic of a
test, but a feature of the inferences
made on the basis of test scores and
the uses to which a test is put.”
9
‘Interactional Model’: Language Ability
and Test Method
10
11
12
13
Development and validation ofDevelopment and validation of
a vocabulary size test ofa vocabulary size test of
multiword expressionsmultiword expressions
Ron Martinez, University of Nottingham
for the Department of Education
University of Oxford
17 November 2011
“There is an obvious payoff for learners of English in
concentrating initially on the 2,000 most frequent
words, since they have been repeatedly shown to
account for at least 80% of the running words in any
written or spoken text.” (Read, 2004: 148)
14
Development and validation of a vocabulary size test of multiword expressions
Ron Martinez
Lexical Profile using VocabProfile
Ron Martinez
Ron Martinez
Lexical Profile using VocabProfile
Ron Martinez
Multiword-Inclusive Profile
Ron Martinez
‘frequency’: potentially problematic
from both the perspective of
learner and teacher/tester.
21
!
Martinez and Murphy (2011)
• 101 adult Brazilian learners of English
(‘intermediate’ or higher).
• Within-groups, paired samples of reading
comprehension measures on a two-part reading
test.
• All texts on both test parts written
‘symmetrically’, using exact same pool of top
2,000 word families in English (BNC).
Ron Martinez
Let me tell you about my home. It’s on this little hill out in
the country. But I’m not far from the city (I don’t like the
city – do you?) – not much time to get here. I can’t wait to
show you a photo… or you can call me to come over to
see in person! 07786 237 679
I don’t get out much – it’s about time I do. I’m not from
here – this country or city. (But I like this country.) I’m far
from home. I’m a little over the hill, let me tell you, but you
can’t tell! (I can show you my photo, or wait to come see
me in person!) Call me on 07786 554 0978
exact
same
words
all very
frequent
words
(top
2,000)
Test Overview
• Part 1: 4 texts, 7 questions each – compositional
formulations (meanings transparent from
individual words).
• Part 2: 4 texts, 7 questions each, exact same
words – less compositional.
• Rating scale for self-reported comprehension
after each text.
1.
2.
3.
4.
5.
6.
7.
 He wants to go out but has a problem with time.
 He is foreign.
 He lives in a remote area.
 He wants to keep his location a secret.
 He thinks he looks younger than his age.
 He probably lives in an area with hills.
 He lives on the hill, but not on top of it.
My comprehension of this text: 5% 25% 50% 75% 100%My comprehension of this text: 5% 25% 50% 75% 100%
I don’t get out much – it’s about time I do. I’m not from here
– this country or city. (But I like this country.) I’m far from
home. I’m a little over the hill, let me tell you, but you can’t
tell! (I can show you my photo, or wait to come see me in
person!) Call me on 07786 554 0978
The results
Min. Max. Mean SD
Part 1
Total
18 28 24.09 2.44
Part 2
Total
6 25 14.76 3.93
t = 24.10 (p ≤ 0.001), eta squared = 0.828
Reported Comprehension vs. Actual
Comprehension
• No statistically significant difference for Part 1
(87.38% reported vs 86.03% actual).
• Reported comprehension significantly
overestimated in Part 2 (t = 3.95, p≤ 0.001, eta
squared = 0.07) – 60.29% reported vs 52.58%
actual.
Development and validation of a vocabulary size test of multiword expressions
‘on occasion’
INTERMEDIATE
HIGHER
30
The Yes-No Test (Meara, 1992)
30Ron Martinez
The Vocabulary Levels Test (Nation, 1983;
Schmitt, Schmitt & Clapham, 2001)
1. original
2. private
3. royal
4. slow
5. sorry
6. total
_____ first
_____ not public
_____ all added together
Ron Martinez
32
Vocabulary Size Test (Nation & Beglar, 2007)
Research question
How can a test be devised that assesses
knowledge of multiword expressions in the
same or similar way as current widely-used
vocabulary tests?
Ron Martinez
34
Challenges
1.Narrowing down the phraseological field
(i.e. which formulaic sequence?)
2.Pinning down the extent (i.e. where do
you stop?)
3.Finding the expressions (i.e. what tools
and resources can be used?)
4.Adopting an appropriate test format (i.e.
how to test the sequences?)
34
35
Challenges
1.Narrowing down the phraseological field
(i.e. which formulaic sequence?)
2.Pinning down the extent (i.e. where do
you stop?)
3.Finding the expressions (i.e. what tools
and resources can be used?)
4.Adopting an appropriate test format (i.e.
how to test the sequences?)
35Ron Martinez
36
The Yes-No Test (Meara, 1992)
36
The Vocabulary Levels Test (Nation, 1983;
Schmitt, Schmitt & Clapham, 2001)
1. original
2. private
3. royal
4. slow
5. sorry
6. total
_____ first
_____ not public
_____ all added together
38
Vocabulary Size Test (Nation & Beglar, 2007)
at all times at all costs at all 
More compositional? Less compositional?
Meaning still retained when each
lexical word replaced with its own
definition (Grant & Bauer, 2004)
A ‘phrasal expression’
• A fixed or semi-fixed sequence of two or
more co-occurring but not necessarily
contiguous words with a cohesive
meaning or function that is not easily
discernible by decoding the individual
words alone.
• take place, to a large extent, take sth over
Ron Martinez
41
Challenges
1.Narrowing down the phraseological field
(i.e. which formulaic sequence?)
2.Pinning down the extent (i.e. where do
you stop?)
3.Finding the expressions (i.e. what tools
and resources can be used?)
4.Adopting an appropriate test format (i.e.
how to test the sequences?)
41
42
Frequency
• VLT stopped at 5000 word frequency band
“represents the upper limit of general high-
frequency vocabulary” (Read, 2000: 119)
• a vocabulary size of 5000 allows for
“pleasurable reading” of simple fiction (Hirsh &
Nation, 1992)
• the English Profile Wordlist project has 4667
entries through B2 (CEFR)
• by advanced levels, students “would probably
be expected to recognize over 4500” word
families (Milton, 2009: 180)
4343
BNC Band Cut-off Points
Frequency band Token frequency cut-off Frequency band Token frequency cut-off
1,000 12,639 + 8,000 434 +
2,000 4,491 + 9,000 356 +
3,000 2,089 + 10,000 295 +
4,000 1,210 + 11,000 249 +
5,000 787 + 12,000 213 +
6,000 620 + 13,000 184 +
7,000 547 + 14,000 162 +
4545
Development and validation of a vocabulary size test of multiword expressions
Development and validation of a vocabulary size test of multiword expressions
Initial data deletion using criteria
Development and validation of a vocabulary size test of multiword expressions
50
PHRASE List sample
single word – multiword expression
frequency matching
51
BEFOREBEFORE AFTERAFTERintegratedwordlist
52
Challenges
1.Narrowing down the phraseological field
(i.e. which formulaic sequence?)
2.Pinning down the extent (i.e. where do
you stop?)
3.Finding the expressions (i.e. what tools
and resources can be used?)
4.Adopting an appropriate test format (i.e.
how to test the sequences?)
52
53
Pilot 1 (n=10): VLT format
53
5454
5555
56
Vocabulary Size Test (VST) (Nation & Beglar,
2007)
57
Pilot 2: VST + VLT (n=34)
57
58
Pilot 2 (VST-VLT comparison)
• 48 overlapping items, counterbalanced forms
(VLT/VST)
• immediate post-test interviews
• VST format 100% preferred by candidates
58
declared knowledge
discrepancies
• Vocabulary Levels Test (VLT) version
significantly more prone to knowledge
discrepancies (t = 5.439, p ≤ 0.001)
59
VST VLT
Discrepancies 11 77
(max.=48) M = 1.50 M = 8.80
Field test (n = 2203)
Test
Version
N Mean SD
A 742 22.67 5.30
B 731 22.32 5.76
C 730 21.95 5.59
60
Freq. Versio
n A
M SD Versio
n B
M SD Versio
n C
M SD
1K   5.50 0.87   4.78 1.26   4.25 0.97
2K   5.05 1.20   5.17 1.14   4.65 1.41
3K   4.33 1.34   4.63 1.44   4.72 1.59
4K   4.21 1.65   3.52 1.62   4.01 1.56
5K   3.60 1.65   4.22 1.67   2.32 1.63
61
Development and validation of a vocabulary size test of multiword expressions
K3
12. at once: I did it at once. Facility Upper Lower D
a. one time .47 .16 .78 -.62
b. many times .00 .00 .00 .00
c. early .02 .00 .06 -.06
d. immediately .43 .81 .16 .65
No attempt 4 (2%) 29
(16%)
K3, Item B12 (item-total correlation .503)
K2
64
3 so far: It’s good so far. Facility Upper Lower D
  a. until now .90 1.00 .75 .25
  b. but not really .04 .00 .08 -.08
  c. sometimes .01 .00 .02 -.02
  d. from a distance .05 .00 .15 -.15
             
    No attempt   0 (0%) 12(5%)  
K1
65
14 used to: I used to go. Facility Upper Lower D
  a. want to .12 .01 .29 -.28
  b. did before .26 .55 .07 .48
  c. usually .56 .40 .54 -.14
  d. always .07 .05 .09 -.04
             
  Answer type totals Combined totals*
Answer type
(consistent)
   
 
‘0’ = Incorrect answer
and translation
 
 
33
 
 
 
740 (consistent)
 
‘1’ = Correct answer
and translation
 
 
707
Answer type
(discrepant)
   
 
‘2’ = Incorrect
answer, correct
translation
 
 
6
 
 
 
8 (discrepant) 
‘3’ = Correct answer,
incorrect translation
 
2 66
‘Cognitive Validity’
“The relevance of the individual’s test
responses to the behaviour under
consideration, rather than on the apparent
relevance of the item content” (Anastasi,
1988: 131).
67
“Even small changes to parameters of
context validity are likely to impact
significantly on cognitive validity and
subsequently on the score or grade a
candidate receives on a test” (O’Sullivan
and Weir, 2011: 28).
68

More Related Content

PDF
ASA 09 Poster-portlandOR-051209
PPTX
Helping Teachers Meet Learner Needs Through Innovative Online Diagnostic Asse...
PDF
Conceptual Plurality in Japanese EFL Learners' Online Sentence Processing: A ...
PPT
Stroop(2)
PDF
Is acquiring knowledge of verb subcategorization in English easier? A partial...
PPTX
Enterprise Content Management
PDF
Sept. 2009 Perfect Attendance
ASA 09 Poster-portlandOR-051209
Helping Teachers Meet Learner Needs Through Innovative Online Diagnostic Asse...
Conceptual Plurality in Japanese EFL Learners' Online Sentence Processing: A ...
Stroop(2)
Is acquiring knowledge of verb subcategorization in English easier? A partial...
Enterprise Content Management
Sept. 2009 Perfect Attendance

Viewers also liked (15)

PDF
Perfil diego labrador
PPTX
Our Road Traveled
PDF
ONE VISION ON THE WEB
PPTX
Portfolio de produtos consultoria de negócios digitais
DOCX
Competencias genéricas
PDF
Blogging professionale
 
PDF
Guia de actividad_fisica
PDF
Perfil william mauricio contreras
PPTX
Qwizdom healthy living
PDF
Fondazione TEMA – contesto, obiettivi, progetti
PPTX
Prezentacja jolantalegierska
PDF
Memoria del Proyecto
PPTX
2012 NUOVI CONSUMI; COMPORTAMENTI E REDDITO DELLE FAMIGLIE IN ITALIA
PPTX
Writing II – an introduction
Perfil diego labrador
Our Road Traveled
ONE VISION ON THE WEB
Portfolio de produtos consultoria de negócios digitais
Competencias genéricas
Blogging professionale
 
Guia de actividad_fisica
Perfil william mauricio contreras
Qwizdom healthy living
Fondazione TEMA – contesto, obiettivi, progetti
Prezentacja jolantalegierska
Memoria del Proyecto
2012 NUOVI CONSUMI; COMPORTAMENTI E REDDITO DELLE FAMIGLIE IN ITALIA
Writing II – an introduction
Ad

Similar to Development and validation of a vocabulary size test of multiword expressions (20)

PPTX
Testing vocabulary and literature
PDF
How does (a lack of) knowledge of multiword expressions affect reading compre...
PDF
Language Testing :kinds of tests
PPT
Testing Vocabulary
DOCX
PPTX
Language Testing: Approaches and Techniques
PPTX
Fundamental concepts and principles in Language Testing
PDF
Group 3 LT_20250630_181253_0000.pdfwhnwisn
PDF
Group 3 LT_20250630_181253_0000.pdf yang paling bagus
PPT
Communicative testing
PPT
Communicative Testing
PPTX
Language testing I by my lecture
PPTX
Assessing Vocabulary with Dr. John Read
PDF
Testing vocabulary
PPSX
Testing Vocabulary
PPSX
Communicative testing 1
PPSX
Communicative Testing
PPT
Language testing final
PPTX
Variable and strategy of language testing by Beny Indra Natan Nadeak, S.Pd
PPT
Designing tests mtcp2008
Testing vocabulary and literature
How does (a lack of) knowledge of multiword expressions affect reading compre...
Language Testing :kinds of tests
Testing Vocabulary
Language Testing: Approaches and Techniques
Fundamental concepts and principles in Language Testing
Group 3 LT_20250630_181253_0000.pdfwhnwisn
Group 3 LT_20250630_181253_0000.pdf yang paling bagus
Communicative testing
Communicative Testing
Language testing I by my lecture
Assessing Vocabulary with Dr. John Read
Testing vocabulary
Testing Vocabulary
Communicative testing 1
Communicative Testing
Language testing final
Variable and strategy of language testing by Beny Indra Natan Nadeak, S.Pd
Designing tests mtcp2008
Ad

More from Ron Martinez (20)

PPTX
Academic writing: before you submit
PPTX
Academic Writing: Discussing and Concluding
PPTX
Teaching Genre in the Writing Center - Phase 1, Class 2
PPTX
Teaching genre in the writing center 1
PPTX
Academic writing: discussing your results
PPTX
Prppg7000 academic writing the method section (1)
PPTX
Module 6: Academic writing The 3 "Cs" and Authorial Voice
PPTX
Module 5 - Academic Writing: Writing Your Introduction
PPTX
Module 4 - Academic Writing: Orienting the Reader
PPTX
Academic Writing in English - Tips on the publication process (2019)
PPTX
Discussing and Concluding - Academic Writing in English 2019
PPTX
Academic Writing in English - Discussing your Results
PDF
Academic Writing in English - The M in IMRaD
PDF
Academic writing: the 3 Cs and authorial voice - 2019
PPTX
Writing your Introduction 2019 Week 5
PPTX
Academic Writing in English: Guiding the reader through title, abstract and i...
PDF
Academic Writing: Think before you write - Week 3 2019
PPTX
Academic Writing: Issues of language 2019 (week 2)
PPTX
Introduction to Research Writing and Publication in English - 2019
PDF
Scientific and technical translation in English - Week 10 2019
Academic writing: before you submit
Academic Writing: Discussing and Concluding
Teaching Genre in the Writing Center - Phase 1, Class 2
Teaching genre in the writing center 1
Academic writing: discussing your results
Prppg7000 academic writing the method section (1)
Module 6: Academic writing The 3 "Cs" and Authorial Voice
Module 5 - Academic Writing: Writing Your Introduction
Module 4 - Academic Writing: Orienting the Reader
Academic Writing in English - Tips on the publication process (2019)
Discussing and Concluding - Academic Writing in English 2019
Academic Writing in English - Discussing your Results
Academic Writing in English - The M in IMRaD
Academic writing: the 3 Cs and authorial voice - 2019
Writing your Introduction 2019 Week 5
Academic Writing in English: Guiding the reader through title, abstract and i...
Academic Writing: Think before you write - Week 3 2019
Academic Writing: Issues of language 2019 (week 2)
Introduction to Research Writing and Publication in English - 2019
Scientific and technical translation in English - Week 10 2019

Recently uploaded (20)

PPTX
B.Sc. DS Unit 2 Software Engineering.pptx
PDF
advance database management system book.pdf
PDF
Hazard Identification & Risk Assessment .pdf
PDF
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
PDF
Paper A Mock Exam 9_ Attempt review.pdf.
PPTX
TNA_Presentation-1-Final(SAVE)) (1).pptx
PDF
FORM 1 BIOLOGY MIND MAPS and their schemes
PDF
Environmental Education MCQ BD2EE - Share Source.pdf
PDF
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
PPTX
Computer Architecture Input Output Memory.pptx
PPTX
Unit 4 Computer Architecture Multicore Processor.pptx
PPTX
Chinmaya Tiranga Azadi Quiz (Class 7-8 )
PDF
AI-driven educational solutions for real-life interventions in the Philippine...
PDF
David L Page_DCI Research Study Journey_how Methodology can inform one's prac...
PDF
LDMMIA Reiki Yoga Finals Review Spring Summer
PDF
احياء السادس العلمي - الفصل الثالث (التكاثر) منهج متميزين/كلية بغداد/موهوبين
PPTX
ELIAS-SEZIURE AND EPilepsy semmioan session.pptx
PDF
1.3 FINAL REVISED K-10 PE and Health CG 2023 Grades 4-10 (1).pdf
PDF
Vision Prelims GS PYQ Analysis 2011-2022 www.upscpdf.com.pdf
PPTX
Virtual and Augmented Reality in Current Scenario
B.Sc. DS Unit 2 Software Engineering.pptx
advance database management system book.pdf
Hazard Identification & Risk Assessment .pdf
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
Paper A Mock Exam 9_ Attempt review.pdf.
TNA_Presentation-1-Final(SAVE)) (1).pptx
FORM 1 BIOLOGY MIND MAPS and their schemes
Environmental Education MCQ BD2EE - Share Source.pdf
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
Computer Architecture Input Output Memory.pptx
Unit 4 Computer Architecture Multicore Processor.pptx
Chinmaya Tiranga Azadi Quiz (Class 7-8 )
AI-driven educational solutions for real-life interventions in the Philippine...
David L Page_DCI Research Study Journey_how Methodology can inform one's prac...
LDMMIA Reiki Yoga Finals Review Spring Summer
احياء السادس العلمي - الفصل الثالث (التكاثر) منهج متميزين/كلية بغداد/موهوبين
ELIAS-SEZIURE AND EPilepsy semmioan session.pptx
1.3 FINAL REVISED K-10 PE and Health CG 2023 Grades 4-10 (1).pdf
Vision Prelims GS PYQ Analysis 2011-2022 www.upscpdf.com.pdf
Virtual and Augmented Reality in Current Scenario

Development and validation of a vocabulary size test of multiword expressions

  • 1. Instrumentos de Pesquisa Testes de Língua: ‘Validade’ em Foco Prof. Dr. Ron Martinez 1
  • 2. 2
  • 3. 3
  • 4. 4
  • 5. Language Tests as Research Instruments • Para que ‘testar’ língua em pesquisas de LA? • Vocês já usaram algum tipo teste nas suas pesquisas? Quais, e porque? 5
  • 8. 8
  • 9. Alderson & Banerjee (2002, p. 79) “Validity is not a characteristic of a test, but a feature of the inferences made on the basis of test scores and the uses to which a test is put.” 9
  • 10. ‘Interactional Model’: Language Ability and Test Method 10
  • 11. 11
  • 12. 12
  • 13. 13 Development and validation ofDevelopment and validation of a vocabulary size test ofa vocabulary size test of multiword expressionsmultiword expressions Ron Martinez, University of Nottingham for the Department of Education University of Oxford 17 November 2011
  • 14. “There is an obvious payoff for learners of English in concentrating initially on the 2,000 most frequent words, since they have been repeatedly shown to account for at least 80% of the running words in any written or spoken text.” (Read, 2004: 148) 14
  • 17. Lexical Profile using VocabProfile Ron Martinez
  • 19. Lexical Profile using VocabProfile Ron Martinez
  • 21. ‘frequency’: potentially problematic from both the perspective of learner and teacher/tester. 21 !
  • 22. Martinez and Murphy (2011) • 101 adult Brazilian learners of English (‘intermediate’ or higher). • Within-groups, paired samples of reading comprehension measures on a two-part reading test. • All texts on both test parts written ‘symmetrically’, using exact same pool of top 2,000 word families in English (BNC).
  • 23. Ron Martinez Let me tell you about my home. It’s on this little hill out in the country. But I’m not far from the city (I don’t like the city – do you?) – not much time to get here. I can’t wait to show you a photo… or you can call me to come over to see in person! 07786 237 679 I don’t get out much – it’s about time I do. I’m not from here – this country or city. (But I like this country.) I’m far from home. I’m a little over the hill, let me tell you, but you can’t tell! (I can show you my photo, or wait to come see me in person!) Call me on 07786 554 0978 exact same words all very frequent words (top 2,000)
  • 24. Test Overview • Part 1: 4 texts, 7 questions each – compositional formulations (meanings transparent from individual words). • Part 2: 4 texts, 7 questions each, exact same words – less compositional. • Rating scale for self-reported comprehension after each text.
  • 25. 1. 2. 3. 4. 5. 6. 7.  He wants to go out but has a problem with time.  He is foreign.  He lives in a remote area.  He wants to keep his location a secret.  He thinks he looks younger than his age.  He probably lives in an area with hills.  He lives on the hill, but not on top of it. My comprehension of this text: 5% 25% 50% 75% 100%My comprehension of this text: 5% 25% 50% 75% 100% I don’t get out much – it’s about time I do. I’m not from here – this country or city. (But I like this country.) I’m far from home. I’m a little over the hill, let me tell you, but you can’t tell! (I can show you my photo, or wait to come see me in person!) Call me on 07786 554 0978
  • 26. The results Min. Max. Mean SD Part 1 Total 18 28 24.09 2.44 Part 2 Total 6 25 14.76 3.93 t = 24.10 (p ≤ 0.001), eta squared = 0.828
  • 27. Reported Comprehension vs. Actual Comprehension • No statistically significant difference for Part 1 (87.38% reported vs 86.03% actual). • Reported comprehension significantly overestimated in Part 2 (t = 3.95, p≤ 0.001, eta squared = 0.07) – 60.29% reported vs 52.58% actual.
  • 30. 30 The Yes-No Test (Meara, 1992) 30Ron Martinez
  • 31. The Vocabulary Levels Test (Nation, 1983; Schmitt, Schmitt & Clapham, 2001) 1. original 2. private 3. royal 4. slow 5. sorry 6. total _____ first _____ not public _____ all added together Ron Martinez
  • 32. 32 Vocabulary Size Test (Nation & Beglar, 2007)
  • 33. Research question How can a test be devised that assesses knowledge of multiword expressions in the same or similar way as current widely-used vocabulary tests? Ron Martinez
  • 34. 34 Challenges 1.Narrowing down the phraseological field (i.e. which formulaic sequence?) 2.Pinning down the extent (i.e. where do you stop?) 3.Finding the expressions (i.e. what tools and resources can be used?) 4.Adopting an appropriate test format (i.e. how to test the sequences?) 34
  • 35. 35 Challenges 1.Narrowing down the phraseological field (i.e. which formulaic sequence?) 2.Pinning down the extent (i.e. where do you stop?) 3.Finding the expressions (i.e. what tools and resources can be used?) 4.Adopting an appropriate test format (i.e. how to test the sequences?) 35Ron Martinez
  • 36. 36 The Yes-No Test (Meara, 1992) 36
  • 37. The Vocabulary Levels Test (Nation, 1983; Schmitt, Schmitt & Clapham, 2001) 1. original 2. private 3. royal 4. slow 5. sorry 6. total _____ first _____ not public _____ all added together
  • 38. 38 Vocabulary Size Test (Nation & Beglar, 2007)
  • 39. at all times at all costs at all  More compositional? Less compositional? Meaning still retained when each lexical word replaced with its own definition (Grant & Bauer, 2004)
  • 40. A ‘phrasal expression’ • A fixed or semi-fixed sequence of two or more co-occurring but not necessarily contiguous words with a cohesive meaning or function that is not easily discernible by decoding the individual words alone. • take place, to a large extent, take sth over Ron Martinez
  • 41. 41 Challenges 1.Narrowing down the phraseological field (i.e. which formulaic sequence?) 2.Pinning down the extent (i.e. where do you stop?) 3.Finding the expressions (i.e. what tools and resources can be used?) 4.Adopting an appropriate test format (i.e. how to test the sequences?) 41
  • 42. 42 Frequency • VLT stopped at 5000 word frequency band “represents the upper limit of general high- frequency vocabulary” (Read, 2000: 119) • a vocabulary size of 5000 allows for “pleasurable reading” of simple fiction (Hirsh & Nation, 1992) • the English Profile Wordlist project has 4667 entries through B2 (CEFR) • by advanced levels, students “would probably be expected to recognize over 4500” word families (Milton, 2009: 180)
  • 43. 4343
  • 44. BNC Band Cut-off Points Frequency band Token frequency cut-off Frequency band Token frequency cut-off 1,000 12,639 + 8,000 434 + 2,000 4,491 + 9,000 356 + 3,000 2,089 + 10,000 295 + 4,000 1,210 + 11,000 249 + 5,000 787 + 12,000 213 + 6,000 620 + 13,000 184 + 7,000 547 + 14,000 162 +
  • 45. 4545
  • 48. Initial data deletion using criteria
  • 51. single word – multiword expression frequency matching 51 BEFOREBEFORE AFTERAFTERintegratedwordlist
  • 52. 52 Challenges 1.Narrowing down the phraseological field (i.e. which formulaic sequence?) 2.Pinning down the extent (i.e. where do you stop?) 3.Finding the expressions (i.e. what tools and resources can be used?) 4.Adopting an appropriate test format (i.e. how to test the sequences?) 52
  • 53. 53 Pilot 1 (n=10): VLT format 53
  • 54. 5454
  • 55. 5555
  • 56. 56 Vocabulary Size Test (VST) (Nation & Beglar, 2007)
  • 57. 57 Pilot 2: VST + VLT (n=34) 57
  • 58. 58 Pilot 2 (VST-VLT comparison) • 48 overlapping items, counterbalanced forms (VLT/VST) • immediate post-test interviews • VST format 100% preferred by candidates 58
  • 59. declared knowledge discrepancies • Vocabulary Levels Test (VLT) version significantly more prone to knowledge discrepancies (t = 5.439, p ≤ 0.001) 59 VST VLT Discrepancies 11 77 (max.=48) M = 1.50 M = 8.80
  • 60. Field test (n = 2203) Test Version N Mean SD A 742 22.67 5.30 B 731 22.32 5.76 C 730 21.95 5.59 60
  • 61. Freq. Versio n A M SD Versio n B M SD Versio n C M SD 1K   5.50 0.87   4.78 1.26   4.25 0.97 2K   5.05 1.20   5.17 1.14   4.65 1.41 3K   4.33 1.34   4.63 1.44   4.72 1.59 4K   4.21 1.65   3.52 1.62   4.01 1.56 5K   3.60 1.65   4.22 1.67   2.32 1.63 61
  • 63. K3 12. at once: I did it at once. Facility Upper Lower D a. one time .47 .16 .78 -.62 b. many times .00 .00 .00 .00 c. early .02 .00 .06 -.06 d. immediately .43 .81 .16 .65 No attempt 4 (2%) 29 (16%) K3, Item B12 (item-total correlation .503)
  • 64. K2 64 3 so far: It’s good so far. Facility Upper Lower D   a. until now .90 1.00 .75 .25   b. but not really .04 .00 .08 -.08   c. sometimes .01 .00 .02 -.02   d. from a distance .05 .00 .15 -.15                   No attempt   0 (0%) 12(5%)  
  • 65. K1 65 14 used to: I used to go. Facility Upper Lower D   a. want to .12 .01 .29 -.28   b. did before .26 .55 .07 .48   c. usually .56 .40 .54 -.14   d. always .07 .05 .09 -.04              
  • 66.   Answer type totals Combined totals* Answer type (consistent)       ‘0’ = Incorrect answer and translation     33       740 (consistent)   ‘1’ = Correct answer and translation     707 Answer type (discrepant)       ‘2’ = Incorrect answer, correct translation     6       8 (discrepant)  ‘3’ = Correct answer, incorrect translation   2 66
  • 67. ‘Cognitive Validity’ “The relevance of the individual’s test responses to the behaviour under consideration, rather than on the apparent relevance of the item content” (Anastasi, 1988: 131). 67
  • 68. “Even small changes to parameters of context validity are likely to impact significantly on cognitive validity and subsequently on the score or grade a candidate receives on a test” (O’Sullivan and Weir, 2011: 28). 68

Editor's Notes

  • #32: Used to say ‘if you know X amt, you should be able to do Y’.
  • #38: Used to say ‘if you know X amt, you should be able to do Y’.