The Language Acquisition Device and innate language ability
1. Do children come to the task of
language acquisition with an innate
‘language acquisition device’?
By Leon and Penny
2. InnatePsychological
Capacitiesinlanguage
acquisition:language-
specificordomain-
general?
In language acquisition:
- Something must be learned, or we would all speak the same
way (eg. all languages would use SVO word order). Clearly
input plays some role.
- Equally, there must be some innate capability to account for
why development of capacity follows a given schedule whose
milestones are not triggered by particular events. Must be
something innate which cannot be learnt otherwise non-
human species would have as complex language.
The question is whether these innate capacities are domain-
specific, restricted and specialised for language only (as in
Chomsky’s theory) or domain-general applying across many
domains in addition to language.
3. MentalisttheoriesofLanguageAcquisition
- Mentalist theories argue that humans are born with an innate capacity to learn
language. This proposed capacity is specialised for language i.e. domain-specific
- Main proponent of this theory was Noam Chomsky
- Chomsky (1965) argued this innate capacity involves a property of the child’s brain
called the Language Acquisition Device which processes linguistic input, using
Universal Grammar and acquisition strategies to gradually produce grammar and a
lexicon.
4. Language
Acquisition
Device
"Language Acquisition Device" (LAD) - pre-programmed knowledge about language
structure
Main arguments:
1. Universal Grammar
Universal Grammar provides a set of principles which are universal properties
of language and restrict the type of grammar the child will develop. Languages
differ in terms of parameters; language development involves parameter
setting based on input
2. Poverty of the stimulus
Children hear a finite number of sentences from which they must generalise to
an infinite set of sentences
Argument that there is not enough input available to children to allow them to
learn certain structures without the help of innate language knowledge guiding
language development
Argues LAD accounts for complex knowledge of language which input alone
would not allow for.
5. EmpiricaltheoriesofLanguageAcquisition
- Empirical theories propose we are not born with language knowledge, and that all
our knowledge is learned through experience.
- Argue we learn from environment via domain-general learning mechanisms such as
statistical learning involving extracting regularities from the sensory environment.
- In the context of language acquisition, children identify and extract statistical
regularities present in the speech they hear around them, essentially learning
patterns and structures within the language through exposure to large amounts of
data, rather than relying solely on innate knowledge or explicit instruction.
- Statistical learning is domain general as also used in visual perception (predict what
appears), and music perception (identify-rhythm)
6. ‘either-or’OR
‘interaction’
Either-or: No LAD – Domain-general abilities only
- Similarities among languages are a result of the biologically based
way that humans learn languages and, more generally, the
biologically based way that humans are sensitive to patterns. No
innate language-specific knowledge or abilities (LAD).
Interaction:
Language-specific knowledge in LAD defines hypothesis space of
possible grammars, while domain-general statistical learning
mechanisms help navigate this hypothesis space to converge on the
correct hypothesis of a language’s specific grammar. Domain-
general mechanisms which help us to navigate this hypothesis
space may reduce the role of the LAD in defining structures are
allowed, reducing language specific constraints, results in a more
loosely defined hypothesis space.
Statistical learning may either (i) replace language-specific
knowledge that guides children through a predefined hypothesis
space or (ii) reduce the need for language-specific knowledge
which constrains the child’s hypothesis space so tightly.
7. Structuraldependency ofrules:domain-specific
Word linear rule = the analysis of the sentence into individual words
part of speech labels. Does not depend on sentence structure
Structure dependent rule = refers to the abstract label “noun
phrase,” a grouping of words into constituents, and consequently is
called structure dependent. Structure dependency of rules is
domain-specific to language
- Appealing to the underlying structure is important
“You have to have a set of prejudices in advance for
induction to take place" (Chomksy, 1980)
• The source of these prejudices (e.g. principles like
structure dependence, etc.) are not found in the
evidence itself, and thus must come from a different,
perhaps innate, source.
8. Childrendon’t
learnstructure-
independent
(e.g.linear)rules
Chomsky (1971) demonstrates the A-over-A constraint in his account of the active–passive
relation under which an Noun Phrase (NP) following the main verb is fronted:
(1) I believe the dog to be hungry
(2) the dog is believed to be hungry
Active passive alternation rule:
Structure independent rule – move the first NP after the verb
Structure dependent rule – move the highest NP after the verb
(3) I believe the dog's owner to be hungry
(4) The dog's owner is believed to be hungry
(5) *The dog is believed's owner to be hungry
The data in (3–5) illustrate that the actual rule is formulated in terms of structure. This A over
A constraint ensures that this rule applies to the higher more inclusive node in a phrase’s c-
structure. If it were stated in terms of linear order, then (4) would be ungrammatical and (5)
would be grammatical. But the opposite is true. However, children may not be exposed to
sentences like (3–5) as evidence in favor of the correct grammar.
Thus, the fact that all adult speakers agree that (4) is grammatical and (5) is not
suggests that the linear rule was never even considered and that children are
predisposed to a structure based grammatical system.
9. Syntactic categories
are learned –
domain-general
Mintz’s (2003) experiment tested whether children use frequent frames—recurring word sequences in
speech—to categorize words into grammatical classes. This study provides evidence for statistical learning as
a mechanism for language acquisition, challenging the idea that a Language Acquisition Device (LAD) is
necessary.
Objective: Investigate whether children can learn grammatical categories from patterns in speech
Method: Corpus Analysis - analysed large samples of child-directed speech from existing language
corpora (transcripts of speech directed at young children). - real-world language input that children hear
during development
Identified frequent frames—word pairs with a variable middle word (e.g., “You _ it” → “You see it,” “You
like it”).
• Examined whether these frames grouped words into the same grammatical categories - if words
appearing in these frames tended to belong to the same grammatical category (e.g., nouns, verbs,
adjectives).
Findings:
• Frequent frames reliably categorized words (e.g., “The _ is” → mostly nouns).
• Categorization accuracy exceeded 90%, suggesting children can use these patterns to learn syntactic
categories.
Implications:
• Challenges Chomsky’s LAD by showing speech contains enough structure for learning of syntactic
categories.
• Supports statistical learning, where children acquire language by recognizing patterns rather than relying
on innate rules.
10. From
distributional to
grammatical
categories
what could guide the linking between distributional categories and
grammatical categories
semantic bootstrapping (Pinker, 1984, 1989)
Child is born with UG and linking rules:
• Innate knowledge
o Syntactic categories: nouns (persons/things); verbs (actions)
o Semantic-syntactic linking rules: agent=subject; patient=object
• Children learn word meanings and categorise then into semantic categories
o E.g., does this word refer to an agent, action or patient
• And they then ‘bootstrap’ to the (adult-like) UG syntactic system
• Errors are with semantic level, not syntactic
Distributional approach
• categorization processes can operate on distributional information from the outset, thus
making for a more economical theory, and avoiding some of the pitfalls inherent in
semantically based categorization proposals
• the bootstrap categories are defined distributionally rather than on semantic grounds.
13. Prosody as a
Segmentation
Cue
Jusczyk, Houston & Newsome (1999) – Prosody as a Segmentation Cue
Study: Investigated how infants use prosodic stress patterns to segment words.
Method:
• 7.5-month-old English-learning babies were familiarized with bisyllabic words (either strong-weak
or weak-strong stress).
• Later, they heard a passage containing or lacking these words.
• Segmentation was measured by longer listening times for passages with familiar words.
Findings:
• 7.5-month-olds segmented strong-weak words but struggled with weak-strong words as they
showed no looking time preference indicating segmentation.
• 10.5-month-olds segmented both stress patterns. – looking time preference for both
Conclusion:
• Since most English words follow a strong-weak pattern, younger infants rely on stress as a word
boundary cue.
• By 10.5 months, they integrate additional segmentation cues for better word recognition.
14. Conclusion
There is a lot of evidence to suggest that there is a
presence of a Language Acquisition Device which
enables children’s acquisition of language; Universal
Grammar; Poverty of Stimulus; Virtuous Errors.
However, there is no testable way to prove whether
such a device exists, and we have not been able to
provide evidence for its location in the brain.
15. Questions?
Which one? Statistical learning may either (i) replace
language-specific knowledge that guides children
through a predefined hypothesis space or (ii) reduce
the need for language-specific knowledge which constrains
the child’s hypothesis space so tightly.
16. References
Chomsky, N. (1965). Aspects of the theory of syntax. MIT Press.
Chomsky, Noam (1971). Problems of knowledge and freedom. New York: Pantheon.
Chomsky, Noam A. (1980). Rules and representations. Behavioral and Brain Sciences
Gómez, R., & Maye, J. (2005). The Developmental Trajectory of Nonadjacent
Dependency Learning. Infancy, 7(2), 183–206.
https://doi.org/10.1207/s15327078in0702_4
Jusczyk, P. W., & Aslin, R. N. (1995). Infants' detection of the sound patterns of
words in fluent speech. Cognitive Psychology, 29(1), 1–23.
https://doi.org/10.1006/cogp.1995.1010
Mintz T. H. (2003). Frequent frames as a cue for grammatical categories in child directed
speech. Cognition, 90(1), 91–117. https://doi.org/10.1016/s0010-0277(03)00140-9
Yang 2004