2.4 Replacement grammars
This is the most common way languages are defined by computer scientists today, and
the way we will use for the rest of this book.
A grammar is a set of rules for generating all strings in the language.
We use theBackus-Naur Form (BNF) notation to define a
grammar.
BNF grammars are exactly as powerful as recursive transition networks
(Exploration 2.1 explains what this means and why it is the case), but easier to
write down.
BNF was invented by John Backus in the late 1950s. Backus led efforts at IBM to define
and implement Fortran, the first widely used programming language.
Fortran enabled computer programs to be written in a language more like
familiar algebraic formulas than low-level machine instructions, enabling programs to
be written more quickly.
In defining the Fortran language, Backus and his team used ad hoc English descriptions
to define the language.
These ad hoc descriptions were often misinterpreted, motivating the need for a
more precise way of defining a language.
Rules in a Backus-Naur Form grammar have the form:
nonterminal ::⇒⇒ replacement
I flunked out every year. I never studied. I hated studying. I was just goofing around.
It had the delightful consequence that every year I went to summer school in
New Hampshire where I spent the summer sailing and having a nice time. John
Backus
The left side of a rule is always a single symbol, known as a nonterminal since it
can never appear in the final generated string.
The right side of a rule contains one or more symbols.
These symbols may include nonterminals, which will be replaced using
replacement rules before generating the final string.
They may also be terminals, which are output symbols that never appear as the left side
of a rule. When we describe grammars, we use italics to represent nonterminal symbols,
and bold to represent terminal symbols.
The terminals are the primitives in the language; the grammar rules are its means
of combination.
We can generate a string in the language described by a replacement grammar by
starting from a designated start symbol (e.g., sentence), and at each step selecting
a nonterminal in the working string, and replacing it with the right side of a replacement
rule whose left side matches the nonterminal.
Wherever we find a nonterminal on the left side of a rule, we can replace it with what
appears on the right side of any rule where that nonterminal matches the left side. A
string is generated once there are no nonterminals remaining.
Here is an example BNF grammar (that describes the same language as the RTN
in Figure 2.1):
1.
2.
3.
4.
5.
Sentence
Noun
Noun
Verb
Verb
::
⇒⇒
::
⇒⇒
::
⇒⇒
::
⇒⇒
::
⇒⇒
Noun Verb
Alice Bob
Jumps
runs
Starting from Sentence, the grammar can generate four sentences: “Alice jumps”, “Alice runs”,
“Bob jumps”, and “Bob runs”.
A derivation shows how a grammar generates a given string. Here is the derivation of
“Alice runs”:
Sentence :: ⇒⇒ Noun−−−−−Noun_ Verb
:: ⇒⇒ Alice Verb
:: ⇒⇒ Alice runs
using Rule 1
replacing Noun using Rule 2
replacing Verb using Rule 5
We can represent a grammar derivation as a tree, where the root of the tree is the
starting nonterminal (Sentence in this case), and the leaves of the tree are the terminals
that form the derived sentence.
Such a tree is known as a parse tree. Here is the parse tree for the derivation of “Alice runs”:
BNF grammars can be more compact than just listing strings in the language since a grammar can
have many replacements for each nonterminal.
For example, adding the rule, Noun ::⇒⇒ Colleen, to the grammar adds two new strings
(“Colleen runs” and “Colleen jumps”) to the language.
Recursive Grammars.recursive grammarThe real power of BNF as a compact notation
for describing languages, though, comes once we start adding recursive rules to our grammar.
A grammar is recursive if the grammar contains a nonterminal that can produce a production that
contains itself.
Suppose we add the rule,
Sentence ::⇒⇒ Sentence and Sentence
to our example grammar. Now, how many sentences can we generate?
Infinitely many! This grammar describes the same language as the RTN in Figure 2.2.
It can generate "Alice runs and Bob jumps" and "Alice runs and Bob jumps and Alice runs" and
sentences with any number of repetitions of "Alice runs".
This is very powerful: by using recursive rules a compact grammar can be used to
define a language containing infinitely many strings.
Example 2.1: Whole Numbers
This grammar defines the language of the whole numbers (0, 1,……) with leading zeros
allowed:
Here is the parse tree for a derivation of 37 from Number:
Number
MoreDigits
:: ⇒⇒
MoreDigits
:: ⇒⇒
Number
Digit
:: ⇒⇒
0
Digit
:: ⇒⇒
1
Digit
:: ⇒⇒
2
Digit
:: ⇒⇒
3
Digit
:: ⇒⇒
4
Digit
:: ⇒⇒
5
Digit
:: ⇒⇒
6
Digit
:: ⇒⇒
7
Digit
:: ⇒⇒
8
Digit
:: ⇒⇒
9
Circular vs. Recursive Definitions. The second rule means we can replaceMoreDigits
with nothing. This is sometimes written as ϵ to make it clear that the replacementϵ
is empty: MoreDigits ::⇒⇒ ϵ .ϵ
This is a very important rule in the grammar—without it no strings could be generated;
with it infinitely many strings can be generated.
The key is that we can only produce a string when all nonterminals in the string have
been replaced with terminals.
Without the MoreDigits ::⇒⇒ ϵ rule, the only rule we would have withϵ MoreDigits on the left
side is the third rule: MoreDigits::⇒⇒ Number.
The only rule we have with Number on the left side is the first rule, which
replaces Number with Digit MoreDigits.
Every time we follow this rule, we replace MoreDigits with Digit MoreDigits. We can produce as
many Digits as we want, but without the MoreDigits ::⇒⇒ ϵ rule we can never stop.ϵ
This is the difference between a circular definition, and a recursive definition. Without the
stopping rule, MoreDigits would be defined in a circular way.
There is no way to start with MoreDigits and generate a production that does not
contain MoreDigits (or a nonterminal that eventually must produce MoreDigits).
With the MoreDigits ::⇒⇒ ϵ rule, however, we have a way to produce somethingϵ
terminal from MoreDigits. This is known as a base case — a rule that turns an otherwise
circular definition into a meaningful, recursive definition.
Condensed Notation. It is common to have many grammar rules with the same left side
nonterminal.
For example, the whole numbers grammar has ten rules with Digit on the left side to produce the
ten terminal digits. Each of these is an alternative rule that can be used when the
production string contains the nonterminal Digit.
A compact notation for these types of rules is to use the vertical bar (∣∣) to separate alternative
replacements. For example, we could write the ten Digit rules compactly as:
Digit ::⇒⇒ 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

More Related Content

PPTX
LanguageStore - Partizip
PPTX
Spoken english syllabus power point
DOCX
Zero and first conditional
DOCX
Adjektivdeklination
PPTX
Understanding pronouns
PPTX
Active listening
PPTX
Definition of conditional
PPTX
Transformational Grammar by: Noam Chomsky
LanguageStore - Partizip
Spoken english syllabus power point
Zero and first conditional
Adjektivdeklination
Understanding pronouns
Active listening
Definition of conditional
Transformational Grammar by: Noam Chomsky

Similar to replacement grammars (20)

PPTX
Computational model language and grammar bnf
DOCX
PDF
Lecture: Context-Free Grammars
PPTX
THEORYOFAUTOMATATHEORYOFAUTOMATATHEORYOFAUTOMATA.pptx
PPT
Normal-forms-for-Context-Free-Grammars.ppt
PPTX
Theory of Computation Kishan Kaushik Presentation
PDF
P99 1067
PPTX
Types of Language in Theory of Computation
PPTX
CH 2.pptx
PPTX
Conteext-free Grammer
PPTX
Regular Expression in Compiler design
PPTX
Syntax_Analysis_Syntax analysis_NLP.pptx
PDF
PARSING ARABIC VERB PHRASES USING PREGROUP GRAMMARS
PPTX
More on Indexing Text Operations (1).pptx
PDF
Presentation (5).pdf
PDF
match the following attributes to the parts of a compilerstrips ou.pdf
PDF
01-Introduction&Languages.pdf
PPT
Class9
PDF
Guide to punctuation
PDF
Guide to punctuation
Computational model language and grammar bnf
Lecture: Context-Free Grammars
THEORYOFAUTOMATATHEORYOFAUTOMATATHEORYOFAUTOMATA.pptx
Normal-forms-for-Context-Free-Grammars.ppt
Theory of Computation Kishan Kaushik Presentation
P99 1067
Types of Language in Theory of Computation
CH 2.pptx
Conteext-free Grammer
Regular Expression in Compiler design
Syntax_Analysis_Syntax analysis_NLP.pptx
PARSING ARABIC VERB PHRASES USING PREGROUP GRAMMARS
More on Indexing Text Operations (1).pptx
Presentation (5).pdf
match the following attributes to the parts of a compilerstrips ou.pdf
01-Introduction&Languages.pdf
Class9
Guide to punctuation
Guide to punctuation
Ad

More from Rajendran (20)

PPT
Element distinctness lower bounds
PPT
Scheduling with Startup and Holding Costs
PPT
Divide and conquer surfing lower bounds
PPT
Red black tree
PPT
Hash table
PPT
Medians and order statistics
PPT
Proof master theorem
PPT
Recursion tree method
PPT
Recurrence theorem
PPT
Master method
PPT
Master method theorem
PPT
Hash tables
PPT
Lower bound
PPT
Master method theorem
PPT
Greedy algorithms
PPT
Longest common subsequences in Algorithm Analysis
PPT
Dynamic programming in Algorithm Analysis
PPT
Average case Analysis of Quicksort
PPT
Np completeness
PPT
computer languages
Element distinctness lower bounds
Scheduling with Startup and Holding Costs
Divide and conquer surfing lower bounds
Red black tree
Hash table
Medians and order statistics
Proof master theorem
Recursion tree method
Recurrence theorem
Master method
Master method theorem
Hash tables
Lower bound
Master method theorem
Greedy algorithms
Longest common subsequences in Algorithm Analysis
Dynamic programming in Algorithm Analysis
Average case Analysis of Quicksort
Np completeness
computer languages
Ad

Recently uploaded (20)

PPTX
Share_Module_2_Power_conflict_and_negotiation.pptx
PDF
Empowerment Technology for Senior High School Guide
PDF
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
PDF
Practical Manual AGRO-233 Principles and Practices of Natural Farming
PDF
David L Page_DCI Research Study Journey_how Methodology can inform one's prac...
PDF
advance database management system book.pdf
PDF
FOISHS ANNUAL IMPLEMENTATION PLAN 2025.pdf
PPTX
CHAPTER IV. MAN AND BIOSPHERE AND ITS TOTALITY.pptx
PPTX
Onco Emergencies - Spinal cord compression Superior vena cava syndrome Febr...
PDF
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
PDF
Weekly quiz Compilation Jan -July 25.pdf
PPTX
TNA_Presentation-1-Final(SAVE)) (1).pptx
PPTX
History, Philosophy and sociology of education (1).pptx
PDF
CISA (Certified Information Systems Auditor) Domain-Wise Summary.pdf
PDF
1.3 FINAL REVISED K-10 PE and Health CG 2023 Grades 4-10 (1).pdf
DOCX
Cambridge-Practice-Tests-for-IELTS-12.docx
PDF
IGGE1 Understanding the Self1234567891011
PPTX
202450812 BayCHI UCSC-SV 20250812 v17.pptx
PPTX
ELIAS-SEZIURE AND EPilepsy semmioan session.pptx
PDF
BP 704 T. NOVEL DRUG DELIVERY SYSTEMS (UNIT 2).pdf
Share_Module_2_Power_conflict_and_negotiation.pptx
Empowerment Technology for Senior High School Guide
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
Practical Manual AGRO-233 Principles and Practices of Natural Farming
David L Page_DCI Research Study Journey_how Methodology can inform one's prac...
advance database management system book.pdf
FOISHS ANNUAL IMPLEMENTATION PLAN 2025.pdf
CHAPTER IV. MAN AND BIOSPHERE AND ITS TOTALITY.pptx
Onco Emergencies - Spinal cord compression Superior vena cava syndrome Febr...
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
Weekly quiz Compilation Jan -July 25.pdf
TNA_Presentation-1-Final(SAVE)) (1).pptx
History, Philosophy and sociology of education (1).pptx
CISA (Certified Information Systems Auditor) Domain-Wise Summary.pdf
1.3 FINAL REVISED K-10 PE and Health CG 2023 Grades 4-10 (1).pdf
Cambridge-Practice-Tests-for-IELTS-12.docx
IGGE1 Understanding the Self1234567891011
202450812 BayCHI UCSC-SV 20250812 v17.pptx
ELIAS-SEZIURE AND EPilepsy semmioan session.pptx
BP 704 T. NOVEL DRUG DELIVERY SYSTEMS (UNIT 2).pdf

replacement grammars

  • 1. 2.4 Replacement grammars This is the most common way languages are defined by computer scientists today, and the way we will use for the rest of this book. A grammar is a set of rules for generating all strings in the language. We use theBackus-Naur Form (BNF) notation to define a grammar. BNF grammars are exactly as powerful as recursive transition networks (Exploration 2.1 explains what this means and why it is the case), but easier to write down. BNF was invented by John Backus in the late 1950s. Backus led efforts at IBM to define and implement Fortran, the first widely used programming language. Fortran enabled computer programs to be written in a language more like familiar algebraic formulas than low-level machine instructions, enabling programs to be written more quickly. In defining the Fortran language, Backus and his team used ad hoc English descriptions to define the language. These ad hoc descriptions were often misinterpreted, motivating the need for a more precise way of defining a language. Rules in a Backus-Naur Form grammar have the form: nonterminal ::⇒⇒ replacement
  • 2. I flunked out every year. I never studied. I hated studying. I was just goofing around. It had the delightful consequence that every year I went to summer school in New Hampshire where I spent the summer sailing and having a nice time. John Backus The left side of a rule is always a single symbol, known as a nonterminal since it can never appear in the final generated string. The right side of a rule contains one or more symbols. These symbols may include nonterminals, which will be replaced using replacement rules before generating the final string. They may also be terminals, which are output symbols that never appear as the left side of a rule. When we describe grammars, we use italics to represent nonterminal symbols, and bold to represent terminal symbols. The terminals are the primitives in the language; the grammar rules are its means of combination. We can generate a string in the language described by a replacement grammar by starting from a designated start symbol (e.g., sentence), and at each step selecting a nonterminal in the working string, and replacing it with the right side of a replacement rule whose left side matches the nonterminal. Wherever we find a nonterminal on the left side of a rule, we can replace it with what appears on the right side of any rule where that nonterminal matches the left side. A string is generated once there are no nonterminals remaining. Here is an example BNF grammar (that describes the same language as the RTN in Figure 2.1):
  • 3. 1. 2. 3. 4. 5. Sentence Noun Noun Verb Verb :: ⇒⇒ :: ⇒⇒ :: ⇒⇒ :: ⇒⇒ :: ⇒⇒ Noun Verb Alice Bob Jumps runs Starting from Sentence, the grammar can generate four sentences: “Alice jumps”, “Alice runs”, “Bob jumps”, and “Bob runs”. A derivation shows how a grammar generates a given string. Here is the derivation of “Alice runs”: Sentence :: ⇒⇒ Noun−−−−−Noun_ Verb :: ⇒⇒ Alice Verb :: ⇒⇒ Alice runs using Rule 1 replacing Noun using Rule 2 replacing Verb using Rule 5 We can represent a grammar derivation as a tree, where the root of the tree is the starting nonterminal (Sentence in this case), and the leaves of the tree are the terminals that form the derived sentence. Such a tree is known as a parse tree. Here is the parse tree for the derivation of “Alice runs”: BNF grammars can be more compact than just listing strings in the language since a grammar can have many replacements for each nonterminal.
  • 4. For example, adding the rule, Noun ::⇒⇒ Colleen, to the grammar adds two new strings (“Colleen runs” and “Colleen jumps”) to the language. Recursive Grammars.recursive grammarThe real power of BNF as a compact notation for describing languages, though, comes once we start adding recursive rules to our grammar. A grammar is recursive if the grammar contains a nonterminal that can produce a production that contains itself. Suppose we add the rule, Sentence ::⇒⇒ Sentence and Sentence to our example grammar. Now, how many sentences can we generate? Infinitely many! This grammar describes the same language as the RTN in Figure 2.2. It can generate "Alice runs and Bob jumps" and "Alice runs and Bob jumps and Alice runs" and sentences with any number of repetitions of "Alice runs". This is very powerful: by using recursive rules a compact grammar can be used to define a language containing infinitely many strings.
  • 5. Example 2.1: Whole Numbers This grammar defines the language of the whole numbers (0, 1,……) with leading zeros allowed: Here is the parse tree for a derivation of 37 from Number: Number MoreDigits :: ⇒⇒ MoreDigits :: ⇒⇒ Number Digit :: ⇒⇒ 0 Digit :: ⇒⇒ 1 Digit :: ⇒⇒ 2 Digit :: ⇒⇒ 3 Digit :: ⇒⇒ 4 Digit :: ⇒⇒ 5 Digit :: ⇒⇒ 6 Digit :: ⇒⇒ 7 Digit :: ⇒⇒ 8 Digit :: ⇒⇒ 9
  • 6. Circular vs. Recursive Definitions. The second rule means we can replaceMoreDigits with nothing. This is sometimes written as ϵ to make it clear that the replacementϵ is empty: MoreDigits ::⇒⇒ ϵ .ϵ This is a very important rule in the grammar—without it no strings could be generated; with it infinitely many strings can be generated. The key is that we can only produce a string when all nonterminals in the string have been replaced with terminals. Without the MoreDigits ::⇒⇒ ϵ rule, the only rule we would have withϵ MoreDigits on the left side is the third rule: MoreDigits::⇒⇒ Number. The only rule we have with Number on the left side is the first rule, which replaces Number with Digit MoreDigits. Every time we follow this rule, we replace MoreDigits with Digit MoreDigits. We can produce as many Digits as we want, but without the MoreDigits ::⇒⇒ ϵ rule we can never stop.ϵ This is the difference between a circular definition, and a recursive definition. Without the stopping rule, MoreDigits would be defined in a circular way. There is no way to start with MoreDigits and generate a production that does not contain MoreDigits (or a nonterminal that eventually must produce MoreDigits). With the MoreDigits ::⇒⇒ ϵ rule, however, we have a way to produce somethingϵ terminal from MoreDigits. This is known as a base case — a rule that turns an otherwise circular definition into a meaningful, recursive definition.
  • 7. Condensed Notation. It is common to have many grammar rules with the same left side nonterminal. For example, the whole numbers grammar has ten rules with Digit on the left side to produce the ten terminal digits. Each of these is an alternative rule that can be used when the production string contains the nonterminal Digit. A compact notation for these types of rules is to use the vertical bar (∣∣) to separate alternative replacements. For example, we could write the ten Digit rules compactly as: Digit ::⇒⇒ 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9