SlideShare a Scribd company logo
2
Most read
Token, Pattern and Lexeme
Token
Token is a valid sequence of characters which are given by lexeme. In a programming language,
• keywords,
• constant,
• identifiers,
• numbers,
• operators and
• punctuations symbols
are possible tokens to be identified.
Lexemes
A lexeme is a sequence of characters in the source program that matches the pattern for a token and is
identified by the lexical analyzer as an instance of that token.
Pattern
Pattern describes a rule that must be matched by sequence of characters (lexemes) to form a token. It can
be defined by regular expressions or grammar rules. In the case of a keyword as a token, the pattern is just
the sequence of characters that form the keyword.
Example: c=a+b*5;
Lexemes and tokens
Lexemes Tokens
c identifier
= assignment symbol
a identifier
+ + (addition symbol)
b identifier
* * (multiplication symbol)
5 5 (number)
Attributes of Tokens
The lexical analyzer collects information about tokens into their associated attributes. As a practical matter
,a token has usually only a single attribute, a pointer to the symbol-table entry in which the information
about the token is kept; the pointer becomes the attribute for the token.
Let num be the token representing an integer. When a sequence of digits appears in the input stream, the
lexical analyzer will pass num to the parser. The value of the integer will be passed along as an attribute of
the token num. Logically, the lexical analyzer passes both the token and the attribute to the parser.
If we write a token and its attribute as a tuple enclosed b/w < >, the input 33 + 89 – 60 is transformed into
the sequence of tuples < num, 33 > <+, > <num, 89 > <-, > <num, 60>
The token “+” has no attribute ,the second components of the tuples ,the attribute ,play no role during
parsing, but are needed during translation.
The token names and associated attribute values for the Fortran Statement
are written below as a sequence of pairs.
<id, pointer to symbol-table entry for E>
< assign-op >
<id, pointer to symbol-table entry for M>
<mult -op>
<id, pointer to symbol-table entry for C>
<exp-op>
<number , integer value 2 >
Lexical Errors
It is hard for a lexical analyzer to tell, without the aid of other components, that there is a source-code error.
For instance, if the string f i is encountered for the first time in a C program in the context:
a lexical analyzer cannot tell whether fi is a misspelling of the keyword if or an undeclared function identifier.
Since fi is a valid lexeme for the token id, the lexical analyzer must return the token id to the parser.
 A character sequence that cannot be scanned into any valid token is a lexical error.
 Lexical errors are uncommon, but they still must be handled by a scanner.
 Misspelling of identifiers, keyword, or operators are considered as lexical errors.
Usually, a lexical error is caused by the appearance of some illegal character, mostly at the beginning of a
token.
Error Recovery Strategies
The simplest recovery strategy is "panic mode" recovery. We delete successive characters from the
remaining input, until the lexical analyzer can find a well-formed token at the beginning of what input is left.
This recovery technique may confuse the parser, but in an interactive computing environment, it may be
quite adequate.
The following are the error-recovery actions in lexical analysis:
1. Deleting an extraneous character.
2. Inserting a missing character.
3. Replacing an incorrect character by a correct character.
4. Transforming two adjacent characters.

More Related Content

PDF
Syntax Directed Definition and its applications
PPTX
Syntax Analysis in Compiler Design
PPT
Lecture 1 - Lexical Analysis.ppt
PPT
Intermediate code generation (Compiler Design)
PDF
Symbol table in compiler Design
PPTX
Types of Parser
PPTX
Page replacement algorithms
PPTX
Principal Sources of Optimization in compiler design
Syntax Directed Definition and its applications
Syntax Analysis in Compiler Design
Lecture 1 - Lexical Analysis.ppt
Intermediate code generation (Compiler Design)
Symbol table in compiler Design
Types of Parser
Page replacement algorithms
Principal Sources of Optimization in compiler design

What's hot (20)

PPT
15. Transactions in DBMS
PPTX
Peephole Optimization
PPTX
Recognition-of-tokens
PPTX
Dynamic Programming Code-Optimization Algorithm (Compiler Design)
PPT
Problems, Problem spaces and Search
PPTX
Algorithm and pseudocode conventions
PDF
Code optimization in compiler design
PPTX
Paging and segmentation
PPTX
Leaky Bucket & Tocken Bucket - Traffic shaping
DOC
Time and space complexity
PPTX
Transport layer
PPTX
CLR AND LALR PARSER
PDF
Design and analysis of algorithms
PPTX
Peephole optimization techniques in compiler design
PPTX
Single pass assembler
PPTX
Code generation
PPTX
Process synchronization in Operating Systems
PPTX
2.2. language evaluation criteria
PPTX
Input-Buffering
PPTX
Message passing in Distributed Computing Systems
15. Transactions in DBMS
Peephole Optimization
Recognition-of-tokens
Dynamic Programming Code-Optimization Algorithm (Compiler Design)
Problems, Problem spaces and Search
Algorithm and pseudocode conventions
Code optimization in compiler design
Paging and segmentation
Leaky Bucket & Tocken Bucket - Traffic shaping
Time and space complexity
Transport layer
CLR AND LALR PARSER
Design and analysis of algorithms
Peephole optimization techniques in compiler design
Single pass assembler
Code generation
Process synchronization in Operating Systems
2.2. language evaluation criteria
Input-Buffering
Message passing in Distributed Computing Systems
Ad

Similar to Token, Pattern and Lexeme (20)

PPT
Lecturer-05 lex anylser (1).pptrjyghsgst
PDF
Lexical Analysis.pdf
PDF
Assignment4
PPTX
LexicalAnalysis chapter2 i n compiler design.pptx
PPTX
Ch03-LexicalAnalysis chapter2 in compiler design.pptx
PPT
1.Role lexical Analyzer
PDF
role of lexical parser compiler design1-181124035217.pdf
PPT
Lexical Analysis
PDF
COMPILER DESIGN.pdf
PDF
Ch03-LexicalAnalysis in compiler design subject.pdf
PDF
Lexical Analysis - Compiler design
PPT
Chapter-2-lexical-analyser and its property lecture note.ppt
PPT
atc 3rd module compiler and automata.ppt
TXT
tokens patterns and lexemes
PPTX
5490ce2bf23093de242ccc160dbfd3b639d.pptx
PPTX
Compiler Construction ( lexical analyzer).pptx
PPTX
ashjhas sahdj ajshbas sajakj askk sadk as
PPTX
Cd ch2 - lexical analysis
PDF
Lexical analysis Compiler design pdf to read
PDF
Lexical analysis compiler design to read and study
Lecturer-05 lex anylser (1).pptrjyghsgst
Lexical Analysis.pdf
Assignment4
LexicalAnalysis chapter2 i n compiler design.pptx
Ch03-LexicalAnalysis chapter2 in compiler design.pptx
1.Role lexical Analyzer
role of lexical parser compiler design1-181124035217.pdf
Lexical Analysis
COMPILER DESIGN.pdf
Ch03-LexicalAnalysis in compiler design subject.pdf
Lexical Analysis - Compiler design
Chapter-2-lexical-analyser and its property lecture note.ppt
atc 3rd module compiler and automata.ppt
tokens patterns and lexemes
5490ce2bf23093de242ccc160dbfd3b639d.pptx
Compiler Construction ( lexical analyzer).pptx
ashjhas sahdj ajshbas sajakj askk sadk as
Cd ch2 - lexical analysis
Lexical analysis Compiler design pdf to read
Lexical analysis compiler design to read and study
Ad

More from A. S. M. Shafi (20)

DOCX
Data Warehouse Schema (Star, Snowflake).docx
PDF
Correlation Analysis in Machine Learning.pdf
PDF
Naive Bayes and Decision Tree Algorithm.pdf
PDF
Frequent Pattern Growth Mining Algorithm.pdf
PDF
Direct Hashing and Pruning Algorithm in Data MIning.pdf
PDF
Association Rule Mining with Apriori Algorithm.pdf
PDF
HITS Algorithm in Data and Web MIning.pdf
PDF
Page Rank Algorithm in Data Mining and Web Application.pdf
PDF
K Nearest Neighbor Classifier in Machine Learning.pdf
PDF
K Means Clustering Algorithm in Machine Learning.pdf
PDF
2D Transformation in Computer Graphics
PDF
3D Transformation in Computer Graphics
PDF
Projection
PDF
2D Transformation
PDF
Line drawing algorithm
PDF
Fragmentation
PDF
File organization
PDF
Bankers algorithm
PDF
RR and priority scheduling
PDF
Fcfs and sjf
Data Warehouse Schema (Star, Snowflake).docx
Correlation Analysis in Machine Learning.pdf
Naive Bayes and Decision Tree Algorithm.pdf
Frequent Pattern Growth Mining Algorithm.pdf
Direct Hashing and Pruning Algorithm in Data MIning.pdf
Association Rule Mining with Apriori Algorithm.pdf
HITS Algorithm in Data and Web MIning.pdf
Page Rank Algorithm in Data Mining and Web Application.pdf
K Nearest Neighbor Classifier in Machine Learning.pdf
K Means Clustering Algorithm in Machine Learning.pdf
2D Transformation in Computer Graphics
3D Transformation in Computer Graphics
Projection
2D Transformation
Line drawing algorithm
Fragmentation
File organization
Bankers algorithm
RR and priority scheduling
Fcfs and sjf

Recently uploaded (20)

PPTX
Internet of Things (IOT) - A guide to understanding
PPTX
Strings in CPP - Strings in C++ are sequences of characters used to store and...
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PPTX
UNIT 4 Total Quality Management .pptx
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PDF
Digital Logic Computer Design lecture notes
PPTX
OOP with Java - Java Introduction (Basics)
PDF
composite construction of structures.pdf
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PPTX
Lesson 3_Tessellation.pptx finite Mathematics
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PDF
PPT on Performance Review to get promotions
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
PPTX
Lecture Notes Electrical Wiring System Components
Internet of Things (IOT) - A guide to understanding
Strings in CPP - Strings in C++ are sequences of characters used to store and...
Operating System & Kernel Study Guide-1 - converted.pdf
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
UNIT 4 Total Quality Management .pptx
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
Digital Logic Computer Design lecture notes
OOP with Java - Java Introduction (Basics)
composite construction of structures.pdf
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
Lesson 3_Tessellation.pptx finite Mathematics
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PPT on Performance Review to get promotions
UNIT-1 - COAL BASED THERMAL POWER PLANTS
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
Lecture Notes Electrical Wiring System Components

Token, Pattern and Lexeme

  • 1. Token, Pattern and Lexeme Token Token is a valid sequence of characters which are given by lexeme. In a programming language, • keywords, • constant, • identifiers, • numbers, • operators and • punctuations symbols are possible tokens to be identified. Lexemes A lexeme is a sequence of characters in the source program that matches the pattern for a token and is identified by the lexical analyzer as an instance of that token. Pattern Pattern describes a rule that must be matched by sequence of characters (lexemes) to form a token. It can be defined by regular expressions or grammar rules. In the case of a keyword as a token, the pattern is just the sequence of characters that form the keyword. Example: c=a+b*5; Lexemes and tokens Lexemes Tokens c identifier = assignment symbol a identifier + + (addition symbol) b identifier * * (multiplication symbol) 5 5 (number) Attributes of Tokens The lexical analyzer collects information about tokens into their associated attributes. As a practical matter ,a token has usually only a single attribute, a pointer to the symbol-table entry in which the information about the token is kept; the pointer becomes the attribute for the token.
  • 2. Let num be the token representing an integer. When a sequence of digits appears in the input stream, the lexical analyzer will pass num to the parser. The value of the integer will be passed along as an attribute of the token num. Logically, the lexical analyzer passes both the token and the attribute to the parser. If we write a token and its attribute as a tuple enclosed b/w < >, the input 33 + 89 – 60 is transformed into the sequence of tuples < num, 33 > <+, > <num, 89 > <-, > <num, 60> The token “+” has no attribute ,the second components of the tuples ,the attribute ,play no role during parsing, but are needed during translation. The token names and associated attribute values for the Fortran Statement are written below as a sequence of pairs. <id, pointer to symbol-table entry for E> < assign-op > <id, pointer to symbol-table entry for M> <mult -op> <id, pointer to symbol-table entry for C> <exp-op> <number , integer value 2 > Lexical Errors It is hard for a lexical analyzer to tell, without the aid of other components, that there is a source-code error. For instance, if the string f i is encountered for the first time in a C program in the context: a lexical analyzer cannot tell whether fi is a misspelling of the keyword if or an undeclared function identifier. Since fi is a valid lexeme for the token id, the lexical analyzer must return the token id to the parser.  A character sequence that cannot be scanned into any valid token is a lexical error.  Lexical errors are uncommon, but they still must be handled by a scanner.  Misspelling of identifiers, keyword, or operators are considered as lexical errors. Usually, a lexical error is caused by the appearance of some illegal character, mostly at the beginning of a token. Error Recovery Strategies The simplest recovery strategy is "panic mode" recovery. We delete successive characters from the remaining input, until the lexical analyzer can find a well-formed token at the beginning of what input is left. This recovery technique may confuse the parser, but in an interactive computing environment, it may be quite adequate. The following are the error-recovery actions in lexical analysis: 1. Deleting an extraneous character. 2. Inserting a missing character. 3. Replacing an incorrect character by a correct character. 4. Transforming two adjacent characters.