SlideShare a Scribd company logo
2
Most read
3
Most read
5
Most read
V.Anusuya,AP(SG)/CSE
1/21/2020 V.Anusuya,AP(SG)/CSE 1
LEX
 LEX is a tool that allows one to specify a Lexical
Analyzer by specifying RE to describe patterns for
tokens.
 Input Notation-Lex language(Specification)
 Lex Compiler-Transforms Input patterns into a
Transition diagram and generates code in a file called
lex.yy.c
1/21/2020 V.Anusuya,AP(SG)/CSE 2
Lexical Analyzer Generator - Lex
Lexical Compiler
Lex Source program
lex.l
lex.yy.c
C
compiler
lex.yy.c a.out
a.outInput stream Sequence
of tokens
1/21/2020 V.Anusuya,AP(SG)/CSE 3
Structure of Lex programs
Lex program has the following form:
declarations
%%
translation rules
%%
auxiliary functions
Pattern {Action}
1/21/2020 V.Anusuya,AP(SG)/CSE 4
 The declarations section includes declarations of
variables, manifest constants (identifiers declared to
stand for a constant, e.g., the name of a token), and
regular definitions.
 The translation rules each have the form
 Pattern { Action }
 pattern is a regular expression
 Action-Fragment of code written in C.
 Third Section-holds whatever additional functions are used in the
actions.
 Alternatively, these functions can be compiled separately and loaded
with the lexical analyser.
1/21/2020 V.Anusuya,AP(SG)/CSE 5
 The lexical analyzer returns a single value, the token
name, to the parser, but uses the shared, integer
variable yylval to pass additional information about
the lexeme found, if needed.
1/21/2020 V.Anusuya,AP(SG)/CSE 6
Sample Lex Program:
digit [0-9]
letter [a-zA-Z]
%%
{letter}({letter}|{digit})* printf(“id: %sn”, yytext);
n printf(“new linen”);
%%
main() {
yylex();
}
1/21/2020 V.Anusuya,AP(SG)/CSE 7
 Lex Predefined Variables
 yytext -- a string containing the lexeme
 yyleng -- the length of the lexeme
 yyin -- the input stream pointer
 yyout -- the output stream pointer
1/21/2020 V.Anusuya,AP(SG)/CSE 8
 Lex Library Routines
 yylex() - The default main() contains a call of yylex()
 yymore() - return the next token
 yyless(n) - retain the first n characters in yytext
 yywarp()-is called whenever Lex reaches an end-of-
file.By default yywarp() always returns 1
1/21/2020 V.Anusuya,AP(SG)/CSE 9
Program: LEX Specification
%%
{letter}({letter}|{digit})* printf("id: %sn", yytext);
n printf("new linen");
%%
int main(int argc,char **argv)
{
if (argc > 1)
{
FILE *file;
file = fopen(argv[1],"r");
if(!file)
{
printf("could not open %s n",argv[1]);
exit(0);
}
yyin = file;
}
1/21/2020 V.Anusuya,AP(SG)/CSE 10
yylex();
fclose(yyin);
return 0;
}
int yywrap()
{
return 1;
}
1/21/2020 V.Anusuya,AP(SG)/CSE 11
Input File
Sam.c
void main()
{
int a=5,b;
char c[]="GOOD";
b=a+10;
printf("%d",b);
}
1/21/2020 V.Anusuya,AP(SG)/CSE 12
Output
id: c
[]="id: GOOD
";new line
id: b
=id: a
+10;new line
id: printf
("%id: d
",id: b
);new line
}new line
new line
1/21/2020 V.Anusuya,AP(SG)/CSE 13
 Conflict Resolution in Lex
 There are two rules that Lex uses to decide on the
proper lexeme to select, when several prefixes of the
input match one or more patterns:
 1. Always prefer a longer prefix to a shorter prefix.
 2. If the longest possible prefix matches two or more
patterns, prefer the pattern listed first in the Lex
program.
1/21/2020 V.Anusuya,AP(SG)/CSE 14
The Lookahead Operator
 Lex automatically reads one character ahead of the last
character that forms the selected lexeme, and then
retracts the input so only the lexeme itself is consumed
from the input.
 However, sometimes, we want a certain pattern to be
matched to the input only when it is followed by a certain
other characters. If so, we may use the slash in a pattern
to indicate the end of the part of the pattern that
matches the lexeme.
 What follows / is additional pattern that must be
matched before we can decide that the token in question
was seen, but what matches this second pattern is not
part of the lexeme.
1/21/2020 V.Anusuya,AP(SG)/CSE 15
%{
/* program to recognize a c program */
int COMMENT=0;
%}
identifier [a-zA-Z][a-zA-Z0-9]*
%%
#.* { printf("n%s is a PREPROCESSOR
DIRECTIVE",yytext);}
int | float | char |double |while |for |do |if |break |
continue |void | switch |case |long |struct |const |
typedef | return |else | goto {printf("nt%s is a
KEYWORD",yytext);}
"/*" {COMMENT = 1;}
"*/" {COMMENT = 0;}
1/21/2020 V.Anusuya,AP(SG)/CSE 16
{identifier}(
{if(!COMMENT)printf("nnFUNCTIONnt%s",yytext);
}
"{" {if(!COMMENT) printf("n BLOCK BEGINS");}
"}" {if(!COMMENT) printf("n BLOCK ENDS");}
{identifier}([[0-9]*])? {if(!COMMENT) printf("n %s
IDENTIFIER",yytext);}
"a"|"n"|"b"|"t"|"t"|"b"|"a" printf("n%stis
Escape sequences",yytext);
""%d""|"%s"|"%c"|"%f"|"%e" printf("n%stis a format
specifier",yytext);
".*" {if(!COMMENT) printf("nt%s is a
STRING",yytext);}
1/21/2020 V.Anusuya,AP(SG)/CSE 17
[0-9]+ {if(!COMMENT) printf("nt%s is a
NUMBER",yytext);}
"=" {if(!COMMENT)printf("nt%s is an ASSIGNMENT
OPERATOR",yytext);}
"+"|"-"|"*"|"/"|"%" {printf("n %s is an arithmetic
operator",yytext);}
"<="|">="|"<"|"=="|">" {if(!COMMENT) printf("nt%s is
a RELATIONAL OPERATOR",yytext);}
%%
1/21/2020 V.Anusuya,AP(SG)/CSE 18
int main(int argc,char **argv)
{
if (argc > 1)
{
FILE *file;
file = fopen(argv[1],"r");
if(!file)
{
printf("could not open %s n",argv[1]);
exit(0);
}
1/21/2020 V.Anusuya,AP(SG)/CSE 19
yyin = file;
}
yylex();
fclose(yyin);
return 0;
}
int yywrap()
{
return 1;
}
1/21/2020 V.Anusuya,AP(SG)/CSE 20

More Related Content

PDF
Lexical Analysis - Compiler design
PPT
Bottom - Up Parsing
PPT
Lexical Analysis
PPTX
Top down parsing
PPTX
Compiler design syntax analysis
PPTX
Specification-of-tokens
PPT
Type Checking(Compiler Design) #ShareThisIfYouLike
Lexical Analysis - Compiler design
Bottom - Up Parsing
Lexical Analysis
Top down parsing
Compiler design syntax analysis
Specification-of-tokens
Type Checking(Compiler Design) #ShareThisIfYouLike

What's hot (20)

PPTX
Syntax Analysis in Compiler Design
PDF
Code optimization in compiler design
PPTX
Loop optimization
PPTX
Type checking in compiler design
PPTX
Chomsky classification of Language
PPT
Intermediate code generation (Compiler Design)
PPT
Problems, Problem spaces and Search
PPT
Compiler Design Unit 1
PPT
Lecture 1 - Lexical Analysis.ppt
PPTX
Lexical Analysis - Compiler Design
PDF
Syntax Directed Definition and its applications
PPTX
Principle source of optimazation
PPTX
Input-Buffering
PPTX
Compiler Design Unit 4
PPTX
Finite automata-for-lexical-analysis
PPTX
Role-of-lexical-analysis
PPTX
Compiler Design
PPTX
Recognition-of-tokens
PPT
1.Role lexical Analyzer
PPTX
Bootstrapping in Compiler
Syntax Analysis in Compiler Design
Code optimization in compiler design
Loop optimization
Type checking in compiler design
Chomsky classification of Language
Intermediate code generation (Compiler Design)
Problems, Problem spaces and Search
Compiler Design Unit 1
Lecture 1 - Lexical Analysis.ppt
Lexical Analysis - Compiler Design
Syntax Directed Definition and its applications
Principle source of optimazation
Input-Buffering
Compiler Design Unit 4
Finite automata-for-lexical-analysis
Role-of-lexical-analysis
Compiler Design
Recognition-of-tokens
1.Role lexical Analyzer
Bootstrapping in Compiler
Ad

Similar to Lexical analyzer generator lex (20)

PPT
compiler Design laboratory lex and yacc tutorial
PPTX
Lexical Analysis and Parsing
PPTX
Compiler Design_LEX Tool for Lexical Analysis.pptx
PPTX
Lex programming
PPTX
module 4.pptx
PPT
LEX Intrduction Compiler Construction_VIET.ppt
PPT
Module4 lex and yacc.ppt
PDF
Compiler_Design_Introduction_Unit_2_IIT.pdf
PPT
Lex and Yacc Tool M1.ppt
PDF
lecture_lex.pdf
PPTX
More on Lex
PPT
PDF
LANGUAGE PROCESSOR
PPT
CS540-2-lecture2 Lexical analyser of .ppt
PDF
CD record Book anna university regulation 21
PPT
Lex (lexical analyzer)
PPT
Chapter-2-lexical-analyser and its property lecture note.ppt
PPT
Introduction to Lex.ppt
PPT
LEX lexical analyzer for compiler theory.ppt
PDF
role of lexical parser compiler design1-181124035217.pdf
compiler Design laboratory lex and yacc tutorial
Lexical Analysis and Parsing
Compiler Design_LEX Tool for Lexical Analysis.pptx
Lex programming
module 4.pptx
LEX Intrduction Compiler Construction_VIET.ppt
Module4 lex and yacc.ppt
Compiler_Design_Introduction_Unit_2_IIT.pdf
Lex and Yacc Tool M1.ppt
lecture_lex.pdf
More on Lex
LANGUAGE PROCESSOR
CS540-2-lecture2 Lexical analyser of .ppt
CD record Book anna university regulation 21
Lex (lexical analyzer)
Chapter-2-lexical-analyser and its property lecture note.ppt
Introduction to Lex.ppt
LEX lexical analyzer for compiler theory.ppt
role of lexical parser compiler design1-181124035217.pdf
Ad

More from Anusuya123 (16)

PPTX
Unit 4 Closure properties of CFLangages.pptx
PPTX
Unsolvable Problems and Computable Functions.pptx
PPTX
Unit-III Correlation and Regression.pptx
PPTX
Types of Data-Introduction.pptx
PPTX
Basic Statistical Descriptions of Data.pptx
PPTX
Data warehousing.pptx
PPTX
Unit 1-Data Science Process Overview.pptx
PPTX
Introduction to Data Science.pptx
PPTX
5.2.2. Memory Consistency Models.pptx
PPTX
5.1.3. Chord.pptx
PPT
3. Descriptive statistics.ppt
PPTX
1. Intro DS.pptx
PPTX
5.Collective bargaining.pptx
PPT
Runtimeenvironment
PDF
Think pair share
PPTX
Operators in Python
Unit 4 Closure properties of CFLangages.pptx
Unsolvable Problems and Computable Functions.pptx
Unit-III Correlation and Regression.pptx
Types of Data-Introduction.pptx
Basic Statistical Descriptions of Data.pptx
Data warehousing.pptx
Unit 1-Data Science Process Overview.pptx
Introduction to Data Science.pptx
5.2.2. Memory Consistency Models.pptx
5.1.3. Chord.pptx
3. Descriptive statistics.ppt
1. Intro DS.pptx
5.Collective bargaining.pptx
Runtimeenvironment
Think pair share
Operators in Python

Recently uploaded (20)

PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PPTX
OOP with Java - Java Introduction (Basics)
PPT
Mechanical Engineering MATERIALS Selection
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
DOCX
573137875-Attendance-Management-System-original
PPTX
Sustainable Sites - Green Building Construction
PPTX
Geodesy 1.pptx...............................................
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PDF
composite construction of structures.pdf
PPTX
Lesson 3_Tessellation.pptx finite Mathematics
PPTX
additive manufacturing of ss316l using mig welding
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PPTX
Lecture Notes Electrical Wiring System Components
PPTX
Construction Project Organization Group 2.pptx
PDF
Structs to JSON How Go Powers REST APIs.pdf
PPT
Project quality management in manufacturing
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
Embodied AI: Ushering in the Next Era of Intelligent Systems
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
Model Code of Practice - Construction Work - 21102022 .pdf
OOP with Java - Java Introduction (Basics)
Mechanical Engineering MATERIALS Selection
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
573137875-Attendance-Management-System-original
Sustainable Sites - Green Building Construction
Geodesy 1.pptx...............................................
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
composite construction of structures.pdf
Lesson 3_Tessellation.pptx finite Mathematics
additive manufacturing of ss316l using mig welding
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
Lecture Notes Electrical Wiring System Components
Construction Project Organization Group 2.pptx
Structs to JSON How Go Powers REST APIs.pdf
Project quality management in manufacturing
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx

Lexical analyzer generator lex

  • 2. LEX  LEX is a tool that allows one to specify a Lexical Analyzer by specifying RE to describe patterns for tokens.  Input Notation-Lex language(Specification)  Lex Compiler-Transforms Input patterns into a Transition diagram and generates code in a file called lex.yy.c 1/21/2020 V.Anusuya,AP(SG)/CSE 2
  • 3. Lexical Analyzer Generator - Lex Lexical Compiler Lex Source program lex.l lex.yy.c C compiler lex.yy.c a.out a.outInput stream Sequence of tokens 1/21/2020 V.Anusuya,AP(SG)/CSE 3
  • 4. Structure of Lex programs Lex program has the following form: declarations %% translation rules %% auxiliary functions Pattern {Action} 1/21/2020 V.Anusuya,AP(SG)/CSE 4
  • 5.  The declarations section includes declarations of variables, manifest constants (identifiers declared to stand for a constant, e.g., the name of a token), and regular definitions.  The translation rules each have the form  Pattern { Action }  pattern is a regular expression  Action-Fragment of code written in C.  Third Section-holds whatever additional functions are used in the actions.  Alternatively, these functions can be compiled separately and loaded with the lexical analyser. 1/21/2020 V.Anusuya,AP(SG)/CSE 5
  • 6.  The lexical analyzer returns a single value, the token name, to the parser, but uses the shared, integer variable yylval to pass additional information about the lexeme found, if needed. 1/21/2020 V.Anusuya,AP(SG)/CSE 6
  • 7. Sample Lex Program: digit [0-9] letter [a-zA-Z] %% {letter}({letter}|{digit})* printf(“id: %sn”, yytext); n printf(“new linen”); %% main() { yylex(); } 1/21/2020 V.Anusuya,AP(SG)/CSE 7
  • 8.  Lex Predefined Variables  yytext -- a string containing the lexeme  yyleng -- the length of the lexeme  yyin -- the input stream pointer  yyout -- the output stream pointer 1/21/2020 V.Anusuya,AP(SG)/CSE 8
  • 9.  Lex Library Routines  yylex() - The default main() contains a call of yylex()  yymore() - return the next token  yyless(n) - retain the first n characters in yytext  yywarp()-is called whenever Lex reaches an end-of- file.By default yywarp() always returns 1 1/21/2020 V.Anusuya,AP(SG)/CSE 9
  • 10. Program: LEX Specification %% {letter}({letter}|{digit})* printf("id: %sn", yytext); n printf("new linen"); %% int main(int argc,char **argv) { if (argc > 1) { FILE *file; file = fopen(argv[1],"r"); if(!file) { printf("could not open %s n",argv[1]); exit(0); } yyin = file; } 1/21/2020 V.Anusuya,AP(SG)/CSE 10
  • 11. yylex(); fclose(yyin); return 0; } int yywrap() { return 1; } 1/21/2020 V.Anusuya,AP(SG)/CSE 11
  • 12. Input File Sam.c void main() { int a=5,b; char c[]="GOOD"; b=a+10; printf("%d",b); } 1/21/2020 V.Anusuya,AP(SG)/CSE 12
  • 13. Output id: c []="id: GOOD ";new line id: b =id: a +10;new line id: printf ("%id: d ",id: b );new line }new line new line 1/21/2020 V.Anusuya,AP(SG)/CSE 13
  • 14.  Conflict Resolution in Lex  There are two rules that Lex uses to decide on the proper lexeme to select, when several prefixes of the input match one or more patterns:  1. Always prefer a longer prefix to a shorter prefix.  2. If the longest possible prefix matches two or more patterns, prefer the pattern listed first in the Lex program. 1/21/2020 V.Anusuya,AP(SG)/CSE 14
  • 15. The Lookahead Operator  Lex automatically reads one character ahead of the last character that forms the selected lexeme, and then retracts the input so only the lexeme itself is consumed from the input.  However, sometimes, we want a certain pattern to be matched to the input only when it is followed by a certain other characters. If so, we may use the slash in a pattern to indicate the end of the part of the pattern that matches the lexeme.  What follows / is additional pattern that must be matched before we can decide that the token in question was seen, but what matches this second pattern is not part of the lexeme. 1/21/2020 V.Anusuya,AP(SG)/CSE 15
  • 16. %{ /* program to recognize a c program */ int COMMENT=0; %} identifier [a-zA-Z][a-zA-Z0-9]* %% #.* { printf("n%s is a PREPROCESSOR DIRECTIVE",yytext);} int | float | char |double |while |for |do |if |break | continue |void | switch |case |long |struct |const | typedef | return |else | goto {printf("nt%s is a KEYWORD",yytext);} "/*" {COMMENT = 1;} "*/" {COMMENT = 0;} 1/21/2020 V.Anusuya,AP(SG)/CSE 16
  • 17. {identifier}( {if(!COMMENT)printf("nnFUNCTIONnt%s",yytext); } "{" {if(!COMMENT) printf("n BLOCK BEGINS");} "}" {if(!COMMENT) printf("n BLOCK ENDS");} {identifier}([[0-9]*])? {if(!COMMENT) printf("n %s IDENTIFIER",yytext);} "a"|"n"|"b"|"t"|"t"|"b"|"a" printf("n%stis Escape sequences",yytext); ""%d""|"%s"|"%c"|"%f"|"%e" printf("n%stis a format specifier",yytext); ".*" {if(!COMMENT) printf("nt%s is a STRING",yytext);} 1/21/2020 V.Anusuya,AP(SG)/CSE 17
  • 18. [0-9]+ {if(!COMMENT) printf("nt%s is a NUMBER",yytext);} "=" {if(!COMMENT)printf("nt%s is an ASSIGNMENT OPERATOR",yytext);} "+"|"-"|"*"|"/"|"%" {printf("n %s is an arithmetic operator",yytext);} "<="|">="|"<"|"=="|">" {if(!COMMENT) printf("nt%s is a RELATIONAL OPERATOR",yytext);} %% 1/21/2020 V.Anusuya,AP(SG)/CSE 18
  • 19. int main(int argc,char **argv) { if (argc > 1) { FILE *file; file = fopen(argv[1],"r"); if(!file) { printf("could not open %s n",argv[1]); exit(0); } 1/21/2020 V.Anusuya,AP(SG)/CSE 19
  • 20. yyin = file; } yylex(); fclose(yyin); return 0; } int yywrap() { return 1; } 1/21/2020 V.Anusuya,AP(SG)/CSE 20