SlideShare a Scribd company logo
PUNE VIDYARTHI GRIHA’s
COLLEGE OF ENGINEERING, NASHIK.
• “INTRODUCTION OF COMPILER
AND L1EXICAL ANALYSIS ”
PREPARED BY :
PROF. ANAND N.
GHARU ASSISTANT
PROFESSOR
COMPUTER
CONTENTS
• COMPILER
• INTERPRETER
• ANALYSIS SYNTHESIS MODEL
• LANGUAGE
PROCESSINGSYSTEM
• COMPILERPROCESSINBRIEF
2
3/17/2019 PROF. ANAND
GHARU
COMPILER
S
• “Compilation”
– Translation of a program written in a source
language into a semantical y equivalent
program written in atarget language
Input
Compil
er
Error
messages
Source
Program
Target
Progra
m
Outp
ut
3
3/17/2019 PROF. ANAND
GHARU
Inte
r
• “ pretation”
– Performing the operations implied by the
source program
INTERPRTER
S
Interpret
er
Source
Progra
m
Inpu
t
Outp
ut
Error
messages
4
3/17/2019 PROF. ANAND
GHARU
ANALYSIS– SYNTHESIS
MODEL
• There are two parts to compilation:
Analysis determines the operations
implied by the source program
which are recorded in a tree
structure
Synthesis takes the tree structure
and translates the operations
therein into the target program
5
3/17/2019 PROF. ANAND
GHARU
ANALYSIS
Breaks up the source
program into
constituent pieces
and imposes a
grammatical
structure on them.
It then uses this
structure to create
an intermediate
representation of
the source program.
If the analysis part
detects that
thesource program
is either
syntactically ill
formed or
semantically
unsound, then it
must provide
informative
messages, so the
user can take
corrective action.
The analysis part
also collects
information about
the source
program and stores
it in a data
structure called a
symboltable,
which is passed
along with the
intermediate
representation to
the synthesis part.
6
3/17/2019 PROF. ANAND
GHARU
• Front end of
compiler
•• BBaacckk
endend ofof
c
c
o
o
m
m
p
p
i
i
l
l
e
e
r
r
• The synthesis part constructs the desired target
programfrom the intermediate representation and the
information in the symbol table
• Front end of
compiler
SYNTHESI
S
ANALYSIS
SYNTHESIS
7
3/17/2019 PROF. ANAND
GHARU
Record the variable names used in
thesource
program and collect information
aboutvarious
attributes of
eachname.
• An
ess
ential function of a
compiler–
COMPILERS
ROLE
• These attributes may provide information about the
storage allocated for a name , its type and its scope ,
procedure names ,number and types of its
arguments, the method of passing each argument
and the type returned
8
3/17/2019 PROF. ANAND
GHARU
ISSUES IN
COMPILATION
Hierarchy of operations need to be maintained to
determine correct order of expression evaluation
Maintain data type integrity with automatic type
conversions
Handle user defineddata types.
Develop appropriate storage mappings
9
3/17/2019 PROF. ANAND
GHARU
ISSUES IN
COMPILATION
Resolve occurrence of each variable name ina program
i.e
construct separate symbol tables for different namespaces.
Handle different controlstructures.
Perform optimization
10
3/17/2019 PROF. ANAND
GHARU
Single
Pass
Multi
Pass
Spee
d
Memor
y
better worse
better for
large programs
(potentially) better
for small
programs
ISSUES IN
COMPILATION
1
1
Modularit
y
Flexibilit
y
“Global”
optimization
Source
Language
worse better
better
worse
impossible possible
single pass compilers are
notpossible for many
programming languages
3/17/2019 PROF. ANAND
GHARU
1
2
A
passis
COMPILER
PASSES
a complete traversal of the source program,
or a complete traversal of some internal
representation of the source program.
A pass can correspond to a “phase” but it
does not have to!
Sometimes a single “pass” corresponds
toseveral phases that are interleaved in time.
What and how many passes a compiler does
over the source program is an important
designdecision.
3/17/2019 PROF. ANAND
GHARU
SINGLE PASS
COMPILER
Asingle pass compiler makes a single pass over the
source text, parsing, analyzing and generating code all
at once.
Dependency diagram of a typical Single Pass
Compiler:
Compiler Driver
calls
SyntacticAnalyzer
calls
Contextual
Analyzer
Code
Generator
calls
13
3/17/2019 PROF. ANAND
GHARU
Amulti
passc
ompiler makes several passes over the program.
The output of a preceding phase is stored in a data
structureand used by subsequent phases.
Dependency diagram of a typical Multi Pass
Compiler:
Compiler Driver
MULTI PASS
COMPILER
calls
Contextual
Analyzer
Code
Generator
calls calls
input
SyntacticAnalyzer
input output output
Source Text AST Decorated
AST
input
output
Object
Code
14
3/17/2019 PROF. ANAND
GHARU
SYMBOL TABLE
MEANS
Symbol tables are data structures that are
used by compilers to hold information about
source-program constructs.
Asymbol table is a necessary component because
 Declaration of identifiers appears once in
aprogram
 Use of identifiers may appear in many places
of the program text
15
3/17/2019 PROF. ANAND
GHARU
I N FORMATION PROVIDED
BY
SYMBOL TABLE
 Given an Identifier which name isit?
 What information is to be associated with a
name?
 How do we access thisinformation?
16
3/17/2019 PROF. ANAND
GHARU
Variable and labels
Parameter
Constant
SYMBOL TABLE
NAMES
NAME Recor
d
RecordField
Procedure
Array and files
17
3/17/2019 PROF. ANAND
GHARU

Identifi
WHO CREATES
SYMBOL
TABLE ?
ers and attributes are entered by the
analysisphases when processing a definition
(declaration) of an identifier
 In simple languages with only global variables
andimplicit declarations:
The scanner can enter an identifier into a
symboltable if it is not alreadythere
 In block-structured languages with scopes and
explicit declarations:
 The parser and/or semantic analyzer enter
identifiers and corresponding attributes
18
3/17/2019 PROF. ANAND
GHARU
• Symb
ol
table information is used by the
analysis and
used identifiers have been
defined
synthesis
phases
• Toverify
that (declared)
USE OF SYMBOL
TABLE
• To verify that expressions and assignments
are
semantically correct – type checking
• To generate intermediate or target
code
19
3/17/2019 PROF. ANAND
GHARU
MEMORY
MANAGEMENT
What has a compiler to do
withmemory management?
• compiler uses heap-allocated data
structures
• modern languages have automatic
data (de)allocation
• garbage collection part of
runtimesupport system
• compiler usually assists in
identifyingpointers
3/17/2019 PROF. ANAND
GHARU
GARBAGE
COLLECTION
• Some systems require user to call free
when finished with memory
– C/ C++
– reason for destructors in C++
• Other systems detect unused
memory and reclaim it
– Garbage Collection
– this is what Javadoes
3/17/2019 PROF. ANAND
GHARU
GARBAGE
COLLECTION
• Basic idea
– keep track of what memory is
referencedand when it is no longer
accessible, reclaim the memory
• Example
– linked list
3/17/2019 PROF. ANAND
GHARU
nex
t
Obj
1
hea
d
tai
l
nex
t
nex
t
Obj2
Obj3
EXAMPL
E
• Assume programmer does the
following
– obj1.next =obj2.next;
nex
t
Obj
1
hea
d
tai
l
nex
t
Obj
2
nex
t
Obj
3
3/17/2019 PROF. ANAND
GHARU
•
No
EXAMPL
E
w there is no way
forprogrammer to reference obj2
– it’s garbage
• In system without garbage collection
this is cal ed a memory leak
– location can’t be used but can’t
bereallocated
– waste of memory and can eventually
crash a program
• In system with garbage
collection this chunk will be
3/17/2019 PROF. ANAND
GHARU
•
Basi
MARK AND
SWEEP
0 100 350 450 600
•Mark chunks 0, 1, and 3 as marked
•Place chunk 2 on the free list (turn it into a
hole)
c idea
– go through all memory and mark every
chunk that is referenced
– make a second pass through
memory and remove all chunks not
marked
OS 0 1 2
3
p2 = 650
p2 = 360
3/17/2019 PROF. ANAND
GHARU
•
Hav
MARK AND SWEEP
ISSUES
e to be able toidentify all
references
– this is difficult in somelanguages
– similar to compaction
• Requires jumping all over memory
– terrible for performance
• cache hits
• virtual memory
• Have to stop everything else to do
• Search time proportional to non-
garbage
– may require lots of work for little
3/17/2019 PROF. ANAND
GHARU
•
REFERENCE
COUNTING
Basic idea
– give each chunk a special field that
is the number of references to
chunk
– whenever a new reference is made,
increment field by 1
– whenever a reference is removed,
decrement field by 1
– when reference count goes to
zero,collect chunk
• Requires compiler support
3/17/2019 PROF. ANAND
GHARU
•
Exa
mple
– everything in italics is added
bycompiler
Object p = new
Object;
p.count++;
p 1 0
REFERENCE
COUNTING
Object q = new
Object;
q.count++;
p.count--;
if(p.count ==
0)
collect
p p = q;
p.count++;
1
q
2
3/17/2019 PROF. ANAND
GHARU
REFERENCE
COUNTING
• Above example does not check for
NULL reference
Object p = new Object
p.count++;
p.count--;
p = NULL;
if(p != NULL)
p.count++;
3/17/2019 PROF. ANAND
GHARU
• What about pointers inside
0referenced page?
0 1
REFERENCE
COUNTING
ISSUES
– both of these are garbage
– before reclaiming a chunk, must
go through all references in the
chunk
• decrement the chunk they
reference by 1
3/17/2019 PROF. ANAND
GHARU
TOOLS USING
ANALYSIS – SYNTHESIS
MODEL
Editors (syntax highlighting)
Pretty printers (e.g. Doxygen)
Static checkers (e.g. Lint and
Splint) Interpreters
31
3/17/2019 PROF. ANAND
GHARU
TOOLS USING
ANALYSIS – SYNTHESIS
MODEL
Text formatters (e.g. TeX
and LaTeX)
Silicon compilers (e.g. VHDL)
Query interpreters/
compilers
(Databases
)
32
3/17/2019 PROF. ANAND
GHARU
Skeletal Source Program
Preprocessor
Try for
example:
PREPROCESSORS,COMPIL
ERS, ASSEMBLERS,AND
LINKERS
Absolute Machine Code
Linker
Assembler
Compil
er
Target Assembly
Program
Relocatable Object
Code
Libraries
and
Relocatable
Object
Files
gcc myprog.c
33
3/17/2019 PROF. ANAND
GHARU
PHASESOFCOMPILE
R
34
3/17/2019 PROF. ANAND
GHARU
Phas
Programmer
(sourcec
THE PHASES OF A
COMPILER
es Output Sample
ode producer) Source string A=B+C;
Scanner (performslexical analysis) Tokenstring ‘A’,‘=’,‘B’, ‘+’,‘C’, ‘;’
Andsymbol table with names
Parser (performs syntax
analysisbased on the grammar of
the programming language)
Parse tree or abstract syntax tree ;
|
=
/ 
A +
/ 
B C
Semantic analyzer (type checking,
etc)
Annotatedparse tree or
abstract syntax tree
Intermediate codegenerator Three-address code, quads, orRTL int2fp B t1
+ t1 C t2
:= t2 A
Optimizer Three-address code, quads, orRTL int2fp B t1
+ t1 #2.3 A
Codegenerator Assemblycode MOVF
#2.3,r1
ADDF2 r1,r2
MOVF r2,A
3/17/2019 PROF. ANAND
GHARU
• Front end: analysis
(machine independent)
• Back end: synthesis
(machine dependent)
THE GROUPING OF
PHASES
Compiler front
and back ends
• Acollection of phases is
done only once (single
pass) or multiple times
(multi pass)
• Single pass: usually requires
everything to be
definedbefore being used
in source program
• Multi pass: compiler
mayhave to keep entire
program representation
in memory
Compile
r
passes
3/17/2019 PROF. ANAND
GHARU
COMPILER
CONSTRUCTION
TOOLS
Software development tools are
available to implement one or more
compiler phases
• Scanner generators
• Parser generators
• Syntax-directed translation engines
• Automatic code generators
• Data-flow engines
3/17/2019 PROF. ANAND
GHARU
lexical
analyze
r
parser
source
progra
m
toke
n
BLOCK SCHEMATIC
OF LEXICAL
ANALYZER
get
next
token
symbo
l
table
3/17/2019 PROF. ANAND
GHARU
LEXICAL
ANALYZER
PERSPECTIVE
LEXICALANALYZER
PARSER
• Scan input
• Remove WS, NL, …
Identify
Tok
t
To
k
•
ens Create Symbol Table
Inser
ensintoST Generate Errors
Send Tokens toParser
38
Perform SyntaxAnalysis
Actions Dictated by
TokenOrder
Update Symbol Table
Entries
Create Abstract Rep. of
Source Generate Errors
3/17/2019 PROF. ANAND
GHARU
SEPERATION OF LEXICAL
ANALYSIS
39
FROM
SYNTAXANALYSIS
• Separation of Lexical Analysis From
Parsing Presents a Simpler
Conceptual Model
– From a Software Engineering Perspective
Division Emphasizes
• High Cohesion and Low Coupling
• Implies Well Specified  Paral el Implementation
• Separation Increases Compiler Efficiency
(I/O Techniques to Enhance
LexicalAnalysis)
• Separation Promotes Portability.
– This is critical today, when platforms
(OSsand Hardware) are numerous and
varied!
3/17/2019 PROF. ANAND
GHARU
BASIC TERMINOLOGIES
OF LEXICAL
ANALYSIS
 Major Terms for Lexical Analysis?
 TOKEN
 A classification for a common set of strings
 Examples Include <Identifier>, <number>, etc.
 PATTERN
 The rules which characterize the set of strings for
a token
 Recall File and OS Wildcards ([A-Z]*.*)
 LEXEME
 Actual sequence of characters that matches pattern
and is classified by a token
 Identifiers: x, count, name, etc…
40
3/17/2019 PROF. ANAND
GHARU
INTRODUCING
BASIC
TERMINOLOG
Y
Sample
Lexemes
Informal Description
ofPattern
T
oken
const
if
relatio
n id
num
liter
al
const
if
<, <=, =, < >,
>, >=
pi, count, D2
3.1416, 0,
6.02E23
“core
dumped”
const
if
< or <= or = or < > or >= or >
letter followed by letters anddigits
any numeric constant
any characters between “and
“except “
Classifies
Pattern
Actual values are critical.
Info is:
1. Stored in symbol table
2.Returned to parser
41
3/17/2019 PROF. ANAND
GHARU
I/O - KEY FOR
SUCCESSFUL
LEXICAL ANALYSIS
42
 Character-at-a-time I/O
 Block / Buffered I/O
 Block/Buffered I/O
 Utilize Block of memory
 Stage data from source to buffer block at a time
 Maintain two blocks - Why (Recall
OS)?
 Asynchronous I/O - for 1 block
 While Lexical Analysis on 2nd block
Block 1 Block 2
ptr..
.
When
done,
issue I/O
Still Process
token in
2nd block
3/17/2019 PROF. ANAND
GHARU
Algorithm
Buffered I/O with Sentinels
eof
M * * 2
* eof
C
Current token
eof
forward
(scans ahead
to find pattern
match)
43
E
=
lexeme
beginning
forward : = forward + 1 ;
if forward is at eof then begin
if forward at end of first half then
begin
reload second half
;
terminate lexical analysis
end
2nd eof  no more
input !
Block
I/O
forward : = forward + 1
end
else if forward at end of second halfthen
begin
reload first half ; Block I/O
move forward to biginning of first half
end
else / * eof within buffer signifying end of
input
3/17/2019 PROF. ANAND
GHARU
HANDLING
LEXICAL
ERRORS
• Error Handling is very localized, with
Respect
to
Input Source
• For example: whil ( x := 0 ) do
generates no lexical errors in
PASCAL
• In what Situations do
ErrorsOccur?
– Prefix of remaining input doesn’tmatch any
defined token
• Possible error recovery actions:
– Deleting or Inserting InputCharacters
– Replacing or Transposing Characters
• Or, skip over to next separator
to“ignore” problem
3/17/2019 PROF. ANAND
GHARU
3/17/2019 PROF. ANAND
GHARU
Tool that helps to take set of
descriptions of possible tokens and
produce Croutine
The set of descriptions is cal
ed lex specification
The token description are
known as regular expressions
3/17/2019 PROF. ANAND
GHARU
L
AUTOMATIC CONSTRUCTION
OF EXICAL ANALYZER ……
LEX
• Lex is a tool for creating lexical
analyzers.
• Lexical analyzers tokenize input streams.
• Tokens are the terminals of a language.
• Regular expressions define tokens .
3/17/2019 PROF. ANAND
GHARU
C
compil
e r
a.out
A
L
AUTOMATIC
CONSTRUCTION OF
EXICAL ANALYZER ……LEX
Lex
compiler
lex.yy.c
lex
source
program
lex.l
lex.yy.c
input
stream
sequenc
e of
tokens
a.out
C
compile
r
a.out
3/17/2019 PROF. ANAND
GHARU
LEX
SPECIFICATION
Lex Program Structure:
declarations
%%
translation rules
%%
auxiliary procedures
Name the file e.g. test.lex
Then, “lex test.lex” produces the
file “lex.yy.c” (a C-program)
3/17/2019 PROF. ANAND
GHARU
LEX
SPECIFICATION
{
/* definitions of all constants
LT, LE, EQ, NE, GT, GE, IF, THEN, ELSE, ... */
C
declarat
s
n
o
i
%
%}
......
letter
digit
id
[A-Za-z]
[0-9]
{letter}({letter}|{digit})*
{ return(IF);}
{ return(THEN);}
{ yylval = install_id(); return(ID); }
......
%%
if
then
{id}
......
%%
install_id()
{ /* procedure to install the lexeme to the ST */
declarations
Rules
Auxiliary
3/17/2019 PROF. ANAND
GHARU
L
AUTOMATIC
CONSTRUCTION OF
EXICAL ANALYZER ……LEX
•To run lex on a source file, use the command: lex
source.l
•This produces the file lex.yy.c which is the C source for
the
lexical analyzer.
• To compile this, use: cc -o prog -O lex.yy.c -ll
3/17/2019 PROF. ANAND
GHARU
Invokes
the
lexical
analyzer
EXAMPLE OF
LEX
SPECIFICATION
%{
#include <stdio.h>
%
%
Contain
s
the
matching
lexeme
Translation%}
rule
s
{ printf(“%sn”, yytext); }
{
}
[0-9]+
.|n
%%
main()
{ yylex();
}
Invokes
the
lexical
analyzer
lex spec.l
gcc lex.yy.c -ll
./a.out < spec.l
3/17/2019 PROF. ANAND
GHARU
%{
#include <stdio.h>
[ t]+
Regular
definitio
n
Translation int ch = 0, wd = 0, nl = 0;
rules %}
EXAMPLE OF
LEX
SPECIFICATION
{ ch++; wd++; nl+
+; }
{ ch+=yyleng; }
{ ch+=yyleng; wd++; }
{ ch++; }
delim
%%
n
^{delim}
{delim}
.
%%
main()
{ yylex();
printf("%8d%8d%8dn", nl, wd, ch);
3/17/2019 PROF. ANAND
GHARU
%{
#include <stdio.h>
Regular
definitio
n
Translation %}
EXAMPLE OF
LEX
SPECIFICATION
rules digit [0-9]
letter [A-Za-z]
id
{letter}({letter}|{digit})*
%%
{ printf(“number: %sn”,
{ printf(“ident: %sn”,
{ printf(“other: %sn”,
{digit}+
yytext); }
{id}
yytext); }
.
yytext); }
3/17/2019 PROF. ANAND
GHARU
REGULAR
EXPRESSIO
NS
•[xyz] match one character x, y, or z
(use  to escape -) [^xyz]match any
character except x, y, and z
• [a-z] match one of a to z
• r* closure (match zero or more
occurrences)
•r+ positive closure (match one or more
occurrences)
• r? optional (match zero or one
EXAMPLE OF
LEX
PROGRAM
int num_lines = 0, num_chars =
0;
%%
{++num_lines; ++num_chars;}
{++num_chars;}
n
.
%%
main( argc, argv )
int argc; char **argv;
{
++argv, --argc;
/* skip over
program name
3/17/2019 PROF. ANAND
GHARU
EXAMPLE OF
LEX
PROGRAM
*/
if ( argc > 0 )
yyin = fopen( argv[0],
"r" ); else yyin = stdin;
yylex();
printf( "# of lines = %d, #
of chars =
%dn",
num_lines, num_chars
);
}
3/17/2019 PROF. ANAND
GHARU
EXAMPLE OF
LEX
PROGRAM
%{ #include <stdio.h>
%} WS [ tn]*
%%
printf("NUMBERn");
printf("WORDn");
/* do nothing */
printf(“UNKNOWNn“);
[0123456789]+
[a-zA-Z][a-zA-Z0-9]*
{WS}
.
%%
3/17/2019 PROF. ANAND
GHARU
EXAMPLE OF LEX
PROGRAM
main( argc, argv )
int argc; char **argv;
{ ++argv, --argc;
if ( argc > 0 )
yyin =
fopen( argv[0],
"r“);
elseyyin = stdin;
yylex(); }
3/17/2019 PROF. ANAND
GHARU
My Blog : anandgharu.wordpress.com
•THANK YOU!!!!!!!!!!
3/17/2019
PROF. ANAND
GHARU

More Related Content

PDF
Compiler design Introduction
PDF
Assignment1
PPTX
1 cc
PPT
Compiler Construction
PPTX
System software module 4 presentation file
PPTX
Plc part 2
PPT
Symbol Table, Error Handler & Code Generation
PDF
Chapter#01 cc
Compiler design Introduction
Assignment1
1 cc
Compiler Construction
System software module 4 presentation file
Plc part 2
Symbol Table, Error Handler & Code Generation
Chapter#01 cc

Similar to Compiler detail with examples and other things (20)

DOCX
System programmin practical file
PPTX
role of lexical anaysis
PPTX
CC week 1.pptx
PDF
Introduction compiler design pdf to read
PPTX
Cs419 Compiler lec1&2 introduction
PPTX
Compiler Design Unit1 PPT Phases of Compiler.pptx
PDF
PDF
design intoduction of_COMPILER_DESIGN.pdf
PDF
3_1_COMPILER_DESIGNGARGREREGREGREGREGREGRGRERE
PDF
COMPILER DESIGN Engineering learinin.pdf
PDF
Phases of compiler
PPTX
PPTX
ppt_cd.pptx ppt on phases of compiler of jntuk syllabus
PDF
Principles of Compiler Design
PPTX
Phases of Compiler
PPTX
Introduction to Compilers
PPTX
Chapter 1.pptx
PPTX
Unit-1compiler design and its lecture note .pptx
PDF
Compiler gate question key
DOC
Chapter 1 1
System programmin practical file
role of lexical anaysis
CC week 1.pptx
Introduction compiler design pdf to read
Cs419 Compiler lec1&2 introduction
Compiler Design Unit1 PPT Phases of Compiler.pptx
design intoduction of_COMPILER_DESIGN.pdf
3_1_COMPILER_DESIGNGARGREREGREGREGREGREGRGRERE
COMPILER DESIGN Engineering learinin.pdf
Phases of compiler
ppt_cd.pptx ppt on phases of compiler of jntuk syllabus
Principles of Compiler Design
Phases of Compiler
Introduction to Compilers
Chapter 1.pptx
Unit-1compiler design and its lecture note .pptx
Compiler gate question key
Chapter 1 1
Ad

More from AssadLeo1 (20)

PPT
Chagal chagal with khatch khatch model with detail
PPT
E commerce busin and some important issues
PPTX
What is SEO in pakistan with main components
PPT
business model and some other things that
PPTX
Software Evolution all in Mehmoona.pptx
PPTX
Behavioral Model with Maniha Butt and many More
PPTX
Software Quality Assurance Qurat ul ain.pptx
PPTX
UML Samra Bs it 4th all about aspire college
PPTX
Process Structure and some other important
PPT
Process importance with full detail about
PPTX
IPM Chapter 1 Complete detail and chapeter
PPTX
Hardware Firewall with all the detail of
PPTX
Law and Order in PK in a country is most important
PPTX
Types of Multipule things and other things
PPTX
Model_of_Heterogeneous_System and other things
PPTX
what a knowledge and other things in this slide
PPTX
full with knowledge and other things with
PPT
that is the most important part of this topic
PPT
Discrete and other examples with great intrest
PPTX
Decoding Insights and some extra examples
Chagal chagal with khatch khatch model with detail
E commerce busin and some important issues
What is SEO in pakistan with main components
business model and some other things that
Software Evolution all in Mehmoona.pptx
Behavioral Model with Maniha Butt and many More
Software Quality Assurance Qurat ul ain.pptx
UML Samra Bs it 4th all about aspire college
Process Structure and some other important
Process importance with full detail about
IPM Chapter 1 Complete detail and chapeter
Hardware Firewall with all the detail of
Law and Order in PK in a country is most important
Types of Multipule things and other things
Model_of_Heterogeneous_System and other things
what a knowledge and other things in this slide
full with knowledge and other things with
that is the most important part of this topic
Discrete and other examples with great intrest
Decoding Insights and some extra examples
Ad

Recently uploaded (20)

PPTX
Lecture (1)-Introduction.pptx business communication
PDF
Tata consultancy services case study shri Sharda college, basrur
PDF
Cours de Système d'information about ERP.pdf
PDF
How to Get Funding for Your Trucking Business
PDF
Deliverable file - Regulatory guideline analysis.pdf
PPTX
Principles of Marketing, Industrial, Consumers,
PDF
kom-180-proposal-for-a-directive-amending-directive-2014-45-eu-and-directive-...
PDF
Solara Labs: Empowering Health through Innovative Nutraceutical Solutions
PDF
pdfcoffee.com-opt-b1plus-sb-answers.pdfvi
PPTX
Board-Reporting-Package-by-Umbrex-5-23-23.pptx
PPTX
ICG2025_ICG 6th steering committee 30-8-24.pptx
PPTX
job Avenue by vinith.pptxvnbvnvnvbnvbnbmnbmbh
PPTX
CkgxkgxydkydyldylydlydyldlyddolydyoyyU2.pptx
DOCX
unit 1 COST ACCOUNTING AND COST SHEET
PDF
Power and position in leadershipDOC-20250808-WA0011..pdf
PDF
Chapter 5_Foreign Exchange Market in .pdf
PDF
Daniels 2024 Inclusive, Sustainable Development
PDF
Katrina Stoneking: Shaking Up the Alcohol Beverage Industry
PDF
Roadmap Map-digital Banking feature MB,IB,AB
PPTX
Probability Distribution, binomial distribution, poisson distribution
Lecture (1)-Introduction.pptx business communication
Tata consultancy services case study shri Sharda college, basrur
Cours de Système d'information about ERP.pdf
How to Get Funding for Your Trucking Business
Deliverable file - Regulatory guideline analysis.pdf
Principles of Marketing, Industrial, Consumers,
kom-180-proposal-for-a-directive-amending-directive-2014-45-eu-and-directive-...
Solara Labs: Empowering Health through Innovative Nutraceutical Solutions
pdfcoffee.com-opt-b1plus-sb-answers.pdfvi
Board-Reporting-Package-by-Umbrex-5-23-23.pptx
ICG2025_ICG 6th steering committee 30-8-24.pptx
job Avenue by vinith.pptxvnbvnvnvbnvbnbmnbmbh
CkgxkgxydkydyldylydlydyldlyddolydyoyyU2.pptx
unit 1 COST ACCOUNTING AND COST SHEET
Power and position in leadershipDOC-20250808-WA0011..pdf
Chapter 5_Foreign Exchange Market in .pdf
Daniels 2024 Inclusive, Sustainable Development
Katrina Stoneking: Shaking Up the Alcohol Beverage Industry
Roadmap Map-digital Banking feature MB,IB,AB
Probability Distribution, binomial distribution, poisson distribution

Compiler detail with examples and other things

  • 1. PUNE VIDYARTHI GRIHA’s COLLEGE OF ENGINEERING, NASHIK. • “INTRODUCTION OF COMPILER AND L1EXICAL ANALYSIS ” PREPARED BY : PROF. ANAND N. GHARU ASSISTANT PROFESSOR COMPUTER
  • 2. CONTENTS • COMPILER • INTERPRETER • ANALYSIS SYNTHESIS MODEL • LANGUAGE PROCESSINGSYSTEM • COMPILERPROCESSINBRIEF 2 3/17/2019 PROF. ANAND GHARU
  • 3. COMPILER S • “Compilation” – Translation of a program written in a source language into a semantical y equivalent program written in atarget language Input Compil er Error messages Source Program Target Progra m Outp ut 3 3/17/2019 PROF. ANAND GHARU
  • 4. Inte r • “ pretation” – Performing the operations implied by the source program INTERPRTER S Interpret er Source Progra m Inpu t Outp ut Error messages 4 3/17/2019 PROF. ANAND GHARU
  • 5. ANALYSIS– SYNTHESIS MODEL • There are two parts to compilation: Analysis determines the operations implied by the source program which are recorded in a tree structure Synthesis takes the tree structure and translates the operations therein into the target program 5 3/17/2019 PROF. ANAND GHARU
  • 6. ANALYSIS Breaks up the source program into constituent pieces and imposes a grammatical structure on them. It then uses this structure to create an intermediate representation of the source program. If the analysis part detects that thesource program is either syntactically ill formed or semantically unsound, then it must provide informative messages, so the user can take corrective action. The analysis part also collects information about the source program and stores it in a data structure called a symboltable, which is passed along with the intermediate representation to the synthesis part. 6 3/17/2019 PROF. ANAND GHARU
  • 7. • Front end of compiler •• BBaacckk endend ofof c c o o m m p p i i l l e e r r • The synthesis part constructs the desired target programfrom the intermediate representation and the information in the symbol table • Front end of compiler SYNTHESI S ANALYSIS SYNTHESIS 7 3/17/2019 PROF. ANAND GHARU
  • 8. Record the variable names used in thesource program and collect information aboutvarious attributes of eachname. • An ess ential function of a compiler– COMPILERS ROLE • These attributes may provide information about the storage allocated for a name , its type and its scope , procedure names ,number and types of its arguments, the method of passing each argument and the type returned 8 3/17/2019 PROF. ANAND GHARU
  • 9. ISSUES IN COMPILATION Hierarchy of operations need to be maintained to determine correct order of expression evaluation Maintain data type integrity with automatic type conversions Handle user defineddata types. Develop appropriate storage mappings 9 3/17/2019 PROF. ANAND GHARU
  • 10. ISSUES IN COMPILATION Resolve occurrence of each variable name ina program i.e construct separate symbol tables for different namespaces. Handle different controlstructures. Perform optimization 10 3/17/2019 PROF. ANAND GHARU
  • 11. Single Pass Multi Pass Spee d Memor y better worse better for large programs (potentially) better for small programs ISSUES IN COMPILATION 1 1 Modularit y Flexibilit y “Global” optimization Source Language worse better better worse impossible possible single pass compilers are notpossible for many programming languages 3/17/2019 PROF. ANAND GHARU
  • 12. 1 2 A passis COMPILER PASSES a complete traversal of the source program, or a complete traversal of some internal representation of the source program. A pass can correspond to a “phase” but it does not have to! Sometimes a single “pass” corresponds toseveral phases that are interleaved in time. What and how many passes a compiler does over the source program is an important designdecision. 3/17/2019 PROF. ANAND GHARU
  • 13. SINGLE PASS COMPILER Asingle pass compiler makes a single pass over the source text, parsing, analyzing and generating code all at once. Dependency diagram of a typical Single Pass Compiler: Compiler Driver calls SyntacticAnalyzer calls Contextual Analyzer Code Generator calls 13 3/17/2019 PROF. ANAND GHARU
  • 14. Amulti passc ompiler makes several passes over the program. The output of a preceding phase is stored in a data structureand used by subsequent phases. Dependency diagram of a typical Multi Pass Compiler: Compiler Driver MULTI PASS COMPILER calls Contextual Analyzer Code Generator calls calls input SyntacticAnalyzer input output output Source Text AST Decorated AST input output Object Code 14 3/17/2019 PROF. ANAND GHARU
  • 15. SYMBOL TABLE MEANS Symbol tables are data structures that are used by compilers to hold information about source-program constructs. Asymbol table is a necessary component because  Declaration of identifiers appears once in aprogram  Use of identifiers may appear in many places of the program text 15 3/17/2019 PROF. ANAND GHARU
  • 16. I N FORMATION PROVIDED BY SYMBOL TABLE  Given an Identifier which name isit?  What information is to be associated with a name?  How do we access thisinformation? 16 3/17/2019 PROF. ANAND GHARU
  • 17. Variable and labels Parameter Constant SYMBOL TABLE NAMES NAME Recor d RecordField Procedure Array and files 17 3/17/2019 PROF. ANAND GHARU
  • 18.  Identifi WHO CREATES SYMBOL TABLE ? ers and attributes are entered by the analysisphases when processing a definition (declaration) of an identifier  In simple languages with only global variables andimplicit declarations: The scanner can enter an identifier into a symboltable if it is not alreadythere  In block-structured languages with scopes and explicit declarations:  The parser and/or semantic analyzer enter identifiers and corresponding attributes 18 3/17/2019 PROF. ANAND GHARU
  • 19. • Symb ol table information is used by the analysis and used identifiers have been defined synthesis phases • Toverify that (declared) USE OF SYMBOL TABLE • To verify that expressions and assignments are semantically correct – type checking • To generate intermediate or target code 19 3/17/2019 PROF. ANAND GHARU
  • 20. MEMORY MANAGEMENT What has a compiler to do withmemory management? • compiler uses heap-allocated data structures • modern languages have automatic data (de)allocation • garbage collection part of runtimesupport system • compiler usually assists in identifyingpointers 3/17/2019 PROF. ANAND GHARU
  • 21. GARBAGE COLLECTION • Some systems require user to call free when finished with memory – C/ C++ – reason for destructors in C++ • Other systems detect unused memory and reclaim it – Garbage Collection – this is what Javadoes 3/17/2019 PROF. ANAND GHARU
  • 22. GARBAGE COLLECTION • Basic idea – keep track of what memory is referencedand when it is no longer accessible, reclaim the memory • Example – linked list 3/17/2019 PROF. ANAND GHARU
  • 23. nex t Obj 1 hea d tai l nex t nex t Obj2 Obj3 EXAMPL E • Assume programmer does the following – obj1.next =obj2.next; nex t Obj 1 hea d tai l nex t Obj 2 nex t Obj 3 3/17/2019 PROF. ANAND GHARU
  • 24. • No EXAMPL E w there is no way forprogrammer to reference obj2 – it’s garbage • In system without garbage collection this is cal ed a memory leak – location can’t be used but can’t bereallocated – waste of memory and can eventually crash a program • In system with garbage collection this chunk will be 3/17/2019 PROF. ANAND GHARU
  • 25. • Basi MARK AND SWEEP 0 100 350 450 600 •Mark chunks 0, 1, and 3 as marked •Place chunk 2 on the free list (turn it into a hole) c idea – go through all memory and mark every chunk that is referenced – make a second pass through memory and remove all chunks not marked OS 0 1 2 3 p2 = 650 p2 = 360 3/17/2019 PROF. ANAND GHARU
  • 26. • Hav MARK AND SWEEP ISSUES e to be able toidentify all references – this is difficult in somelanguages – similar to compaction • Requires jumping all over memory – terrible for performance • cache hits • virtual memory • Have to stop everything else to do • Search time proportional to non- garbage – may require lots of work for little 3/17/2019 PROF. ANAND GHARU
  • 27. • REFERENCE COUNTING Basic idea – give each chunk a special field that is the number of references to chunk – whenever a new reference is made, increment field by 1 – whenever a reference is removed, decrement field by 1 – when reference count goes to zero,collect chunk • Requires compiler support 3/17/2019 PROF. ANAND GHARU
  • 28. • Exa mple – everything in italics is added bycompiler Object p = new Object; p.count++; p 1 0 REFERENCE COUNTING Object q = new Object; q.count++; p.count--; if(p.count == 0) collect p p = q; p.count++; 1 q 2 3/17/2019 PROF. ANAND GHARU
  • 29. REFERENCE COUNTING • Above example does not check for NULL reference Object p = new Object p.count++; p.count--; p = NULL; if(p != NULL) p.count++; 3/17/2019 PROF. ANAND GHARU
  • 30. • What about pointers inside 0referenced page? 0 1 REFERENCE COUNTING ISSUES – both of these are garbage – before reclaiming a chunk, must go through all references in the chunk • decrement the chunk they reference by 1 3/17/2019 PROF. ANAND GHARU
  • 31. TOOLS USING ANALYSIS – SYNTHESIS MODEL Editors (syntax highlighting) Pretty printers (e.g. Doxygen) Static checkers (e.g. Lint and Splint) Interpreters 31 3/17/2019 PROF. ANAND GHARU
  • 32. TOOLS USING ANALYSIS – SYNTHESIS MODEL Text formatters (e.g. TeX and LaTeX) Silicon compilers (e.g. VHDL) Query interpreters/ compilers (Databases ) 32 3/17/2019 PROF. ANAND GHARU
  • 33. Skeletal Source Program Preprocessor Try for example: PREPROCESSORS,COMPIL ERS, ASSEMBLERS,AND LINKERS Absolute Machine Code Linker Assembler Compil er Target Assembly Program Relocatable Object Code Libraries and Relocatable Object Files gcc myprog.c 33 3/17/2019 PROF. ANAND GHARU
  • 35. Phas Programmer (sourcec THE PHASES OF A COMPILER es Output Sample ode producer) Source string A=B+C; Scanner (performslexical analysis) Tokenstring ‘A’,‘=’,‘B’, ‘+’,‘C’, ‘;’ Andsymbol table with names Parser (performs syntax analysisbased on the grammar of the programming language) Parse tree or abstract syntax tree ; | = / A + / B C Semantic analyzer (type checking, etc) Annotatedparse tree or abstract syntax tree Intermediate codegenerator Three-address code, quads, orRTL int2fp B t1 + t1 C t2 := t2 A Optimizer Three-address code, quads, orRTL int2fp B t1 + t1 #2.3 A Codegenerator Assemblycode MOVF #2.3,r1 ADDF2 r1,r2 MOVF r2,A 3/17/2019 PROF. ANAND GHARU
  • 36. • Front end: analysis (machine independent) • Back end: synthesis (machine dependent) THE GROUPING OF PHASES Compiler front and back ends • Acollection of phases is done only once (single pass) or multiple times (multi pass) • Single pass: usually requires everything to be definedbefore being used in source program • Multi pass: compiler mayhave to keep entire program representation in memory Compile r passes 3/17/2019 PROF. ANAND GHARU
  • 37. COMPILER CONSTRUCTION TOOLS Software development tools are available to implement one or more compiler phases • Scanner generators • Parser generators • Syntax-directed translation engines • Automatic code generators • Data-flow engines 3/17/2019 PROF. ANAND GHARU
  • 39. LEXICAL ANALYZER PERSPECTIVE LEXICALANALYZER PARSER • Scan input • Remove WS, NL, … Identify Tok t To k • ens Create Symbol Table Inser ensintoST Generate Errors Send Tokens toParser 38 Perform SyntaxAnalysis Actions Dictated by TokenOrder Update Symbol Table Entries Create Abstract Rep. of Source Generate Errors 3/17/2019 PROF. ANAND GHARU
  • 40. SEPERATION OF LEXICAL ANALYSIS 39 FROM SYNTAXANALYSIS • Separation of Lexical Analysis From Parsing Presents a Simpler Conceptual Model – From a Software Engineering Perspective Division Emphasizes • High Cohesion and Low Coupling • Implies Well Specified  Paral el Implementation • Separation Increases Compiler Efficiency (I/O Techniques to Enhance LexicalAnalysis) • Separation Promotes Portability. – This is critical today, when platforms (OSsand Hardware) are numerous and varied! 3/17/2019 PROF. ANAND GHARU
  • 41. BASIC TERMINOLOGIES OF LEXICAL ANALYSIS  Major Terms for Lexical Analysis?  TOKEN  A classification for a common set of strings  Examples Include <Identifier>, <number>, etc.  PATTERN  The rules which characterize the set of strings for a token  Recall File and OS Wildcards ([A-Z]*.*)  LEXEME  Actual sequence of characters that matches pattern and is classified by a token  Identifiers: x, count, name, etc… 40 3/17/2019 PROF. ANAND GHARU
  • 42. INTRODUCING BASIC TERMINOLOG Y Sample Lexemes Informal Description ofPattern T oken const if relatio n id num liter al const if <, <=, =, < >, >, >= pi, count, D2 3.1416, 0, 6.02E23 “core dumped” const if < or <= or = or < > or >= or > letter followed by letters anddigits any numeric constant any characters between “and “except “ Classifies Pattern Actual values are critical. Info is: 1. Stored in symbol table 2.Returned to parser 41 3/17/2019 PROF. ANAND GHARU
  • 43. I/O - KEY FOR SUCCESSFUL LEXICAL ANALYSIS 42  Character-at-a-time I/O  Block / Buffered I/O  Block/Buffered I/O  Utilize Block of memory  Stage data from source to buffer block at a time  Maintain two blocks - Why (Recall OS)?  Asynchronous I/O - for 1 block  While Lexical Analysis on 2nd block Block 1 Block 2 ptr.. . When done, issue I/O Still Process token in 2nd block 3/17/2019 PROF. ANAND GHARU
  • 44. Algorithm Buffered I/O with Sentinels eof M * * 2 * eof C Current token eof forward (scans ahead to find pattern match) 43 E = lexeme beginning forward : = forward + 1 ; if forward is at eof then begin if forward at end of first half then begin reload second half ; terminate lexical analysis end 2nd eof  no more input ! Block I/O forward : = forward + 1 end else if forward at end of second halfthen begin reload first half ; Block I/O move forward to biginning of first half end else / * eof within buffer signifying end of input 3/17/2019 PROF. ANAND GHARU
  • 45. HANDLING LEXICAL ERRORS • Error Handling is very localized, with Respect to Input Source • For example: whil ( x := 0 ) do generates no lexical errors in PASCAL • In what Situations do ErrorsOccur? – Prefix of remaining input doesn’tmatch any defined token • Possible error recovery actions: – Deleting or Inserting InputCharacters – Replacing or Transposing Characters • Or, skip over to next separator to“ignore” problem 3/17/2019 PROF. ANAND GHARU
  • 47. Tool that helps to take set of descriptions of possible tokens and produce Croutine The set of descriptions is cal ed lex specification The token description are known as regular expressions 3/17/2019 PROF. ANAND GHARU
  • 48. L AUTOMATIC CONSTRUCTION OF EXICAL ANALYZER …… LEX • Lex is a tool for creating lexical analyzers. • Lexical analyzers tokenize input streams. • Tokens are the terminals of a language. • Regular expressions define tokens . 3/17/2019 PROF. ANAND GHARU
  • 49. C compil e r a.out A L AUTOMATIC CONSTRUCTION OF EXICAL ANALYZER ……LEX Lex compiler lex.yy.c lex source program lex.l lex.yy.c input stream sequenc e of tokens a.out C compile r a.out 3/17/2019 PROF. ANAND GHARU
  • 50. LEX SPECIFICATION Lex Program Structure: declarations %% translation rules %% auxiliary procedures Name the file e.g. test.lex Then, “lex test.lex” produces the file “lex.yy.c” (a C-program) 3/17/2019 PROF. ANAND GHARU
  • 51. LEX SPECIFICATION { /* definitions of all constants LT, LE, EQ, NE, GT, GE, IF, THEN, ELSE, ... */ C declarat s n o i % %} ...... letter digit id [A-Za-z] [0-9] {letter}({letter}|{digit})* { return(IF);} { return(THEN);} { yylval = install_id(); return(ID); } ...... %% if then {id} ...... %% install_id() { /* procedure to install the lexeme to the ST */ declarations Rules Auxiliary 3/17/2019 PROF. ANAND GHARU
  • 52. L AUTOMATIC CONSTRUCTION OF EXICAL ANALYZER ……LEX •To run lex on a source file, use the command: lex source.l •This produces the file lex.yy.c which is the C source for the lexical analyzer. • To compile this, use: cc -o prog -O lex.yy.c -ll 3/17/2019 PROF. ANAND GHARU
  • 53. Invokes the lexical analyzer EXAMPLE OF LEX SPECIFICATION %{ #include <stdio.h> % % Contain s the matching lexeme Translation%} rule s { printf(“%sn”, yytext); } { } [0-9]+ .|n %% main() { yylex(); } Invokes the lexical analyzer lex spec.l gcc lex.yy.c -ll ./a.out < spec.l 3/17/2019 PROF. ANAND GHARU
  • 54. %{ #include <stdio.h> [ t]+ Regular definitio n Translation int ch = 0, wd = 0, nl = 0; rules %} EXAMPLE OF LEX SPECIFICATION { ch++; wd++; nl+ +; } { ch+=yyleng; } { ch+=yyleng; wd++; } { ch++; } delim %% n ^{delim} {delim} . %% main() { yylex(); printf("%8d%8d%8dn", nl, wd, ch); 3/17/2019 PROF. ANAND GHARU
  • 55. %{ #include <stdio.h> Regular definitio n Translation %} EXAMPLE OF LEX SPECIFICATION rules digit [0-9] letter [A-Za-z] id {letter}({letter}|{digit})* %% { printf(“number: %sn”, { printf(“ident: %sn”, { printf(“other: %sn”, {digit}+ yytext); } {id} yytext); } . yytext); } 3/17/2019 PROF. ANAND GHARU
  • 56. REGULAR EXPRESSIO NS •[xyz] match one character x, y, or z (use to escape -) [^xyz]match any character except x, y, and z • [a-z] match one of a to z • r* closure (match zero or more occurrences) •r+ positive closure (match one or more occurrences) • r? optional (match zero or one
  • 57. EXAMPLE OF LEX PROGRAM int num_lines = 0, num_chars = 0; %% {++num_lines; ++num_chars;} {++num_chars;} n . %% main( argc, argv ) int argc; char **argv; { ++argv, --argc; /* skip over program name 3/17/2019 PROF. ANAND GHARU
  • 58. EXAMPLE OF LEX PROGRAM */ if ( argc > 0 ) yyin = fopen( argv[0], "r" ); else yyin = stdin; yylex(); printf( "# of lines = %d, # of chars = %dn", num_lines, num_chars ); } 3/17/2019 PROF. ANAND GHARU
  • 59. EXAMPLE OF LEX PROGRAM %{ #include <stdio.h> %} WS [ tn]* %% printf("NUMBERn"); printf("WORDn"); /* do nothing */ printf(“UNKNOWNn“); [0123456789]+ [a-zA-Z][a-zA-Z0-9]* {WS} . %% 3/17/2019 PROF. ANAND GHARU
  • 60. EXAMPLE OF LEX PROGRAM main( argc, argv ) int argc; char **argv; { ++argv, --argc; if ( argc > 0 ) yyin = fopen( argv[0], "r“); elseyyin = stdin; yylex(); } 3/17/2019 PROF. ANAND GHARU
  • 61. My Blog : anandgharu.wordpress.com •THANK YOU!!!!!!!!!! 3/17/2019 PROF. ANAND GHARU