SlideShare a Scribd company logo
Introduction to Compiler
Construction
Prof. A. N. Kazi
Jawaharlal Darda Institute of Engineering &
Technology, Yavatmal
TRANSLATORS
• A translator is one kind of program that takes one form
of program (input) and converts into another form
(output). The input program is called source language
and the output program is called target language.
Types of Translators are ::
(1) Compilers
(2) Interpreters
(3) Assemblers
COMPILATION AND INTERPRETATION
• A compiler is a program that reads a program in one
language and translates it into an equivalent program
in another language. The translation done by a
compiler is called compilation.
• An interpreter is another common kind of language
processor. Instead of producing a target program as a
translation, an interpreter appears to directly execute
the operations specified in the source program on
inputs supplied by the user.
COMPILATION AND INTERPRETATION
 Compiler :
 Interpreter :
Compilers
• “Compilation”
– Translation of a program written in a source
language into a semantically equivalent
program written in a target language
Compiler
Error messages
Source
Program
Target
Program
Input
Output
As an important role of a compiler is error showing to
the programmer.
INTERPRETER
• An interpreter is a program that appears to execute a
source program as if it were machine language.
Fig: Execution in Interpreter
Languages such as BASIC, SNOBOL, LISP can be translated
using interpreters
Compiler is a translator program that translates a program written in
(HLL) the source program and translate it into an equivalent program in
(MLL) the target program.
Fig : Language Processing System
HLL Consisting
#include< >
#define SIZE
Pure HLL
Fig : Structure of Compiler
Fig : Execution process of source program in Compiler
ASSEMBLER
1. Programmers found it difficult to write or read programs in
machine language. They begin to use a mnemonic (symbols) for
each machine instruction, which they would subsequently translate
into machine language.
2. Such a mnemonic machine language is now called an assembly
language.(ALP)
3. Programs known as assembler were written to automate the
translation of assembly language in to machine language.
LOADER AND LINK-EDITOR:
• Loader : Once the assembler procedures an object
program, that program must be placed into memory and
executed. The assembler could place the object program
directly in memory and transfer control to it, thereby
causing the machine language program to be execute.
• Linker : Add necessary library file that are included in
source program
LIST OF COMPILERS
1. Ada compilers
2 .ALGOL compilers
3 .BASIC compilers
4 .C# compilers
5 .C compilers
6 .C++ compilers
7 .COBOL compilers
8 .Common Lisp compilers
9. ECMAScript interpreters
10. Fortran compilers
11 .Java compilers
12. Pascal compilers
13. PL/I compilers
14. Python compilers
15. Smalltalk compilers
Preprocessors, Compilers, Assemblers,
and Linkers
Preprocessor
Compiler
Assembler
Linker
Skeletal Source Program
Source Program
Target Assembly Program
Relocatable Object Code
Absolute Machine Code
Libraries and
Relocatable Object Files
Try for example:
gcc -v myprog.c
Phases of a compiler
• Lexical Analysis
- The first phase of a compiler is called lexical
analysis or scanning or linear analysis. The lexical
analyzer reads the stream of characters making up
the source program and groups the characters into
meaningful sequences called lexemes.
For each lexeme, the lexical analyzer produces output as a token of
the form
<token-name, attribute-value>
The first component token-name is an abstract symbol that is used
during syntax analysis, and the second component attribute-value
points to an entry in the symbol table for this token.
Lexeme mapped in Tokens
1) position is a lexeme that would be mapped into a token
<id,1>.
where , id is an abstract symbol standing for identifier and 1
points to the symbol able entry for position.
(2) The assignment symbol = is a lexeme that is mapped into
the token <=>.
(3) initial is a lexeme that is mapped into the token <id, 2>.
(4) + is a lexeme that is mapped into the token <+>.
(5) rate is a lexeme that is mapped into the token <id, 3>.
(6) * is a lexeme that is mapped into the token <*>.
(7) 60 is a lexeme that is mapped into the token <60>.
The sequence of tokens produced as follows after lexical analysis.
<id, 1> <=> <id, 2> <+> <id, 3> <*> <60>
Syntax Analysis
• The second phase of the compiler is syntax analysis or
parsing or hierarchical analysis.
• The parser uses the first components of the tokens
produced by the lexical analyzer to create a tree-like
intermediate representation that depicts the
grammatical structure of the token stream.
• The hierarchical tree structure generated in this phase
is called parse tree or syntax tree.
Figure : Syntax tree for position = initial + rate * 60
Semantic Analysis
• The semantic analyzer uses the syntax tree and the
information in the symbol table to check the source
program for semantic consistency with the language
definition.
• It ensures the correctness of the program, matching of
the parenthesis is also done in this phase.
• An important part of semantic analysis is type
checking, where the compiler checks that each
operator has matching operands.
• The compiler must report an error if a floating-point
number is used to index an array.
Semantic Analysis
Figure : Semantic tree for position = initial + rate * 60
Intermediate Code Generation
• After syntax and semantic analysis of the source program,
many compilers generate an explicit low-level or machine-
like intermediate representation
• The intermediate representation have two important
properties:
a. It should be easy to produce
b. It should be easy to translate into the target machine.
Three-address code is one of the intermediate representations,
which consists of a sequence of assembly-like instructions with
three operands per instruction.
Intermediate Code Generation
• Each operand can act like a register.
• The output of the intermediate code generator
consists of the three-address code sequence for
position = initial + rate * 60
• t1 = inttofloat(60)
• t2 = id3 * t1
• t3 = id2 + t2
• id1 = t3
Code Optimization
• The machine-independent code-optimization phase
attempts to improve the intermediate code so that better
target code will result. Usually better means faster.
• Optimization has to improve the efficiency of code so
that the target program running time and consumption
of memory can be reduced.
Moreover, t3 is used only once to transmit its value to id1 so
the optimizer can transform into the shorter sequence:
t1 = id3 * 60.0
id1 = id2 + t1
Code Generation
• The code generator takes as input an intermediate
representation of the source program and maps it
into the target language.
• If the target language is machine code, then the
registers or memory locations are selected for each
of the variables used by the program.
The intermediate instructions are translated into sequences of machine
instructions.
LDF R2, id3
MULF R2, R2 , #60.0
LDF Rl, id2
ADDF Rl, Rl, R2
STF idl, Rl
Figure : Translation of an
assignment statement
Symbol-Table Management
• The symbol table, which stores information about the
entire source program, is used by all phases of the
compiler.
• An essential function of a compiler is to record the
variable names used in the source program and collect
information about various attributes of each name.
• These attributes may provide information about the
storage allocated for a name, its type, its scope.
A symbol table can be implemented in one of the following ways:
Linear (sorted or unsorted) list
Binary Search Tree
Hash table
THE GROUPING OF PHASES
THE END

More Related Content

PPTX
Semantics analysis
PPTX
Bootstrapping in Compiler
PPTX
Language processing activity
PDF
Differences between c and c++
PDF
Java Platform Security Architecture
PPTX
Lexical Analysis - Compiler Design
PPTX
Types of Compilers
Semantics analysis
Bootstrapping in Compiler
Language processing activity
Differences between c and c++
Java Platform Security Architecture
Lexical Analysis - Compiler Design
Types of Compilers

What's hot (20)

PDF
Data Structures & Algorithm design using C
PPTX
Constructor in java
PPTX
constructors in java ppt
PPSX
JDBC: java DataBase connectivity
PPTX
Operators in C
PPTX
Data types in python
PPT
Intermediate code generation (Compiler Design)
PPTX
Context free grammar
PPTX
Procedure oriented programming
PPTX
Parsing in Compiler Design
PPT
Compiler Design Basics
PPTX
Role-of-lexical-analysis
PDF
Syntax analysis
PPT
Lecture 5 - Structured Programming Language
PPTX
Programming in c Arrays
PDF
Nlp ambiguity presentation
PPTX
Applets in java
PDF
Interface
PPTX
Programming Fundamentals lecture 1
Data Structures & Algorithm design using C
Constructor in java
constructors in java ppt
JDBC: java DataBase connectivity
Operators in C
Data types in python
Intermediate code generation (Compiler Design)
Context free grammar
Procedure oriented programming
Parsing in Compiler Design
Compiler Design Basics
Role-of-lexical-analysis
Syntax analysis
Lecture 5 - Structured Programming Language
Programming in c Arrays
Nlp ambiguity presentation
Applets in java
Interface
Programming Fundamentals lecture 1
Ad

Similar to Concept of compiler in details (20)

PPT
what is compiler and five phases of compiler
PPTX
Compiler Construction-2 for bs computer science.pptx
PPT
A basic introduction to compiler design.ppt
PPT
A basic introduction to compiler design.ppt
PPTX
The Phases of a Compiler
PPTX
Compiler Design
PPTX
Chapter 1.pptx
PDF
Lecture 2.1 - Phase of a Commmmpiler.pdf
PPTX
Compiler an overview
PPT
Compiler Construction
PDF
Lecture 01 introduction to compiler
PPTX
Compiler Design Introduction
PPTX
Unit 1.pptx
PPTX
Phases of Compiler.pptx
PDF
3_1_COMPILER_DESIGNGARGREREGREGREGREGREGRGRERE
PDF
design intoduction of_COMPILER_DESIGN.pdf
PDF
COMPILER DESIGN Engineering learinin.pdf
PPT
Chapter One
PDF
Phases of compiler
PPTX
1._Introduction_.pptx
what is compiler and five phases of compiler
Compiler Construction-2 for bs computer science.pptx
A basic introduction to compiler design.ppt
A basic introduction to compiler design.ppt
The Phases of a Compiler
Compiler Design
Chapter 1.pptx
Lecture 2.1 - Phase of a Commmmpiler.pdf
Compiler an overview
Compiler Construction
Lecture 01 introduction to compiler
Compiler Design Introduction
Unit 1.pptx
Phases of Compiler.pptx
3_1_COMPILER_DESIGNGARGREREGREGREGREGREGRGRERE
design intoduction of_COMPILER_DESIGN.pdf
COMPILER DESIGN Engineering learinin.pdf
Chapter One
Phases of compiler
1._Introduction_.pptx
Ad

Recently uploaded (20)

PDF
PREDICTION OF DIABETES FROM ELECTRONIC HEALTH RECORDS
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PPTX
Current and future trends in Computer Vision.pptx
PPT
introduction to datamining and warehousing
PDF
Exploratory_Data_Analysis_Fundamentals.pdf
PPT
INTRODUCTION -Data Warehousing and Mining-M.Tech- VTU.ppt
PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
PDF
UNIT no 1 INTRODUCTION TO DBMS NOTES.pdf
PDF
BIO-INSPIRED ARCHITECTURE FOR PARSIMONIOUS CONVERSATIONAL INTELLIGENCE : THE ...
PPTX
introduction to high performance computing
PDF
Abrasive, erosive and cavitation wear.pdf
PDF
SMART SIGNAL TIMING FOR URBAN INTERSECTIONS USING REAL-TIME VEHICLE DETECTI...
PDF
III.4.1.2_The_Space_Environment.p pdffdf
PDF
PPT on Performance Review to get promotions
PPTX
Safety Seminar civil to be ensured for safe working.
PPTX
UNIT - 3 Total quality Management .pptx
PDF
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
PPTX
Artificial Intelligence
PPTX
Fundamentals of safety and accident prevention -final (1).pptx
PREDICTION OF DIABETES FROM ELECTRONIC HEALTH RECORDS
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
Current and future trends in Computer Vision.pptx
introduction to datamining and warehousing
Exploratory_Data_Analysis_Fundamentals.pdf
INTRODUCTION -Data Warehousing and Mining-M.Tech- VTU.ppt
Automation-in-Manufacturing-Chapter-Introduction.pdf
UNIT no 1 INTRODUCTION TO DBMS NOTES.pdf
BIO-INSPIRED ARCHITECTURE FOR PARSIMONIOUS CONVERSATIONAL INTELLIGENCE : THE ...
introduction to high performance computing
Abrasive, erosive and cavitation wear.pdf
SMART SIGNAL TIMING FOR URBAN INTERSECTIONS USING REAL-TIME VEHICLE DETECTI...
III.4.1.2_The_Space_Environment.p pdffdf
PPT on Performance Review to get promotions
Safety Seminar civil to be ensured for safe working.
UNIT - 3 Total quality Management .pptx
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
Artificial Intelligence
Fundamentals of safety and accident prevention -final (1).pptx

Concept of compiler in details

  • 1. Introduction to Compiler Construction Prof. A. N. Kazi Jawaharlal Darda Institute of Engineering & Technology, Yavatmal
  • 2. TRANSLATORS • A translator is one kind of program that takes one form of program (input) and converts into another form (output). The input program is called source language and the output program is called target language. Types of Translators are :: (1) Compilers (2) Interpreters (3) Assemblers
  • 3. COMPILATION AND INTERPRETATION • A compiler is a program that reads a program in one language and translates it into an equivalent program in another language. The translation done by a compiler is called compilation. • An interpreter is another common kind of language processor. Instead of producing a target program as a translation, an interpreter appears to directly execute the operations specified in the source program on inputs supplied by the user.
  • 4. COMPILATION AND INTERPRETATION  Compiler :  Interpreter :
  • 5. Compilers • “Compilation” – Translation of a program written in a source language into a semantically equivalent program written in a target language Compiler Error messages Source Program Target Program Input Output As an important role of a compiler is error showing to the programmer.
  • 6. INTERPRETER • An interpreter is a program that appears to execute a source program as if it were machine language. Fig: Execution in Interpreter Languages such as BASIC, SNOBOL, LISP can be translated using interpreters
  • 7. Compiler is a translator program that translates a program written in (HLL) the source program and translate it into an equivalent program in (MLL) the target program. Fig : Language Processing System HLL Consisting #include< > #define SIZE Pure HLL
  • 8. Fig : Structure of Compiler Fig : Execution process of source program in Compiler
  • 9. ASSEMBLER 1. Programmers found it difficult to write or read programs in machine language. They begin to use a mnemonic (symbols) for each machine instruction, which they would subsequently translate into machine language. 2. Such a mnemonic machine language is now called an assembly language.(ALP) 3. Programs known as assembler were written to automate the translation of assembly language in to machine language.
  • 10. LOADER AND LINK-EDITOR: • Loader : Once the assembler procedures an object program, that program must be placed into memory and executed. The assembler could place the object program directly in memory and transfer control to it, thereby causing the machine language program to be execute. • Linker : Add necessary library file that are included in source program
  • 11. LIST OF COMPILERS 1. Ada compilers 2 .ALGOL compilers 3 .BASIC compilers 4 .C# compilers 5 .C compilers 6 .C++ compilers 7 .COBOL compilers 8 .Common Lisp compilers 9. ECMAScript interpreters 10. Fortran compilers 11 .Java compilers 12. Pascal compilers 13. PL/I compilers 14. Python compilers 15. Smalltalk compilers
  • 12. Preprocessors, Compilers, Assemblers, and Linkers Preprocessor Compiler Assembler Linker Skeletal Source Program Source Program Target Assembly Program Relocatable Object Code Absolute Machine Code Libraries and Relocatable Object Files Try for example: gcc -v myprog.c
  • 13. Phases of a compiler
  • 14. • Lexical Analysis - The first phase of a compiler is called lexical analysis or scanning or linear analysis. The lexical analyzer reads the stream of characters making up the source program and groups the characters into meaningful sequences called lexemes. For each lexeme, the lexical analyzer produces output as a token of the form <token-name, attribute-value> The first component token-name is an abstract symbol that is used during syntax analysis, and the second component attribute-value points to an entry in the symbol table for this token.
  • 15. Lexeme mapped in Tokens 1) position is a lexeme that would be mapped into a token <id,1>. where , id is an abstract symbol standing for identifier and 1 points to the symbol able entry for position. (2) The assignment symbol = is a lexeme that is mapped into the token <=>. (3) initial is a lexeme that is mapped into the token <id, 2>. (4) + is a lexeme that is mapped into the token <+>. (5) rate is a lexeme that is mapped into the token <id, 3>. (6) * is a lexeme that is mapped into the token <*>. (7) 60 is a lexeme that is mapped into the token <60>. The sequence of tokens produced as follows after lexical analysis. <id, 1> <=> <id, 2> <+> <id, 3> <*> <60>
  • 16. Syntax Analysis • The second phase of the compiler is syntax analysis or parsing or hierarchical analysis. • The parser uses the first components of the tokens produced by the lexical analyzer to create a tree-like intermediate representation that depicts the grammatical structure of the token stream. • The hierarchical tree structure generated in this phase is called parse tree or syntax tree. Figure : Syntax tree for position = initial + rate * 60
  • 17. Semantic Analysis • The semantic analyzer uses the syntax tree and the information in the symbol table to check the source program for semantic consistency with the language definition. • It ensures the correctness of the program, matching of the parenthesis is also done in this phase. • An important part of semantic analysis is type checking, where the compiler checks that each operator has matching operands. • The compiler must report an error if a floating-point number is used to index an array.
  • 18. Semantic Analysis Figure : Semantic tree for position = initial + rate * 60
  • 19. Intermediate Code Generation • After syntax and semantic analysis of the source program, many compilers generate an explicit low-level or machine- like intermediate representation • The intermediate representation have two important properties: a. It should be easy to produce b. It should be easy to translate into the target machine. Three-address code is one of the intermediate representations, which consists of a sequence of assembly-like instructions with three operands per instruction.
  • 20. Intermediate Code Generation • Each operand can act like a register. • The output of the intermediate code generator consists of the three-address code sequence for position = initial + rate * 60 • t1 = inttofloat(60) • t2 = id3 * t1 • t3 = id2 + t2 • id1 = t3
  • 21. Code Optimization • The machine-independent code-optimization phase attempts to improve the intermediate code so that better target code will result. Usually better means faster. • Optimization has to improve the efficiency of code so that the target program running time and consumption of memory can be reduced. Moreover, t3 is used only once to transmit its value to id1 so the optimizer can transform into the shorter sequence: t1 = id3 * 60.0 id1 = id2 + t1
  • 22. Code Generation • The code generator takes as input an intermediate representation of the source program and maps it into the target language. • If the target language is machine code, then the registers or memory locations are selected for each of the variables used by the program. The intermediate instructions are translated into sequences of machine instructions. LDF R2, id3 MULF R2, R2 , #60.0 LDF Rl, id2 ADDF Rl, Rl, R2 STF idl, Rl
  • 23. Figure : Translation of an assignment statement
  • 24. Symbol-Table Management • The symbol table, which stores information about the entire source program, is used by all phases of the compiler. • An essential function of a compiler is to record the variable names used in the source program and collect information about various attributes of each name. • These attributes may provide information about the storage allocated for a name, its type, its scope. A symbol table can be implemented in one of the following ways: Linear (sorted or unsorted) list Binary Search Tree Hash table
  • 25. THE GROUPING OF PHASES