SlideShare a Scribd company logo
COP4020
Programming
Languages
Compiler phases
Prof. Xin Yuan
COP4020 Spring 2014 2
9/13/2022
Overview
 Compiler phases
 Lexical analysis
 Syntax analysis
 Semantic analysis
 Intermediate (machine-independent) code generation
 Intermediate code optimization
 Target (machine-dependent) code generation
 Target code optimization
Source program with macros
Preprocessor
Source program
Compiler
Target assembly program
assembler
Relocatable machine code
linker
Absolute machine code
Try g++ with –v, -E, -S flags
on linprog.
A typical compilation process
9/13/2022 COP4020 Spring 2014 3
 What is a compiler?
 A program that reads a program written in one language
(source language) and translates it into an equivalent
program in another language (target language).
 Two components
 Understand the program (make sure it is correct)
 Rewrite the program in the target language.
 Traditionally, the source language is a high level language
and the target language is a low level language (machine
code).
compiler
Source
program
Target
program
Error message
9/13/2022 COP4020 Spring 2014 4
COP4020 Spring 2014 5
9/13/2022
Compilation Phases and Passes
 Compilation of a program proceeds through a fixed
series of phases
 Each phase use an (intermediate) form of the program produced
by an earlier phase
 Subsequent phases operate on lower-level code representations
 Each phase may consist of a number of passes over the
program representation
 Pascal, FORTRAN, C languages designed for one-pass
compilation, which explains the need for function prototypes
 Single-pass compilers need less memory to operate
 Java and ADA are multi-pass
COP4020 Spring 2014 6
9/13/2022
Compiler Front- and Back-end
Semantic Analysis
and Intermediate
Code Generation
Scanner
(lexical analysis)
Parser
(syntax analysis)
Machine-
Independent Code
Improvement
Target Code
Generation
Machine-Specific
Code Improvement
Source program (character stream)
Tokens
Parse tree
Abstract syntax tree or
other intermediate form
Modified intermediate form
Assembly or object code
Modified assembly or object code
Abstract syntax tree or
other intermediate form
Front
end
analysis
Back
end
synthesis
COP4020 Spring 2014 7
9/13/2022
Scanner: Lexical Analysis
 Lexical analysis breaks up a program into tokens
 Grouping characters into non-separatable units (tokens)
 Changing a stream to characters to a stream of tokens
program gcd (input, output);
var i, j : integer;
begin
read (i, j);
while i <> j do
if i > j then i := i - j else j := j - i;
writeln (i)
end.
program gcd ( input , output ) ;
var i , j : integer ; begin
read ( i , j ) ; while
i <> j do if i > j
then i := i - j else j
:= i - i ; writeln ( i
) end .
COP4020 Spring 2014 8
9/13/2022
Scanner: Lexical Analysis
 What kind of errors can be reported by lexical analyzer?
A = b + @3;
COP4020 Spring 2014 9
9/13/2022
Parser: Syntax Analysis
 Checks whether the token stream meets the
grammatical specification of the language and
generates the syntax tree.
 A syntax error is produced by the compiler when the program
does not meet the grammatical specification.
 For grammatically correct program, this phase generates an
internal representation that is easy to manipulate in later phases
 Typically a syntax tree (also called a parse tree).
 A grammar of a programming language is typically
described by a context free grammer, which also defines
the structure of the parse tree.
COP4020 Spring 2014 10
9/13/2022
Context-Free Grammars
 A context-free grammar defines the syntax of a programming
language
 The syntax defines the syntactic categories for language constructs
 Statements
 Expressions
 Declarations
 Categories are subdivided into more detailed categories
 A Statement is a
 For-statement
 If-statement
 Assignment
<statement> ::= <for-statement> | <if-statement> | <assignment>
<for-statement> ::= for ( <expression> ; <expression> ; <expression> ) <statement>
<assignment> ::= <identifier> := <expression>
COP4020 Spring 2014 11
9/13/2022
Example: Micro Pascal
<Program> ::= program <id> ( <id> <More_ids> ) ; <Block> .
<Block> ::= <Variables> begin <Stmt> <More_Stmts> end
<More_ids> ::= , <id> <More_ids>
| 
<Variables> ::= var <id> <More_ids> : <Type> ; <More_Variables>
| 
<More_Variables> ::= <id> <More_ids> : <Type> ; <More_Variables>
| 
<Stmt> ::= <id> := <Exp>
| if <Exp> then <Stmt> else <Stmt>
| while <Exp> do <Stmt>
| begin <Stmt> <More_Stmts> end
<Exp> ::= <num>
| <id>
| <Exp> + <Exp>
| <Exp> - <Exp>
Parsing examples
 Pos = init + / rate * 60  id1 = id2 + / id3 * const 
syntax error (exp ::= exp + exp cannot be reduced).
 Pos = init + rate * 60  id1 = id2 + id3 * const 
COP4020 Spring 2014 12
9/13/2022
:=
id1 +
id2 *
id3 60
COP4020 Spring 2014 13
9/13/2022
Semantic Analysis
 Semantic analysis is applied by a compiler to
discover the meaning of a program by analyzing
its parse tree or abstract syntax tree.
 A program without grammatical errors may not
always be correct program.
 pos = init + rate * 60
 What if pos is a class while init and rate are integers?
 This kind of errors cannot be found by the parser
 Semantic analysis finds this type of error and ensure that the
program has a meaning.
COP4020 Spring 2014 14
9/13/2022
Semantic Analysis
 Static semantic checks (done by the compiler) are performed at
compile time
 Type checking
 Every variable is declared before used
 Identifiers are used in appropriate contexts
 Check subroutine call arguments
 Check labels
 Dynamic semantic checks are performed at run time, and the
compiler produces code that performs these checks
 Array subscript values are within bounds
 Arithmetic errors, e.g. division by zero
 Pointers are not dereferenced unless pointing to valid object
 A variable is used but hasn't been initialized
 When a check fails at run time, an exception is raised
COP4020 Spring 2014 15
9/13/2022
Semantic Analysis and Strong
Typing
 A language is strongly typed "if (type) errors are always
detected"
 Errors are either detected at compile time or at run time
 Examples of such errors are listed on previous slide
 Languages that are strongly typed are Ada, Java, ML, Haskell
 Languages that are not strongly typed are Fortran, Pascal,
C/C++, Lisp
 Strong typing makes language safe and easier to use,
but potentially slower because of dynamic semantic
checks
 In some languages, most (type) errors are detected late
at run time which is detrimental to reliability e.g. early
Basic, Lisp, Prolog, some script languages
COP4020 Spring 2014 16
9/13/2022
Code Generation and
Intermediate Code Forms
 A typical intermediate form of
code produced by the
semantic analyzer is an
abstract syntax tree (AST)
 The AST is annotated with
useful information such as
pointers to the symbol table
entry of identifiers
Example AST for the
gcd program in Pascal
COP4020 Spring 2014 17
9/13/2022
Code Generation and
Intermediate Code Forms
 Other intermediate code forms
 intermediate code is something that is both close to the final machine code
and easy to manipulate (for optimization). One example is the three-
address code:
dst = op1 op op2
 The three-address code for the assignment statement:
temp1 = 60
temp2 = id3 + temp1
temp3 = id2 + temp2
id1 = temp3
 Machine-independent Intermediate code improvement
temp1 = id3 * 60.0
id1 = id2 + temp1
COP4020 Spring 2014 18
9/13/2022
Target Code Generation and
Optimization
 From the machine-independent form assembly or object
code is generated by the compiler
MOVF id3, R2
MULF #60.0, R2
MOVF id2, R1
ADDF R2, R1
MOVF R1, id1
 This machine-specific code is optimized to exploit
specific hardware features
Summary
 Compiler front-end: lexical analysis, syntax analysis,
semantic analysis
 Tasks: understanding the source code, making sure the source
code is written correctly
 Compiler back-end: Intermediate code
generation/improvement, and Machine code
generation/improvement
 Tasks: translating the program to a semantically the same
program (in a different language).
COP4020 Spring 2014 19
9/13/2022

More Related Content

PPT
Phases of compiler
PPTX
Compiler Design Unit1 PPT Phases of Compiler.pptx
PPTX
Compiler presentaion
PDF
Chapter#01 cc
PPT
A basic introduction to compiler design.ppt
PPT
A basic introduction to compiler design.ppt
PPT
Cpcs302 1
PPT
How a Compiler Works ?
Phases of compiler
Compiler Design Unit1 PPT Phases of Compiler.pptx
Compiler presentaion
Chapter#01 cc
A basic introduction to compiler design.ppt
A basic introduction to compiler design.ppt
Cpcs302 1
How a Compiler Works ?

Similar to 7068458.ppt (20)

PPT
Compiler Construction introduction
PDF
Compiler_Lecture1.pdf
PPT
Compiler design computer science engineering.ppt
PDF
SOFTWARE TOOL FOR TRANSLATING PSEUDOCODE TO A PROGRAMMING LANGUAGE
PDF
SOFTWARE TOOL FOR TRANSLATING PSEUDOCODE TO A PROGRAMMING LANGUAGE
PDF
unit1pdf__2021_12_14_12_37_34.pdf
DOCX
Compiler Design Material
PPT
01. introduction
PDF
Compiler gate question key
PDF
PPT
Data structure and algorithm.lect-03.ppt
PDF
Chapter1pdf__2021_11_23_10_53_20.pdf
DOCX
Dineshmaterial1 091225091539-phpapp02
PPT
Chapter One
PPT
C program compiler presentation
PPT
1 - Introduction to Compilers.ppt
PPTX
unit1_cd unit1_cd unit1_cd unit1_cd unit1_cd (1).pptx
PPTX
Software programming and development
PPT
Lexical analyzer
PPTX
Chapter 2 Program language translation.pptx
Compiler Construction introduction
Compiler_Lecture1.pdf
Compiler design computer science engineering.ppt
SOFTWARE TOOL FOR TRANSLATING PSEUDOCODE TO A PROGRAMMING LANGUAGE
SOFTWARE TOOL FOR TRANSLATING PSEUDOCODE TO A PROGRAMMING LANGUAGE
unit1pdf__2021_12_14_12_37_34.pdf
Compiler Design Material
01. introduction
Compiler gate question key
Data structure and algorithm.lect-03.ppt
Chapter1pdf__2021_11_23_10_53_20.pdf
Dineshmaterial1 091225091539-phpapp02
Chapter One
C program compiler presentation
1 - Introduction to Compilers.ppt
unit1_cd unit1_cd unit1_cd unit1_cd unit1_cd (1).pptx
Software programming and development
Lexical analyzer
Chapter 2 Program language translation.pptx
Ad

More from jeronimored (20)

PPTX
Day by day people are becoming smarter, so in this era , some specific machin...
PPT
Android's mobile operating system is based on the Linux kernel and is the wor...
PPT
Over the past century and a half, important technological developments have c...
PDF
Computer Networks 7.Physical LayerComputer Networks 7.Physical Layer
PPT
Android – Open source mobile OS developed ny the Open Handset Alliance led by...
PPT
Intel microprocessor history lec12_x86arch.ppt
PPT
Intro Ch 01BA business alliance consisting of 47 companies to develop open st...
PPT
preKnowledge-InternetNetworking Android's mobile operating system is based on...
PPT
TelecommunicationsThe Internet Basic Telecom Model
PPT
Functional Areas of Network Management Configuration Management
PPT
Coding, Information Theory (and Advanced Modulation
PPT
8085microprocessorarchitectureppt-121013115356-phpapp02_2.ppt
PPT
A microprocessor is the main component of a microcomputer system and is also ...
PPT
Erroneous co-routines can block system Formal interfaces slow down system
PPT
Welcome to Introduction to Algorithms, Spring 2004
PPT
Resource Management in (Embedded) Real-Time Systems
PPT
Management Tools Desirable features Management Architectures Simple Network ...
PPT
MICMicrowave Tubes – klystron, reflex klystron, magnetron and TWT.
PPT
Network Management Network Management Model
PPT
Saumya Debray The University of Arizona Tucson
Day by day people are becoming smarter, so in this era , some specific machin...
Android's mobile operating system is based on the Linux kernel and is the wor...
Over the past century and a half, important technological developments have c...
Computer Networks 7.Physical LayerComputer Networks 7.Physical Layer
Android – Open source mobile OS developed ny the Open Handset Alliance led by...
Intel microprocessor history lec12_x86arch.ppt
Intro Ch 01BA business alliance consisting of 47 companies to develop open st...
preKnowledge-InternetNetworking Android's mobile operating system is based on...
TelecommunicationsThe Internet Basic Telecom Model
Functional Areas of Network Management Configuration Management
Coding, Information Theory (and Advanced Modulation
8085microprocessorarchitectureppt-121013115356-phpapp02_2.ppt
A microprocessor is the main component of a microcomputer system and is also ...
Erroneous co-routines can block system Formal interfaces slow down system
Welcome to Introduction to Algorithms, Spring 2004
Resource Management in (Embedded) Real-Time Systems
Management Tools Desirable features Management Architectures Simple Network ...
MICMicrowave Tubes – klystron, reflex klystron, magnetron and TWT.
Network Management Network Management Model
Saumya Debray The University of Arizona Tucson
Ad

Recently uploaded (20)

PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPTX
Internet of Things (IOT) - A guide to understanding
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PDF
Digital Logic Computer Design lecture notes
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
PPTX
UNIT 4 Total Quality Management .pptx
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PPTX
Geodesy 1.pptx...............................................
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PDF
Arduino robotics embedded978-1-4302-3184-4.pdf
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PPT
Project quality management in manufacturing
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
Internet of Things (IOT) - A guide to understanding
CYBER-CRIMES AND SECURITY A guide to understanding
Embodied AI: Ushering in the Next Era of Intelligent Systems
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
Digital Logic Computer Design lecture notes
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
UNIT 4 Total Quality Management .pptx
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
Geodesy 1.pptx...............................................
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
Arduino robotics embedded978-1-4302-3184-4.pdf
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
UNIT-1 - COAL BASED THERMAL POWER PLANTS
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
Project quality management in manufacturing

7068458.ppt

  • 2. COP4020 Spring 2014 2 9/13/2022 Overview  Compiler phases  Lexical analysis  Syntax analysis  Semantic analysis  Intermediate (machine-independent) code generation  Intermediate code optimization  Target (machine-dependent) code generation  Target code optimization
  • 3. Source program with macros Preprocessor Source program Compiler Target assembly program assembler Relocatable machine code linker Absolute machine code Try g++ with –v, -E, -S flags on linprog. A typical compilation process 9/13/2022 COP4020 Spring 2014 3
  • 4.  What is a compiler?  A program that reads a program written in one language (source language) and translates it into an equivalent program in another language (target language).  Two components  Understand the program (make sure it is correct)  Rewrite the program in the target language.  Traditionally, the source language is a high level language and the target language is a low level language (machine code). compiler Source program Target program Error message 9/13/2022 COP4020 Spring 2014 4
  • 5. COP4020 Spring 2014 5 9/13/2022 Compilation Phases and Passes  Compilation of a program proceeds through a fixed series of phases  Each phase use an (intermediate) form of the program produced by an earlier phase  Subsequent phases operate on lower-level code representations  Each phase may consist of a number of passes over the program representation  Pascal, FORTRAN, C languages designed for one-pass compilation, which explains the need for function prototypes  Single-pass compilers need less memory to operate  Java and ADA are multi-pass
  • 6. COP4020 Spring 2014 6 9/13/2022 Compiler Front- and Back-end Semantic Analysis and Intermediate Code Generation Scanner (lexical analysis) Parser (syntax analysis) Machine- Independent Code Improvement Target Code Generation Machine-Specific Code Improvement Source program (character stream) Tokens Parse tree Abstract syntax tree or other intermediate form Modified intermediate form Assembly or object code Modified assembly or object code Abstract syntax tree or other intermediate form Front end analysis Back end synthesis
  • 7. COP4020 Spring 2014 7 9/13/2022 Scanner: Lexical Analysis  Lexical analysis breaks up a program into tokens  Grouping characters into non-separatable units (tokens)  Changing a stream to characters to a stream of tokens program gcd (input, output); var i, j : integer; begin read (i, j); while i <> j do if i > j then i := i - j else j := j - i; writeln (i) end. program gcd ( input , output ) ; var i , j : integer ; begin read ( i , j ) ; while i <> j do if i > j then i := i - j else j := i - i ; writeln ( i ) end .
  • 8. COP4020 Spring 2014 8 9/13/2022 Scanner: Lexical Analysis  What kind of errors can be reported by lexical analyzer? A = b + @3;
  • 9. COP4020 Spring 2014 9 9/13/2022 Parser: Syntax Analysis  Checks whether the token stream meets the grammatical specification of the language and generates the syntax tree.  A syntax error is produced by the compiler when the program does not meet the grammatical specification.  For grammatically correct program, this phase generates an internal representation that is easy to manipulate in later phases  Typically a syntax tree (also called a parse tree).  A grammar of a programming language is typically described by a context free grammer, which also defines the structure of the parse tree.
  • 10. COP4020 Spring 2014 10 9/13/2022 Context-Free Grammars  A context-free grammar defines the syntax of a programming language  The syntax defines the syntactic categories for language constructs  Statements  Expressions  Declarations  Categories are subdivided into more detailed categories  A Statement is a  For-statement  If-statement  Assignment <statement> ::= <for-statement> | <if-statement> | <assignment> <for-statement> ::= for ( <expression> ; <expression> ; <expression> ) <statement> <assignment> ::= <identifier> := <expression>
  • 11. COP4020 Spring 2014 11 9/13/2022 Example: Micro Pascal <Program> ::= program <id> ( <id> <More_ids> ) ; <Block> . <Block> ::= <Variables> begin <Stmt> <More_Stmts> end <More_ids> ::= , <id> <More_ids> |  <Variables> ::= var <id> <More_ids> : <Type> ; <More_Variables> |  <More_Variables> ::= <id> <More_ids> : <Type> ; <More_Variables> |  <Stmt> ::= <id> := <Exp> | if <Exp> then <Stmt> else <Stmt> | while <Exp> do <Stmt> | begin <Stmt> <More_Stmts> end <Exp> ::= <num> | <id> | <Exp> + <Exp> | <Exp> - <Exp>
  • 12. Parsing examples  Pos = init + / rate * 60  id1 = id2 + / id3 * const  syntax error (exp ::= exp + exp cannot be reduced).  Pos = init + rate * 60  id1 = id2 + id3 * const  COP4020 Spring 2014 12 9/13/2022 := id1 + id2 * id3 60
  • 13. COP4020 Spring 2014 13 9/13/2022 Semantic Analysis  Semantic analysis is applied by a compiler to discover the meaning of a program by analyzing its parse tree or abstract syntax tree.  A program without grammatical errors may not always be correct program.  pos = init + rate * 60  What if pos is a class while init and rate are integers?  This kind of errors cannot be found by the parser  Semantic analysis finds this type of error and ensure that the program has a meaning.
  • 14. COP4020 Spring 2014 14 9/13/2022 Semantic Analysis  Static semantic checks (done by the compiler) are performed at compile time  Type checking  Every variable is declared before used  Identifiers are used in appropriate contexts  Check subroutine call arguments  Check labels  Dynamic semantic checks are performed at run time, and the compiler produces code that performs these checks  Array subscript values are within bounds  Arithmetic errors, e.g. division by zero  Pointers are not dereferenced unless pointing to valid object  A variable is used but hasn't been initialized  When a check fails at run time, an exception is raised
  • 15. COP4020 Spring 2014 15 9/13/2022 Semantic Analysis and Strong Typing  A language is strongly typed "if (type) errors are always detected"  Errors are either detected at compile time or at run time  Examples of such errors are listed on previous slide  Languages that are strongly typed are Ada, Java, ML, Haskell  Languages that are not strongly typed are Fortran, Pascal, C/C++, Lisp  Strong typing makes language safe and easier to use, but potentially slower because of dynamic semantic checks  In some languages, most (type) errors are detected late at run time which is detrimental to reliability e.g. early Basic, Lisp, Prolog, some script languages
  • 16. COP4020 Spring 2014 16 9/13/2022 Code Generation and Intermediate Code Forms  A typical intermediate form of code produced by the semantic analyzer is an abstract syntax tree (AST)  The AST is annotated with useful information such as pointers to the symbol table entry of identifiers Example AST for the gcd program in Pascal
  • 17. COP4020 Spring 2014 17 9/13/2022 Code Generation and Intermediate Code Forms  Other intermediate code forms  intermediate code is something that is both close to the final machine code and easy to manipulate (for optimization). One example is the three- address code: dst = op1 op op2  The three-address code for the assignment statement: temp1 = 60 temp2 = id3 + temp1 temp3 = id2 + temp2 id1 = temp3  Machine-independent Intermediate code improvement temp1 = id3 * 60.0 id1 = id2 + temp1
  • 18. COP4020 Spring 2014 18 9/13/2022 Target Code Generation and Optimization  From the machine-independent form assembly or object code is generated by the compiler MOVF id3, R2 MULF #60.0, R2 MOVF id2, R1 ADDF R2, R1 MOVF R1, id1  This machine-specific code is optimized to exploit specific hardware features
  • 19. Summary  Compiler front-end: lexical analysis, syntax analysis, semantic analysis  Tasks: understanding the source code, making sure the source code is written correctly  Compiler back-end: Intermediate code generation/improvement, and Machine code generation/improvement  Tasks: translating the program to a semantically the same program (in a different language). COP4020 Spring 2014 19 9/13/2022

Editor's Notes