SlideShare a Scribd company logo
Chapter 1 2301373: Introduction 1
Introduction
Chapter 1 2301373: Introduction 2
What is a Compiler?
• A compiler is a computer
program that translates a
program in a source language
into an equivalent program in a
target language.
• A source program/code is a
program/code written in the
source language, which is
usually a high-level language.
• A target program/code is a
program/code written in the
target language, which often is
a machine language or an
intermediate code.
compiler
Source
program
Target
program
Error
message
Chapter 1 2301373: Introduction 3
Process of Compiling
scanner
parser
Semantic analyzer
Intermediate code generator
Code optimization
Code generator
Code optimization
Stream of characters
Stream of tokens
Parse/syntax tree
Annotated tree
Intermediate code
Intermediate code
Target code
Target code
Chapter 1 2301373: Introduction 4
Some Data Structures
• Symbol table
• Literal table
• Parse tree
Chapter 1 2301373: Introduction 5
Symbol Table
• Identifiers are names of variables,
constants, functions, data types, etc.
• Store information associated with identifiers
– Information associated with different types of
identifiers can be different
• Information associated with variables are name, type,
address,size (for array), etc.
• Information associated with functions are name,type
of return value, parameters, address, etc.
Chapter 1 2301373: Introduction 6
Symbol Table (cont’d)
• Accessed in every phase of compilers
– The scanner, parser, and semantic analyzer put
names of identifiers in symbol table.
– The semantic analyzer stores more information
(e.g. data types) in the table.
– The intermediate code generator, code
optimizer and code generator use information in
symbol table to generate appropriate code.
• Mostly use hash table for efficiency.
Chapter 1 2301373: Introduction 7
Literal table
• Store constants and strings used in program
– reduce the memory size by reusing constants
and strings
• Can be combined with symbol table
Chapter 1 2301373: Introduction 8
Parse tree
• Dynamically-allocated, pointer-based
structure
• Information for different data types
related to parse trees need to be stored
somewhere.
– Nodes are variant records, storing
information for different types of data
– Nodes store pointers to information stored
in other data structure, e.g. symbol table
Chapter 1 2301373: Introduction 9
Scanning
• A scanner reads a stream of characters and
puts them together into some meaningful
(with respect to the source language) units
called tokens.
• It produces a stream of tokens for the next
phase of compiler.
Chapter 1 2301373: Introduction 10
Parsing
• A parser gets a stream of tokens from the
scanner, and determines if the syntax
(structure) of the program is correct
according to the (context-free) grammar of
the source language.
• Then, it produces a data structure, called a
parse tree or an abstract syntax tree, which
describes the syntactic structure of the
program.
Chapter 1 2301373: Introduction 11
Semantic analysis
• It gets the parse tree from the parser together with
information about some syntactic elements
• It determines if the semantics or meaning of the
program is correct.
• This part deals with static semantic.
– semantic of programs that can be checked by reading
off from the program only.
– syntax of the language which cannot be described in
context-free grammar.
• Mostly, a semantic analyzer does type checking.
• It modifies the parse tree in order to get that
(static) semantically correct code.
Chapter 1 2301373: Introduction 12
Intermediate code generation
• An intermediate code generator
– takes a parse tree from the semantic analyzer
– generates a program in the intermediate
language.
• In some compilers, a source program is
translated into an intermediate code first
and then the intermediate code is translated
into the target language.
• In other compilers, a source program is
translated directly into the target language.
Chapter 1 2301373: Introduction 13
Intermediate code generation (cont’d)
• Using intermediate code is beneficial when
compilers which translates a single source
language to many target languages are
required.
– The front-end of a compiler – scanner to
intermediate code generator – can be used for
every compilers.
– Different back-ends – code optimizer and code
generator– is required for each target language.
• One of the popular intermediate code is
three-address code. A three-address code
instruction is in the form of x = y op z.
Chapter 1 2301373: Introduction 14
Code optimization
• Replacing an inefficient sequence of
instructions with a better sequence of
instructions.
• Sometimes called code improvement.
• Code optimization can be done:
– after semantic analyzing
• performed on a parse tree
– after intermediate code generation
• performed on a intermediate code
– after code generation
• performed on a target code
Chapter 1 2301373: Introduction 15
Code generation
• A code generator
– takes either an intermediate code or a parse
tree
– produces a target program.
Chapter 1 2301373: Introduction 16
Error Handling
• Error can be found in every phase of
compilation.
– Errors found during compilation are called static
(or compile-time) errors.
– Errors found during execution are called
dynamic (or run-time) errors
• Compilers need to detect, report, and
recover from error found in source programs
• Error handlers are different in different
phases of compiler.
Chapter 1 2301373: Introduction 17
• a compiler which generates target code for
a different machine from one on which the c
ompiler runs.
• A host language is a language in which the
compiler is written.
– T-diagram
• Cross compilers are used very often in
practice.
Cross Compiler
S
H
T
Chapter 1 2301373: Introduction 18
Cross Compilers (cont’d)
• If we want a compiler from
language A to language B on a
machine with language E,
– write one with E
– write one with D if you have a
compiler from D to E on some ma
chine
• It is better than the former approach
if D is a high-level language but E is
a machine language
– write one from G to B with E if we
have a compiler from A to G
written in E
A
E
B
D
?
E
A
D
B
G
E
BA
E
G
Chapter 1 2301373: Introduction 19
Porting
• Porting: construct a compiler between a
source and a target language using one
host language from another host language
A
A
K
A
H
H A
H
K
A
A
K
A
H
K A
K
K
Chapter 1 2301373: Introduction 20
Bootstrapping
• If we have to implement, from
scratch, a compiler from a high
-level language A to a machine,
which is also a host, language,
– direct method
– bootstrapping
A
H
H
A
A1
H
A1
A2
H
A2
A3
H
A3
H
H
Chapter 1 2301373: Introduction 21
Cousins of Compilers
• Linkers
• Loaders
• Interpreters
• Assemblers
Chapter 1 2301373: Introduction 22
History (1930’s -40’s)
• 1930’s
– John von Neumann invented the concept of
stored-program computer.
– Alan Turing defined Turing machine and
computability.
• 1940’s
– Many electro-mechanic, stored-program
computers were constructed.
• ABC (Atanasoff Berry Computer) at Iowa
• Z1-4 (by Zuse) in Germany
• ENIAC (programmed by a plug board)
Chapter 1 2301373: Introduction 23
History : 1950
• Many electronic, stored-program computers were
designed.
– EDVAC (by von Neumann)
– ACE (by Turing)
• Programs were written in machine languages.
• Later, programs are written in assembly languages
instead.
– Assemblers translate symbolic code and memory
address to machine code.
• John Backus developed FORTRAN (no recursive
call) and FORTRAN compiler.
• Noam Chomsky studied structure of languages
and classified them into classes called Chomsky
hierarchy.
0A 1F 83 90 4B
op code, address,..
LDI B, 4
LDI C, 3
LDI A, 0
ST: ADI A, C
DEC B
JNZ B, ST
STO 0XF0, A
Grammar
Chapter 1 2301373: Introduction 24
History (1960’s)
• Recursive-descent parsing was introduced.
• Nuar designed Algol60, Pascal’s ancestor,
which allows recursive call.
• Backus-Nuar form (BNF) was used to
described Algol60.
• LL(1) parsing was proposed by Lewis and
Stearns.
• General LR parsing was invented by Knuth.
• SLR parsing was developed by DeRemer.
Chapter 1 2301373: Introduction 25
History (1970’s)
• LALR was develpoed by DeRemer.
• Aho and Ullman founded the theory of LR
parsing techniques.
• Yacc (Yet Another Compiler Compiler) was
developed by Johnson.
• Type inference was studied by Milner.

More Related Content

PPT
Introduction to Compiler design
PDF
Lecture 01 introduction to compiler
PPT
Introduction to Compiler Construction
PPTX
Fundamentals of Language Processing
PPTX
Pumping lemma Theory Of Automata
PPTX
Compilers
PPTX
Three address code In Compiler Design
PPT
Lexical analyzer
Introduction to Compiler design
Lecture 01 introduction to compiler
Introduction to Compiler Construction
Fundamentals of Language Processing
Pumping lemma Theory Of Automata
Compilers
Three address code In Compiler Design
Lexical analyzer

What's hot (20)

PPTX
Lexical Analysis - Compiler Design
PPTX
Types of Parser
PPTX
Flowshop scheduling
PPT
High level and Low level Language
PDF
Syntax analysis
PPTX
Algorithm and flowchart
PPTX
COMPILER DESIGN OPTIONS
PPTX
Algorithm Design & Implementation
PPTX
Types of system software
PPTX
Lex & yacc
PPTX
Basic programming concepts
PPT
Language translator
PPTX
computer Architecture
PDF
COMPILER DESIGN- Introduction & Lexical Analysis:
PPTX
First-Come-First-Serve (FCFS)
PDF
Binary codes
PPTX
Linking in MS-Dos System
PPT
Intermediate code generation (Compiler Design)
PPTX
Life cycle of a computer program
PPTX
Software programming and development
Lexical Analysis - Compiler Design
Types of Parser
Flowshop scheduling
High level and Low level Language
Syntax analysis
Algorithm and flowchart
COMPILER DESIGN OPTIONS
Algorithm Design & Implementation
Types of system software
Lex & yacc
Basic programming concepts
Language translator
computer Architecture
COMPILER DESIGN- Introduction & Lexical Analysis:
First-Come-First-Serve (FCFS)
Binary codes
Linking in MS-Dos System
Intermediate code generation (Compiler Design)
Life cycle of a computer program
Software programming and development
Ad

Similar to Introduction to Compiler (20)

PPT
introduction to computer vision and image processing
PDF
Introduction to compilers
PPTX
Compiler an overview
PPTX
Compiler Design Introduction
PPT
Compiler Design Basics
PPTX
Compiler Design Introduction With Design
PPTX
COMPILER DESIGN PPTS.pptx
PPTX
System software module 1 presentation file
PPT
Compiler Design Basics
PPTX
Presentation1
PPTX
Presentation1
PPTX
Unit 1 part1 Introduction of Compiler Design.pptx
PDF
COMPILER DESIGN.pdf
PPTX
Introduction_to_Programming.pptx
PPTX
4_5802928814682016556.pptx
PPTX
CD - CH1 - Introduction to compiler design.pptx
PPTX
Cd ch1 - introduction
PPT
Introduction to compiler design and phases of compiler
PPTX
Compilers.pptx
PPTX
Python-L1.pptx
introduction to computer vision and image processing
Introduction to compilers
Compiler an overview
Compiler Design Introduction
Compiler Design Basics
Compiler Design Introduction With Design
COMPILER DESIGN PPTS.pptx
System software module 1 presentation file
Compiler Design Basics
Presentation1
Presentation1
Unit 1 part1 Introduction of Compiler Design.pptx
COMPILER DESIGN.pdf
Introduction_to_Programming.pptx
4_5802928814682016556.pptx
CD - CH1 - Introduction to compiler design.pptx
Cd ch1 - introduction
Introduction to compiler design and phases of compiler
Compilers.pptx
Python-L1.pptx
Ad

More from Radhakrishnan Chinnusamy (11)

PPTX
Unit 5_Controlling.pptx
PPTX
Unit 3_organising.pptx
PPT
Unit 2_Planning.ppt
PPTX
Unit 1_introduction.pptx
PPT
Chapter 7 Run Time Environment
PPT
Chapter 6 Intermediate Code Generation
PPT
Chapter 5 Syntax Directed Translation
PPT
1.Role lexical Analyzer
PPT
Multi Head, Multi Tape Turing Machine
PPT
Primitive Recursive Functions
PPT
Context free grammar
Unit 5_Controlling.pptx
Unit 3_organising.pptx
Unit 2_Planning.ppt
Unit 1_introduction.pptx
Chapter 7 Run Time Environment
Chapter 6 Intermediate Code Generation
Chapter 5 Syntax Directed Translation
1.Role lexical Analyzer
Multi Head, Multi Tape Turing Machine
Primitive Recursive Functions
Context free grammar

Recently uploaded (20)

PPTX
Internet of Things (IOT) - A guide to understanding
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PPTX
CH1 Production IntroductoryConcepts.pptx
PPTX
web development for engineering and engineering
PPTX
Sustainable Sites - Green Building Construction
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
PPTX
Lecture Notes Electrical Wiring System Components
PPTX
UNIT 4 Total Quality Management .pptx
PPTX
OOP with Java - Java Introduction (Basics)
PDF
Arduino robotics embedded978-1-4302-3184-4.pdf
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PPT
Mechanical Engineering MATERIALS Selection
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
Internet of Things (IOT) - A guide to understanding
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
CH1 Production IntroductoryConcepts.pptx
web development for engineering and engineering
Sustainable Sites - Green Building Construction
CYBER-CRIMES AND SECURITY A guide to understanding
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
Lecture Notes Electrical Wiring System Components
UNIT 4 Total Quality Management .pptx
OOP with Java - Java Introduction (Basics)
Arduino robotics embedded978-1-4302-3184-4.pdf
Operating System & Kernel Study Guide-1 - converted.pdf
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
Mechanical Engineering MATERIALS Selection
Foundation to blockchain - A guide to Blockchain Tech
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT

Introduction to Compiler

  • 1. Chapter 1 2301373: Introduction 1 Introduction
  • 2. Chapter 1 2301373: Introduction 2 What is a Compiler? • A compiler is a computer program that translates a program in a source language into an equivalent program in a target language. • A source program/code is a program/code written in the source language, which is usually a high-level language. • A target program/code is a program/code written in the target language, which often is a machine language or an intermediate code. compiler Source program Target program Error message
  • 3. Chapter 1 2301373: Introduction 3 Process of Compiling scanner parser Semantic analyzer Intermediate code generator Code optimization Code generator Code optimization Stream of characters Stream of tokens Parse/syntax tree Annotated tree Intermediate code Intermediate code Target code Target code
  • 4. Chapter 1 2301373: Introduction 4 Some Data Structures • Symbol table • Literal table • Parse tree
  • 5. Chapter 1 2301373: Introduction 5 Symbol Table • Identifiers are names of variables, constants, functions, data types, etc. • Store information associated with identifiers – Information associated with different types of identifiers can be different • Information associated with variables are name, type, address,size (for array), etc. • Information associated with functions are name,type of return value, parameters, address, etc.
  • 6. Chapter 1 2301373: Introduction 6 Symbol Table (cont’d) • Accessed in every phase of compilers – The scanner, parser, and semantic analyzer put names of identifiers in symbol table. – The semantic analyzer stores more information (e.g. data types) in the table. – The intermediate code generator, code optimizer and code generator use information in symbol table to generate appropriate code. • Mostly use hash table for efficiency.
  • 7. Chapter 1 2301373: Introduction 7 Literal table • Store constants and strings used in program – reduce the memory size by reusing constants and strings • Can be combined with symbol table
  • 8. Chapter 1 2301373: Introduction 8 Parse tree • Dynamically-allocated, pointer-based structure • Information for different data types related to parse trees need to be stored somewhere. – Nodes are variant records, storing information for different types of data – Nodes store pointers to information stored in other data structure, e.g. symbol table
  • 9. Chapter 1 2301373: Introduction 9 Scanning • A scanner reads a stream of characters and puts them together into some meaningful (with respect to the source language) units called tokens. • It produces a stream of tokens for the next phase of compiler.
  • 10. Chapter 1 2301373: Introduction 10 Parsing • A parser gets a stream of tokens from the scanner, and determines if the syntax (structure) of the program is correct according to the (context-free) grammar of the source language. • Then, it produces a data structure, called a parse tree or an abstract syntax tree, which describes the syntactic structure of the program.
  • 11. Chapter 1 2301373: Introduction 11 Semantic analysis • It gets the parse tree from the parser together with information about some syntactic elements • It determines if the semantics or meaning of the program is correct. • This part deals with static semantic. – semantic of programs that can be checked by reading off from the program only. – syntax of the language which cannot be described in context-free grammar. • Mostly, a semantic analyzer does type checking. • It modifies the parse tree in order to get that (static) semantically correct code.
  • 12. Chapter 1 2301373: Introduction 12 Intermediate code generation • An intermediate code generator – takes a parse tree from the semantic analyzer – generates a program in the intermediate language. • In some compilers, a source program is translated into an intermediate code first and then the intermediate code is translated into the target language. • In other compilers, a source program is translated directly into the target language.
  • 13. Chapter 1 2301373: Introduction 13 Intermediate code generation (cont’d) • Using intermediate code is beneficial when compilers which translates a single source language to many target languages are required. – The front-end of a compiler – scanner to intermediate code generator – can be used for every compilers. – Different back-ends – code optimizer and code generator– is required for each target language. • One of the popular intermediate code is three-address code. A three-address code instruction is in the form of x = y op z.
  • 14. Chapter 1 2301373: Introduction 14 Code optimization • Replacing an inefficient sequence of instructions with a better sequence of instructions. • Sometimes called code improvement. • Code optimization can be done: – after semantic analyzing • performed on a parse tree – after intermediate code generation • performed on a intermediate code – after code generation • performed on a target code
  • 15. Chapter 1 2301373: Introduction 15 Code generation • A code generator – takes either an intermediate code or a parse tree – produces a target program.
  • 16. Chapter 1 2301373: Introduction 16 Error Handling • Error can be found in every phase of compilation. – Errors found during compilation are called static (or compile-time) errors. – Errors found during execution are called dynamic (or run-time) errors • Compilers need to detect, report, and recover from error found in source programs • Error handlers are different in different phases of compiler.
  • 17. Chapter 1 2301373: Introduction 17 • a compiler which generates target code for a different machine from one on which the c ompiler runs. • A host language is a language in which the compiler is written. – T-diagram • Cross compilers are used very often in practice. Cross Compiler S H T
  • 18. Chapter 1 2301373: Introduction 18 Cross Compilers (cont’d) • If we want a compiler from language A to language B on a machine with language E, – write one with E – write one with D if you have a compiler from D to E on some ma chine • It is better than the former approach if D is a high-level language but E is a machine language – write one from G to B with E if we have a compiler from A to G written in E A E B D ? E A D B G E BA E G
  • 19. Chapter 1 2301373: Introduction 19 Porting • Porting: construct a compiler between a source and a target language using one host language from another host language A A K A H H A H K A A K A H K A K K
  • 20. Chapter 1 2301373: Introduction 20 Bootstrapping • If we have to implement, from scratch, a compiler from a high -level language A to a machine, which is also a host, language, – direct method – bootstrapping A H H A A1 H A1 A2 H A2 A3 H A3 H H
  • 21. Chapter 1 2301373: Introduction 21 Cousins of Compilers • Linkers • Loaders • Interpreters • Assemblers
  • 22. Chapter 1 2301373: Introduction 22 History (1930’s -40’s) • 1930’s – John von Neumann invented the concept of stored-program computer. – Alan Turing defined Turing machine and computability. • 1940’s – Many electro-mechanic, stored-program computers were constructed. • ABC (Atanasoff Berry Computer) at Iowa • Z1-4 (by Zuse) in Germany • ENIAC (programmed by a plug board)
  • 23. Chapter 1 2301373: Introduction 23 History : 1950 • Many electronic, stored-program computers were designed. – EDVAC (by von Neumann) – ACE (by Turing) • Programs were written in machine languages. • Later, programs are written in assembly languages instead. – Assemblers translate symbolic code and memory address to machine code. • John Backus developed FORTRAN (no recursive call) and FORTRAN compiler. • Noam Chomsky studied structure of languages and classified them into classes called Chomsky hierarchy. 0A 1F 83 90 4B op code, address,.. LDI B, 4 LDI C, 3 LDI A, 0 ST: ADI A, C DEC B JNZ B, ST STO 0XF0, A Grammar
  • 24. Chapter 1 2301373: Introduction 24 History (1960’s) • Recursive-descent parsing was introduced. • Nuar designed Algol60, Pascal’s ancestor, which allows recursive call. • Backus-Nuar form (BNF) was used to described Algol60. • LL(1) parsing was proposed by Lewis and Stearns. • General LR parsing was invented by Knuth. • SLR parsing was developed by DeRemer.
  • 25. Chapter 1 2301373: Introduction 25 History (1970’s) • LALR was develpoed by DeRemer. • Aho and Ullman founded the theory of LR parsing techniques. • Yacc (Yet Another Compiler Compiler) was developed by Johnson. • Type inference was studied by Milner.