SlideShare a Scribd company logo
SMARTIAN:
Enhancing Smart Contract Fuzzing with
Static and Dynamic Data-Flow Analyses
Jaeseung Choi
KAIST
CODE BLUE 2022
Doyeon Kim
LINE Plus
Soomin Kim
KAIST
Gustavo Grieco
Trail of Bits
Alex Groce
Northern Arizona University
Sang Kil Cha
KAIST
Ethereum Smart Contract
• Ethereum: most popular smart contract platform based on blockchain
• Smart contract = (code + data) on blockchain
ether
ether
$
Blockchain
$
</> </>
Digital cash
EVM (Ethereum Virtual Machine)
Smart Contract is Stateful
• Smart contract defines functions that a user can call.
• Each function can read or write state variables.
g(uint y) {
... = state_v + 1;
...
}
Smart contract
f(uint x) {
state_v = ...;
...
}
Call
State
variable
(persistent)
</>
f()
g()
state_v
User
Smart Contract Security
Need Testing!
Reentrancy attacks on DAO [1] Integer overflow attacks on ERC20
Bugs in smart contract can cause a catastrophic loss of digital assets.
$70M
[1] P. Daian, “Analysis of the dao exploit,” https://hackingdistributed.com/2016/06/18/analysis-of-the-dao-exploit/
• Approximate the program behaviors without actual execution.
• Can investigate various semantic properties.
• Ex) Does buffer overflow bug occur?
Program code
?
Static Program Analysis
• Repeatedly execute the target program with random inputs.
• Simple but effective technique to find vulnerabilities.
• Employed by major software companies. (e.g., Google and Microsoft)
Inputs
Mutate
Program
Crash
Google’s OSS-Fuzz [1,2]
[1] https://github.com/google/oss-fuzz
[2] https://github.com/google/clusterfuzz
Fuzz Testing (Fuzzing)
• For smart contracts, a test case (seed) is a sequence of function calls.
• Deciding the order of function call is important in fuzzing.
g( ) {
if(state_v == 31337) {
bug();
}
}
f(uint x) {
state_v = x;
}
</>
f()
g()
Can trigger bug w/ mutation
Smart contract
state_v f(0) --> g( )
g( ) --> f(0)
Can’t trigger bug w/ mutation
Challenge in Fuzzing
• Traditional coverage-based fuzzing cannot discern two sequences.
• Previous work is based on machine learning [1] or runtime heuristics [2].
</>
f()
g()
Smart contract
state_v
g( ) {
if(state_v == 31337) {
bug();
}
}
f(uint x) {
state_v = x;
}
f(0) --> g( )
g( ) --> f(0)
Same code coverage
Existing Approach
[1] J. He et al., “Learning to fuzz from symbolic execution with application to smart contracts”, CCS 2019
[2] V. Wustholz et al., “Harvey: A greybox fuzzer for smart contracts”, FSE 2020
1 f(uint x, uint y) {
2 if (x == 41)
3 state_v = y;
4 }
5 g( ) {
6 if (state_v == 61)
7 bug();
8 }
9 h( ) { ... }
• Traditional code coverage (e.g., line coverage) may miss critical seed.
𝑺𝑺𝟏𝟏: f(0,0)-->g()
𝑺𝑺𝒃𝒃𝒃𝒃𝒃𝒃: f(41,61)-->g()
Covers Line 3
𝑺𝑺𝟐𝟐: f(0,0)-->h()
𝑺𝑺𝟐𝟐′ : f(41,0)-->h()
Covers Line 3
We can miss critical
intermediate seed
𝑺𝑺𝟏𝟏′ : f(41,0)-->g()
Only 𝑺𝑺𝟏𝟏′ covers
Line 3
𝑠𝑠𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡_𝑣𝑣
Line 6
Why is Line Coverage Not Enough?
• Statically analyze data-flows between functions.
• Initialize fuzzing seeds to have promising function call orders.
</>
f()
g()
Promising sequence
Smart contract
state_v
g( ) {
if(state_v == 31337) {
bug();
}
}
f(uint x) {
state_v = x;
}
f(0) --> g( )
g( ) --> f(0)
Static Analysis
Our Approach: Static Analysis
• Integrating static analysis with fuzzing
• Collect program knowledges that can improve fuzzing performance.
Program code
Inputs
Mutate
Program
Crash
+
Fuzzing
Static Analysis
?
Our Work
Contract Code
Static
Analyzer
Fuzzer
Bugs
Initial
Seed Pool
Smartian
</>
f()
g()
Dynamic
Analysis
Our System: Smartian
Fuzzer
Bugs
Smartian
Dynamic
Analysis
Initial
Seed Pool
Contract Code
Static
Analyzer
</>
f()
g()
Smartian runs on bytecode
C
Src
C
01101
Byte
(Compile)
Our System: Smartian
• Smart contracts are deployed to the blockchain in bytecode form.
• For certain contracts in the blockchain, source code may be unavailable.
• Binary-only fuzzing broadens the range of testing targets.
Binary-Only Smart Contract Fuzzing
• During compilation, ABI files are generated along with the bytecode.
• ABI contains various information, e.g., the type of function parameters.
• Only bytecode are uploaded to the blockchain.
ABI Specification
Contract Code
Static
Analyzer
Fuzzer
Bugs
Initial
Seed Pool
Smartian
</>
f()
g()
Dynamic
Analysis
011
101
111
Our System: Smartian
Analyzing State Variable Access
• Contract bytecode runs in a stack-based machine called EVM.
• We must figure out the operands for storage access instructions.
C
01101
Byte
100
Stack
200
EVM
PUSH 20
ADD
...
SLOAD // Storage load
Memory Storage
20
state_v
20 + 100
120
Analyzing State Variable Access
• Contract bytecode runs in a stack-based machine called EVM.
• We must figure out the operands for storage access instructions.
C
01101
Byte
Stack
200
EVM
PUSH 20
ADD
...
SLOAD // Storage load
Memory Storage
state_v
120
...
High Level Design
• We run flow-sensitive analysis for each function.
− Approximates the state of EVM along the execution.
• We identify which state variables are loaded & stored by the function using
SLOAD and SSTORE instructions.
</>
f()
g()
011
101
111
f(…
)
g(…)
h(…)
Store: var_x, var_y
Load: var_x
Load: var_y
• Identify function call orders that may produce data-flows across functions.
• Ensure that at least one seed includes the identified order.
Initial Seed Pool
f(…
)
g(…)
h(…)
Store: var_x, var_y
Load: var_x
Load: var_y
Generate
</>
f()
g()
011
101
111
Data-flow
f()->g()
f()->h()
Generating Initial Seeds for Fuzzing
• Funcs: A set of identified functions.
• Defs: A map from each identified function to the state variables defined by the
function.
• Uses: A map from each identified function to the state variables used by the
function.
• DataFlowGain: Function-level data flows as triples <f1,v,f2> from a given
sequence, where (1) f1 and f2 are functions that appear in the sequence, (2) f1
defines v, and (3) f2 uses that v.
Seed Initialization Algorithm
Seed Initialization Algorithm
Contract Code
Static
Analyzer
Fuzzer
Bugs
Initial
Seed Pool
Smartian
</>
f()
g()
Dynamic
Analysis
011
101
111
Our System: Smartian
• We should mutate function arguments to realize the expected data-flows.
• For this, we dynamically analyze concrete data-flows and use them as feedback.
𝑺𝑺𝟏𝟏: f(0,0)-->g()
1 f(uint x, uint y) {
2 if (x == 41)
3 state_v = y;
4 }
5 g( ) {
6 if (state_v == 61)
7 bug();
8 }
9 h( ) { ... }
𝑺𝑺𝒃𝒃𝒃𝒃𝒃𝒃: f(41,61)-->g()
Mutate
Initial seed
𝑺𝑺𝟏𝟏′: f(41,0)--
>g()
Intermediate seed
Realize data-flow
Line 3
𝑠𝑠𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡_𝑣𝑣
Line 6
Dynamic Data-Flow Analysis
• Smart contract bugs (mostly) do not incur a crash.
− Must implement bug oracle that monitors the execution.
• Smartian implements bug oracles for 13 classes of bugs.
− Investigated previous works on finding bugs from smart contract.
Bug Oracles for Fuzzing
• Assertion Failure(AF): The condition of an assert statement is not satisfied.
− Check if an INVALID instruction is executed.
• Arbitrary Write(AW): An Attacker can overwrite arbitrary storage data by
accessing a mismanaged array object.
− Check if someone accesses storage data in a location that is larger than the length of the
storage.
− Same bug oracle with Harvey[1].
• Requirement Violation(RV): The condition of a require statement is not satisfied.
− Check if a REVERT instruction is executed.
Bug Oracles
[1] V. Wu ̈stholz and M. Christakis, “Harvey: A greybox fuzzer for smart contracts,” in Proceedings of the International Symposium on Founda- tions of Software Engineering: Industry Papers, 2020.
• Block State Dependency(BD): Block states decide ether transfer of a contract.
− Check if a block state(e.g. TIMESTAMP, NUMBER) can affect an ether transfer tracing both
direct and indirect taint flows for this.
• Control-Flow Hijack(CH): An attacker can arbitrarily control the destination of a
JUMP or DELEGATECALL instruction.
− Raise an alarm if someone can set the destination contract of a DELEGATECALL into an
arbitrary user contract.
− Report an alarm if the destination of a JUMP instruction is manipulatable.
Bug Oracles
• Ether Leak(EL): A contract allows an arbitrary user to freely retrieve ether from
the contract.
− Check if a normal user can gain ether by sending transactions to the contract only when the
transaction sequence does not have any preceding transaction from the deployer.
• Freezing Ether(FE): A contract can receive ether but does not have any means to
send out ether.
− Check if there is no way to transfer ether to someone during the execution while contract
balance is greater than zero.
− Same bug oracle with ContractFuzzer[1].
Bug Oracles
[1] B. Jiang, Y. Liu, and W. K. Chan, “ContractFuzzer: Fuzzing smart contracts for vulnerability detection,” in Proceedings of the International Conference on Automated Software Engineering, 2018.
• Mishandled Exception(ME): A contract does not check for an exception when
calling external functions or sending ether.
− Taint the return value of a CALL instruction flows into a predicate of a JUMPI instruction.
− If there is a return value that is not used by a JUMPI, we report an alarm.
• Multiple Send(MS): A contract sends out ether multiple times within one
transaction. This is a specific case of DoS.
− Detect multiple ether transfers taking place in a single transaction.
Bug Oracles
• Integer Bug(IB): Integer overflows or underflows occur, and the result becomes
an unexpected value.
− Check if the over/underflowed value is used to critical variables.
• Reentrancy(RE): A function in a victim contract is re-entered and leads to a race
condition on state variables.
− First, monitor if there is a cyclic call chain during an ether transfer.
− Then, use taint analysis to identify state variables that affect this ether transfer.
− Finally, report if such variables are updated after the transfer takes place.
Bug Oracles
• Suicidal Contract(SC): An arbitrary user can destroy a victim contract by running
a SELFDESTRUCT instruction.
− Check if a normal user can execute SELFDESTRUCT instruction and destroy the contract.
− Filter out that have any preceding transaction from the deployer in the sequence.
• Transaction Origin Use(TO): A contract relies on the origin of a transaction (i.e.
tx.origin) for user authorization.
− Taint the return value of ORIGIN instruction, and check if it flows into the predicate of a
JUMPI instruction.
Bug Oracles
• Static analysis module
− Used B2R2 [1] as a front-end for EVM bytecode.
− Wrote main analysis logic in 1K lines of F# code.
• Fuzzing module
− Extended Eclipser [2] to support EVM bytecode.
− Used Nethermind [3] for the emulation of the bytecode.
Implementation
[1] M. Jung et al., “B2R2: Building an efficient front-end for binary analysis,” NDSS BAR 2019
[2] J. Choi et al., “Grey-box Concolic Testing on Binary Code,” ICSE 2019
[3] "Nethermind," https://github.com/NethermindEth/nethermind
• Q1. Can static & dynamic data-flow analyses improve fuzzing?
• Q2. Can Smartian outperform other testing tools for smart contracts?
• Q3. How does Smartian perform on a large-scale benchmark?
Evaluation
• Benchmarks
− Used the dataset from Verismart [1] and SmartBugs [2]
• Comparison targets
− Two fuzzers (sFuzz, ILF) and two symbolic executors (Mythril, Manticore)
• Environment
− Used Docker container to run each tool on a single contract
Experimental Setup
[1] S. So et al., “VeriSmart: A highly precise safety verifier for ethereum smart contracts,” S&P 2020
[2] T. Durieux et al., “Empirical review of automated analysis tools on 47,587 ethereum smart contracts,” ICSE 2020
• Verismart [1] benchmark: 58 real-world contracts with integer overflow CVEs
• Compare three different modes of Smartian
Impact of Data-Flow Analyses
[1] S. So et al., “VeriSmart: A highly precise safety verifier for ethereum smart contracts,” S&P 2020
• Verismart [1] benchmark: 58 real-world contracts with integer overflow CVEs
• Compare four different modes of Smartian
What about Dynamic Analysis Only?
[1] S. So et al., “VeriSmart: A highly precise safety verifier for ethereum smart contracts,” S&P 2020
• Used a subset of the previous benchmark
• Compared against tools that support integer overflow detection
ILF: no support
Comparison against other Tool - 1
• SmartBugs [1] benchmark: contracts with labeled bugs
− Selected 3 bug class: block state dependency, mishandled exception, reentrancy
Comparison against other Tool - 2
• More experimental results
− Coverage measurement
− Consideration on different bug oracles
− Large-scale experiment
More in the Paper
• Improving the precision of static analysis
• Automatically inferring the ABI specification of contract
• Applying of our idea to other domains
Future Works
• Smartian is available at https://github.com/SoftSec-KAIST/Smartian
• We also release the artifacts for our evaluation
Open Science
Question?

More Related Content

PPTX
trojan detection
PDF
02 - Introduction to Hyperledger Fabric
PDF
SANS Purple Team Summit 2021: Active Directory Purple Team Playbooks
PDF
IDOR Know-How.pdf
PDF
BCON22: oneAPI backend - Blender Cycles on Intel GPUs
PDF
Malware Detection - A Machine Learning Perspective
PDF
ICD-10-CM Outpatient Coding and Reporting Guidelines
PDF
Cloud Summit Canada com Rodrigo Montoro
trojan detection
02 - Introduction to Hyperledger Fabric
SANS Purple Team Summit 2021: Active Directory Purple Team Playbooks
IDOR Know-How.pdf
BCON22: oneAPI backend - Blender Cycles on Intel GPUs
Malware Detection - A Machine Learning Perspective
ICD-10-CM Outpatient Coding and Reporting Guidelines
Cloud Summit Canada com Rodrigo Montoro

Similar to [cb22] SMARTIAN: Enhancing Smart Contract Fuzzing with Static and Dynamic Data-Flow Analyses by Doyeon Kim (20)

PPTX
Smart Contract Testing
PPTX
Smart Contract Security Testing
PDF
Attacking and Exploiting Ethereum Smart Contracts: Auditing 101
PPTX
Smart Contract: QA Role for Decentralized Platform
PDF
Smartcheck: Static Analysis of Ethereum Smart Contracts
PPTX
Practical Challenges for Public Blockchains
PPTX
Practical Challenges for Public Blockchains
PDF
Deklarative Smart Contracts
PPTX
Binary Analysis - Luxembourg
PDF
Daniel Connelly Ethereum Smart Contract Master's Thesis
PPTX
Smart Contracts with Solidity hands-on training session
PDF
Blockchain and Smart Contract Simulation
PDF
Blockchain School 2019 - Security of Smart Contracts.pdf
PDF
[2012 CodeEngn Conference 06] beist - Everyone has his or her own fuzzer
PDF
A living programming environment for a living blockchain
PDF
A living programming environment for blockchain
PDF
Smart Contract Security
PDF
Solidity and Ethereum Smart Contract Gas Optimization
PDF
Blockchain Programming
PDF
Sthack 2015 - Jonathan "@JonathanSalwan" Salwan - Dynamic Behavior Analysis U...
Smart Contract Testing
Smart Contract Security Testing
Attacking and Exploiting Ethereum Smart Contracts: Auditing 101
Smart Contract: QA Role for Decentralized Platform
Smartcheck: Static Analysis of Ethereum Smart Contracts
Practical Challenges for Public Blockchains
Practical Challenges for Public Blockchains
Deklarative Smart Contracts
Binary Analysis - Luxembourg
Daniel Connelly Ethereum Smart Contract Master's Thesis
Smart Contracts with Solidity hands-on training session
Blockchain and Smart Contract Simulation
Blockchain School 2019 - Security of Smart Contracts.pdf
[2012 CodeEngn Conference 06] beist - Everyone has his or her own fuzzer
A living programming environment for a living blockchain
A living programming environment for blockchain
Smart Contract Security
Solidity and Ethereum Smart Contract Gas Optimization
Blockchain Programming
Sthack 2015 - Jonathan "@JonathanSalwan" Salwan - Dynamic Behavior Analysis U...
Ad

More from CODE BLUE (20)

PDF
[cb22] Hayabusa Threat Hunting and Fast Forensics in Windows environments fo...
PDF
[cb22] Tales of 5G hacking by Karsten Nohl
PDF
[cb22] Your Printer is not your Printer ! - Hacking Printers at Pwn2Own by A...
PDF
[cb22] "The Present and Future of Coordinated Vulnerability Disclosure" Inter...
PDF
[cb22] 「協調された脆弱性開示の現在と未来」国際的なパネルディスカッション(4) by 板橋 博之
PDF
[cb22] "The Present and Future of Coordinated Vulnerability Disclosure" Inter...
PDF
[cb22] 「協調された脆弱性開示の現在と未来」国際的なパネルディスカッション(3) by Lorenzo Pupillo
PDF
[cb22] ”The Present and Future of Coordinated Vulnerability Disclosure” Inte...
PDF
[cb22] 「協調された脆弱性開示の現在と未来」国際的なパネルディスカッション(2)by Allan Friedman
PDF
[cb22] "The Present and Future of Coordinated Vulnerability Disclosure" Inter...
PDF
[cb22] 「協調された脆弱性開示の現在と未来」国際的なパネルディスカッション (1)by 高橋 郁夫
PDF
[cb22] Are Embedded Devices Ready for ROP Attacks? -ROP verification for low-...
PPTX
[cb22] Wslinkのマルチレイヤーな仮想環境について by Vladislav Hrčka
PPTX
[cb22] Under the hood of Wslink’s multilayered virtual machine en by Vladisla...
PDF
[cb22] CloudDragon’s Credential Factory is Powering Up Its Espionage Activiti...
PDF
[cb22] From Parroting to Echoing: The Evolution of China’s Bots-Driven Info...
PDF
[cb22] Who is the Mal-Gopher? - Implementation and Evaluation of “gimpfuzzy”...
PDF
[cb22] Mal-gopherとは?Go系マルウェアの分類のためのgimpfuzzy実装と評価 by 澤部 祐太, 甘粕 伸幸, 野村 和也
PDF
[cb22] Tracking the Entire Iceberg - Long-term APT Malware C2 Protocol Emulat...
PDF
[cb22] Fight Against Malware Development Life Cycle by Shusei Tomonaga and Yu...
[cb22] Hayabusa Threat Hunting and Fast Forensics in Windows environments fo...
[cb22] Tales of 5G hacking by Karsten Nohl
[cb22] Your Printer is not your Printer ! - Hacking Printers at Pwn2Own by A...
[cb22] "The Present and Future of Coordinated Vulnerability Disclosure" Inter...
[cb22] 「協調された脆弱性開示の現在と未来」国際的なパネルディスカッション(4) by 板橋 博之
[cb22] "The Present and Future of Coordinated Vulnerability Disclosure" Inter...
[cb22] 「協調された脆弱性開示の現在と未来」国際的なパネルディスカッション(3) by Lorenzo Pupillo
[cb22] ”The Present and Future of Coordinated Vulnerability Disclosure” Inte...
[cb22] 「協調された脆弱性開示の現在と未来」国際的なパネルディスカッション(2)by Allan Friedman
[cb22] "The Present and Future of Coordinated Vulnerability Disclosure" Inter...
[cb22] 「協調された脆弱性開示の現在と未来」国際的なパネルディスカッション (1)by 高橋 郁夫
[cb22] Are Embedded Devices Ready for ROP Attacks? -ROP verification for low-...
[cb22] Wslinkのマルチレイヤーな仮想環境について by Vladislav Hrčka
[cb22] Under the hood of Wslink’s multilayered virtual machine en by Vladisla...
[cb22] CloudDragon’s Credential Factory is Powering Up Its Espionage Activiti...
[cb22] From Parroting to Echoing: The Evolution of China’s Bots-Driven Info...
[cb22] Who is the Mal-Gopher? - Implementation and Evaluation of “gimpfuzzy”...
[cb22] Mal-gopherとは?Go系マルウェアの分類のためのgimpfuzzy実装と評価 by 澤部 祐太, 甘粕 伸幸, 野村 和也
[cb22] Tracking the Entire Iceberg - Long-term APT Malware C2 Protocol Emulat...
[cb22] Fight Against Malware Development Life Cycle by Shusei Tomonaga and Yu...
Ad

Recently uploaded (20)

PPTX
ART-APP-REPORT-FINctrwxsg f fuy L-na.pptx
PPTX
chapter8-180915055454bycuufucdghrwtrt.pptx
PPTX
Human Mind & its character Characteristics
PPTX
Learning-Plan-5-Policies-and-Practices.pptx
PDF
Parts of Speech Prepositions Presentation in Colorful Cute Style_20250724_230...
PPTX
Tablets And Capsule Preformulation Of Paracetamol
PPTX
nose tajweed for the arabic alphabets for the responsive
PPT
First Aid Training Presentation Slides.ppt
PPTX
worship songs, in any order, compilation
DOCX
"Project Management: Ultimate Guide to Tools, Techniques, and Strategies (2025)"
PPTX
Emphasizing It's Not The End 08 06 2025.pptx
PPTX
Intro to ISO 9001 2015.pptx wareness raising
PPTX
English-9-Q1-3-.pptxjkshbxnnxgchchxgxhxhx
PPTX
Primary and secondary sources, and history
PPTX
water for all cao bang - a charity project
PPTX
lesson6-211001025531lesson plan ppt.pptx
PPTX
INTERNATIONAL LABOUR ORAGNISATION PPT ON SOCIAL SCIENCE
PPTX
Effective_Handling_Information_Presentation.pptx
PPTX
The spiral of silence is a theory in communication and political science that...
PPTX
Impressionism_PostImpressionism_Presentation.pptx
ART-APP-REPORT-FINctrwxsg f fuy L-na.pptx
chapter8-180915055454bycuufucdghrwtrt.pptx
Human Mind & its character Characteristics
Learning-Plan-5-Policies-and-Practices.pptx
Parts of Speech Prepositions Presentation in Colorful Cute Style_20250724_230...
Tablets And Capsule Preformulation Of Paracetamol
nose tajweed for the arabic alphabets for the responsive
First Aid Training Presentation Slides.ppt
worship songs, in any order, compilation
"Project Management: Ultimate Guide to Tools, Techniques, and Strategies (2025)"
Emphasizing It's Not The End 08 06 2025.pptx
Intro to ISO 9001 2015.pptx wareness raising
English-9-Q1-3-.pptxjkshbxnnxgchchxgxhxhx
Primary and secondary sources, and history
water for all cao bang - a charity project
lesson6-211001025531lesson plan ppt.pptx
INTERNATIONAL LABOUR ORAGNISATION PPT ON SOCIAL SCIENCE
Effective_Handling_Information_Presentation.pptx
The spiral of silence is a theory in communication and political science that...
Impressionism_PostImpressionism_Presentation.pptx

[cb22] SMARTIAN: Enhancing Smart Contract Fuzzing with Static and Dynamic Data-Flow Analyses by Doyeon Kim

  • 1. SMARTIAN: Enhancing Smart Contract Fuzzing with Static and Dynamic Data-Flow Analyses Jaeseung Choi KAIST CODE BLUE 2022 Doyeon Kim LINE Plus Soomin Kim KAIST Gustavo Grieco Trail of Bits Alex Groce Northern Arizona University Sang Kil Cha KAIST
  • 2. Ethereum Smart Contract • Ethereum: most popular smart contract platform based on blockchain • Smart contract = (code + data) on blockchain ether ether $ Blockchain $ </> </> Digital cash EVM (Ethereum Virtual Machine)
  • 3. Smart Contract is Stateful • Smart contract defines functions that a user can call. • Each function can read or write state variables. g(uint y) { ... = state_v + 1; ... } Smart contract f(uint x) { state_v = ...; ... } Call State variable (persistent) </> f() g() state_v User
  • 4. Smart Contract Security Need Testing! Reentrancy attacks on DAO [1] Integer overflow attacks on ERC20 Bugs in smart contract can cause a catastrophic loss of digital assets. $70M [1] P. Daian, “Analysis of the dao exploit,” https://hackingdistributed.com/2016/06/18/analysis-of-the-dao-exploit/
  • 5. • Approximate the program behaviors without actual execution. • Can investigate various semantic properties. • Ex) Does buffer overflow bug occur? Program code ? Static Program Analysis
  • 6. • Repeatedly execute the target program with random inputs. • Simple but effective technique to find vulnerabilities. • Employed by major software companies. (e.g., Google and Microsoft) Inputs Mutate Program Crash Google’s OSS-Fuzz [1,2] [1] https://github.com/google/oss-fuzz [2] https://github.com/google/clusterfuzz Fuzz Testing (Fuzzing)
  • 7. • For smart contracts, a test case (seed) is a sequence of function calls. • Deciding the order of function call is important in fuzzing. g( ) { if(state_v == 31337) { bug(); } } f(uint x) { state_v = x; } </> f() g() Can trigger bug w/ mutation Smart contract state_v f(0) --> g( ) g( ) --> f(0) Can’t trigger bug w/ mutation Challenge in Fuzzing
  • 8. • Traditional coverage-based fuzzing cannot discern two sequences. • Previous work is based on machine learning [1] or runtime heuristics [2]. </> f() g() Smart contract state_v g( ) { if(state_v == 31337) { bug(); } } f(uint x) { state_v = x; } f(0) --> g( ) g( ) --> f(0) Same code coverage Existing Approach [1] J. He et al., “Learning to fuzz from symbolic execution with application to smart contracts”, CCS 2019 [2] V. Wustholz et al., “Harvey: A greybox fuzzer for smart contracts”, FSE 2020
  • 9. 1 f(uint x, uint y) { 2 if (x == 41) 3 state_v = y; 4 } 5 g( ) { 6 if (state_v == 61) 7 bug(); 8 } 9 h( ) { ... } • Traditional code coverage (e.g., line coverage) may miss critical seed. 𝑺𝑺𝟏𝟏: f(0,0)-->g() 𝑺𝑺𝒃𝒃𝒃𝒃𝒃𝒃: f(41,61)-->g() Covers Line 3 𝑺𝑺𝟐𝟐: f(0,0)-->h() 𝑺𝑺𝟐𝟐′ : f(41,0)-->h() Covers Line 3 We can miss critical intermediate seed 𝑺𝑺𝟏𝟏′ : f(41,0)-->g() Only 𝑺𝑺𝟏𝟏′ covers Line 3 𝑠𝑠𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡_𝑣𝑣 Line 6 Why is Line Coverage Not Enough?
  • 10. • Statically analyze data-flows between functions. • Initialize fuzzing seeds to have promising function call orders. </> f() g() Promising sequence Smart contract state_v g( ) { if(state_v == 31337) { bug(); } } f(uint x) { state_v = x; } f(0) --> g( ) g( ) --> f(0) Static Analysis Our Approach: Static Analysis
  • 11. • Integrating static analysis with fuzzing • Collect program knowledges that can improve fuzzing performance. Program code Inputs Mutate Program Crash + Fuzzing Static Analysis ? Our Work
  • 14. • Smart contracts are deployed to the blockchain in bytecode form. • For certain contracts in the blockchain, source code may be unavailable. • Binary-only fuzzing broadens the range of testing targets. Binary-Only Smart Contract Fuzzing
  • 15. • During compilation, ABI files are generated along with the bytecode. • ABI contains various information, e.g., the type of function parameters. • Only bytecode are uploaded to the blockchain. ABI Specification
  • 17. Analyzing State Variable Access • Contract bytecode runs in a stack-based machine called EVM. • We must figure out the operands for storage access instructions. C 01101 Byte 100 Stack 200 EVM PUSH 20 ADD ... SLOAD // Storage load Memory Storage 20 state_v 20 + 100 120
  • 18. Analyzing State Variable Access • Contract bytecode runs in a stack-based machine called EVM. • We must figure out the operands for storage access instructions. C 01101 Byte Stack 200 EVM PUSH 20 ADD ... SLOAD // Storage load Memory Storage state_v 120 ...
  • 19. High Level Design • We run flow-sensitive analysis for each function. − Approximates the state of EVM along the execution. • We identify which state variables are loaded & stored by the function using SLOAD and SSTORE instructions. </> f() g() 011 101 111 f(… ) g(…) h(…) Store: var_x, var_y Load: var_x Load: var_y
  • 20. • Identify function call orders that may produce data-flows across functions. • Ensure that at least one seed includes the identified order. Initial Seed Pool f(… ) g(…) h(…) Store: var_x, var_y Load: var_x Load: var_y Generate </> f() g() 011 101 111 Data-flow f()->g() f()->h() Generating Initial Seeds for Fuzzing
  • 21. • Funcs: A set of identified functions. • Defs: A map from each identified function to the state variables defined by the function. • Uses: A map from each identified function to the state variables used by the function. • DataFlowGain: Function-level data flows as triples <f1,v,f2> from a given sequence, where (1) f1 and f2 are functions that appear in the sequence, (2) f1 defines v, and (3) f2 uses that v. Seed Initialization Algorithm
  • 24. • We should mutate function arguments to realize the expected data-flows. • For this, we dynamically analyze concrete data-flows and use them as feedback. 𝑺𝑺𝟏𝟏: f(0,0)-->g() 1 f(uint x, uint y) { 2 if (x == 41) 3 state_v = y; 4 } 5 g( ) { 6 if (state_v == 61) 7 bug(); 8 } 9 h( ) { ... } 𝑺𝑺𝒃𝒃𝒃𝒃𝒃𝒃: f(41,61)-->g() Mutate Initial seed 𝑺𝑺𝟏𝟏′: f(41,0)-- >g() Intermediate seed Realize data-flow Line 3 𝑠𝑠𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡_𝑣𝑣 Line 6 Dynamic Data-Flow Analysis
  • 25. • Smart contract bugs (mostly) do not incur a crash. − Must implement bug oracle that monitors the execution. • Smartian implements bug oracles for 13 classes of bugs. − Investigated previous works on finding bugs from smart contract. Bug Oracles for Fuzzing
  • 26. • Assertion Failure(AF): The condition of an assert statement is not satisfied. − Check if an INVALID instruction is executed. • Arbitrary Write(AW): An Attacker can overwrite arbitrary storage data by accessing a mismanaged array object. − Check if someone accesses storage data in a location that is larger than the length of the storage. − Same bug oracle with Harvey[1]. • Requirement Violation(RV): The condition of a require statement is not satisfied. − Check if a REVERT instruction is executed. Bug Oracles [1] V. Wu ̈stholz and M. Christakis, “Harvey: A greybox fuzzer for smart contracts,” in Proceedings of the International Symposium on Founda- tions of Software Engineering: Industry Papers, 2020.
  • 27. • Block State Dependency(BD): Block states decide ether transfer of a contract. − Check if a block state(e.g. TIMESTAMP, NUMBER) can affect an ether transfer tracing both direct and indirect taint flows for this. • Control-Flow Hijack(CH): An attacker can arbitrarily control the destination of a JUMP or DELEGATECALL instruction. − Raise an alarm if someone can set the destination contract of a DELEGATECALL into an arbitrary user contract. − Report an alarm if the destination of a JUMP instruction is manipulatable. Bug Oracles
  • 28. • Ether Leak(EL): A contract allows an arbitrary user to freely retrieve ether from the contract. − Check if a normal user can gain ether by sending transactions to the contract only when the transaction sequence does not have any preceding transaction from the deployer. • Freezing Ether(FE): A contract can receive ether but does not have any means to send out ether. − Check if there is no way to transfer ether to someone during the execution while contract balance is greater than zero. − Same bug oracle with ContractFuzzer[1]. Bug Oracles [1] B. Jiang, Y. Liu, and W. K. Chan, “ContractFuzzer: Fuzzing smart contracts for vulnerability detection,” in Proceedings of the International Conference on Automated Software Engineering, 2018.
  • 29. • Mishandled Exception(ME): A contract does not check for an exception when calling external functions or sending ether. − Taint the return value of a CALL instruction flows into a predicate of a JUMPI instruction. − If there is a return value that is not used by a JUMPI, we report an alarm. • Multiple Send(MS): A contract sends out ether multiple times within one transaction. This is a specific case of DoS. − Detect multiple ether transfers taking place in a single transaction. Bug Oracles
  • 30. • Integer Bug(IB): Integer overflows or underflows occur, and the result becomes an unexpected value. − Check if the over/underflowed value is used to critical variables. • Reentrancy(RE): A function in a victim contract is re-entered and leads to a race condition on state variables. − First, monitor if there is a cyclic call chain during an ether transfer. − Then, use taint analysis to identify state variables that affect this ether transfer. − Finally, report if such variables are updated after the transfer takes place. Bug Oracles
  • 31. • Suicidal Contract(SC): An arbitrary user can destroy a victim contract by running a SELFDESTRUCT instruction. − Check if a normal user can execute SELFDESTRUCT instruction and destroy the contract. − Filter out that have any preceding transaction from the deployer in the sequence. • Transaction Origin Use(TO): A contract relies on the origin of a transaction (i.e. tx.origin) for user authorization. − Taint the return value of ORIGIN instruction, and check if it flows into the predicate of a JUMPI instruction. Bug Oracles
  • 32. • Static analysis module − Used B2R2 [1] as a front-end for EVM bytecode. − Wrote main analysis logic in 1K lines of F# code. • Fuzzing module − Extended Eclipser [2] to support EVM bytecode. − Used Nethermind [3] for the emulation of the bytecode. Implementation [1] M. Jung et al., “B2R2: Building an efficient front-end for binary analysis,” NDSS BAR 2019 [2] J. Choi et al., “Grey-box Concolic Testing on Binary Code,” ICSE 2019 [3] "Nethermind," https://github.com/NethermindEth/nethermind
  • 33. • Q1. Can static & dynamic data-flow analyses improve fuzzing? • Q2. Can Smartian outperform other testing tools for smart contracts? • Q3. How does Smartian perform on a large-scale benchmark? Evaluation
  • 34. • Benchmarks − Used the dataset from Verismart [1] and SmartBugs [2] • Comparison targets − Two fuzzers (sFuzz, ILF) and two symbolic executors (Mythril, Manticore) • Environment − Used Docker container to run each tool on a single contract Experimental Setup [1] S. So et al., “VeriSmart: A highly precise safety verifier for ethereum smart contracts,” S&P 2020 [2] T. Durieux et al., “Empirical review of automated analysis tools on 47,587 ethereum smart contracts,” ICSE 2020
  • 35. • Verismart [1] benchmark: 58 real-world contracts with integer overflow CVEs • Compare three different modes of Smartian Impact of Data-Flow Analyses [1] S. So et al., “VeriSmart: A highly precise safety verifier for ethereum smart contracts,” S&P 2020
  • 36. • Verismart [1] benchmark: 58 real-world contracts with integer overflow CVEs • Compare four different modes of Smartian What about Dynamic Analysis Only? [1] S. So et al., “VeriSmart: A highly precise safety verifier for ethereum smart contracts,” S&P 2020
  • 37. • Used a subset of the previous benchmark • Compared against tools that support integer overflow detection ILF: no support Comparison against other Tool - 1
  • 38. • SmartBugs [1] benchmark: contracts with labeled bugs − Selected 3 bug class: block state dependency, mishandled exception, reentrancy Comparison against other Tool - 2
  • 39. • More experimental results − Coverage measurement − Consideration on different bug oracles − Large-scale experiment More in the Paper
  • 40. • Improving the precision of static analysis • Automatically inferring the ABI specification of contract • Applying of our idea to other domains Future Works
  • 41. • Smartian is available at https://github.com/SoftSec-KAIST/Smartian • We also release the artifacts for our evaluation Open Science