SlideShare a Scribd company logo
CodeQL
Mostafa Sattari
2
Agenda
●
Introduction
●
Installation
●
Writing Queries
●
Example
3
Intro
void fire_thrusters(double
vectors[12]) {
for (int i = 0; i < 12 i++) {
... vectors[i] ...
}
}
double thruster[3] = ... ;
fire_thrusters(thruster);
●
In C, array types of parameters degrade to pointer types.
●
The size is ignored!
●
No protection from passing a mismatched array.
4
Intro
●
…to find all instances of the problem.
import cpp
from Function f, FunctionCall c, int i, int a, int b
where f = c.getTarget()
and a = c.getArgument(i).getType().(ArrayType).getArraySize()
and b = f.getParameter(i).getType().(ArrayType).getArraySize()
and a < b
select c.getArgument(i), "Array of size " + a
+ " passed to $@, which expects an array of size " + b +
".",
f, f.getName()
5
Intro
●
CodeQL Consists of:
– QL: the programming language for CodeQL
code analysis platform.
– CLI: run queries
– Libraries: QL libraries
– Databases: contains all the things needed to
run the queries
●
Used for Variant Analysis
6
Analysis overview
7
Intro - Tools
●
Standalone CodeQL CLI
●
Interactive Query Console (lgtm.com)
●
IDE extensions
– Eclipse
– VSCode
8
Intro - CLI
9
Intro – Interactive Query Console
10
Intro – VSCode Extension
11
●
What you need to run queries
– CodeQL CLI tool
– Query libraries
– A database
●
Installing VSCode and CodeQL extension
– Alternatively Eclipse + CodeQL extension
– Add CodeQL cli to your env
●
~/.config/Code/User/globalStorage/github.vscode-codeql/distribution1/codeql/codeql
●
It might vary on your machine
Installation
12
●
Install CodeQL CLI
or
●
Install VSCode and CodeQL extension
– Alternatively Eclipse + CodeQL extension
– Add CodeQL cli to your env
– ~/.config/Code/User/globalStorage/github.vscode-codeql/distribution1/codeql/codeql
●
It might vary on your machine
Installation
13
Installation
●
Running codeql database create
●
Importing a database from lgtm.com
$ codeql database create <database> --language=<language-
identifier>
--language: cpp/csharp/go/java/python/javascript
--source-root: the root folder for the primary source files
(default = current directory).
--command: for compiled languages only, the build commands
that invoke the compiler.
14
Writing QL Queries
●
QL
– logic programming language
– built up of logical formulas
– Object oriented
●
Basic syntax
from /* ... variable declarations ... */
where /* ... logical formulas ... */
select /* ... expressions ... */
// Example:
from int x, int y
where x = 6 and y = 7
select x * y
15
Writing QL Queries
●
Python
import python
from Function f
where count(f.getAnArg()) > 7
select f
●
Java
●
JavaScript
import java
from Parameter p
where not exists(p.getAnAccess())
select p
import javascript
from Comment c
where c.getText().regexpMatch("(?si).*bTODOb.*")
select c
16
Writing QL Queries
●
Formulas
<expression> <operator> <expression> // Comparison
<expression> instanceof <type> // Type check
<expression> in <range> // Range check
exists(<variable declarations> | <formula>)
forex(<variable declarations> | <formula 1> | <formula 2>)
forall(<vars> | <formula 1> | <formula 2>) and
exists(<vars> | <formula 1> | <formula 2>)
Two formulas in the body: It holds if <formula 2> holds for all values that <formula 1> holds for.
●
Aggregates
●
Common aggregates are count, max, min, avg (average) and sum.
from Person t
where t.getAge() = max(int i | exists(Person p | p.getAge() = i) | i)
select t
17
Writing QL Queries
●
Predicates
●
The name of a predicate always starts with a lowercase letter.
●
You can also define predicates with a result. In that case, the keyword predicate is
replaced with the type of the result. This is like introducing a new argument, the special
variable result. For example, int getAge() {result = ...} returns an int.
predicate southern(Person p) {
p.getLocation() = "south"
}
from Person p
where southern(p)
select p
18
Writing QL Queries
●
Classes
– instanceof
●
You might be tempted to think of the characteristic predicate as a constructor. However, this
is not the case - it is a logical property which does not create any objects.
class Southerner extends Person {
Southerner() { southern(this) }
}
from Southerner s
select s
class Child extends Person{
/* the characteristic predicate */
Child() { this.getAge() < 10 }
/* a member predicate */
override predicate isAllowedIn(string region){
region = this.getLocation()
}
}
19
Writing QL Queries
●
Annotations
– abstract, finaal, overrise, private, ...
●
Recursion
– Transitive closures +
– Reflexive transitive closure *
●
Name Resolution
– Qualified references (import examples.security.MyLibrary)
– Selections (<module_expression>::<name>)
20
Variant analysis
●
Control flow analysis (CFA) allows you to inspect how the different parts of the source
code are executed and in which order. Control flow analysis is useful for finding vulnerable
code paths that are only executed under unlikely circumstances.
●
Data flow analysis (DFA) is the process of tracking data from a source, where it enters an
application, to a sink, where the data is used in a potentially harmful way if it's not sanitized
along the way.
●
Taint tracking typically refers to untrusted – or tainted – data that is under partial or full
control of a user. Using data flow analysis, tainted data is tracked from the source through
method calls and variable assignments – including containers and class members – to a sink.
●
Range analysis (or bounds analysis) is used to investigate which possible values a variable
can hold, and which values it will never hold. This is useful information in various lines of
investigation.
●
Semantic code search allows you to quickly interrogate a codebase and identify areas of
interest for further investigation. This is valuable to identify methods having a particular
signature, or variables that may contain credentials.
21
Variant analysis - Modules
●
semmle.code.cpp.dataflow.DataFlow
– IsSource : defines where data may flow from
– IsSink : defines where data may flow to
– HasFlow : performs the analysis
●
semmle.code.cpp.dataflow.TaintTracking
– IsSanitizerGuard : optional, restricts the taint flow
22
Variant analysis
●
Analyzing data flow in C/C++
import cpp
import semmle.code.cpp.dataflow.TaintTracking
class MyTaintTrackingConfiguration extends TaintTracking::Configuration {
MyTaintTrackingConfiguration() { this = "MyTaintTrackingConfiguration" }
override predicate isSource(DataFlow::Node source) {
...
}
override predicate isSink(DataFlow::Node sink) {
...
}
}
23
Variant analysis - Example
import semmle.code.cpp.dataflow.DataFlow
class EnvironmentToFileConfiguration extends DataFlow::Configuration {
EnvironmentToFileConfiguration() { this =
"EnvironmentToFileConfiguration" }
override predicate isSource(DataFlow::Node source) {
exists (Function getenv |
source.asExpr().(FunctionCall).getTarget() = getenv and
getenv.hasQualifiedName("getenv")
)
}
override predicate isSink(DataFlow::Node sink) {
exists (FunctionCall fc |
sink.asExpr() = fc.getArgument(0) and
fc.getTarget().hasQualifiedName("fopen")
)
}
}
from Expr getenv, Expr fopen, EnvironmentToFileConfiguration config
where config.hasFlow(DataFlow::exprNode(getenv),
DataFlow::exprNode(fopen))
select fopen, "This 'fopen' uses data from $@.",
getenv, "call to 'getenv'"
24
●
Almost all materials are burrowed from Semmle.com
– https://help.semmle.com/QL/learn-ql/index.html
– https://help.semmle.com/QL/ql-training/cpp/intro-ql-cpp.html
– https://marketplace.visualstudio.com/items?itemName=github.vscode-codeql
●
Get help
– https://discuss.lgtm.com/latest
– https://stackoverflow.com/questions/tagged/semmle-ql
Recaps
25
Thank you

More Related Content

PDF
Codeql Variant Analysis
PPTX
Stack pivot
PPTX
You didnt see it’s coming? "Dawn of hardened Windows Kernel"
PDF
The Hunter Games: How to Find the Adversary with Event Query Language
PDF
Super Easy Memory Forensics
 
PPTX
Write microservice in golang
PDF
A Threat Hunter Himself
PDF
CNIT 126: 10: Kernel Debugging with WinDbg
Codeql Variant Analysis
Stack pivot
You didnt see it’s coming? "Dawn of hardened Windows Kernel"
The Hunter Games: How to Find the Adversary with Event Query Language
Super Easy Memory Forensics
 
Write microservice in golang
A Threat Hunter Himself
CNIT 126: 10: Kernel Debugging with WinDbg

What's hot (20)

PPTX
Triton and Symbolic execution on GDB@DEF CON China
PDF
DevOps Continuous Integration & Delivery - A Whitepaper by RapidValue
PDF
Hunting for security bugs in AEM webapps
PDF
Hunting for Credentials Dumping in Windows Environment
PDF
Play with FILE Structure - Yet Another Binary Exploit Technique
PPTX
Injection on Steroids: Codeless code injection and 0-day techniques
PDF
Heap exploitation
PDF
Cyber Threat hunting workshop
PPTX
Cyber Threat Hunting Workshop
PPTX
AMSI: How Windows 10 Plans to Stop Script-Based Attacks and How Well It Does It
PDF
CMake - Introduction and best practices
PPTX
Bsides 2019 - Intelligent Threat Hunting
PDF
Implementing generic JNI hardware control for Kotlin based app on AOSP
PDF
Make static instrumentation great again, High performance fuzzing for Windows...
PDF
CNIT 126 6: Recognizing C Code Constructs in Assembly
PDF
An Overview of Deserialization Vulnerabilities in the Java Virtual Machine (J...
PDF
Secure coding presentation Oct 3 2020
ODP
Secure coding in C#
PDF
A Case Study in Attacking KeePass
PDF
Hunting for Privilege Escalation in Windows Environment
Triton and Symbolic execution on GDB@DEF CON China
DevOps Continuous Integration & Delivery - A Whitepaper by RapidValue
Hunting for security bugs in AEM webapps
Hunting for Credentials Dumping in Windows Environment
Play with FILE Structure - Yet Another Binary Exploit Technique
Injection on Steroids: Codeless code injection and 0-day techniques
Heap exploitation
Cyber Threat hunting workshop
Cyber Threat Hunting Workshop
AMSI: How Windows 10 Plans to Stop Script-Based Attacks and How Well It Does It
CMake - Introduction and best practices
Bsides 2019 - Intelligent Threat Hunting
Implementing generic JNI hardware control for Kotlin based app on AOSP
Make static instrumentation great again, High performance fuzzing for Windows...
CNIT 126 6: Recognizing C Code Constructs in Assembly
An Overview of Deserialization Vulnerabilities in the Java Virtual Machine (J...
Secure coding presentation Oct 3 2020
Secure coding in C#
A Case Study in Attacking KeePass
Hunting for Privilege Escalation in Windows Environment
Ad

Similar to Semmle Codeql (20)

PDF
CodeQL Microsoft documentation - Basic of CodeQL
PDF
robert-kovacsics-part-ii-dissertation
KEY
Polyglot and Functional Programming (OSCON 2012)
KEY
Exciting JavaScript - Part II
PDF
oop-slides.pdf 01-introduction OOPS concepts in C++ JAVA
PDF
Ibm db2 10.5 for linux, unix, and windows x query reference
PPTX
GraphQL-ify your APIs - Devoxx UK 2021
PDF
ADBMS_CSII_2025Feb25.pdfowkwkekekekekekee
KEY
Polyglot and functional (Devoxx Nov/2011)
PDF
CodeQL a Powerful Binary Analysis Engine
PDF
Programming Languages: some news for the last N years
ODP
PDF
PL Lecture 01 - preliminaries
PDF
Scala for Java Developers - Intro
KEY
Building a Mongo DSL in Scala at Hot Potato
DOC
Mohammed Kharma - A flexible framework for quality assurance and testing of s...
PDF
Programming in Scala - Lecture One
PPT
NDepend Public PPT (2008)
PPTX
Scala, Play 2.0 & Cloud Foundry
PDF
Principles of programming languages
CodeQL Microsoft documentation - Basic of CodeQL
robert-kovacsics-part-ii-dissertation
Polyglot and Functional Programming (OSCON 2012)
Exciting JavaScript - Part II
oop-slides.pdf 01-introduction OOPS concepts in C++ JAVA
Ibm db2 10.5 for linux, unix, and windows x query reference
GraphQL-ify your APIs - Devoxx UK 2021
ADBMS_CSII_2025Feb25.pdfowkwkekekekekekee
Polyglot and functional (Devoxx Nov/2011)
CodeQL a Powerful Binary Analysis Engine
Programming Languages: some news for the last N years
PL Lecture 01 - preliminaries
Scala for Java Developers - Intro
Building a Mongo DSL in Scala at Hot Potato
Mohammed Kharma - A flexible framework for quality assurance and testing of s...
Programming in Scala - Lecture One
NDepend Public PPT (2008)
Scala, Play 2.0 & Cloud Foundry
Principles of programming languages
Ad

Recently uploaded (20)

PPTX
cloud_computing_Infrastucture_as_cloud_p
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PPT
Teaching material agriculture food technology
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
A Presentation on Artificial Intelligence
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PPTX
TLE Review Electricity (Electricity).pptx
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
Tartificialntelligence_presentation.pptx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
cloud_computing_Infrastucture_as_cloud_p
Encapsulation_ Review paper, used for researhc scholars
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Assigned Numbers - 2025 - Bluetooth® Document
Teaching material agriculture food technology
Mobile App Security Testing_ A Comprehensive Guide.pdf
A Presentation on Artificial Intelligence
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
NewMind AI Weekly Chronicles - August'25-Week II
TLE Review Electricity (Electricity).pptx
Spectral efficient network and resource selection model in 5G networks
Building Integrated photovoltaic BIPV_UPV.pdf
Univ-Connecticut-ChatGPT-Presentaion.pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
Tartificialntelligence_presentation.pptx
Digital-Transformation-Roadmap-for-Companies.pptx
Agricultural_Statistics_at_a_Glance_2022_0.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx

Semmle Codeql

  • 3. 3 Intro void fire_thrusters(double vectors[12]) { for (int i = 0; i < 12 i++) { ... vectors[i] ... } } double thruster[3] = ... ; fire_thrusters(thruster); ● In C, array types of parameters degrade to pointer types. ● The size is ignored! ● No protection from passing a mismatched array.
  • 4. 4 Intro ● …to find all instances of the problem. import cpp from Function f, FunctionCall c, int i, int a, int b where f = c.getTarget() and a = c.getArgument(i).getType().(ArrayType).getArraySize() and b = f.getParameter(i).getType().(ArrayType).getArraySize() and a < b select c.getArgument(i), "Array of size " + a + " passed to $@, which expects an array of size " + b + ".", f, f.getName()
  • 5. 5 Intro ● CodeQL Consists of: – QL: the programming language for CodeQL code analysis platform. – CLI: run queries – Libraries: QL libraries – Databases: contains all the things needed to run the queries ● Used for Variant Analysis
  • 7. 7 Intro - Tools ● Standalone CodeQL CLI ● Interactive Query Console (lgtm.com) ● IDE extensions – Eclipse – VSCode
  • 9. 9 Intro – Interactive Query Console
  • 10. 10 Intro – VSCode Extension
  • 11. 11 ● What you need to run queries – CodeQL CLI tool – Query libraries – A database ● Installing VSCode and CodeQL extension – Alternatively Eclipse + CodeQL extension – Add CodeQL cli to your env ● ~/.config/Code/User/globalStorage/github.vscode-codeql/distribution1/codeql/codeql ● It might vary on your machine Installation
  • 12. 12 ● Install CodeQL CLI or ● Install VSCode and CodeQL extension – Alternatively Eclipse + CodeQL extension – Add CodeQL cli to your env – ~/.config/Code/User/globalStorage/github.vscode-codeql/distribution1/codeql/codeql ● It might vary on your machine Installation
  • 13. 13 Installation ● Running codeql database create ● Importing a database from lgtm.com $ codeql database create <database> --language=<language- identifier> --language: cpp/csharp/go/java/python/javascript --source-root: the root folder for the primary source files (default = current directory). --command: for compiled languages only, the build commands that invoke the compiler.
  • 14. 14 Writing QL Queries ● QL – logic programming language – built up of logical formulas – Object oriented ● Basic syntax from /* ... variable declarations ... */ where /* ... logical formulas ... */ select /* ... expressions ... */ // Example: from int x, int y where x = 6 and y = 7 select x * y
  • 15. 15 Writing QL Queries ● Python import python from Function f where count(f.getAnArg()) > 7 select f ● Java ● JavaScript import java from Parameter p where not exists(p.getAnAccess()) select p import javascript from Comment c where c.getText().regexpMatch("(?si).*bTODOb.*") select c
  • 16. 16 Writing QL Queries ● Formulas <expression> <operator> <expression> // Comparison <expression> instanceof <type> // Type check <expression> in <range> // Range check exists(<variable declarations> | <formula>) forex(<variable declarations> | <formula 1> | <formula 2>) forall(<vars> | <formula 1> | <formula 2>) and exists(<vars> | <formula 1> | <formula 2>) Two formulas in the body: It holds if <formula 2> holds for all values that <formula 1> holds for. ● Aggregates ● Common aggregates are count, max, min, avg (average) and sum. from Person t where t.getAge() = max(int i | exists(Person p | p.getAge() = i) | i) select t
  • 17. 17 Writing QL Queries ● Predicates ● The name of a predicate always starts with a lowercase letter. ● You can also define predicates with a result. In that case, the keyword predicate is replaced with the type of the result. This is like introducing a new argument, the special variable result. For example, int getAge() {result = ...} returns an int. predicate southern(Person p) { p.getLocation() = "south" } from Person p where southern(p) select p
  • 18. 18 Writing QL Queries ● Classes – instanceof ● You might be tempted to think of the characteristic predicate as a constructor. However, this is not the case - it is a logical property which does not create any objects. class Southerner extends Person { Southerner() { southern(this) } } from Southerner s select s class Child extends Person{ /* the characteristic predicate */ Child() { this.getAge() < 10 } /* a member predicate */ override predicate isAllowedIn(string region){ region = this.getLocation() } }
  • 19. 19 Writing QL Queries ● Annotations – abstract, finaal, overrise, private, ... ● Recursion – Transitive closures + – Reflexive transitive closure * ● Name Resolution – Qualified references (import examples.security.MyLibrary) – Selections (<module_expression>::<name>)
  • 20. 20 Variant analysis ● Control flow analysis (CFA) allows you to inspect how the different parts of the source code are executed and in which order. Control flow analysis is useful for finding vulnerable code paths that are only executed under unlikely circumstances. ● Data flow analysis (DFA) is the process of tracking data from a source, where it enters an application, to a sink, where the data is used in a potentially harmful way if it's not sanitized along the way. ● Taint tracking typically refers to untrusted – or tainted – data that is under partial or full control of a user. Using data flow analysis, tainted data is tracked from the source through method calls and variable assignments – including containers and class members – to a sink. ● Range analysis (or bounds analysis) is used to investigate which possible values a variable can hold, and which values it will never hold. This is useful information in various lines of investigation. ● Semantic code search allows you to quickly interrogate a codebase and identify areas of interest for further investigation. This is valuable to identify methods having a particular signature, or variables that may contain credentials.
  • 21. 21 Variant analysis - Modules ● semmle.code.cpp.dataflow.DataFlow – IsSource : defines where data may flow from – IsSink : defines where data may flow to – HasFlow : performs the analysis ● semmle.code.cpp.dataflow.TaintTracking – IsSanitizerGuard : optional, restricts the taint flow
  • 22. 22 Variant analysis ● Analyzing data flow in C/C++ import cpp import semmle.code.cpp.dataflow.TaintTracking class MyTaintTrackingConfiguration extends TaintTracking::Configuration { MyTaintTrackingConfiguration() { this = "MyTaintTrackingConfiguration" } override predicate isSource(DataFlow::Node source) { ... } override predicate isSink(DataFlow::Node sink) { ... } }
  • 23. 23 Variant analysis - Example import semmle.code.cpp.dataflow.DataFlow class EnvironmentToFileConfiguration extends DataFlow::Configuration { EnvironmentToFileConfiguration() { this = "EnvironmentToFileConfiguration" } override predicate isSource(DataFlow::Node source) { exists (Function getenv | source.asExpr().(FunctionCall).getTarget() = getenv and getenv.hasQualifiedName("getenv") ) } override predicate isSink(DataFlow::Node sink) { exists (FunctionCall fc | sink.asExpr() = fc.getArgument(0) and fc.getTarget().hasQualifiedName("fopen") ) } } from Expr getenv, Expr fopen, EnvironmentToFileConfiguration config where config.hasFlow(DataFlow::exprNode(getenv), DataFlow::exprNode(fopen)) select fopen, "This 'fopen' uses data from $@.", getenv, "call to 'getenv'"
  • 24. 24 ● Almost all materials are burrowed from Semmle.com – https://help.semmle.com/QL/learn-ql/index.html – https://help.semmle.com/QL/ql-training/cpp/intro-ql-cpp.html – https://marketplace.visualstudio.com/items?itemName=github.vscode-codeql ● Get help – https://discuss.lgtm.com/latest – https://stackoverflow.com/questions/tagged/semmle-ql Recaps