SlideShare a Scribd company logo
Hermes:
Towards Representative
Benchmarks
Collaborative work by many PhDs and Post Docs!
Dr. Michael Eichberg (mail@michael-eichberg.de)

Technische Universität Darmstadt
http://www.opal-project.de/Hermes.html
https://twitter.com/@TheOpalProject
https://gitter.im/OPAL-Project/Hermes
OOPSLA 2000
“ The benchmarks in our evaluation are
meant to be representative of real
applications and they include four SPECjvm
benchmarks plus three other large, object-
oriented benchmarks. ”
Empirical Software Engineering 2011
“ For our study, we consider the API
documentation of packages from the
official Java Development Kit, the Eclipse
project, and from the Apache foundation.
[…] By choosing libraries that cover a wide
range of use cases we try to alleviate the
risk that the study is biased towards a
particular domain. The decision is based on
the evidential observation that the level
and the quality of their documentation is
often high.”
FSE 2015
“ In the third research question, we looked at
the performance […] on realistic
programs. We selected five realistic apps
from different sources.”
0
75
150
225
300
Reflection BookStore a2z add.me.fast pipefilter
size/execution time of analysis
Why?
FSE 2019
“ We evaluate IT on three well-known
benchmark suites— DaCapo 2006,
Dacapo-9.12-MR1-bach and ScalaBench.
Additionally, we use two real-world
performance bug datasets.…”
Qualitas Corpus,

XCorpus,

…

It’s huge, but….

is it representative?
ant informa luceneantlr ireport marauroaaoi itext mavenargouml ivatagroupware megamekaspectj jFin_DateMath mvnforumaxion jag myfaces_coreazureus james nakedobjectsbatik jasml nekohtmlc_jdbc jasperreports netbeanscastor javacc openjmscayenne jboss oscachecheckstyle jchempaint picocontainercobertura jedit pmdcollections jena poicolt jext pookacolumba jfreechart proguardcompiere jgraph quartzderby jgraphpad quickserverdisplaytag jgrapht quiltdrawswf jgroups rollerdrjava jhotdraw rssowleclipse_SDK jmeter sableccemma jmoney sandmarkexoportal joggplayer springframeworkfindbugs jparse squirrel_sqlfitjava jpf strutsfreecol jre tapestryfreecs jrefactory
Properties of the

Qualitas Corpus (Sept. 2013)
• No usage of Java FX (already became available in 2008)

• Only one project (Hibernate) made use of Java 7 features

• (There is a lot of redundancy.)
Feature-oriented
Benchmarks
• Securibench

• Pointerbench

• …
Features are
not distributed
uniformly!
Judge: Identifying,
Understanding, and
Evaluating Sources of
Unsoundness in Call
Graphs
Michael Reif, Florian Kübler, Michael
Eichberg, Dominik Helm, Mira Mezini

ISSTA’19
WALA0-1-CFA—all congured with the FULL reection option. WALA
requires to specify packages to be excluded from the analysis. For
the comparative analysis (Experiment 2 (see 4.3)) we excluded no
package, whereas for the experiment related to RQ3 we use the pre-
dened Java60RegressionExclusions to ensure termination. For all
Soot call graphs (SootCHA, SootRTA, SootVTA, and SootSPARK [26])
we use the options: safe-forname and safe-newinstance. This options
make Soot consider all types as instantiated when Class.forName
or Class.newInstance is used. We could not use types-for-invoke due
exceptions being thrown [41]. Furthermore, we use include-all to
ensure that no packages are ltered. Our library test cases are addi-
tionally started with library:signature-resolution and all-reachable
to make use of Soot’s capabilities to analyze library code. DOOPCI’s
call graph is set to be context-insensitive with classical-reection
turned on. For OPALRTA, we use the standard conguration.
All test cases w.r.t. libraries are started with the respective library
entry points. We perform all experiments on a server with two Intel
Xeon E5-2620 CPUs and 64 GB RAM.
4.2 Experiment 1
Our corpus for analyzing the prevalence of language and API fea-
tures (RQ1) includes the XCorpus [13], the top 50 libraries from
Maven Central [31] (from July 2018), the top 15 apps from Google’s
Playstore (from January 2018), plus ve Clojure [20], Groovy [32],
Kotlin [16], and Scala [24] projects.
Table 2 visualizes the results using a heatmap. It shows the
relative frequency of each feature (cf. Feature column) within each
corpus. We include the OpenJDK column as a separate corpus
because most corpus projects are built upon it and, hence, partially
use its features. A feature’s relative frequency is color coded using a
logarithmic scale as shown in the legend of Table 2. Slightly yellow
boxes (⌅) identify unused features and red boxes (⌅) those found
in 5% of all methods; we chose 5% because only 7 features occur
in more than 5% of all methods. Features used in no corpus (e.g.,
Groovy invokedynamics, or the serialization of lambdas) and always
soundly resolved features (e.g., standard poly-/monomorphic call)
are not included.
⌅ All the API and language features supported by Java up to version
7 are used widely across all code bases.
The most frequently used feature that was introduced with Java
8 is the call of static interface methods (J8DIM6). 12% of all
methods of the top 50 Maven projects use them; Scalatest [22] is
responsible for ⇡ 90% of all uses. Clojure and Android code have
not yet adapted Java 8 call semantics. Other Java 8 features, e.g.,
MethodHandle constants, are rarely used; primarily by the Nashorn
library.
⌅ Support for Java 8 is a must, given the frequent use of Java 8 call
semantics features in modern code (J8DIMX), unless one analyzes
only Android or Clojure code.
Serialization-related functionality (Ser3-7,9, ExtSer) and Java’s
Reection API (cf. TR, LRR, CSR) are both used with medium fre-
0%
0.00006%0.0001%0.0002%0.0005%0.0009%0.0018%0.0037%0.0073%0.0148%0.0295%0.059%0.1181%0.2367%0.4724%0.9448%1.8895%
5%
— log2(n)
Feature
OpenJDK8
XCorpusTop50MavenScala
GroovyKotlin
ClojureAndroid
Feature
OpenJDK8
XCorpusTop50MavenScala
GroovyKotlin
ClojureAndroid
J8DIM1 ExtSer2
J8DIM2 ExtSer3
J8DIM3 SI1
J8DIM4 SI2
J8DIM5 SI3
J8DIM6 SI4
JVMC1 SI5
JVMC2 SI6
JVMC3 SI7
JVMC4 SI8
JVMC5 MH1.1
Lambda1 MH1.2
Lambda2 MH2
Lambda3 MH3
Lambda4 MH4
Lambda5 MH5
Lambda6 MH6
Lambda7 MH7
LIB3 MH8
LIB4 TR1
LIB5 TR2
MR1 TR3
MR2 TR4
MR3 TR5
MR4 TR6
MR5 TR7
MR6 TR8
MR7 TR9
Native LRR1
Ser1 LRR2
Ser3 CSR1+CSR2
Ser4 LRR3+CSR3
Ser5 Unsafe1
Ser6 Unsafe2
Ser7 Unsafe3
Ser8 Unsafe4
Ser9 Unsafe5
ExtSer1 Unsafe6
Unsafe7
Getting Representative and
relevant Benchmarks
1. Determine the population
2. Get a (manageable) representative subset
2.1.determine the relevant technical features
2.2.compute the subset…
2.2.1.for development and testing purposes

2.2.2.for a comprehensive evaluation
3.M
ake
itpublicly
available!
Hermes
Assessment and Creation of Effective Test Corpora

(consisting of projects available as Java bytecode)
Facilitating the
Development and Evaluation of
Code Analyses
Idea
Prototyping
D
evelopm
ent
Evaluation
Paper
OPAL
Hermes in a Nutshell
Corpus candidates Hermes Optimal corpus
Feature Queries
Manual and/or
Automatic Selection
using Choco Solver
General Purpose Queries
Existence of 

Bytecode Instructions
Class File Versions
Class Types
Trivial Reflection
OO Metrics 

(Fan-In/Fan-Out,…)
Field Access
Method w/o Returns
Call Graph Related
Method Types
Recursive 

Data Structures
Size of

Inheritance Tree
API Usage
Feature Queries Relevant for
Constructing Call Graphs
Static Initializer
Lambdas
Unsafe API
Polymorphic Calls
Method Reference
Serialization
Java 8 Polymorphic Calls
Signature Polymorphic
Methods
Non-Java bytecode
It’s
Extensible!
Feature Queries for 

API Usage
Bytecode 

Instrumentation
Class Loader
GUI
Crypto
JDBC
Reflection
System
Thread
Unsafe
It’s
Extensible!
Feature Queries
trait FeatureQuery {
// …
def apply[S](
projectConfiguration: ProjectConfiguration,
project: Project[S],
rawClassFiles: Traversable[(da.ClassFile, S)]
): TraversableOnce[Feature[S]]
// …
}
Identifier,
Project JAR Files,
Library JAR Files,
Statistics
Complete reified
project information
(classes, fields,
methods, bodys, etc.)
Raw class file information
(e.g., for extracting
information from the
constant pool)List of detected features in
the codebase (id, frequency
of occurrence, (opt.)
locations)
Feature
abstract case class Feature[S] private (
id: String,
count: Int,
extensions: Chain[Location[S]]
) {
…
}
The name of the
feature.
How often the feature
was found.
Where the feature was
found.
Hermes - Example
Constructing a 

Minimal Corpus
• Hidden Truths in Dead Software Paths @ FSE 15

• Original evaluation and development conducted on the
complete Qualitas Corpus

• Minimal corpus computed by Hermes using all available
queries only consists of 5 out of the 100 projects in the
Qualitas Corpus

• Evaluation cut down from 16.77 minutes to 2.82 minutes
(~6x faster) while coverage is only 1.06% below the
original corpus
• Can help you to build a specifically targeted benchmark

• Hermes Data (i.e., all current feature queries) for all of
Maven Central

• For details talk to Ben :-)
Delphi

More Related Content

PDF
Open Problems in Automatically Refactoring Legacy Java Software to use New Fe...
PDF
Automated Evolution of Feature Logging Statement Levels Using Git Histories a...
PDF
Introduction to new features in java 8
PDF
An Empirical Study of Refactorings and Technical Debt in Machine Learning Sys...
PDF
Mining Fix Patterns for FindBugs Violations
PPTX
iFixR: Bug Report Driven Program Repair
PDF
Bench4BL: Reproducibility Study on the Performance of IR-Based Bug Localization
PPTX
AVATAR : Fixing Semantic Bugs with Fix Patterns of Static Analysis Violations
Open Problems in Automatically Refactoring Legacy Java Software to use New Fe...
Automated Evolution of Feature Logging Statement Levels Using Git Histories a...
Introduction to new features in java 8
An Empirical Study of Refactorings and Technical Debt in Machine Learning Sys...
Mining Fix Patterns for FindBugs Violations
iFixR: Bug Report Driven Program Repair
Bench4BL: Reproducibility Study on the Performance of IR-Based Bug Localization
AVATAR : Fixing Semantic Bugs with Fix Patterns of Static Analysis Violations

What's hot (19)

PPTX
A Closer Look at Real-World Patches
PDF
Method-Level Code Clone Modification using Refactoring Techniques for Clone M...
PDF
Understanding And Using Reflection
PPTX
Actor Concurrency Bugs: A Comprehensive Study on Symptoms, Root Causes, API U...
PPTX
TBar: Revisiting Template-based Automated Program Repair
PDF
A novel approach for clone group mapping
PDF
Learning to Spot and Refactor Inconsistent Method Names
PDF
Wodel-Test: A Model-Based Framework for Language-Independent Mutation Testing
PPT
9781111530532 ppt ch11
PDF
TMPA-2017: 5W+1H Static Analysis Report Quality Measure
PPTX
You Cannot Fix What You Cannot Find! --- An Investigation of Fault Localizati...
PPT
9781111530532 ppt ch06
PPT
9781111530532 ppt ch07
PDF
Metaprograms and metadata (as part of the the PTT lecture)
PPT
9781111530532 ppt ch04
PDF
Using PHPStan with Laravel App
PPT
9781111530532 ppt ch13
PPT
9781111530532 ppt ch08
PPT
9781111530532 ppt ch03
A Closer Look at Real-World Patches
Method-Level Code Clone Modification using Refactoring Techniques for Clone M...
Understanding And Using Reflection
Actor Concurrency Bugs: A Comprehensive Study on Symptoms, Root Causes, API U...
TBar: Revisiting Template-based Automated Program Repair
A novel approach for clone group mapping
Learning to Spot and Refactor Inconsistent Method Names
Wodel-Test: A Model-Based Framework for Language-Independent Mutation Testing
9781111530532 ppt ch11
TMPA-2017: 5W+1H Static Analysis Report Quality Measure
You Cannot Fix What You Cannot Find! --- An Investigation of Fault Localizati...
9781111530532 ppt ch06
9781111530532 ppt ch07
Metaprograms and metadata (as part of the the PTT lecture)
9781111530532 ppt ch04
Using PHPStan with Laravel App
9781111530532 ppt ch13
9781111530532 ppt ch08
9781111530532 ppt ch03
Ad

Similar to Opal Hermes - towards representative benchmarks (20)

PDF
Automating the Generation of Benchmark Suites
PDF
Judge: Identifying, Understanding, and Evaluating Sources of Unsoundness in C...
PDF
Systematic Evaluation of the Unsoundness of Call Graph Algorithms for Java
PDF
Java Unit Testing Tool Competition — Fifth Round
PPT
Assessing Unit Test Quality
PDF
Java 25 and Beyond - A Roadmap of Innovations
PDF
Effective and Efficient API Misuse Detection via Exception Propagation and Se...
PDF
PLDI WALA Tutorial
PPTX
Static analysis of java enterprise applications
PDF
Apache Big Data Europe 2016
PDF
Live API Documentation
PPTX
Interactive Java Support to your tool -- The JShell API and Architecture
PDF
Assessing the Use of Eclipse MDE Technologies in Open-Source Software Projects
PPTX
CMPT-842-BRACK
PDF
Java in flames
PDF
How difficult is to get a JIT right? Talk from ESGU 2024
PDF
Multi-dimensional exploration of API usage - ICPC13 - 21-05-13
PDF
JCConf 2022 - New Features in Java 18 & 19
PDF
Panama4Newbies_Atlanta.pdf
PDF
New types of tests for Java projects
Automating the Generation of Benchmark Suites
Judge: Identifying, Understanding, and Evaluating Sources of Unsoundness in C...
Systematic Evaluation of the Unsoundness of Call Graph Algorithms for Java
Java Unit Testing Tool Competition — Fifth Round
Assessing Unit Test Quality
Java 25 and Beyond - A Roadmap of Innovations
Effective and Efficient API Misuse Detection via Exception Propagation and Se...
PLDI WALA Tutorial
Static analysis of java enterprise applications
Apache Big Data Europe 2016
Live API Documentation
Interactive Java Support to your tool -- The JShell API and Architecture
Assessing the Use of Eclipse MDE Technologies in Open-Source Software Projects
CMPT-842-BRACK
Java in flames
How difficult is to get a JIT right? Talk from ESGU 2024
Multi-dimensional exploration of API usage - ICPC13 - 21-05-13
JCConf 2022 - New Features in Java 18 & 19
Panama4Newbies_Atlanta.pdf
New types of tests for Java projects
Ad

Recently uploaded (20)

PDF
Odoo Companies in India – Driving Business Transformation.pdf
PDF
medical staffing services at VALiNTRY
PPTX
Essential Infomation Tech presentation.pptx
PDF
How Creative Agencies Leverage Project Management Software.pdf
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PDF
top salesforce developer skills in 2025.pdf
PPTX
Transform Your Business with a Software ERP System
PDF
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PPTX
history of c programming in notes for students .pptx
PPTX
Introduction to Artificial Intelligence
PDF
PTS Company Brochure 2025 (1).pdf.......
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
Odoo Companies in India – Driving Business Transformation.pdf
medical staffing services at VALiNTRY
Essential Infomation Tech presentation.pptx
How Creative Agencies Leverage Project Management Software.pdf
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
Which alternative to Crystal Reports is best for small or large businesses.pdf
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
Design an Analysis of Algorithms II-SECS-1021-03
Design an Analysis of Algorithms I-SECS-1021-03
top salesforce developer skills in 2025.pdf
Transform Your Business with a Software ERP System
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
history of c programming in notes for students .pptx
Introduction to Artificial Intelligence
PTS Company Brochure 2025 (1).pdf.......
Upgrade and Innovation Strategies for SAP ERP Customers
How to Choose the Right IT Partner for Your Business in Malaysia

Opal Hermes - towards representative benchmarks

  • 1. Hermes: Towards Representative Benchmarks Collaborative work by many PhDs and Post Docs! Dr. Michael Eichberg (mail@michael-eichberg.de) Technische Universität Darmstadt http://www.opal-project.de/Hermes.html https://twitter.com/@TheOpalProject https://gitter.im/OPAL-Project/Hermes
  • 2. OOPSLA 2000 “ The benchmarks in our evaluation are meant to be representative of real applications and they include four SPECjvm benchmarks plus three other large, object- oriented benchmarks. ”
  • 3. Empirical Software Engineering 2011 “ For our study, we consider the API documentation of packages from the official Java Development Kit, the Eclipse project, and from the Apache foundation. […] By choosing libraries that cover a wide range of use cases we try to alleviate the risk that the study is biased towards a particular domain. The decision is based on the evidential observation that the level and the quality of their documentation is often high.”
  • 4. FSE 2015 “ In the third research question, we looked at the performance […] on realistic programs. We selected five realistic apps from different sources.” 0 75 150 225 300 Reflection BookStore a2z add.me.fast pipefilter size/execution time of analysis Why?
  • 5. FSE 2019 “ We evaluate IT on three well-known benchmark suites— DaCapo 2006, Dacapo-9.12-MR1-bach and ScalaBench. Additionally, we use two real-world performance bug datasets.…”
  • 6. Qualitas Corpus, XCorpus, … It’s huge, but…. is it representative? ant informa luceneantlr ireport marauroaaoi itext mavenargouml ivatagroupware megamekaspectj jFin_DateMath mvnforumaxion jag myfaces_coreazureus james nakedobjectsbatik jasml nekohtmlc_jdbc jasperreports netbeanscastor javacc openjmscayenne jboss oscachecheckstyle jchempaint picocontainercobertura jedit pmdcollections jena poicolt jext pookacolumba jfreechart proguardcompiere jgraph quartzderby jgraphpad quickserverdisplaytag jgrapht quiltdrawswf jgroups rollerdrjava jhotdraw rssowleclipse_SDK jmeter sableccemma jmoney sandmarkexoportal joggplayer springframeworkfindbugs jparse squirrel_sqlfitjava jpf strutsfreecol jre tapestryfreecs jrefactory
  • 7. Properties of the
 Qualitas Corpus (Sept. 2013) • No usage of Java FX (already became available in 2008) • Only one project (Hibernate) made use of Java 7 features • (There is a lot of redundancy.)
  • 9. Features are not distributed uniformly! Judge: Identifying, Understanding, and Evaluating Sources of Unsoundness in Call Graphs Michael Reif, Florian Kübler, Michael Eichberg, Dominik Helm, Mira Mezini ISSTA’19 WALA0-1-CFA—all congured with the FULL reection option. WALA requires to specify packages to be excluded from the analysis. For the comparative analysis (Experiment 2 (see 4.3)) we excluded no package, whereas for the experiment related to RQ3 we use the pre- dened Java60RegressionExclusions to ensure termination. For all Soot call graphs (SootCHA, SootRTA, SootVTA, and SootSPARK [26]) we use the options: safe-forname and safe-newinstance. This options make Soot consider all types as instantiated when Class.forName or Class.newInstance is used. We could not use types-for-invoke due exceptions being thrown [41]. Furthermore, we use include-all to ensure that no packages are ltered. Our library test cases are addi- tionally started with library:signature-resolution and all-reachable to make use of Soot’s capabilities to analyze library code. DOOPCI’s call graph is set to be context-insensitive with classical-reection turned on. For OPALRTA, we use the standard conguration. All test cases w.r.t. libraries are started with the respective library entry points. We perform all experiments on a server with two Intel Xeon E5-2620 CPUs and 64 GB RAM. 4.2 Experiment 1 Our corpus for analyzing the prevalence of language and API fea- tures (RQ1) includes the XCorpus [13], the top 50 libraries from Maven Central [31] (from July 2018), the top 15 apps from Google’s Playstore (from January 2018), plus ve Clojure [20], Groovy [32], Kotlin [16], and Scala [24] projects. Table 2 visualizes the results using a heatmap. It shows the relative frequency of each feature (cf. Feature column) within each corpus. We include the OpenJDK column as a separate corpus because most corpus projects are built upon it and, hence, partially use its features. A feature’s relative frequency is color coded using a logarithmic scale as shown in the legend of Table 2. Slightly yellow boxes (⌅) identify unused features and red boxes (⌅) those found in 5% of all methods; we chose 5% because only 7 features occur in more than 5% of all methods. Features used in no corpus (e.g., Groovy invokedynamics, or the serialization of lambdas) and always soundly resolved features (e.g., standard poly-/monomorphic call) are not included. ⌅ All the API and language features supported by Java up to version 7 are used widely across all code bases. The most frequently used feature that was introduced with Java 8 is the call of static interface methods (J8DIM6). 12% of all methods of the top 50 Maven projects use them; Scalatest [22] is responsible for ⇡ 90% of all uses. Clojure and Android code have not yet adapted Java 8 call semantics. Other Java 8 features, e.g., MethodHandle constants, are rarely used; primarily by the Nashorn library. ⌅ Support for Java 8 is a must, given the frequent use of Java 8 call semantics features in modern code (J8DIMX), unless one analyzes only Android or Clojure code. Serialization-related functionality (Ser3-7,9, ExtSer) and Java’s Reection API (cf. TR, LRR, CSR) are both used with medium fre- 0% 0.00006%0.0001%0.0002%0.0005%0.0009%0.0018%0.0037%0.0073%0.0148%0.0295%0.059%0.1181%0.2367%0.4724%0.9448%1.8895% 5% — log2(n) Feature OpenJDK8 XCorpusTop50MavenScala GroovyKotlin ClojureAndroid Feature OpenJDK8 XCorpusTop50MavenScala GroovyKotlin ClojureAndroid J8DIM1 ExtSer2 J8DIM2 ExtSer3 J8DIM3 SI1 J8DIM4 SI2 J8DIM5 SI3 J8DIM6 SI4 JVMC1 SI5 JVMC2 SI6 JVMC3 SI7 JVMC4 SI8 JVMC5 MH1.1 Lambda1 MH1.2 Lambda2 MH2 Lambda3 MH3 Lambda4 MH4 Lambda5 MH5 Lambda6 MH6 Lambda7 MH7 LIB3 MH8 LIB4 TR1 LIB5 TR2 MR1 TR3 MR2 TR4 MR3 TR5 MR4 TR6 MR5 TR7 MR6 TR8 MR7 TR9 Native LRR1 Ser1 LRR2 Ser3 CSR1+CSR2 Ser4 LRR3+CSR3 Ser5 Unsafe1 Ser6 Unsafe2 Ser7 Unsafe3 Ser8 Unsafe4 Ser9 Unsafe5 ExtSer1 Unsafe6 Unsafe7
  • 10. Getting Representative and relevant Benchmarks 1. Determine the population 2. Get a (manageable) representative subset 2.1.determine the relevant technical features 2.2.compute the subset… 2.2.1.for development and testing purposes 2.2.2.for a comprehensive evaluation 3.M ake itpublicly available!
  • 11. Hermes Assessment and Creation of Effective Test Corpora
 (consisting of projects available as Java bytecode)
  • 12. Facilitating the Development and Evaluation of Code Analyses Idea Prototyping D evelopm ent Evaluation Paper
  • 13. OPAL Hermes in a Nutshell Corpus candidates Hermes Optimal corpus Feature Queries Manual and/or Automatic Selection using Choco Solver
  • 14. General Purpose Queries Existence of 
 Bytecode Instructions Class File Versions Class Types Trivial Reflection OO Metrics 
 (Fan-In/Fan-Out,…) Field Access Method w/o Returns Call Graph Related Method Types Recursive 
 Data Structures Size of
 Inheritance Tree API Usage
  • 15. Feature Queries Relevant for Constructing Call Graphs Static Initializer Lambdas Unsafe API Polymorphic Calls Method Reference Serialization Java 8 Polymorphic Calls Signature Polymorphic Methods Non-Java bytecode It’s Extensible!
  • 16. Feature Queries for 
 API Usage Bytecode 
 Instrumentation Class Loader GUI Crypto JDBC Reflection System Thread Unsafe It’s Extensible!
  • 17. Feature Queries trait FeatureQuery { // … def apply[S]( projectConfiguration: ProjectConfiguration, project: Project[S], rawClassFiles: Traversable[(da.ClassFile, S)] ): TraversableOnce[Feature[S]] // … } Identifier, Project JAR Files, Library JAR Files, Statistics Complete reified project information (classes, fields, methods, bodys, etc.) Raw class file information (e.g., for extracting information from the constant pool)List of detected features in the codebase (id, frequency of occurrence, (opt.) locations)
  • 18. Feature abstract case class Feature[S] private ( id: String, count: Int, extensions: Chain[Location[S]] ) { … } The name of the feature. How often the feature was found. Where the feature was found.
  • 20. Constructing a 
 Minimal Corpus • Hidden Truths in Dead Software Paths @ FSE 15 • Original evaluation and development conducted on the complete Qualitas Corpus • Minimal corpus computed by Hermes using all available queries only consists of 5 out of the 100 projects in the Qualitas Corpus • Evaluation cut down from 16.77 minutes to 2.82 minutes (~6x faster) while coverage is only 1.06% below the original corpus
  • 21. • Can help you to build a specifically targeted benchmark • Hermes Data (i.e., all current feature queries) for all of Maven Central • For details talk to Ben :-) Delphi