SlideShare a Scribd company logo
Live API Documentation
Subramanian, S., Inozemtseva, L., & Holmes, R. (2014, May). Live API documentation.
In ICSE (pp. 643-652).
Presenter: Hossein Mobasher
Course: Software Evolution
Contents
• Introduction
• Previous works
• What’s new?
• Scenario
• Oracle Generation
• Problems
• Approach
• Example
• Evaluation
• Conclusion
2 / 20
Introduction
• APIs enable complex functionality to be used by client programs 
• Understanding how to use an API can be difficult 
• API documentation is often insufficient on its own.
• Writing documentation and keeping it up to date is very difficult 
• Developers ignore the documentation that does exist and declare that “code
is king”
3 / 20
Introduction (continue)
• Online sites fill the gap between traditional API documentation and
more example-based resources 
• StackOverflow
• Github Gists
• Unfortunately, these two important classes of documentation are
independent 
• Baker links source code examples to API documentation 
4 / 20
Previous works
• Identify source code references within non-code resources.
• These approaches have several limitations:
• Some systems explicitly ignored external references.
• Others only returned partially qualified names (PQN).
• Are insufficient for documentation linking.
• None of them worked for dynamically-typed languages.
5 / 20
What’s new?
• A constraint-based, iterative approach for determining the fully
qualified names of code elements in source code snippets.
• A prototype tool that implements this approach and uses the results
to automatically create bidirectional links between documentation
and source code examples.
6 / 20
Scenario 1
• Code snippet posted to StackOverflow to assist a developer who
didn’t understand how to manipulate the state of History objects.
• Baker can uniquely link bolded elements to the API.
• The elements for which it can determine a fully qualified name. (FQN)
7
Scenario 2
• Code snippet that a developer is trying to make a web app that can
take a photo and inject it into an element in an HTML documents.
• Baker also can identify the API that bolded elements are from.
8
Oracle Generation
• Baker’s oracle is key to its success.
• It is generally impossible to identify FQN of the code elements in a
snippets.
• FQN is essential to documentation linking tasks.
• The oracle are implemented as web services.
• Allowing Java/JS to be updated dynamically by any user or program.
9 / 20
Oracle Generation (continue)
• Java Oracle
• Containing class, method and field signatures.
• Using Neo4j to represent the hierarchies between code elements that an
object-oriented language like Java offers.
• Oracle includes full type information in the database.
• The types of classes, fields, return types, and parameters.
• The Java oracle can be dynamically updated by adding an appropriate JAR file.
10 / 20
Oracle Generation (continue)
• JS Oracle
• Is built by statically analyzing the source files of the libraries to be included.
• Using ESPRIMA to parse the source code of each library.
• ESPRIMA returns a JSON representation of the AST.
• JS oracle identifies all of the ‘Function Expressions’ and ‘Function Declaration’
nodes.
• Parsing problems:
• JavaScript is dynamically typed language. It is difficult to identify all method declarations
by static analysis of source code.
• JavaScript is not annotated with visibility (e.g. public and private)
11 / 20
Problems
• Parsing code snippets is more difficult than full files.
• Code snippets can be ambiguous.
• Kinds of ambiguity:
• Declaration Ambiguity
• External Reference Ambiguity
12 / 20
Approach
• Deductive Linking
• Handles declaration and external reference ambiguity.
• The goal is identifying the sole FQN that a given identifier can represent.
• Generating AST for code snippets.
• Uses information from the oracle to deduce facts about the AST.
13 / 20
Example (JavaBaker)
• History: 58 candidate types are recorded for History in oracle.
• addHistoryListener: Test 58 candidate types to see which ones contain a
method called addHistoryListener(…) that take a single object parameter.
• This results in 4 candidate methods.
• History node is also updated (reduced from 58 to 4)
• History.getToken: Test 4 candidates, reduced to from 4 to 2
• …
• At the end, Baker iterates again.
• Baker can be identified History as
com.google.gwt.user.client.History
14 / 20
Example (JSBaker)
• $(…).on(…)
• $ matches only with jQuery.
• on matches with jQuery’s on method. (Even though there are three on
method in the oracle)
• useGetPicture
• Oracle doesn’t contain a result for that.
• Baker records that this function as locally
defined, rather than being an external function.
• …
15 / 20
Evaluation
• Two research questions:
• Can Baker accurately identify API elements in code snippets?
• Does Baker work on a variety of systems, or is it limited to just a few libraries?
16 / 20
Linker Accuracy
• Precision is much more important than recall.
• Choose five Java systems (libraries) for analyzing.
• Android/ GWT/ Hibernate/ Joda Time/ XStream
• Manually examined 50 code elements for each system to determine if
the result returned by Baker:
• True Positive (TP)
• False Positive (FP)
• False Negative (FN)
17 / 20
Linker Accuracy (continue)
• Baker’s overall Java precision (0.98) and recall (0.83). Only exact
matches (cardinality = 1) were considered.
18 / 20
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 98% 𝑅𝑒𝑐𝑎𝑙𝑙 = 83%
Linker Accuracy (continue)
19 / 20
• Baker’s overall JavaScript precision (0.97) and recall (0.96). Only exact
matches were considered. (cardinality = 1)
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 97% 𝑅𝑒𝑐𝑎𝑙𝑙 = 96%
Example Diversity
• JavaBaker parsed 4000 source code examples.
• Identified over 30000 links to 4500 unique elements.
• JSBaker parsed 1000 source code examples.
• Identified over 10000 links to 500 unique elements.
Qualifying High-cardinality Match
• Linking methods may return more than one match when there isn’t
enough information to FQN of a method or type.
• This is relatively rare.
• Graph shows the cardinality of the result
for each of the 4,000 snippets.
• JDK types and methods have been removed.
• The majority (69%) of elements can be
precisely identified.
21 / 20
Conclusion
• Maintaining API documentation is challenging, time-consuming task.
• The documentation is frequently out of date.
• Baker automatically generates links between API documentation and
source code examples.
• Baker has high precision. (0.97)
22 / 20
Questions?

More Related Content

PPT
C# 3.0 and LINQ Tech Talk
PDF
Exploring neXtProt data and beyond: A SPARQLing solution
PDF
Java 8 - Project Lambda
PPTX
Flex (fast lexical analyzer generator )
PDF
Advanced debugging
PPTX
Ozr2013
PPTX
Static Import and access modifiers
PPT
13243967
C# 3.0 and LINQ Tech Talk
Exploring neXtProt data and beyond: A SPARQLing solution
Java 8 - Project Lambda
Flex (fast lexical analyzer generator )
Advanced debugging
Ozr2013
Static Import and access modifiers
13243967

What's hot (20)

PPTX
Introduction of Java 8 with emphasis on Lambda Expressions and Streams
PDF
How to improve your Tizen native program
PPT
Introducing object oriented programming (oop)
PPT
Slides
PPTX
C Sharp Course 101.5
PPTX
Mule soft meetup_4_mty_online_oct_2020
PPTX
AAC Room
PDF
32.java input-output
PPTX
java 8 new features
PPT
Kelis king - introduction to software design
PDF
CNIT 129S Ch 9: Attacking Data Stores (Part 2 of 2)
PPTX
Chapter one
PDF
input/ output in java
PPTX
Java 8 Lambda and Streams
PPTX
Java 8 new features
PDF
Object Oriented Programming with Laravel - Session 3
DOCX
Linq in C#
KEY
Introducing LINQ
PPT
Lambdas
PDF
Object Oriented Programming with Laravel - Session 2
Introduction of Java 8 with emphasis on Lambda Expressions and Streams
How to improve your Tizen native program
Introducing object oriented programming (oop)
Slides
C Sharp Course 101.5
Mule soft meetup_4_mty_online_oct_2020
AAC Room
32.java input-output
java 8 new features
Kelis king - introduction to software design
CNIT 129S Ch 9: Attacking Data Stores (Part 2 of 2)
Chapter one
input/ output in java
Java 8 Lambda and Streams
Java 8 new features
Object Oriented Programming with Laravel - Session 3
Linq in C#
Introducing LINQ
Lambdas
Object Oriented Programming with Laravel - Session 2
Ad

Viewers also liked (13)

PDF
WMA Risk Survey 2016
PDF
UCloud
PDF
CodeIgniter
ODP
PPTX
Desarrollo de la unidad didáctica
PPTX
Intimacy
DOCX
Makalah psikologi
PPTX
Presentation
PDF
Association Rule Mining in Social Network Data
PDF
Advanced Java
PDF
Illustration and Graphic Design Portfolio
WMA Risk Survey 2016
UCloud
CodeIgniter
Desarrollo de la unidad didáctica
Intimacy
Makalah psikologi
Presentation
Association Rule Mining in Social Network Data
Advanced Java
Illustration and Graphic Design Portfolio
Ad

Similar to Live API Documentation (20)

PPTX
API Documentation Workshop tcworld India 2015
PDF
Apidays Paris 2023 - API design first: A case for a better language, Emmanu...
PDF
APIDays 2018 - API Development Lifecycle - The secret ingredient behind RESTf...
PPTX
TDD and the Legacy Code Black Hole
PDF
Practices and tools for building better API (JFall 2013)
PDF
Practices and tools for building better APIs
PDF
JSR 335 / java 8 - update reference
PDF
Classification and Searching in Java API Reference Documentation
PPTX
API-first development
PPTX
Don't Be Afraid of Abstract Syntax Trees
PPTX
Slides
PDF
Java SE 8 & EE 7 Launch
PDF
API Docs Made Right / RAML - Swagger rant
PDF
apidays Australia 2022 - Schemas are not contracts!, Matt Fellows, Pactflow
PDF
Microservices and the Art of Taming the Dependency Hell Monster
PPTX
API workshop: Introduction to APIs (TC Camp)
PDF
APIs: A Better Alternative to Page Objects
PDF
Consumer centric api design v0.4.0
PDF
Roundtable_-_API_Research__Testing_Tools.pdf
PPTX
Crafting Evolvable Api Responses
API Documentation Workshop tcworld India 2015
Apidays Paris 2023 - API design first: A case for a better language, Emmanu...
APIDays 2018 - API Development Lifecycle - The secret ingredient behind RESTf...
TDD and the Legacy Code Black Hole
Practices and tools for building better API (JFall 2013)
Practices and tools for building better APIs
JSR 335 / java 8 - update reference
Classification and Searching in Java API Reference Documentation
API-first development
Don't Be Afraid of Abstract Syntax Trees
Slides
Java SE 8 & EE 7 Launch
API Docs Made Right / RAML - Swagger rant
apidays Australia 2022 - Schemas are not contracts!, Matt Fellows, Pactflow
Microservices and the Art of Taming the Dependency Hell Monster
API workshop: Introduction to APIs (TC Camp)
APIs: A Better Alternative to Page Objects
Consumer centric api design v0.4.0
Roundtable_-_API_Research__Testing_Tools.pdf
Crafting Evolvable Api Responses

Live API Documentation

  • 1. Live API Documentation Subramanian, S., Inozemtseva, L., & Holmes, R. (2014, May). Live API documentation. In ICSE (pp. 643-652). Presenter: Hossein Mobasher Course: Software Evolution
  • 2. Contents • Introduction • Previous works • What’s new? • Scenario • Oracle Generation • Problems • Approach • Example • Evaluation • Conclusion 2 / 20
  • 3. Introduction • APIs enable complex functionality to be used by client programs  • Understanding how to use an API can be difficult  • API documentation is often insufficient on its own. • Writing documentation and keeping it up to date is very difficult  • Developers ignore the documentation that does exist and declare that “code is king” 3 / 20
  • 4. Introduction (continue) • Online sites fill the gap between traditional API documentation and more example-based resources  • StackOverflow • Github Gists • Unfortunately, these two important classes of documentation are independent  • Baker links source code examples to API documentation  4 / 20
  • 5. Previous works • Identify source code references within non-code resources. • These approaches have several limitations: • Some systems explicitly ignored external references. • Others only returned partially qualified names (PQN). • Are insufficient for documentation linking. • None of them worked for dynamically-typed languages. 5 / 20
  • 6. What’s new? • A constraint-based, iterative approach for determining the fully qualified names of code elements in source code snippets. • A prototype tool that implements this approach and uses the results to automatically create bidirectional links between documentation and source code examples. 6 / 20
  • 7. Scenario 1 • Code snippet posted to StackOverflow to assist a developer who didn’t understand how to manipulate the state of History objects. • Baker can uniquely link bolded elements to the API. • The elements for which it can determine a fully qualified name. (FQN) 7
  • 8. Scenario 2 • Code snippet that a developer is trying to make a web app that can take a photo and inject it into an element in an HTML documents. • Baker also can identify the API that bolded elements are from. 8
  • 9. Oracle Generation • Baker’s oracle is key to its success. • It is generally impossible to identify FQN of the code elements in a snippets. • FQN is essential to documentation linking tasks. • The oracle are implemented as web services. • Allowing Java/JS to be updated dynamically by any user or program. 9 / 20
  • 10. Oracle Generation (continue) • Java Oracle • Containing class, method and field signatures. • Using Neo4j to represent the hierarchies between code elements that an object-oriented language like Java offers. • Oracle includes full type information in the database. • The types of classes, fields, return types, and parameters. • The Java oracle can be dynamically updated by adding an appropriate JAR file. 10 / 20
  • 11. Oracle Generation (continue) • JS Oracle • Is built by statically analyzing the source files of the libraries to be included. • Using ESPRIMA to parse the source code of each library. • ESPRIMA returns a JSON representation of the AST. • JS oracle identifies all of the ‘Function Expressions’ and ‘Function Declaration’ nodes. • Parsing problems: • JavaScript is dynamically typed language. It is difficult to identify all method declarations by static analysis of source code. • JavaScript is not annotated with visibility (e.g. public and private) 11 / 20
  • 12. Problems • Parsing code snippets is more difficult than full files. • Code snippets can be ambiguous. • Kinds of ambiguity: • Declaration Ambiguity • External Reference Ambiguity 12 / 20
  • 13. Approach • Deductive Linking • Handles declaration and external reference ambiguity. • The goal is identifying the sole FQN that a given identifier can represent. • Generating AST for code snippets. • Uses information from the oracle to deduce facts about the AST. 13 / 20
  • 14. Example (JavaBaker) • History: 58 candidate types are recorded for History in oracle. • addHistoryListener: Test 58 candidate types to see which ones contain a method called addHistoryListener(…) that take a single object parameter. • This results in 4 candidate methods. • History node is also updated (reduced from 58 to 4) • History.getToken: Test 4 candidates, reduced to from 4 to 2 • … • At the end, Baker iterates again. • Baker can be identified History as com.google.gwt.user.client.History 14 / 20
  • 15. Example (JSBaker) • $(…).on(…) • $ matches only with jQuery. • on matches with jQuery’s on method. (Even though there are three on method in the oracle) • useGetPicture • Oracle doesn’t contain a result for that. • Baker records that this function as locally defined, rather than being an external function. • … 15 / 20
  • 16. Evaluation • Two research questions: • Can Baker accurately identify API elements in code snippets? • Does Baker work on a variety of systems, or is it limited to just a few libraries? 16 / 20
  • 17. Linker Accuracy • Precision is much more important than recall. • Choose five Java systems (libraries) for analyzing. • Android/ GWT/ Hibernate/ Joda Time/ XStream • Manually examined 50 code elements for each system to determine if the result returned by Baker: • True Positive (TP) • False Positive (FP) • False Negative (FN) 17 / 20
  • 18. Linker Accuracy (continue) • Baker’s overall Java precision (0.98) and recall (0.83). Only exact matches (cardinality = 1) were considered. 18 / 20 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 98% 𝑅𝑒𝑐𝑎𝑙𝑙 = 83%
  • 19. Linker Accuracy (continue) 19 / 20 • Baker’s overall JavaScript precision (0.97) and recall (0.96). Only exact matches were considered. (cardinality = 1) 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 97% 𝑅𝑒𝑐𝑎𝑙𝑙 = 96%
  • 20. Example Diversity • JavaBaker parsed 4000 source code examples. • Identified over 30000 links to 4500 unique elements. • JSBaker parsed 1000 source code examples. • Identified over 10000 links to 500 unique elements.
  • 21. Qualifying High-cardinality Match • Linking methods may return more than one match when there isn’t enough information to FQN of a method or type. • This is relatively rare. • Graph shows the cardinality of the result for each of the 4,000 snippets. • JDK types and methods have been removed. • The majority (69%) of elements can be precisely identified. 21 / 20
  • 22. Conclusion • Maintaining API documentation is challenging, time-consuming task. • The documentation is frequently out of date. • Baker automatically generates links between API documentation and source code examples. • Baker has high precision. (0.97) 22 / 20