SlideShare a Scribd company logo
Feature Mining From a Collection of Software
Product Variants
Rafat AL-msie’deen1
, Abdelhak D. Seriai1
, Marianne Huchard1
,
Christelle Urtado2
, Sylvain Vauttier2
and Hamzeh Eyal Salman1
1
LIRMM / CNRS & Montpellier 2 University, Montpellier, France
{Al-msiedee, Abdelhak.Seriai, huchard, eyalsalman}@lirmm.fr
2
LGI2P / Ecole des Mines d’Al`es, Nˆımes, France
{Christelle.Urtado, Sylvain.Vauttier}@mines-ales.fr
1 Reverse Engineering Software Product Lines
Similarly to car manufacturers who propose a full range of cars with common
characteristics and numerous variants and options, software development might
entail to segment users’ needs and propose to them a software family to choose
from. Such software family is called a software product line (SPL) [1]. A SPL is
usually characterized by two sets of features: the features that are shared by all
products in the family, called the SPL’s commonalities, and, the features that are
shared by some, but not all, products in the family, called the SPLs variability.
These two sets define the mandatory and optional parts of the SPL. Software
product line engineering (SPLE) focuses on capturing the commonalities and
variabilities between several software products that belong to the same family.
In order to provide a more subtle description of the possible combinations of
optional features (e.g., some optional feature might exclude another and require
a third one), SPLs are usually described with a de-facto standard formalism
called a feature model. A feature model characterizes the whole software family.
It defines all valid feature sets, also called configurations. Each valid configuration
represents a specific product, either it be an existing product or a valid product-
to-be.
Software product variants are seldom developed in a disciplined way from
scratch. Alternatively, ad hoc reuse techniques such as copy-paste-modify are
used on the software’s code until some point where the need to discipline the
development by adopting a SPLE approach raises. Expected benefits are to
improve product maintenance, ease system migration, and the extracted features
may lead to the production of new products. In order to capitalize from the
existing code, reverse engineering is needed but manual analysis of the existing
software product variants to discover their features is time-consuming, error-
prone, and requires substantial efforts. Automating feature mining from source
code would be of great help.
In literature, surprisingly, the reverse engineering of features (or feature
model) from source code is seldom considered [2]. Existing approaches mine
features from a single software product variant, while we think it is necessary to
consider all available variants at a time [3].
2 R. AL-msie’deen et al.
2 A Three Step Process to Mine Features from Code
Feature location in OO source code consists in identifying the object-oriented
building elements (OBEs) that implement a particular feature across software
product variants. The OBE we consider are packages, classes, attributes, meth-
ods and their body. We assume that a feature can be mapped to one and only
one set of OBEs: each feature has a unique implementation for the whole product
family.
In order to mine features from the OO source code of software variants, we
propose a three step process and rely on both Formal Concept Analysis (FCA) [4]
and Latent Semantic Indexing (LSI) [5] techniques. Our approach:
1. extracts OBEs from each software product variant by parsing its code.
2. uses FCA to build a lattice from OBEs and software product variants that
hierarchically groups OBEs from the software product variants into disjoint,
minimal partitions. This classification provides us with two OBE sets: Com-
mon OBEs (that are shared by all variants and can be found on the top
node of the lattice) and variable OBEs (that are shared by several but not
all variants and appear at the bottom of the lattice).
3. clusters OBEs into features. Each OBE set is analyzed using LSI and FCA
techniques to mine the optional and mandatory features based on the lexical
similarity between OBEs.
We have implemented this three step approach and evaluated its produced
results on a collection of ten ArgoUML products. The results showed that most of
the features were identified [6]. In our future work, we plan to combine both tex-
tual and semantic similarity measures to be more precise in determining feature
implementation. We also plan to use the mined common and variable features
to automate the building of the studied software family’s feature model.
References
1. AL-Msie’deen, R., Seriai, A.D., Huchard, M., Urtado, C., Vauttier, S., Salman, H.E.:
An approach to recover feature models from object-oriented source code. In: Actes
de la Journ´ee Lignes de Produits 2012, Lille, France (Novembre 2012) 15–26
2. AL-Msie’deen, R., Seriai, A.D., Huchard, M., Urtado, C., Vauttier, S., Salman,
H.E.: Survey: reverse engineering feature model/features from different artefacts.
http://www.lirmm.fr/Survey (2013) [Online; accessed 24-January-2013].
3. Dit, B., Revelle, M., Gethers, M., Poshyvanyk, D.: Feature location in source code:
a taxonomy and survey. Journal of Software: Evolution and Process (2012) 5395
4. Ganter, B., Wille, R.: Formal Concept Analysis, Mathematical Foundations. Sprin-
ger-Verlag (1999)
5. Marcus, A., Maletic, J.I.: Recovering documentation-to-source-code traceability
links using latent semantic indexing. In: Proceedings of the 25th International Con-
ference on Software Engineering. ICSE ’03, Washington, DC, USA, IEEE Computer
Society (2003) 125–135
6. AL-Msie’deen, R., Seriai, A.D., Huchard, M., Urtado, C., Vauttier, S., Salman, H.E.:
ArgoUML case study. http://www.lirmm.fr/CaseStudy (2013) [Online; accessed 23-
January-2013].

More Related Content

PDF
IDENTIFICATION OF PROMOTED ECLIPSE UNSTABLE INTERFACES USING CLONE DETECTION ...
PDF
IDENTIFICATION OF PROMOTED ECLIPSE UNSTABLE INTERFACES USING CLONE DETECTION ...
PDF
Regression Suite Optimization
PPTX
Dependency Inversion Principle
PDF
Aspect Oriented Programming Through C#.NET
PPSX
الحوكمة المفتوحةجمعيات
PDF
Reverse Engineering Feature Models from Software Configurations
PPTX
SOAP--Simple Object Access Protocol
IDENTIFICATION OF PROMOTED ECLIPSE UNSTABLE INTERFACES USING CLONE DETECTION ...
IDENTIFICATION OF PROMOTED ECLIPSE UNSTABLE INTERFACES USING CLONE DETECTION ...
Regression Suite Optimization
Dependency Inversion Principle
Aspect Oriented Programming Through C#.NET
الحوكمة المفتوحةجمعيات
Reverse Engineering Feature Models from Software Configurations
SOAP--Simple Object Access Protocol

Similar to Feature Mining From a Collection of Software Product Variants (20)

PDF
GENERATING SOFTWARE PRODUCT LINE MODEL BY RESOLVING CODE SMELLS IN THE PRODUC...
PDF
Generating Software Product Line Model by Resolving Code Smells in the Produc...
PDF
GENERATING SOFTWARE PRODUCT LINE MODEL BY RESOLVING CODE SMELLS IN THE PRODUC...
PPTX
Concept lattices: a representation space to structure software variability
DOC
Coupling based structural metrics for measuring the quality of a software (sy...
PDF
An Approach to Recover Feature Models From Object-Oriented Source Code
PDF
MANAGING AND ANALYSING SOFTWARE PRODUCT LINE REQUIREMENTS
PDF
CODE SWARM:ACODE GENERATION TOOL BASED ON THE AUTOMATIC DERIVATION OF TRANSFO...
PDF
Object Orientation Fundamentals
PPT
Software development effort reduction with Co-op
PPT
software development and programming languages
PDF
ODP
FOSD, Building Automated Software Factories
PDF
An Empirical Study of the Improved SPLD Framework using Expert Opinion Technique
PDF
Code Craftsmanship Checklist
PDF
Software evolution understanding: Automatic extraction of software identifier...
PDF
CS587 Project - Raychaudhury,Shaalmali
PDF
Windsurf Debuts A Free SWE-1 Coding Model For Everyone
PDF
Software Product Line Analysis and Detection of Clones
PPTX
Software Engineering -UNIT1 - Part2.pptx
GENERATING SOFTWARE PRODUCT LINE MODEL BY RESOLVING CODE SMELLS IN THE PRODUC...
Generating Software Product Line Model by Resolving Code Smells in the Produc...
GENERATING SOFTWARE PRODUCT LINE MODEL BY RESOLVING CODE SMELLS IN THE PRODUC...
Concept lattices: a representation space to structure software variability
Coupling based structural metrics for measuring the quality of a software (sy...
An Approach to Recover Feature Models From Object-Oriented Source Code
MANAGING AND ANALYSING SOFTWARE PRODUCT LINE REQUIREMENTS
CODE SWARM:ACODE GENERATION TOOL BASED ON THE AUTOMATIC DERIVATION OF TRANSFO...
Object Orientation Fundamentals
Software development effort reduction with Co-op
software development and programming languages
FOSD, Building Automated Software Factories
An Empirical Study of the Improved SPLD Framework using Expert Opinion Technique
Code Craftsmanship Checklist
Software evolution understanding: Automatic extraction of software identifier...
CS587 Project - Raychaudhury,Shaalmali
Windsurf Debuts A Free SWE-1 Coding Model For Everyone
Software Product Line Analysis and Detection of Clones
Software Engineering -UNIT1 - Part2.pptx
Ad

More from Ra'Fat Al-Msie'deen (20)

PDF
Smart City: Definitions, Architectures, Development Life Cycle, Technologies,...
PDF
ScaMaha: A Tool for Parsing, Analyzing, and Visualizing Object-Oriented Softw...
PDF
ScaMaha: A Tool for Parsing, Analyzing, and Visualizing Object-Oriented Softw...
PDF
FeatureClouds: Naming the Identified Feature Implementation Blocks from Softw...
PDF
Requirements Traceability: Recovering and Visualizing Traceability Links Betw...
PDF
BushraDBR: An Automatic Approach to Retrieving Duplicate Bug Reports
PDF
Supporting software documentation with source code summarization
PDF
SoftCloud: A Tool for Visualizing Software Artifacts as Tag Clouds.pdf
PDF
BushraDBR: An Automatic Approach to Retrieving Duplicate Bug Reports.pdf
PDF
Requirements Traceability: Recovering and Visualizing Traceability Links Betw...
PDF
Automatic Labeling of the Object-oriented Source Code: The Lotus Approach
PDF
Constructing a software requirements specification and design for electronic ...
PDF
Detecting commonality and variability in use-case diagram variants
PDF
Naming the Identified Feature Implementation Blocks from Software Source Code
PPTX
Application architectures - Software Architecture and Design
PPTX
Planning and writing your documents - Software documentation
PPTX
Requirements management planning & Requirements change management
PPTX
Requirements change - requirements engineering
PPTX
Requirements validation - requirements engineering
PPTX
Software Documentation - writing to support - references
Smart City: Definitions, Architectures, Development Life Cycle, Technologies,...
ScaMaha: A Tool for Parsing, Analyzing, and Visualizing Object-Oriented Softw...
ScaMaha: A Tool for Parsing, Analyzing, and Visualizing Object-Oriented Softw...
FeatureClouds: Naming the Identified Feature Implementation Blocks from Softw...
Requirements Traceability: Recovering and Visualizing Traceability Links Betw...
BushraDBR: An Automatic Approach to Retrieving Duplicate Bug Reports
Supporting software documentation with source code summarization
SoftCloud: A Tool for Visualizing Software Artifacts as Tag Clouds.pdf
BushraDBR: An Automatic Approach to Retrieving Duplicate Bug Reports.pdf
Requirements Traceability: Recovering and Visualizing Traceability Links Betw...
Automatic Labeling of the Object-oriented Source Code: The Lotus Approach
Constructing a software requirements specification and design for electronic ...
Detecting commonality and variability in use-case diagram variants
Naming the Identified Feature Implementation Blocks from Software Source Code
Application architectures - Software Architecture and Design
Planning and writing your documents - Software documentation
Requirements management planning & Requirements change management
Requirements change - requirements engineering
Requirements validation - requirements engineering
Software Documentation - writing to support - references
Ad

Recently uploaded (20)

PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PDF
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PDF
Nekopoi APK 2025 free lastest update
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PPTX
Transform Your Business with a Software ERP System
PDF
medical staffing services at VALiNTRY
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PPT
Introduction Database Management System for Course Database
PPTX
CHAPTER 2 - PM Management and IT Context
PDF
PTS Company Brochure 2025 (1).pdf.......
PDF
top salesforce developer skills in 2025.pdf
PPTX
Embracing Complexity in Serverless! GOTO Serverless Bengaluru
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Navsoft: AI-Powered Business Solutions & Custom Software Development
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
Wondershare Filmora 15 Crack With Activation Key [2025
Nekopoi APK 2025 free lastest update
How to Migrate SBCGlobal Email to Yahoo Easily
Transform Your Business with a Software ERP System
medical staffing services at VALiNTRY
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
Upgrade and Innovation Strategies for SAP ERP Customers
Introduction Database Management System for Course Database
CHAPTER 2 - PM Management and IT Context
PTS Company Brochure 2025 (1).pdf.......
top salesforce developer skills in 2025.pdf
Embracing Complexity in Serverless! GOTO Serverless Bengaluru

Feature Mining From a Collection of Software Product Variants

  • 1. Feature Mining From a Collection of Software Product Variants Rafat AL-msie’deen1 , Abdelhak D. Seriai1 , Marianne Huchard1 , Christelle Urtado2 , Sylvain Vauttier2 and Hamzeh Eyal Salman1 1 LIRMM / CNRS & Montpellier 2 University, Montpellier, France {Al-msiedee, Abdelhak.Seriai, huchard, eyalsalman}@lirmm.fr 2 LGI2P / Ecole des Mines d’Al`es, Nˆımes, France {Christelle.Urtado, Sylvain.Vauttier}@mines-ales.fr 1 Reverse Engineering Software Product Lines Similarly to car manufacturers who propose a full range of cars with common characteristics and numerous variants and options, software development might entail to segment users’ needs and propose to them a software family to choose from. Such software family is called a software product line (SPL) [1]. A SPL is usually characterized by two sets of features: the features that are shared by all products in the family, called the SPL’s commonalities, and, the features that are shared by some, but not all, products in the family, called the SPLs variability. These two sets define the mandatory and optional parts of the SPL. Software product line engineering (SPLE) focuses on capturing the commonalities and variabilities between several software products that belong to the same family. In order to provide a more subtle description of the possible combinations of optional features (e.g., some optional feature might exclude another and require a third one), SPLs are usually described with a de-facto standard formalism called a feature model. A feature model characterizes the whole software family. It defines all valid feature sets, also called configurations. Each valid configuration represents a specific product, either it be an existing product or a valid product- to-be. Software product variants are seldom developed in a disciplined way from scratch. Alternatively, ad hoc reuse techniques such as copy-paste-modify are used on the software’s code until some point where the need to discipline the development by adopting a SPLE approach raises. Expected benefits are to improve product maintenance, ease system migration, and the extracted features may lead to the production of new products. In order to capitalize from the existing code, reverse engineering is needed but manual analysis of the existing software product variants to discover their features is time-consuming, error- prone, and requires substantial efforts. Automating feature mining from source code would be of great help. In literature, surprisingly, the reverse engineering of features (or feature model) from source code is seldom considered [2]. Existing approaches mine features from a single software product variant, while we think it is necessary to consider all available variants at a time [3].
  • 2. 2 R. AL-msie’deen et al. 2 A Three Step Process to Mine Features from Code Feature location in OO source code consists in identifying the object-oriented building elements (OBEs) that implement a particular feature across software product variants. The OBE we consider are packages, classes, attributes, meth- ods and their body. We assume that a feature can be mapped to one and only one set of OBEs: each feature has a unique implementation for the whole product family. In order to mine features from the OO source code of software variants, we propose a three step process and rely on both Formal Concept Analysis (FCA) [4] and Latent Semantic Indexing (LSI) [5] techniques. Our approach: 1. extracts OBEs from each software product variant by parsing its code. 2. uses FCA to build a lattice from OBEs and software product variants that hierarchically groups OBEs from the software product variants into disjoint, minimal partitions. This classification provides us with two OBE sets: Com- mon OBEs (that are shared by all variants and can be found on the top node of the lattice) and variable OBEs (that are shared by several but not all variants and appear at the bottom of the lattice). 3. clusters OBEs into features. Each OBE set is analyzed using LSI and FCA techniques to mine the optional and mandatory features based on the lexical similarity between OBEs. We have implemented this three step approach and evaluated its produced results on a collection of ten ArgoUML products. The results showed that most of the features were identified [6]. In our future work, we plan to combine both tex- tual and semantic similarity measures to be more precise in determining feature implementation. We also plan to use the mined common and variable features to automate the building of the studied software family’s feature model. References 1. AL-Msie’deen, R., Seriai, A.D., Huchard, M., Urtado, C., Vauttier, S., Salman, H.E.: An approach to recover feature models from object-oriented source code. In: Actes de la Journ´ee Lignes de Produits 2012, Lille, France (Novembre 2012) 15–26 2. AL-Msie’deen, R., Seriai, A.D., Huchard, M., Urtado, C., Vauttier, S., Salman, H.E.: Survey: reverse engineering feature model/features from different artefacts. http://www.lirmm.fr/Survey (2013) [Online; accessed 24-January-2013]. 3. Dit, B., Revelle, M., Gethers, M., Poshyvanyk, D.: Feature location in source code: a taxonomy and survey. Journal of Software: Evolution and Process (2012) 5395 4. Ganter, B., Wille, R.: Formal Concept Analysis, Mathematical Foundations. Sprin- ger-Verlag (1999) 5. Marcus, A., Maletic, J.I.: Recovering documentation-to-source-code traceability links using latent semantic indexing. In: Proceedings of the 25th International Con- ference on Software Engineering. ICSE ’03, Washington, DC, USA, IEEE Computer Society (2003) 125–135 6. AL-Msie’deen, R., Seriai, A.D., Huchard, M., Urtado, C., Vauttier, S., Salman, H.E.: ArgoUML case study. http://www.lirmm.fr/CaseStudy (2013) [Online; accessed 23- January-2013].