SlideShare a Scribd company logo
Correlation between Packages and Security Vulnerabilities in Mobile Applications
ChandraSekhar Vuppalapati
Computer Engineering Department
San Jose State University
San Jose, United States
Chandrasekar.Vuppalapati@sjsu.edu
Sakshi Gangrade
Computer Engineering Department
San Jose State University
San Jose, United States
Sakshi.Gangrade@sjsu.edu
Prateek Panda
Appknox
XYSec Labs Pte. Ltd
Singapore
Prateek@appknox.com
Vinusha Anem
Indian Institute of Technology
Kharagpur, India
vinusha.anem15@gmail.com
Abstract— Mobile security or mobile application security has
become increasingly important in today’s mobile-driven world.
Of particular concern is the security of personal and business
information, now stored on smartphones. This concern has been
substantiated by the increase in targeted Advanced Persistent
Threat attacks on many businesses as well as governments. In
this paper, we develop analyses for finding the correlation
between the packages installed in mobile applications and
security issues found in them, by analyzing a dataset of 103,783
mobile apps comprising of 3,227 distinct packages. The apps are
of diverse nature covering over 42 categories as per the Google
Play Store. We find that severe vulnerabilities are present across
the entire Android app ecosystem, even in popular apps and
libraries. The findings are intended to raise awareness of the
rising security problem in mobile apps, among both the research
community and app developers, and aid them in detecting and
resolving the vulnerabilities.
Keywords – Advanced Persistent Threat, Application Security,
Big Data, Cybersecurity, Data Analysis, Data Mining, Mobile
Applications, Mobile Security, Pattern Matching, Security
I. INTRODUCTION
“I am convinced that there are only two types of
companies: those that have been hacked and those that will
be. And even they are converging into one category:
companies that have been hacked and will be hacked again
[1],” says Rob Mueller, ex-Director of the Federal Bureau of
Investigation. We rely on mobile apps in many aspects of
our life, such as banking, online shopping, social
networking, services, health, and entertainment. User’s store
their private information in these apps and conduct financial
transactions using these apps but are not aware about the
security risks they bring [2]. The importance of protecting
apps from adversaries cannot be overstated. A Gartner
report [3] stated that more than 75% of mobile applications
will fail basic security tests, and that is coming out to be
very true.
Businesses, in order to launch their smartphone
applications in the application stores, often miss out on
understanding the criticality of a security abuse. Developers
are often focused on elements of usability, design and
features while security takes a backseat. A recent report [4]
by mobile security company Appknox shows that 85% of the
top 100 banks in the Asia Pacific region are vulnerable to
hacks. If banks who are believed to be pioneers in security
are in this condition, there isn’t much to say about thousands
of other businesses who are fast moving to mobile.
Security for mobile apps is complex and involves a
number of considerations that go beyond traditional desktop
and web apps. Mobile is still a very nascent ecosystem and
it’s open and mobile nature is both a boon and a bane. This
research utilizes security data obtained by analyzing mobile
apps through the Appknox mobile security product which
covers all the Top Ten OWASP [5] mobile security threats
and more as required by different compliances like PCI-DSS
[6], HIPAA [7], SOX [8], etc. The research intends to
establish a relationship between the packages present in an
application and its level of security. We analyze a dataset of
103,783 Android apps acquired from the Google Play Store
across 42 different categories. We find that 95.26% of these
apps contain at least one high, medium or low vulnerability.
The trends indicate that severe vulnerabilities are present
throughout the Android ecosystem, including popular apps
and libraries.
The main goals of this paper are:
• Perform large-scale analysis of 103,783 mobile
apps developed on Android to quantify the
prevalence of 18 types of vulnerabilities in the
Android ecosystem.
• Analyze trends in these vulnerabilities and find that
vulnerabilities are present in all corners of the
Android ecosystem.
• Develop a highly scalable analyses that can detect
which packages are more prone to the 18 types of
vulnerabilities.
II. MOBILE APP DATASET
The mobile he mobile applications considered for this
research comprise of a total of 42 different categories as per
the Google Play Store. These have been combined into 36
categories in order to merge certain similar subcategories
into one. 103,783 apps have been tested against 18 different
security vulnerabilities. We will later construct analyses to
find and quantify which packages are susceptible to these
vulnerabilities in our dataset.
i. Categories
There	are	48	categories	of	applications	on	google	
store	that	are	represented	by	this	pie	chart:
Figure 1: App distribution by categories
ii. Vulnerabilities
We have covered following 18 different security
vulnerabilities including the Top Ten OWASP [2]
mobile threats, and threats highlighted in the CAPEC
[9] database:
1. Unprotected Services
2. Improper Content Provider Permissions
3. Application Debug Enabled
4. Improper Custom Permission
5. Broken Trust Manager for SSL
6. Broken hostname verifier for SSL
7. Insecure SSL Socket Factory Implemented
8. Hostname Verifier Allows all hostname
9. App extends WebViewClient
10. Unused Permissions
11. Remote Code Execution through JavaScript
Interface
12. Unprotected Activities
13. SQL Injection
14. Information in Shared Preference
15. Insufficient Transport Layer Protection
16. Derived Crypto Keys
17. Application Logs
18. Business Logic
III. ANALYSES
This section describes the procedure used to determine
the correlation between packages and vulnerabilities of
103,783 apps in the dataset.
i. Data Cleansing
Our data source was a JSON file of 103,783 apps
consisting of Category, Developer Name, App URL,
Title, Version, Vulnerabilities, Description of each
vulnerability, Packages and the Risk rating of each
vulnerability for all the apps.
To find the correlation between Packages and
Vulnerabilities, the data was cleaned by converting the
JSON file to an excel file which comprised of Category,
App name, Packages, Vulnerabilities, and the
Description of each vulnerability. The description of
each vulnerability contains a text string which includes
the name of the package in case the vulnerability is
because of the mentioned package. If the description
does not contain a package name, the security
vulnerability detected is not caused by a package. Here
is an example of a vulnerability description for an app:
[{"description": "Custom TrustManager is implemented
in class com.flurry.sdk.eu implements naive certificate
check. This TrustManager breaks certificate validation!
Referenced in method com.flurry.sdk.eu->a"}].
In the mentioned example, the package
“com.flurry.sdk.eu” is responsible for causing the
vulnerability.
ii. Preprocessing
As the main aim of this study was to find correlation
between packages and mobile vulnerabilities, we
filtered out the apps that do not have any vulnerability
altogether or vulnerabilities caused by a package. We
also replaced multiple package entries with their
corresponding parent package. For example in the list
of packages “com.XYZ.a.net” and “com.XYZ.b.net”
the parent package was extracted to be “XYZ”.
iii. Vulnerability Analysis
We applied MapReduce using Python to our
preprocessed data. Three functions namely Mapper,
Shuffle and Reducer were defined. In our algorithm, the
packages are given as input into the Mapper function
and the SDKs are generated as tuples or key value
pairs. In the Shuffle function, the words were sorted
lexicographically and then appended into an empty list
after which the Reducer function was used to add up the
parent package tuples.
	 Algorithm:
Import re
def mapper (doc_name, text):
result = []
for line in text:
# remove leading and trailing whitespace
line = line.strip()
# remove the following symbols: !”§$%&/()=?*#()[],.<>:;
~_-
line = re.sub ('[!"§$ %&/()=?*#()[],.<>:;~_-]',"", line)
# split the line into words
words = line.split (" ")
# insert the cleaned words into the results list
for word in words:
result.append((word, 1))
return result
def reducer(key, values):
print ("Reducer result -> %s: %d" % (key, sum (values)))
def shuffle(words):
# sorting the words
sorted_keys = sorted (words)
tmp=""
val_list= []
for i in sorted_keys:
if i[0]! =tmp and tmp! ="":
reducer (tmp, val_list)
val_list= []
tmp=i[0]
val_list.append (i[1])
elif i[0]==tmp or tmp=="":
tmp=i[0]
val_list.append(i[1])
reducer (tmp, val_list)
Figure 2: Code Snippet of Analysis Algorithm
The parent package with the highest value will be the most
vulnerable in the dataset. Likewise, we get the probability of
vulnerability for each package used.
IV. RESULTS
In the dataset of 103,783 mobile apps, we found the
following:
• The 95.26% of apps have at least one vulnerability
• 74.24% of apps contain at least one vulnerability of
either medium or high risk
• 13.63% of the apps have more than 4
vulnerabilities in them.
• 47.59% of the apps have high security risks
• 15.30% of them have medium security risks, and
• 37.11% have low security risk.
Below chart shows top vulnerabilities affecting android app
security:
Figure 2: Top 5 vulnerabilities
Figure 3 displays the result of the map reduce algorithm
in the form of a bar graph. It is a representation of top 20
vulnerable packages and the percentage of apps that are
affected by them. It can be observed that highly popular
packages like Flurry for analytics has a security issue that
affects over 13% of the apps that we examined, that is, more
than 13,400 applications are affected by the security issue
caused by this package.
Figure 3: Packages vs Vulnerability Percentage
A total of 3,227 packages were found in our android
applications dataset. All these applications and packages
were evaluated based on the 18 security vulnerabilities from
mentioned before.
V. CONCLUSION
In this study, we have systematically found the
correlation between packages and vulnerabilities as
determined through the security analysis of 103,783 Android
apps taken from the Google Play Store. We explored trends
in vulnerable apps and found that vulnerabilities are present
across the entire Android app ecosystem, including the most
popular libraries and apps. Our analyses provide evidence of
a pattern where certain vulnerable packages result in the
presence of particular security vulnerabilities. In this paper,
we have been able to establish this correlation and have also
revealed the most vulnerable packages and the top five
commonly found security vulnerabilities. The knowledge of
security vulnerabilities associated with different packages
will help developers build secure apps and keep businesses
safe by killing the chain of Advanced Persistent Threat
(APT) [10].
REFERENCES
[1] "Combating threats in the Cyber world: Outsmarting
terrorists, hackers, and spies," FBI, 2012. [Online].
Available:
https://archives.fbi.gov/archives/news/speeches/combati
ng-threats-in-the-cyber-world-outsmarting-terrorists-
hackers-and-spies
[2] Y.-D. Lin, C.-Y. Huang, and M. Wright, "IEEE Xplore
document - mobile application security," IEEE, Jun.
2014. [Online]. Available:
http://ieeexplore.ieee.org/document/6838873/
[3] Gartner, "Gartner says more than 75 percent of mobile
applications will fail basic security tests through 2015,"
2014. [Online]. Available:
http://www.gartner.com/newsroom/id/2846017
[4] "85% of the top mobile banking apps in the APAC
region fail basic security checks," in Appknox,
Appknox, 2016. [Online]. Available:
https://blog.appknox.com/85-of-the-top-mobile-
banking-apps-in-the-apac-region-fail-basic-security-
checks-appknox-banking-report/
[5] OWASP, "Projects/OWASP mobile security project -
Top Ten mobile risks," 2014. [Online]. Available:
https://www.owasp.org/index.php/Projects/OWASP_M
obile_Security_Project_-_Top_Ten_Mobile_Risks
[6] PCI Security Standards Council LLC, "PCI security,"
2006. [Online]. Available:
https://www.pcisecuritystandards.org/pci_security/
[7] U.S. Department of Health & Human Services,
"Summary of the HIPAA security rule," HHS.gov,
2013. [Online]. Available:
http://www.hhs.gov/hipaa/for-
professionals/security/laws-regulations/index.html
[8] "The Sarbanes-Oxley act 2002," 2006. [Online].
Available: http://www.soxlaw.com/
[9] "Common attack pattern enumeration and classification
(CAPEC) ï," 2007. [Online]. Available:
https://capec.mitre.org/data/lists/283.html
[10] J. B. Thummala and B. Thummala, "Defending
advanced persistent threats - be better prepared to face
the worst," Infosecurity Magazine, 2016. [Online].
Available: http://www.infosecurity-
magazine.com/opinions/defending-advanced-persistent/

More Related Content

PDF
IEEE ANDROID APPLICATION 2016 TITLE AND ABSTRACT
PDF
A Study of Database Protection Techniques
PDF
Provide security about risk score in mobile application’s
PDF
Testing an Android Implementation of the Social Engineering Protection Traini...
PDF
Lessons Learned From the Yahoo! Hack
PDF
WHAT IS APP SECURITY – THE COMPLETE PROCESS AND THE TOOLS & TESTS TO RUN IT
PDF
Ijeee 51-57-preventing sql injection attacks in web application
PDF
IRJET- Android Malware Detection using Deep Learning
IEEE ANDROID APPLICATION 2016 TITLE AND ABSTRACT
A Study of Database Protection Techniques
Provide security about risk score in mobile application’s
Testing an Android Implementation of the Social Engineering Protection Traini...
Lessons Learned From the Yahoo! Hack
WHAT IS APP SECURITY – THE COMPLETE PROCESS AND THE TOOLS & TESTS TO RUN IT
Ijeee 51-57-preventing sql injection attacks in web application
IRJET- Android Malware Detection using Deep Learning

What's hot (20)

PDF
Ey giss-under-cyber-attack
PPT
Security Application for Malicious Code Detection using Data Mining
PDF
AndRadar: Fast Discovery of Android Applications in Alternative Markets
DOCX
Mitigating Privilege-Escalation Attacks on Android Report
PDF
ACCUSE: Helping Users to minimize Android App Privacy Concerns
PDF
MALWARE THREAT ANALYSIS
PDF
Survey on Fraud Malware Detection in Google Play Store
PDF
Malware Detection in Android Applications
PDF
20160831_app_storesecurity_Seminar
PDF
MINING PATTERNS OF SEQUENTIAL MALICIOUS APIS TO DETECT MALWARE
PDF
MINING PATTERNS OF SEQUENTIAL MALICIOUS APIS TO DETECT MALWARE
PDF
DMIA: A MALWARE DETECTION SYSTEM ON IOS PLATFORM
PDF
Comparing vulnerability and security configuration assessment coverage of lea...
PDF
Security and Privacy Measurements in Social Networks: Experiences and Lessons...
PPTX
VULNERABILITIES AND EXPLOITATION IN COMPUTER SYSTEM – PAST, PRESENT, AND FUTURE
PPTX
The security mindset securing social media integrations and social learning...
PDF
SYSTEM CALL DEPENDENCE GRAPH BASED BEHAVIOR DECOMPOSITION OF ANDROID APPLICAT...
PDF
An automated approach to fix buffer overflows
PDF
IRJET- Windows Log Investigator System for Faster Root Cause Detection of a D...
Ey giss-under-cyber-attack
Security Application for Malicious Code Detection using Data Mining
AndRadar: Fast Discovery of Android Applications in Alternative Markets
Mitigating Privilege-Escalation Attacks on Android Report
ACCUSE: Helping Users to minimize Android App Privacy Concerns
MALWARE THREAT ANALYSIS
Survey on Fraud Malware Detection in Google Play Store
Malware Detection in Android Applications
20160831_app_storesecurity_Seminar
MINING PATTERNS OF SEQUENTIAL MALICIOUS APIS TO DETECT MALWARE
MINING PATTERNS OF SEQUENTIAL MALICIOUS APIS TO DETECT MALWARE
DMIA: A MALWARE DETECTION SYSTEM ON IOS PLATFORM
Comparing vulnerability and security configuration assessment coverage of lea...
Security and Privacy Measurements in Social Networks: Experiences and Lessons...
VULNERABILITIES AND EXPLOITATION IN COMPUTER SYSTEM – PAST, PRESENT, AND FUTURE
The security mindset securing social media integrations and social learning...
SYSTEM CALL DEPENDENCE GRAPH BASED BEHAVIOR DECOMPOSITION OF ANDROID APPLICAT...
An automated approach to fix buffer overflows
IRJET- Windows Log Investigator System for Faster Root Cause Detection of a D...
Ad

Similar to IEEE_BigDataService_2017_paper_10 (20)

PDF
Top Application Security Testing Tools for Enhanced Software Protection.pdf
PDF
Software reusabilitydevelopment through NFL approach For identifying security...
PDF
DROIDSWAN: Detecting Malicious Android Applications Based on Static Feature A...
PDF
Protecting Enterprise - An examination of bugs, major vulnerabilities and exp...
PDF
The complete guide to developer first application security By Github.Com
PDF
The complete guide to developer first application security By Github.Com
PDF
Conducting Security Metrics for Object-Oriented Class Design
PDF
Permission Driven Malware Detection using Machine Learning
PDF
Evaluating android antimalware against transformation attacks
PDF
Pileup Flaws: Vulnerabilities in Android Update Make All Android Devices Vuln...
PDF
MACHINE LEARNING APPLICATIONS IN MALWARE CLASSIFICATION: A METAANALYSIS LITER...
PPTX
Secure Software Development Life Cycle
PDF
Session2-Application Threat Modeling
PDF
Malware Detection Module using Machine Learning Algorithms to Assist in Centr...
PDF
PDF The complete guide to developer first application security By Github.Co...
PDF
All You Need to Know About Application Security Testing.pdf
PDF
How to Build Secure Mobile Apps.pdf
PDF
Permission based malware detection by using k means algorithm in Android OS
PPTX
Threat modelling
PDF
Security Redefined - Prevention is the future!!
Top Application Security Testing Tools for Enhanced Software Protection.pdf
Software reusabilitydevelopment through NFL approach For identifying security...
DROIDSWAN: Detecting Malicious Android Applications Based on Static Feature A...
Protecting Enterprise - An examination of bugs, major vulnerabilities and exp...
The complete guide to developer first application security By Github.Com
The complete guide to developer first application security By Github.Com
Conducting Security Metrics for Object-Oriented Class Design
Permission Driven Malware Detection using Machine Learning
Evaluating android antimalware against transformation attacks
Pileup Flaws: Vulnerabilities in Android Update Make All Android Devices Vuln...
MACHINE LEARNING APPLICATIONS IN MALWARE CLASSIFICATION: A METAANALYSIS LITER...
Secure Software Development Life Cycle
Session2-Application Threat Modeling
Malware Detection Module using Machine Learning Algorithms to Assist in Centr...
PDF The complete guide to developer first application security By Github.Co...
All You Need to Know About Application Security Testing.pdf
How to Build Secure Mobile Apps.pdf
Permission based malware detection by using k means algorithm in Android OS
Threat modelling
Security Redefined - Prevention is the future!!
Ad

IEEE_BigDataService_2017_paper_10

  • 1. Correlation between Packages and Security Vulnerabilities in Mobile Applications ChandraSekhar Vuppalapati Computer Engineering Department San Jose State University San Jose, United States Chandrasekar.Vuppalapati@sjsu.edu Sakshi Gangrade Computer Engineering Department San Jose State University San Jose, United States Sakshi.Gangrade@sjsu.edu Prateek Panda Appknox XYSec Labs Pte. Ltd Singapore Prateek@appknox.com Vinusha Anem Indian Institute of Technology Kharagpur, India vinusha.anem15@gmail.com Abstract— Mobile security or mobile application security has become increasingly important in today’s mobile-driven world. Of particular concern is the security of personal and business information, now stored on smartphones. This concern has been substantiated by the increase in targeted Advanced Persistent Threat attacks on many businesses as well as governments. In this paper, we develop analyses for finding the correlation between the packages installed in mobile applications and security issues found in them, by analyzing a dataset of 103,783 mobile apps comprising of 3,227 distinct packages. The apps are of diverse nature covering over 42 categories as per the Google Play Store. We find that severe vulnerabilities are present across the entire Android app ecosystem, even in popular apps and libraries. The findings are intended to raise awareness of the rising security problem in mobile apps, among both the research community and app developers, and aid them in detecting and resolving the vulnerabilities. Keywords – Advanced Persistent Threat, Application Security, Big Data, Cybersecurity, Data Analysis, Data Mining, Mobile Applications, Mobile Security, Pattern Matching, Security I. INTRODUCTION “I am convinced that there are only two types of companies: those that have been hacked and those that will be. And even they are converging into one category: companies that have been hacked and will be hacked again [1],” says Rob Mueller, ex-Director of the Federal Bureau of Investigation. We rely on mobile apps in many aspects of our life, such as banking, online shopping, social networking, services, health, and entertainment. User’s store their private information in these apps and conduct financial transactions using these apps but are not aware about the security risks they bring [2]. The importance of protecting apps from adversaries cannot be overstated. A Gartner report [3] stated that more than 75% of mobile applications will fail basic security tests, and that is coming out to be very true. Businesses, in order to launch their smartphone applications in the application stores, often miss out on understanding the criticality of a security abuse. Developers are often focused on elements of usability, design and features while security takes a backseat. A recent report [4] by mobile security company Appknox shows that 85% of the top 100 banks in the Asia Pacific region are vulnerable to hacks. If banks who are believed to be pioneers in security are in this condition, there isn’t much to say about thousands of other businesses who are fast moving to mobile. Security for mobile apps is complex and involves a number of considerations that go beyond traditional desktop and web apps. Mobile is still a very nascent ecosystem and it’s open and mobile nature is both a boon and a bane. This research utilizes security data obtained by analyzing mobile apps through the Appknox mobile security product which covers all the Top Ten OWASP [5] mobile security threats and more as required by different compliances like PCI-DSS [6], HIPAA [7], SOX [8], etc. The research intends to establish a relationship between the packages present in an application and its level of security. We analyze a dataset of 103,783 Android apps acquired from the Google Play Store across 42 different categories. We find that 95.26% of these apps contain at least one high, medium or low vulnerability. The trends indicate that severe vulnerabilities are present throughout the Android ecosystem, including popular apps and libraries. The main goals of this paper are: • Perform large-scale analysis of 103,783 mobile apps developed on Android to quantify the prevalence of 18 types of vulnerabilities in the Android ecosystem. • Analyze trends in these vulnerabilities and find that vulnerabilities are present in all corners of the Android ecosystem.
  • 2. • Develop a highly scalable analyses that can detect which packages are more prone to the 18 types of vulnerabilities. II. MOBILE APP DATASET The mobile he mobile applications considered for this research comprise of a total of 42 different categories as per the Google Play Store. These have been combined into 36 categories in order to merge certain similar subcategories into one. 103,783 apps have been tested against 18 different security vulnerabilities. We will later construct analyses to find and quantify which packages are susceptible to these vulnerabilities in our dataset. i. Categories There are 48 categories of applications on google store that are represented by this pie chart: Figure 1: App distribution by categories ii. Vulnerabilities We have covered following 18 different security vulnerabilities including the Top Ten OWASP [2] mobile threats, and threats highlighted in the CAPEC [9] database: 1. Unprotected Services 2. Improper Content Provider Permissions 3. Application Debug Enabled 4. Improper Custom Permission 5. Broken Trust Manager for SSL 6. Broken hostname verifier for SSL 7. Insecure SSL Socket Factory Implemented 8. Hostname Verifier Allows all hostname 9. App extends WebViewClient 10. Unused Permissions 11. Remote Code Execution through JavaScript Interface 12. Unprotected Activities 13. SQL Injection 14. Information in Shared Preference 15. Insufficient Transport Layer Protection 16. Derived Crypto Keys 17. Application Logs 18. Business Logic III. ANALYSES This section describes the procedure used to determine the correlation between packages and vulnerabilities of 103,783 apps in the dataset. i. Data Cleansing Our data source was a JSON file of 103,783 apps consisting of Category, Developer Name, App URL, Title, Version, Vulnerabilities, Description of each vulnerability, Packages and the Risk rating of each vulnerability for all the apps. To find the correlation between Packages and Vulnerabilities, the data was cleaned by converting the JSON file to an excel file which comprised of Category, App name, Packages, Vulnerabilities, and the Description of each vulnerability. The description of each vulnerability contains a text string which includes the name of the package in case the vulnerability is because of the mentioned package. If the description does not contain a package name, the security vulnerability detected is not caused by a package. Here is an example of a vulnerability description for an app: [{"description": "Custom TrustManager is implemented in class com.flurry.sdk.eu implements naive certificate check. This TrustManager breaks certificate validation! Referenced in method com.flurry.sdk.eu->a"}].
  • 3. In the mentioned example, the package “com.flurry.sdk.eu” is responsible for causing the vulnerability. ii. Preprocessing As the main aim of this study was to find correlation between packages and mobile vulnerabilities, we filtered out the apps that do not have any vulnerability altogether or vulnerabilities caused by a package. We also replaced multiple package entries with their corresponding parent package. For example in the list of packages “com.XYZ.a.net” and “com.XYZ.b.net” the parent package was extracted to be “XYZ”. iii. Vulnerability Analysis We applied MapReduce using Python to our preprocessed data. Three functions namely Mapper, Shuffle and Reducer were defined. In our algorithm, the packages are given as input into the Mapper function and the SDKs are generated as tuples or key value pairs. In the Shuffle function, the words were sorted lexicographically and then appended into an empty list after which the Reducer function was used to add up the parent package tuples. Algorithm: Import re def mapper (doc_name, text): result = [] for line in text: # remove leading and trailing whitespace line = line.strip() # remove the following symbols: !”§$%&/()=?*#()[],.<>:; ~_- line = re.sub ('[!"§$ %&/()=?*#()[],.<>:;~_-]',"", line) # split the line into words words = line.split (" ") # insert the cleaned words into the results list for word in words: result.append((word, 1)) return result def reducer(key, values): print ("Reducer result -> %s: %d" % (key, sum (values))) def shuffle(words): # sorting the words sorted_keys = sorted (words) tmp="" val_list= [] for i in sorted_keys: if i[0]! =tmp and tmp! ="": reducer (tmp, val_list) val_list= [] tmp=i[0] val_list.append (i[1]) elif i[0]==tmp or tmp=="": tmp=i[0] val_list.append(i[1]) reducer (tmp, val_list) Figure 2: Code Snippet of Analysis Algorithm The parent package with the highest value will be the most vulnerable in the dataset. Likewise, we get the probability of vulnerability for each package used. IV. RESULTS In the dataset of 103,783 mobile apps, we found the following: • The 95.26% of apps have at least one vulnerability • 74.24% of apps contain at least one vulnerability of either medium or high risk • 13.63% of the apps have more than 4 vulnerabilities in them. • 47.59% of the apps have high security risks • 15.30% of them have medium security risks, and • 37.11% have low security risk. Below chart shows top vulnerabilities affecting android app security: Figure 2: Top 5 vulnerabilities Figure 3 displays the result of the map reduce algorithm in the form of a bar graph. It is a representation of top 20 vulnerable packages and the percentage of apps that are affected by them. It can be observed that highly popular packages like Flurry for analytics has a security issue that affects over 13% of the apps that we examined, that is, more than 13,400 applications are affected by the security issue caused by this package.
  • 4. Figure 3: Packages vs Vulnerability Percentage A total of 3,227 packages were found in our android applications dataset. All these applications and packages were evaluated based on the 18 security vulnerabilities from mentioned before. V. CONCLUSION In this study, we have systematically found the correlation between packages and vulnerabilities as determined through the security analysis of 103,783 Android apps taken from the Google Play Store. We explored trends in vulnerable apps and found that vulnerabilities are present across the entire Android app ecosystem, including the most popular libraries and apps. Our analyses provide evidence of a pattern where certain vulnerable packages result in the presence of particular security vulnerabilities. In this paper, we have been able to establish this correlation and have also revealed the most vulnerable packages and the top five commonly found security vulnerabilities. The knowledge of security vulnerabilities associated with different packages will help developers build secure apps and keep businesses safe by killing the chain of Advanced Persistent Threat (APT) [10]. REFERENCES [1] "Combating threats in the Cyber world: Outsmarting terrorists, hackers, and spies," FBI, 2012. [Online]. Available: https://archives.fbi.gov/archives/news/speeches/combati ng-threats-in-the-cyber-world-outsmarting-terrorists- hackers-and-spies [2] Y.-D. Lin, C.-Y. Huang, and M. Wright, "IEEE Xplore document - mobile application security," IEEE, Jun. 2014. [Online]. Available: http://ieeexplore.ieee.org/document/6838873/ [3] Gartner, "Gartner says more than 75 percent of mobile applications will fail basic security tests through 2015," 2014. [Online]. Available: http://www.gartner.com/newsroom/id/2846017 [4] "85% of the top mobile banking apps in the APAC region fail basic security checks," in Appknox, Appknox, 2016. [Online]. Available: https://blog.appknox.com/85-of-the-top-mobile- banking-apps-in-the-apac-region-fail-basic-security- checks-appknox-banking-report/ [5] OWASP, "Projects/OWASP mobile security project - Top Ten mobile risks," 2014. [Online]. Available: https://www.owasp.org/index.php/Projects/OWASP_M obile_Security_Project_-_Top_Ten_Mobile_Risks [6] PCI Security Standards Council LLC, "PCI security," 2006. [Online]. Available: https://www.pcisecuritystandards.org/pci_security/ [7] U.S. Department of Health & Human Services, "Summary of the HIPAA security rule," HHS.gov, 2013. [Online]. Available: http://www.hhs.gov/hipaa/for- professionals/security/laws-regulations/index.html [8] "The Sarbanes-Oxley act 2002," 2006. [Online]. Available: http://www.soxlaw.com/ [9] "Common attack pattern enumeration and classification (CAPEC) ï," 2007. [Online]. Available: https://capec.mitre.org/data/lists/283.html [10] J. B. Thummala and B. Thummala, "Defending advanced persistent threats - be better prepared to face the worst," Infosecurity Magazine, 2016. [Online]. Available: http://www.infosecurity- magazine.com/opinions/defending-advanced-persistent/