SlideShare a Scribd company logo
A Proposed Model for Cybercrime Detection
Algorithm Using A Big Data Analytics
Hossam Abdel Rahaman
Dept of Computer and Information Sciences
Faculty of Statistical Studies and Research
Cairo University, Egypt
Hossam_mm7@yahoo.com
Abstract— Cybercrime today is evolving as part of our day-to-
day lives, and The challenges of cybercrime reduction and
prevention are becoming increasingly complex, that needs a
new technique to handle the vast amount of this data, The
capabilities of the traditional activities of police mostly drop
brief in portraying the original division of criminal activities,
hence contribute less to the appropriate allotment of police
services. In this paper, methods are described for cybercrime
Prediction, by using the Hadoop technique for big data
analytics, through examining the geological zones which
incorporate more noteworthy chance and exterior the
conventional policing capabilities. The used method makes the
utilize of a topographical cybercrime mapping algorithm to
distinguish regions that have generally high cases of
cybercrime. This method will identify exceedingly cases of
cybercrime clusters which assist can show the patterns of
cybercrime. the estimation approach is enhanced by the
processing capability of the Hadoop platform.
Keywords-component; formatting; style; styling; insert (key
words)
I. INTRODUCTION
Cybercrimes are getting increased with expanding
dangers through online fraud and unscrupulous hacking.
With both cyber safety threats and data increasing, the
organizations must be prepared to prepare themselves with
foreseeing and anticipating cybercrime. the specialists of
cybercrime are using digital Forensic tools to identify
cybercrime episodes and recognize any potential threats like
credit card frauds. Big data analytics is empowering
companies to analyze the gigantic sum of information they
collect amid the monetary transactions; cybercrime could be
a greater significance nowadays due to the increased risk of
cybercrime. Big data tools are being utilized to combat
cybercrime attacks. big data analytics can offer to detect
forgery and can facilitate digital forensic analysis. [1]
The utilize of K-Mean algorithms to analyze the data and
predict where cybercrime is likely to happen is getting to be
more common in law authorization. Frequently referred to as
predictive analysis, which gets to be the police agency's
successes to cybercrime reduction efforts by applying the
predictive investigation. [2]
The detection algorithm presented in this paper has three
stages as appeared in Figure 1. The first phase is the
distribution geographic of cybercrime data analysis which
identifies spatial clusters that have a greater risk of
cybercrime. In the second phase a K-Mean clustering
algorithm that utilized to determine the quality of each
identified cluster. [3]
Figure 1. Predictive Process
This paper delineates a cybercrime detection algorithm
on the Hadoop platform in big data analytics that will be able
to predict the near likely cybercrime. also, a brief overview is
made about several techniques utilized in analyzing big data
to detect online fraud and unethical hacking by analyzing
large sets of data. One aim of this study is to identify the
model that best identifies online fraud cases. [4]
A. Problem Statement
The predictive of big data analysis has not been broadly
examined and studied from an objective, perspective
scientific. Whereas beginning experiences by the police
agencies that have either fully implemented or experimented
with predictive policing techniques appear to be positive,
predictive policing’s affecting on cybercrime has yet to be
definitively determined. this problem is troublesome because
the utilize of predictive analysis in policing is so modern that
small objective research has been conducted on its
cybercrime reduction applications. [5]
B. Challenges Of Research
• The distinctive techniques and infrastructures that
are used for recording data on cybercrime.
• The diverse techniques that can analyze with
precision and efficiency for this expanding volume
of data on cybercrime.
• The accessible data are inconsistent and fragmented
are making the task increasingly difficult formal
analysis
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 18, No. 6, June 2020
146 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
• Increasing the size of the data that has to be stored
and analyzed.
C. Research Questions
The research questions as following:
• Dose the predictive big data analytics an effective
cybercrime control practices that can contribute to
improved homeland security?
Secondary questions include the following:
• What is the relationship between predictive big data
analysis and cybercrime reduction in cities that have
implemented the practice?
• How does the quantity and quality of historical big
data affect the relationship between predictive
analysis and cybercrime reduction?
D. 1.4. Objective of Research
The objective of this research can clearly be broken down
into two Points as the following:
• Reducing the incidence of cybercrime by big data
analytics to make cybercrime predictions in terms of
time and space.
• Efficiency improvement of the Egypt Cybercrime
Centre to empower the users of the Internet in Egypt
that They would able to be more sensitized about the
emerging trends on cybercrime. Cybercrime today is
evolving as part of our day-to-day lives.
• Aim the proposed algorithm for Egypt Cybercrime
Centre to contribute the fight against cybercrime in
Egypt.
• Aim to use predictive big data analysis techniques by
Egyptian police to contribution reallocate resources
towards homeland security missions.
II. BIG DATA
The world is becoming interconnected digitalized so that
the amount of data has been detonating every minute. To
manage the records of this data, it requires extremely
powerful action intelligence.
The problem begins during data acquisition when a huge
amount of data requires us to make decisions about what data
to be a store, what to discard, and how to store so that data
can be kept reliable and accurate. Big data refers to datasets
whose capacity is beyond the ability of the typical database,
store, oversee, manage, and analyze. It can be described as a
massive volume of both structured and unstructured data that
can’t be stored using traditional databases, which consists of
billions to trillions of records that are collected from millions
of people all from different sources. The sources of data may
come from the web, sales, customer contact center, social
media, mobile data. [6]
Big data as shown in fig 1, is a term associated related to
expansive datasets that come into existence with the features
of volume, variety, velocity, and veracity of data. Data
variability, value, and complexity are some other features
that are used with big data.
Volume: The large amount of data stored that can be
collected and analyzed effectively.
Variety: Type of data that may be structured, unstructured,
log files, text, video, audio, and transactions.
Velocity: Rate of the speed of data available and data
change for analysis.
Veracity: Related to data integrity and extend of trust in the
data to confidently that use it to make decisions.
Fig1, Big Data Properties
In the context of financial banking transaction analysis,
volume corresponds with the thousands of credit card
transactions that happen every second in every day. Variety
refers to the type of data that is used in transaction activities.
Velocity refers to how to speed data that can be processed for
analytics. Veracity related to analyzing the credit card
transactions to make decisions on it with the aim of finding
fraudulent transactions if any. These factors are important for
analyzing transactions to find fraud and taking the needed
action immediately to correct the fraudulent transactions. [7]
A. Big Data Analytics
The technology of Big data analytics is useful
information such as a hidden value, and a relation rule from
huge data. When the data volumes reach big data extents,
parsing it for important data requires exceptionally effective
data analytics. The domain of Big Data Analytics is
concerned with the extraction of value from big data which
are significant, previously unknown, implicit, and potentially
useful. These experiences have a direct effect on making a
useful decision from the interpreted data. With the assistance
of the right analytical tools, and big data can detect various
frauds. [8]
in financial banking the analytics tools perform the
following activities:
• Collects data from some of the enterprise sources.
• Performs more profound analytics on the data.
• Provides a fine view of security information.
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 18, No. 6, June 2020
147 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
• Achieves real-time analysis of streaming data.
III. BIG DATA ANALYTICS IN CYBER CRIME
Big Data Analytics in cybercrime security includes the
ability to collect massive amounts of digital information to
analyze, visualize, and draw knowledge that can make it
conceivable to foresee and halt cybercrime assaults. to
detecting fraud rapidly, requires real-time investigation of
many structured and unstructured data sources. detection of
fraud is one of the most visible uses in big data analytics. [9]
Most of the frauds are high-volume in nature. So, a great
opportunity is given for analytics to identify patterns from
high volume data and suggest preventive action. some of the
techniques utilized to detect frauds require recognizing
identical/repeating pattern matches of people, places,
systems, and events. Compared to conventional approaches,
huge data analytics gives a proficient cybersecurity setting
by isolating what is “normal” from what is “abnormal”,
isolating the designs produced by authorized clients from
those created by suspicious or malicious clients. [10]
By providing means to discover changing patterns of
malicious activities hidden profound in large volumes of
organizations data, big data tools can undoubtedly enable
businesses to better understand if and how they have been
attacked. Big Data can be associated with the following fraud
detection techniques: [11]
A. Descriptive Analytics - Unsupervised Learning
Descriptive analytics are target to finding the behavior
that deviates from normal behavior or to detecting anomalies.
These techniques learn from observations historical and not
require observations as a fraudulent or non-fraudulent
activity. [12]
B. Predictive Analytics - Supervised Learning
Predictive analytics are target to learn from historical
data that recovers patterns that allow permitting the contrast
between normal and fraudulent behavior. These analytics can
be applied to detect fraud as well as to estimate the amount
of fraud. [13]
C. Social Network Analytics
Target to extending the capability by detecting the
fraudulent behavior in a network of linked substances. It also
finds the relationship between entities by revealing specific
patterns indicating fraud. [14]
D. Big Data Detection Techniques
A formal digital forensic investigation cannot be
launched until extract important and significant data from the
entire data set. The focus is on the different techniques that
can facilitate the digital forensic investigator in analyzing the
big data to find the underlying relationship among the data.
Furthermore, these techniques help the investigator in
extracting meaningful and purposeful digital forensic
evidence for detecting frauds from the large datasets. [15]
IV. CRIME PREDICTION THEORY
The incidents of various types of cybercrimes in different
states of Egypt for the year 2019 were considered as the
input for the analysis. It also contained the number of
persons arrested under different age groups ranging from 18,
18 to 30, 30 to 45, and 45 to 60 and above 60, in different
states and union territories. The distinctive sorts of
cybercrimes and its description are given in Table 1. Feature
selection techniques are utilized to determine the important
features such as the conspicuous categories of cybercrime
and the age gather of people who are included in these
crimes more. The chosen features were normalized utilizing
the population attribute for each and every state in Egypt.
TABLE I. CYBERCRIME TYPE
Crime Type Description
Manipulate in computerized
documents Source
The person knowingly or intentionally
concealing, destroying code or altering
or causing another to conceal, crush or
modify any computer source utilized for
a computer, computer software engineer
or computer network
Hacking computing systems
Finding out shortcomings in a computer
or computer network, misusing and
exploiting them.
Types 1- Loss, damage to computer
source, utility
Type 2 - Hacking
Distribution indecent -
Transmission in electronic
Transmitting indecent content through
Internet/ Emails and cell phones (SMS).
Compliance failure - Orders
of certifying authority
Failure of the license is provided in
issuing a digital signature.
Unauthorized to access -
endeavor to access of
protected computing systems
Access to any computer software
programs or software sources, which
have security vulnerabilities without
legal permission.
Obtaining a license or digital
signature certificate by
deceptions
An individual who attempts to obtain
obtains or endeavors to maintain a
license by willful misrepresentation or
fraudulent representation
Publishing false digital
signature certificate
A digital signature authorizes the
identity of a person. Publishing false
signatures is similar to the crime of
personification.
False digital signature
certificate
Breach of confidentiality and
privacy
A Breach of Confidentiality is a Security
violation where the Confidentiality of
some data was lost.
A. Cybercrime spatial data analysis
Arrested under various age groups was the input for our
investigation analysis. This cybercrime data is distributed
across 28 states in Egypt. This data is normalized utilizing
the population for each state and union territory. The feature
selection technique was applied to determine the contribution
of chosen features towards cybercrime activities. The
attributes with higher ranks were further considered for our
analysis. The attributes and their F score are given in Table
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 18, No. 6, June 2020
148 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
2. Three among nine types of cybercrimes and the age
groups of 18-30 and 30-45, among the various age groups
considered occupied higher ranks. Thus, they were further
considered in our analysis for the prediction of relevant
patterns. [16]
TABLE II. FEATURE
Attribute I-Score
Hacking 0.7532
Obscene Publication 0.2412
Failure of compliance 0.5201
Age (18-30) 0.7421
Age (30-45) 0.6102
X-means clustering connected to the chosen resulted in
three clusters. Then an application of the K-means clustering
algorithm with the value of k as 3 was applied for finding out
the cluster patterns. [17]
The geospatial distribution of cybercrimes in Egypt is
shown in Figure 2, areas marked with red color depict the
regions where the occurrence of cybercrime incidents is high,
yellow color depicts nominal cybercrime occurring regions
and blue color depicts the areas with very low incidents of
cybercrime. Visualization and analysis of the crime patterns
results in meaningful inferences
Figure 2 Cybercrimes in Egypt
From Table 3, we watch that the lion's share of the
Egyptian domains drops beneath the third cluster where the
wrongdoing design or the event of cybercrimes is none.
Cluster 2 represents high cybercrime occurring regions and
the people involved in the cybercrime people group 35 to 45.
Cluster 1 represents the average crime occurring region and
the age group of people involved also includes the age 18 to
30.
TABLE III. MAJORITY OF THE EGYPT TERRITORIES
Intensity of Crime Arrested Persons
Crime1
Hacking
Crime2
Obscene
publication
Crime3
Failure of
compliance
Age
(18 to 35) (35 to 45)
Average None Average Average None
Average High High Very low High
None None None None None
V. PROPOSED MODEL FOR CYBER-CRIME PREDICTION
A. Collection of cybercrime dataset
A variety of cybercrime data should be collected for the
prediction of cybercrime class in the banking sector by the
analysis of cybercrime patterns. So, this data has to be
collected from various news feeds, articles, and blogs, police
department websites over the internet web. The collected
cybercrime data is stored in a cybercrime database for further
handling of data. [18]
B. Pre-processing of cybercrime dataset
The cybercrime dataset put away within the cybercrime
database has to be pre-processed before applying data mining
processes to them. Because pre-processing expels noisy data,
lost, missing values.
Figure 3 Proposed Model for cyber Crime Prediction
C. Data mining Techniques
For Pre-processed data, Data mining processes and
algorithms are implemented to identify or forecast fraud
through Knowledge innovation from abnormal patterns and
also it achieves recognition in combating cybercrime
financial fraud Data Mining by contributing in solving
tribulations within keeping banking sector by discovering
patterns, relationships, and links that are unseen in the
business information accumulated in the crime databases.
[19]
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 18, No. 6, June 2020
149 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
D. Association Rule mining
Based on the frequent incidents of cybercrime patterns,
Association rule mining processes rules for cybercrime
dataset. These produced rules help the assessment processes
of characterizes society to take a hindrance activity. The
procedure comprises the subsequent measures:
a) The method of deciding commonly occurring item
sets within the cybercrime database.
b) The recognizable of patterns in program
implementation and customer behaviors as association rules
known as interruption recognition.
E. Clustering
The clustering is the number of groups which divided up
to a set of records or items. Clustering is suggested in
discovering interactions linking cybercrime and criminal
characteristics having a few past strange common
characteristics. For discovering frauds in banking sectors,
clustering techniques are utilized. Clustering is stated as
unsupervised learning because its classes are not positive and
decided in progress and consortium of data is exclusive of
supervision. [20] K-Means' partition algorithm is
implemented in clustering cybercrime datasets because of its
minimalism and less computational intricacy. At first, the
quantity of data items is assembled and precise as (k)
clusters. Between the mean separations of objects, the mean
value is intended. The repositioning iterative method is
utilized to recover the partitions by transferring items from
one cluster to another. Then until the union occurs, the
number of iterations is carried out. [21]
F. K-Means Algorithm
G. Classification
Classification is the most frequently utilized data mining
technique, which executes a set of pre-classified cases to
build up a model that can classify the instances of attributes
on a huge scale. The classification technique makes an
association between a dependent variable and an independent
variable by mapping the data points. Within the given
dataset, Classification is used to bring out in which group
each data occurrence is associated. Classification is utilized
to create several models of unknown patterns and prospect
assessment on the basis of the previous decision making.
Automatic credit authorization is the about major procedure
in the banking sector and financial organizations. Frauds can
be prohibited by building an outstanding assessment for the
credit consents using the classification representation based
on decision trees such as Apache Hadoop.
H. Influenced Association Classification
For fulfilling more exactness, the affiliated classification is
an amazing and moving novel and improved method which
assimilates the mining of association rule and classifications
of the prediction model. This method is being implemented
for ruling out the link and association over item sets. The
affiliated classification comes under unsupervised learning
since it does engage any class characteristic for rule
extraction. Two steps employed to extract association rules
are: [22] [23]
a) Through the cybercrime data set, classes are
produced based on the affiliation rule.
b) In the class labels, perform an examination on the
dataset classification.
Different steps implemented in Affected Association
Classifier has been summarized below:
c) Pre-process the cybercrime dataset so assist
mining hones can be accomplished on them.
d) To replicate the assessment in the replica of
prediction, every element is assigned within a range of
weight
Attributes having additional significance are allocated
maximum weight (0.9) and having fewer significance are
allocated minimum weight (0.1).
Influenced Association Rule Mining algorithm is
implemented on pre-processed cybercrime data set for
obtaining fascinating pattern invention. Influenced
Association Classification uses weighted support and
confidence and the rules spawned by this process are known
as Classification Association Rule.
The extracted Classification Association Rules are stored
in the Rule base index. At any time if any new cybercrime
record is updated, this CAR rule forecast the class label from
the Rule base.
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 18, No. 6, June 2020
150 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
I. Cyber Crime Prediction using Apache Hadoop
For the classification of problems and issues in the
cybercrime prediction analysis, Apache Hadoop technique is
thorny and more precise two steps are: [24]
a) Formation of the tree.
b) Validate the built tree over the cybercrime data set.
The Apache Hadoop technique uses a clipping method
for the construction of the tree. The clipping technique
diminishes the size of the tree by removing appropriate data
that guides the terrible concert in prediction. The anticipated
Apache Hadoop technique classifies the data until the entire
categorization and affords the utmost accuracy over the
training of cybercrime data. It also stabilizes the precision
and litheness. The Apache Hadoop technique is the extensive
version of decision tree C4.5. The Apache Hadoop technique
produces the classifier output in the form of rule sets and
decision trees. The rule sets are straightforward to recognize
and too easy for employing within the application. [18]
J. Experimental Settings
K-Mean cluster consists of six data nodes, considered as
slave roles only, and one name node which is both a slave
role and a master role in our system. The details of these
nodes are listed in Table 3. Besides, and we will set the
number of replicas to be 6 since there are 6 nodes in total in
this cluster.
TABLE IV. K-MEAN CLUSTER COMPOSITION
We used the same configuration. Indicate references by Dili
WM, 2013)
VI. PERFORMANCE ANALYSIS
This section will monitor and evaluate Apache Hadoop
performance in three cases:
• Without using Apache Hadoop.
• At the beginning of using Apache Hadoop.
• After a certain period (one month) from using
Apache Hadoop.
We will take in our consideration the following
parameters in the evaluation process
• Requests returned from the Apache Storage.
• Requests returned from Apache Storage without
verification.
• Requests returned from the Apache Storage,
updating a file in cache.
• Requests returned from Apache Storage after
verifying that they have not changed.
A. Performance Metrics
The main categories of performance metrics are:
a) Apache Storage Performance: how requested Web
objects were returned from the Storage or from the network.
b) Traffic: the amount of network traffic, by date, sent
through Apache Hadoop including both Web and non-Web
traffic.
c) Daily traffic: average network traffic through
Apache Hadoop at various times during the day. This report
includes both Web and non-Web traffic.
d) Response Time: how Apache Hadoop responded to
HTTP requests during the reporting period.
e) Failures communicating: Apache Hadoop
encountered the following failures communicating with
other computers during the reporting period.
f) Dropped Packets: shows the number of dropped
network packets during the report period Users that had the
most dropped packets are listed first
g) Queue Length: Queue Length counter shows how
many threads are ready in the processor queue, but not
currently able to use the processor. Indicate references by
(Spark Streaming Programming Guide)
B. Types of Requests
We want to know the file types that occur most often in
the application server. Knowing the characteristics of the log
files based on file type gives some indication of whether the
document will change or not.
TABLE V. TYPE OF REQUEST
C. Apache Storage Performance
The Storage performance results for each of the log files
are shown below. The percentage of requests returned from
Node Instance Type CPU Memory Storage Privet IP
Node1 M1 Medium Core i7 8 GB 500 GB 10.1.1.2
Node2 M1 Small Core i5 4 GB 200 GB 10.1.1.3
Node3 M1 Small Core i5 4 GB 200 GB 10.1.1.4
Node4 M1 Small Core i5 4 GB 200 GB 10.1.1.5
Node5 M1 Small Core i5 4 GB 200 GB 10.1.1.6
Node6 M1 Small Core i5 4 GB 200 GB 10.1.1.7
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 18, No. 6, June 2020
151 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
storage without verification is high. It shows that between
38% of all requests result in a request returned from Apache
Hadoop without verification, which is consistent with
previously published results. Reported that only 15% to 32%
of their Apache logs results in requests returned from Storage
without verification, also notice detect unknown objects
returned from the Apache Storage
TABLE VI. STORAGE PERFORMANCE RESULTS
Status Requests
% of
Total
Requests
Total
Bytes
Objects returned from Apache
Storage
21251 59.30 % 622.73 MB
Objects returned from Apache
Storage without verification
14110 38.20 % 29.93 MB
Objects returned from Storage
after verifying that they have not
changed
587 1.40 % 0.98 MB
Information not available 337 1.00 % 47.94 KB
Unknown objects returned from
the Apache Storage
61 0.20 % 15.71 KB
Total 327251 100.00 % 675.52 MB
D. Traffic
The results for average network traffic through Apache
Hadoop at various times during the day at the beginning of
using Apache Hadoop and after a certain time from using
Apache Hadoop are in the Table below.
The results indicate that the average processing time for
handling the request is reduced by 43% after a certain time of
using Apache Hadoop because Apache Hadoop the
previously visited pages and return them directly to the client
without waste time to ask Storage server each time
a) Traffic by Time of day
The following Table summarizes average network traffic
through Apache Hadoop at various times during the day
TABLE VII. TRAFFIC BY TIME OF DAY
b) Dropped Packets
The result below shows the users who had the highest
number of dropped network packets during the reporting
period. Users that had the most dropped packets are listed
first. We can observe that the percentage of dropped packets
is reduced by time, also notice detect unknown two users
using a network IPs out of network range (172.31.0.,12).
TABLE VIII. DROPPED PACKETS
TABLE IX. DROPPED PACKETS AFTER USING APATCHE
User
At the beginning of
using Apache Hadoop
After a certain time of
using Apache Hadoop
Dropped
Packets
% of Total
Dropped
Packets
Dropped
Packets
% of Total
Dropped
Packets
10.1.1.13 35887 23.30% 682 11.80%
10.1.1.14 34871 22.50% 662 11.10%
172.31.0.2 32817 21.50% Unknown Unknown
172.31.0.1 30618 20.20% Unknown Unknown
10.1.1.12 4832 3.00% ---- 13.30%
10.1.1.20 2310 1.50% 223 4.100%
10.1.1.23 2301 1.50% 221 3.70%
An Algorithm is widely explored to detect unknown or
previously unseen two networks IP. the technique not only
detects the known Network IP but can also detect the
unknown objects returned for patch storage. The technique is
a two-step process, in the first step feature is extracted from
the know datasets which plays a vital role, not only to
represent the target concept but also to speed-up the learning
and classification/detection processes. In the second step,
appropriate machine learning techniques, and trained for
detection/classification of up normal behavior .
At the beginning of using Apache
Hadoop
After a certain time of using
Apache Hadoop
Requests
Average
Processing
Time
TotalBytes
CacheHit
Ratio
Requests
Average
Processing
Time
TotalBytes
CacheHit
Ratio
1811
141.00
sec
4.52
GB
0.00
%
6186
52.80
sec
2.87
GB
1.00
%
1617
131.40
sec
8.95
MB
0.00
%
6246
59.31
sec
35.89
MB
0.00
%
1535
122.10
sec
8.23
MB
0.00
%
5844
61.29
sec
35.27
MB
0.00
%
1816
125.20
sec
8.71
MB
0.00
%
6103
57.00
sec
34.19
MB
0.00
%
TimeInterval
Average
Requests
Per
Second
Average
BytesPer
Second
Average
Response Time
for Apache
Requests
Average
Response Time
for Non-Apache
Requests
00:00 14.4
66.48
KB
- 54.20 sec
00:15 18.1
12.14
MB
- 57.80 sec
00:30 15.8
11.47
MB
0.00 sec -
00:45 16.1
82.44
KB
Unknown 66.10 sec
01:00 15.1
64.19
KB
0.00 sec -
01:15 17.2
68.21
KB
0.00 sec -
01:30 15.5
91.32
KB
0.00 sec -
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 18, No. 6, June 2020
152 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
VII. VALIDATION AND VERIFICATION
The big data analysis required more visual inspection as
well as manual execution than the other components. We
have tested the solutions described above by several
activities and measured the output of these activities.
a) Tools Used
Apache Hadoop Logs- The Hadoop log was used to
determine the amount and size of data sent by clients. It also
was used to generate the emulation of client requests.
b) SCOM Reports
System Center Operations Manager (SCOM) as a storage
log analysis tool that allows the user to pull various pieces of
data from its log files such as the number of requests, bytes
transferred, hosts contacted, etc. and we used it to reveal
some characteristics of the log files used in the performance
analysis.
VIII. CONCLUSION
The proposed work focuses on cybercrime prediction by
crime mapping with recorded data using the latest
technology. The model helps in reducing cybercrime for the
security authorities. And improve network performance.
Numerous of the Network anomaly detection techniques are
designed based on the accessibility of data instances.
Numerous anomaly detection techniques have been
specifically particularly for certain application domains,
while others are more generic. this paper presents a cascaded
algorithm utilizing K–Means algorithms for big data
Anomaly Detection. The proposed algorithm is used to
detect the anomalies presented in the supervised and
unsupervised data set. The model also helps the authorities in
the investigation of crimes. Using Big Data Analytics with
the clustering approach reduces the investigation time and
helps in retrieving the hidden information.
REFERENCES
[1] Cameron S.D. Brown, “Investigating and Prosecuting Cyber Crime:
Forensic Dependencies and Barriers to Justice”, International Journal
of Cyber Criminology, ( 2015).
[2] Spalevic Z, “Cyber Security as a global challenge today”,
Singidunum Journal of Applied Sciences, (2014 )
[3] Najafabadi M., Villanustre F., Khoshgoftaar T, Seliya N., R. Wald,
and E. Muharemagic, “Deep learning applications and challenges in
big data analytics”, Journal of Big Data, ( 2015)
[4] Gupta P, N.Tyagi, “An Approach towards Big Data –A Review”,
International Conference on Computing, Communication and
Automation (IEEE), ( 2015).
[5] Tahir S, Waseem I, “Big Data−An Evolving Concern for Forensic
Investigators”, IEEE Transactions, ( 2015).
[6] Chen X , Member S , X. Lin, “Big Data Deep Learning: Challenges
and Perspectives”, IEEE Access, Vol 2, ,( 2014)
[7] Magoulas R , Lorica B, “Introduction to Big Data”, Release 2.0, Issue
11, , (Feb 2009 ).
[8] m.i.pramanik, raymond y.k. lau, wei t.yue, yunming ye and chunping
li., “big data analytics for security and criminal investigations” Wiley
interdisciplinary reviews-data mining and knowledge discovery, vol.7
no.4,1-19, (2017).
[9] chung-hsien yu, max w. Ward, melissa morabito, wei ding, “crime
forecasting using data mining techniques,” international conference
on data mining workshops, IEEE, 2011.
[10] ManjeetRege& Raymond Blanch K. Mbah, Machine Learning for
Cyber Defense and Attack , DATA ANALYTICS 2018 : The Seventh
International Conference on Data Analytics, Copyright (c) IARIA,
2018. ISBN: 978-1-61208-681-1 , pp.73–78.
[11] Tariq M , Uzma A, “Security Analytics: Big Data Analytics for Cyber
security A Review of Trends, Techniques and Tools”, 2nd National
Conference on Information Assurance (NCIA), ,( 2013).
[12] Palak G, Nidhi T, “An Approach towards Big Data –A Review”,
International Conference on Computing, Communication and
Automation (IEEE), (2015 ).
[13] Giri T, Anjan G, “A Survey on Data Science Technologies & Big
Data Analytics”, International Journal of Advanced Research in
Computer Science and Software Engineering, Vol 6, Issue 2, , (Feb
2016 ).
[14] Dean J , Ghemawat S, “MapReduce: Simplified data processing on
large clusters”, Communications of the ACM, vol 51, pp. 107-113, ,
(2008 ).
[15] Siddaraj u, Sowmya C , Rashmi K, Rahul M,( 2014) “ Efficient
Analysis of Big Data Using Map Reduce Framework”, International
Journal of Recent Development in Engineering and Technology,
Vol.2.
[16] Aksoy, S., “K–Nearest Neighbor Classifier and Distance Functions,”
Technical Report, Department of Computer Engineering, Bilkent
University (February 2008)
[17] A. Reyes, R. Brittson, K. O’Shea, and J. Steele, Cyber Crime
Investigations: Bridging the Gaps Between Security Professionals,
Law Enforcement, and Prosecutors. Elsevier Science, 2011.
[18] D. Quick and K.-K. R. Choo, “Impacts of increasing volume of digital
forensic data: A survey and future research challenges,” Digital 
Investigation, vol. 11, no. 4, pp. 273 – 294, 2014.
[19] R. Rowlingson, “A ten step process for forensic readiness,”
International Journal of Digital Evidence, vol. 2, no. 3, pp. 1–28,
2004.
[20] A. Guarino, “Digital forensics as a big data challenge,” in ISSE 2013
Securing Electronic Business Processes. Springer, 2013, pp. 197–203.
[21] P. Dhaka and R. Johari, “Crib: Cyber crime investigation, data
archival and analysis using big data tool,” in 2016 International
Conference on Computing, Communication and Automation
(ICCCA), April 2016, pp. 117–121.
[22] H. Van Beek, E. van Eijk, R. van Baar, M. Ugen, J. Bodde, and A.
Siemelink, “Digital forensics as a service: Game on,” Digital
Investigation, vol. 15, pp. 20–38, 2015.
[23] Alessandro G, “Digital Forensic as a Big Data Challange”, ISSE
Securing Electronic Business Processes,( 2013).
[24] Katarina G, Michael H, Wilson A. Higashino, A, David S. Allison,
and Miriam A. Capretz M, “Challenges for MapReduce in Big Data”,
Proc. of the 10th 2014 world congress on services. ,( 2014).
AUTHORS PROFILE
Hossam Abdel Rahman Mohamed:
Doctor degree for computer science in Cairo
University, Computer and Information
Technology Dept. His currently position is IT
Director at Bek Group.
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 18, No. 6, June 2020
153 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
JOURNAL
IJCSIS
Journal Impact Factor
Google Scholar Alerts
Conference Partnership
Open Access Journals
Sitemap
CALL FOR PAPERS
Call for Papers September
2020
Call for Papers August
2020
1st Special Issue - 2019
Special Issue 2018
AUTHORS
Notes for Authors
Submit Paper
Publication Fee
Review Process
IJCSIS PUBLICATION
ARCHIVES
All Volumes & Issues
Vol. 18 No. 7 JULY 2020
Vol. 18 No. 6 JUN 2020
Vol. 18 No. 5 MAY 2020
Vol. 18 No. 4 APR 2020
Vol. 18 No. 3 MAR 2020
Vol. 18 No. 2 FEB 2020
Vol. 18 No. 1 JAN 2020
Vol. 17 No. 12 DEC 2019
Vol. 17 No. 11 NOV 2019
Vol. 17 No. 10 OCT 2019
Vol. 17 No. 9 SEP 2019
Vol. 17 No. 8 AUG 2019
Vol. 17 No. 7 JULY 2019
Vol. 17 No. 6 JUNE 2019
Vol. 17 No. 5 MAY 2019
Vol. 17 No. 4 APR 2019
Vol. 17 No. 3 MAR 2019
Vol. 17 No. 2 FEB 2019
Vol. 17 No. 1 JAN 2019
Vol. 16 No. 12 DEC 2018
Vol. 16 No. 11 NOV 2018
Vol. 16 No. 10 OCT 2018
------------------------------------------------------------------------------------------------------------------------------
The International Journal of Computer Science and Information Security (IJCSIS) is
one of the leading open access publisher, with hundreds of papers published each year
related to different areas ranging from computer science, mobile & wireless computing,
networking and information security. The core vision of IJCSIS is to promote knowledge
and technology advancement for the benefit of academia, professional research
communities and industry practitioners. The aim is to support you to achieve success in
your research and scholarly experience.
Researchers, PhD scholars and professionals from academia and industry are solicited to
submit completed research and developments in the listed areas below. With a large
research community of authors, readers, editors and reviewers bounded together by their
talent and integrity, IJCSIS publications are available online freely for everyone worldwide.
All published papers undergo high-quality peer review and rigorous editorial processes.
The journal of Computer Science and Information Security is an Open Access journal
since 2009 with high citations in Google Scholar.
ISSN 1947 5500 Copyright © IJCSIS.
------------------------------------------------------------------------------------------------------------------------------
International Journal of Computer
Science and Information Security
IJCSIS August 2020 Volume 18,
No. 8
Important Dates:
Paper Submission (until) - 11
August 2020 (Deadline
Extension)
* Deadline extension to submit a paper can
be offered on request.
Decision Notification (2-3 weeks)
- August 19-23, 2020
Issue Publication (Online) -
International Journal of Computer
Science and Information Security
IJCSIS September 2020 Volume
18, No. 9
Important Dates:
Paper Submission (until)
- September 04, 2020
* Deadline extension to submit a paper can
be offered on request.
Decision Notification (2-3 weeks) -
September 18-21, 2020
Issue Publication (Online) -
October 03, 2020
Search this site
International Journal of Computer Science and Information Security https://sites.google.com/site/ijcsis/Home
Vol. 16 No. 9 SEP 2018
Vol. 16 No. 8 AUG 2018
Vol. 16 No. 7 JULY 2018
Vol. 16 No. 6 JUNE 2018
Vol. 16 No. 5 MAY 2018
Vol. 16 No. 4 APR 2018
Vol. 16 No. 3 MAR 2018
Vol. 16 No. 2 FEB 2018
Vol. 16 No. 1 JAN 2018
Vol. 15 No. 12 DEC 2017
Vol. 15 No. 11 NOV 2017
Vol. 15 No. 10 OCT 2017
Vol. 15 No. 9 SEP 2017
Vol. 15 No. 8 AUG 2017
Vol. 15 No. 7 JUL 2017
Vol. 15 No. 6 JUN 2017
Vol. 15 No. 5 MAY 2017
Vol. 15 No. 4 APR 2017
Vol. 15 No. 3 MAR 2017
Vol. 15 No. 2 FEB 2017
Vol. 15 No. 1 JAN 2017
Vol. 14 VirtualCom 2016
Vol. 14 No. 12 DEC 2016
Vol. 14 No. 11 NOV 2016
Vol. 14 No. 10 OCT 2016
Vol. 14 No. 9 SEP 2016
Vol. 14 CIC 2016
Vol 14 ICETCSE 2016
Vol. 14 No. 8 AUG 2016
Vol. 14 No. 7 JUL 2016
Vol. 14 No. 6 JUN 2016
Vol. 14 No. 5 MAY 2016
Vol. 14 No. 4 APR 2016
Vol. 14 No. 3 MAR 2016
Vol. 14 No. 2 FEB 2016
Vol. 14 Special FEB 2016
Vol. 14 No. 1 JAN 2016
Vol. 13 No. 12 DEC 2015
Vol. 13 No. 11 NOV 2015
Vol. 13 No. 10 OCT 2015
Vol. 13 No. 9 SEP 2015
Vol. 13 No. 8 AUG 2015
Vol. 13 No. 7 JUL 2015
Vol. 13 No. 6 JUN 2015
Vol. 13 No. 5 MAY 2015
Vol. 13 No. 4 APR 2015
Vol. 13 No. 3 MAR 2015
Vol. 13 No. 2 FEB 2015
Vol. 13 No. 1 JAN 2015
Vol. 12 No. 12 DEC 2014
September 03, 2020
   
      
List of detail topics including, but not limited to:
Computer science [more details] Information security [more details]
Information and communication
technology [more details]
Cloud computing security [more
details]
Wireless, mobile, and sensor networks [more
details]
Forensics computing and security
[more details]
Parallel and distributed systems [more details] Network security and privacy [more
details]
Pervasive computing [more details] Security, Trust and Privacy [more
details]
Data mining and predictive modelling [more
details]
Cloud and big data analytics [more
details]
Computer vision [more details] Data warehouse [more details]
Multimedia systems [more details] Internet of Things (IoT) [more
details]
3D Modelling, animation and virtual reality
[more details]
Enterprise systems [more details]
Biometrics and pattern recognition [more
details]
Software engineering [more details]
Computational science [more information] Software security [more details]
Digital image processing [more details] Business Intelligence &
Analytics [more details]
Computer networks [more details] Wireless sensor networks [more
details]
Green and Sustainable Computing [more
details]
Educational and web technologies
Software testing tools & technologies Computer applications technology
Network protocols, services and applications Intelligent systems
Cloud Services and Networks [aims and
scope]
Communication Technologies [aims
and scope]
Cloud Computing [aims and scope] Applied Informatics [aims and
scope]
Information Processing [aims and scope] Smart Learning Environments [aims
and scope]
Next Generation Wired/Wireless Advanced
Networks and
Systems [aims and scope]
Interaction Science [aims and scope]
Mathematical/Analytical Modelling and
Computer Simulation [aims and scope]
Social and Mobile Connected Smart
Objects
[aims and scope]
News and Updates
Whats New? On this page you will find the latest happening and information about IJCSIS
International Journal of Computer Science and Information Security https://sites.google.com/site/ijcsis/Home

More Related Content

PDF
Graphs in Government
PPTX
How Data Analytics is Re-defining Modern Era in Cyber Security
PPTX
Machine learning in Cyber Security
PDF
Big Data and Information Security
PDF
Constructing a predictive model for an intelligent network intrusion detection
PDF
Neo4j graphs in government
PDF
Correlation Method for Public Security Information in Big Data Environment
PPTX
Graphs in Government
How Data Analytics is Re-defining Modern Era in Cyber Security
Machine learning in Cyber Security
Big Data and Information Security
Constructing a predictive model for an intelligent network intrusion detection
Neo4j graphs in government
Correlation Method for Public Security Information in Big Data Environment

What's hot (20)

PPTX
IoT and Big Data
PPTX
Big data
PPTX
Lars Lyberg, Inizio: Rapport från konferensen BigSurv18
PDF
A Proactive Approach in Network Forensic Investigation Process
PPTX
Tools and techniques adopted for big data analytics
PPTX
Big data, big opportunities
PDF
Turning Big Data to Business Advantage
PDF
Big Data Security Challenges: An Overview and Application of User Behavior An...
PDF
New Hybrid Intrusion Detection System Based On Data Mining Technique to Enhan...
PDF
Use of network forensic mechanisms to formulate network security
PPTX
IANS Forum Seattle Technology Spotlight: Looking for and Finding the Inside...
PDF
IRJET- Scope of Big Data Analytics in Industrial Domain
PDF
Computer Forensics-An Introduction of New Face to the Digital World
PDF
Top industry use cases for streaming analytics
PPTX
[Webinar] Supercharging Security with Behavioral Analytics
PDF
J017536064
PPT
Future data security ‘will come from several sources’
PPTX
Big data Introduction
PPTX
5 ways big data benefits consumers
PPTX
Data Mining With Big Data
IoT and Big Data
Big data
Lars Lyberg, Inizio: Rapport från konferensen BigSurv18
A Proactive Approach in Network Forensic Investigation Process
Tools and techniques adopted for big data analytics
Big data, big opportunities
Turning Big Data to Business Advantage
Big Data Security Challenges: An Overview and Application of User Behavior An...
New Hybrid Intrusion Detection System Based On Data Mining Technique to Enhan...
Use of network forensic mechanisms to formulate network security
IANS Forum Seattle Technology Spotlight: Looking for and Finding the Inside...
IRJET- Scope of Big Data Analytics in Industrial Domain
Computer Forensics-An Introduction of New Face to the Digital World
Top industry use cases for streaming analytics
[Webinar] Supercharging Security with Behavioral Analytics
J017536064
Future data security ‘will come from several sources’
Big data Introduction
5 ways big data benefits consumers
Data Mining With Big Data
Ad

Similar to A proposed model_for_cybercrime_detectio (20)

PDF
The International Journal of Engineering and Science (IJES)
PDF
IRJET- Cyber Crime Attack Prediction
PDF
Analysis of Crime Big Data using MapReduce
PPTX
Big Data Analytics
PDF
Survey on Crime Interpretation and Forecasting Using Machine Learning
PDF
U24149153
PDF
IRJET- Crime Analysis using Data Mining and Data Analytics
PDF
Using Predictive Analytics for Anticipatory Investigation and Intervention
PPTX
Data Philly Meetup - Big (Geo) Data
PDF
Crime forecasting system for soic final
PDF
Behavioural Analytics in Cyber Security for Digital Forensics Application
PDF
BEHAVIOURAL ANALYTICS IN CYBER SECURITY FOR DIGITAL FORENSICS APPLICATION
PDF
Behavioural Analytics in Cyber Security for Digital Forensics Application
PDF
Physical and Cyber Crime Detection using Digital Forensic Approach: A Complet...
PDF
Crime sensing with big data - Singapore perspective
PPT
Crime Analysis
PPTX
crime invetigation using big data analytic
PDF
Crime Analysis based on Historical and Transportation Data
PDF
Getting Real About Security Management and “Big Data”
 
The International Journal of Engineering and Science (IJES)
IRJET- Cyber Crime Attack Prediction
Analysis of Crime Big Data using MapReduce
Big Data Analytics
Survey on Crime Interpretation and Forecasting Using Machine Learning
U24149153
IRJET- Crime Analysis using Data Mining and Data Analytics
Using Predictive Analytics for Anticipatory Investigation and Intervention
Data Philly Meetup - Big (Geo) Data
Crime forecasting system for soic final
Behavioural Analytics in Cyber Security for Digital Forensics Application
BEHAVIOURAL ANALYTICS IN CYBER SECURITY FOR DIGITAL FORENSICS APPLICATION
Behavioural Analytics in Cyber Security for Digital Forensics Application
Physical and Cyber Crime Detection using Digital Forensic Approach: A Complet...
Crime sensing with big data - Singapore perspective
Crime Analysis
crime invetigation using big data analytic
Crime Analysis based on Historical and Transportation Data
Getting Real About Security Management and “Big Data”
 
Ad

More from Hossam Al-Ansary (7)

PDF
A_Proposed_Genetic_Algorithm_Model_to_Im.pdf
PDF
A Proposed Model for Web Proxy Caching Techniques to Improve Computer Network...
PDF
A Proposed Model for Datacenter in -Depth Defense to Enhance Continual Security
PDF
A Proposed Model for Radio Frequency Systems to Tracking Trains via GPS
PDF
A Proposed Virtualization Technique to Enhance IT Services
PDF
A Proposed Model for IT Disaster Recovery Plan
PDF
Proposed Model for Enhancing Data Storage Security in Cloud Computing Systems
A_Proposed_Genetic_Algorithm_Model_to_Im.pdf
A Proposed Model for Web Proxy Caching Techniques to Improve Computer Network...
A Proposed Model for Datacenter in -Depth Defense to Enhance Continual Security
A Proposed Model for Radio Frequency Systems to Tracking Trains via GPS
A Proposed Virtualization Technique to Enhance IT Services
A Proposed Model for IT Disaster Recovery Plan
Proposed Model for Enhancing Data Storage Security in Cloud Computing Systems

Recently uploaded (20)

PDF
Electronic commerce courselecture one. Pdf
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Approach and Philosophy of On baking technology
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPTX
Cloud computing and distributed systems.
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
Electronic commerce courselecture one. Pdf
Review of recent advances in non-invasive hemoglobin estimation
Diabetes mellitus diagnosis method based random forest with bat algorithm
20250228 LYD VKU AI Blended-Learning.pptx
NewMind AI Monthly Chronicles - July 2025
Encapsulation_ Review paper, used for researhc scholars
Per capita expenditure prediction using model stacking based on satellite ima...
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Approach and Philosophy of On baking technology
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Cloud computing and distributed systems.
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Chapter 3 Spatial Domain Image Processing.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
Network Security Unit 5.pdf for BCA BBA.
NewMind AI Weekly Chronicles - August'25 Week I
Digital-Transformation-Roadmap-for-Companies.pptx
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Advanced methodologies resolving dimensionality complications for autism neur...

A proposed model_for_cybercrime_detectio

  • 1. A Proposed Model for Cybercrime Detection Algorithm Using A Big Data Analytics Hossam Abdel Rahaman Dept of Computer and Information Sciences Faculty of Statistical Studies and Research Cairo University, Egypt Hossam_mm7@yahoo.com Abstract— Cybercrime today is evolving as part of our day-to- day lives, and The challenges of cybercrime reduction and prevention are becoming increasingly complex, that needs a new technique to handle the vast amount of this data, The capabilities of the traditional activities of police mostly drop brief in portraying the original division of criminal activities, hence contribute less to the appropriate allotment of police services. In this paper, methods are described for cybercrime Prediction, by using the Hadoop technique for big data analytics, through examining the geological zones which incorporate more noteworthy chance and exterior the conventional policing capabilities. The used method makes the utilize of a topographical cybercrime mapping algorithm to distinguish regions that have generally high cases of cybercrime. This method will identify exceedingly cases of cybercrime clusters which assist can show the patterns of cybercrime. the estimation approach is enhanced by the processing capability of the Hadoop platform. Keywords-component; formatting; style; styling; insert (key words) I. INTRODUCTION Cybercrimes are getting increased with expanding dangers through online fraud and unscrupulous hacking. With both cyber safety threats and data increasing, the organizations must be prepared to prepare themselves with foreseeing and anticipating cybercrime. the specialists of cybercrime are using digital Forensic tools to identify cybercrime episodes and recognize any potential threats like credit card frauds. Big data analytics is empowering companies to analyze the gigantic sum of information they collect amid the monetary transactions; cybercrime could be a greater significance nowadays due to the increased risk of cybercrime. Big data tools are being utilized to combat cybercrime attacks. big data analytics can offer to detect forgery and can facilitate digital forensic analysis. [1] The utilize of K-Mean algorithms to analyze the data and predict where cybercrime is likely to happen is getting to be more common in law authorization. Frequently referred to as predictive analysis, which gets to be the police agency's successes to cybercrime reduction efforts by applying the predictive investigation. [2] The detection algorithm presented in this paper has three stages as appeared in Figure 1. The first phase is the distribution geographic of cybercrime data analysis which identifies spatial clusters that have a greater risk of cybercrime. In the second phase a K-Mean clustering algorithm that utilized to determine the quality of each identified cluster. [3] Figure 1. Predictive Process This paper delineates a cybercrime detection algorithm on the Hadoop platform in big data analytics that will be able to predict the near likely cybercrime. also, a brief overview is made about several techniques utilized in analyzing big data to detect online fraud and unethical hacking by analyzing large sets of data. One aim of this study is to identify the model that best identifies online fraud cases. [4] A. Problem Statement The predictive of big data analysis has not been broadly examined and studied from an objective, perspective scientific. Whereas beginning experiences by the police agencies that have either fully implemented or experimented with predictive policing techniques appear to be positive, predictive policing’s affecting on cybercrime has yet to be definitively determined. this problem is troublesome because the utilize of predictive analysis in policing is so modern that small objective research has been conducted on its cybercrime reduction applications. [5] B. Challenges Of Research • The distinctive techniques and infrastructures that are used for recording data on cybercrime. • The diverse techniques that can analyze with precision and efficiency for this expanding volume of data on cybercrime. • The accessible data are inconsistent and fragmented are making the task increasingly difficult formal analysis International Journal of Computer Science and Information Security (IJCSIS), Vol. 18, No. 6, June 2020 146 https://sites.google.com/site/ijcsis/ ISSN 1947-5500
  • 2. • Increasing the size of the data that has to be stored and analyzed. C. Research Questions The research questions as following: • Dose the predictive big data analytics an effective cybercrime control practices that can contribute to improved homeland security? Secondary questions include the following: • What is the relationship between predictive big data analysis and cybercrime reduction in cities that have implemented the practice? • How does the quantity and quality of historical big data affect the relationship between predictive analysis and cybercrime reduction? D. 1.4. Objective of Research The objective of this research can clearly be broken down into two Points as the following: • Reducing the incidence of cybercrime by big data analytics to make cybercrime predictions in terms of time and space. • Efficiency improvement of the Egypt Cybercrime Centre to empower the users of the Internet in Egypt that They would able to be more sensitized about the emerging trends on cybercrime. Cybercrime today is evolving as part of our day-to-day lives. • Aim the proposed algorithm for Egypt Cybercrime Centre to contribute the fight against cybercrime in Egypt. • Aim to use predictive big data analysis techniques by Egyptian police to contribution reallocate resources towards homeland security missions. II. BIG DATA The world is becoming interconnected digitalized so that the amount of data has been detonating every minute. To manage the records of this data, it requires extremely powerful action intelligence. The problem begins during data acquisition when a huge amount of data requires us to make decisions about what data to be a store, what to discard, and how to store so that data can be kept reliable and accurate. Big data refers to datasets whose capacity is beyond the ability of the typical database, store, oversee, manage, and analyze. It can be described as a massive volume of both structured and unstructured data that can’t be stored using traditional databases, which consists of billions to trillions of records that are collected from millions of people all from different sources. The sources of data may come from the web, sales, customer contact center, social media, mobile data. [6] Big data as shown in fig 1, is a term associated related to expansive datasets that come into existence with the features of volume, variety, velocity, and veracity of data. Data variability, value, and complexity are some other features that are used with big data. Volume: The large amount of data stored that can be collected and analyzed effectively. Variety: Type of data that may be structured, unstructured, log files, text, video, audio, and transactions. Velocity: Rate of the speed of data available and data change for analysis. Veracity: Related to data integrity and extend of trust in the data to confidently that use it to make decisions. Fig1, Big Data Properties In the context of financial banking transaction analysis, volume corresponds with the thousands of credit card transactions that happen every second in every day. Variety refers to the type of data that is used in transaction activities. Velocity refers to how to speed data that can be processed for analytics. Veracity related to analyzing the credit card transactions to make decisions on it with the aim of finding fraudulent transactions if any. These factors are important for analyzing transactions to find fraud and taking the needed action immediately to correct the fraudulent transactions. [7] A. Big Data Analytics The technology of Big data analytics is useful information such as a hidden value, and a relation rule from huge data. When the data volumes reach big data extents, parsing it for important data requires exceptionally effective data analytics. The domain of Big Data Analytics is concerned with the extraction of value from big data which are significant, previously unknown, implicit, and potentially useful. These experiences have a direct effect on making a useful decision from the interpreted data. With the assistance of the right analytical tools, and big data can detect various frauds. [8] in financial banking the analytics tools perform the following activities: • Collects data from some of the enterprise sources. • Performs more profound analytics on the data. • Provides a fine view of security information. International Journal of Computer Science and Information Security (IJCSIS), Vol. 18, No. 6, June 2020 147 https://sites.google.com/site/ijcsis/ ISSN 1947-5500
  • 3. • Achieves real-time analysis of streaming data. III. BIG DATA ANALYTICS IN CYBER CRIME Big Data Analytics in cybercrime security includes the ability to collect massive amounts of digital information to analyze, visualize, and draw knowledge that can make it conceivable to foresee and halt cybercrime assaults. to detecting fraud rapidly, requires real-time investigation of many structured and unstructured data sources. detection of fraud is one of the most visible uses in big data analytics. [9] Most of the frauds are high-volume in nature. So, a great opportunity is given for analytics to identify patterns from high volume data and suggest preventive action. some of the techniques utilized to detect frauds require recognizing identical/repeating pattern matches of people, places, systems, and events. Compared to conventional approaches, huge data analytics gives a proficient cybersecurity setting by isolating what is “normal” from what is “abnormal”, isolating the designs produced by authorized clients from those created by suspicious or malicious clients. [10] By providing means to discover changing patterns of malicious activities hidden profound in large volumes of organizations data, big data tools can undoubtedly enable businesses to better understand if and how they have been attacked. Big Data can be associated with the following fraud detection techniques: [11] A. Descriptive Analytics - Unsupervised Learning Descriptive analytics are target to finding the behavior that deviates from normal behavior or to detecting anomalies. These techniques learn from observations historical and not require observations as a fraudulent or non-fraudulent activity. [12] B. Predictive Analytics - Supervised Learning Predictive analytics are target to learn from historical data that recovers patterns that allow permitting the contrast between normal and fraudulent behavior. These analytics can be applied to detect fraud as well as to estimate the amount of fraud. [13] C. Social Network Analytics Target to extending the capability by detecting the fraudulent behavior in a network of linked substances. It also finds the relationship between entities by revealing specific patterns indicating fraud. [14] D. Big Data Detection Techniques A formal digital forensic investigation cannot be launched until extract important and significant data from the entire data set. The focus is on the different techniques that can facilitate the digital forensic investigator in analyzing the big data to find the underlying relationship among the data. Furthermore, these techniques help the investigator in extracting meaningful and purposeful digital forensic evidence for detecting frauds from the large datasets. [15] IV. CRIME PREDICTION THEORY The incidents of various types of cybercrimes in different states of Egypt for the year 2019 were considered as the input for the analysis. It also contained the number of persons arrested under different age groups ranging from 18, 18 to 30, 30 to 45, and 45 to 60 and above 60, in different states and union territories. The distinctive sorts of cybercrimes and its description are given in Table 1. Feature selection techniques are utilized to determine the important features such as the conspicuous categories of cybercrime and the age gather of people who are included in these crimes more. The chosen features were normalized utilizing the population attribute for each and every state in Egypt. TABLE I. CYBERCRIME TYPE Crime Type Description Manipulate in computerized documents Source The person knowingly or intentionally concealing, destroying code or altering or causing another to conceal, crush or modify any computer source utilized for a computer, computer software engineer or computer network Hacking computing systems Finding out shortcomings in a computer or computer network, misusing and exploiting them. Types 1- Loss, damage to computer source, utility Type 2 - Hacking Distribution indecent - Transmission in electronic Transmitting indecent content through Internet/ Emails and cell phones (SMS). Compliance failure - Orders of certifying authority Failure of the license is provided in issuing a digital signature. Unauthorized to access - endeavor to access of protected computing systems Access to any computer software programs or software sources, which have security vulnerabilities without legal permission. Obtaining a license or digital signature certificate by deceptions An individual who attempts to obtain obtains or endeavors to maintain a license by willful misrepresentation or fraudulent representation Publishing false digital signature certificate A digital signature authorizes the identity of a person. Publishing false signatures is similar to the crime of personification. False digital signature certificate Breach of confidentiality and privacy A Breach of Confidentiality is a Security violation where the Confidentiality of some data was lost. A. Cybercrime spatial data analysis Arrested under various age groups was the input for our investigation analysis. This cybercrime data is distributed across 28 states in Egypt. This data is normalized utilizing the population for each state and union territory. The feature selection technique was applied to determine the contribution of chosen features towards cybercrime activities. The attributes with higher ranks were further considered for our analysis. The attributes and their F score are given in Table International Journal of Computer Science and Information Security (IJCSIS), Vol. 18, No. 6, June 2020 148 https://sites.google.com/site/ijcsis/ ISSN 1947-5500
  • 4. 2. Three among nine types of cybercrimes and the age groups of 18-30 and 30-45, among the various age groups considered occupied higher ranks. Thus, they were further considered in our analysis for the prediction of relevant patterns. [16] TABLE II. FEATURE Attribute I-Score Hacking 0.7532 Obscene Publication 0.2412 Failure of compliance 0.5201 Age (18-30) 0.7421 Age (30-45) 0.6102 X-means clustering connected to the chosen resulted in three clusters. Then an application of the K-means clustering algorithm with the value of k as 3 was applied for finding out the cluster patterns. [17] The geospatial distribution of cybercrimes in Egypt is shown in Figure 2, areas marked with red color depict the regions where the occurrence of cybercrime incidents is high, yellow color depicts nominal cybercrime occurring regions and blue color depicts the areas with very low incidents of cybercrime. Visualization and analysis of the crime patterns results in meaningful inferences Figure 2 Cybercrimes in Egypt From Table 3, we watch that the lion's share of the Egyptian domains drops beneath the third cluster where the wrongdoing design or the event of cybercrimes is none. Cluster 2 represents high cybercrime occurring regions and the people involved in the cybercrime people group 35 to 45. Cluster 1 represents the average crime occurring region and the age group of people involved also includes the age 18 to 30. TABLE III. MAJORITY OF THE EGYPT TERRITORIES Intensity of Crime Arrested Persons Crime1 Hacking Crime2 Obscene publication Crime3 Failure of compliance Age (18 to 35) (35 to 45) Average None Average Average None Average High High Very low High None None None None None V. PROPOSED MODEL FOR CYBER-CRIME PREDICTION A. Collection of cybercrime dataset A variety of cybercrime data should be collected for the prediction of cybercrime class in the banking sector by the analysis of cybercrime patterns. So, this data has to be collected from various news feeds, articles, and blogs, police department websites over the internet web. The collected cybercrime data is stored in a cybercrime database for further handling of data. [18] B. Pre-processing of cybercrime dataset The cybercrime dataset put away within the cybercrime database has to be pre-processed before applying data mining processes to them. Because pre-processing expels noisy data, lost, missing values. Figure 3 Proposed Model for cyber Crime Prediction C. Data mining Techniques For Pre-processed data, Data mining processes and algorithms are implemented to identify or forecast fraud through Knowledge innovation from abnormal patterns and also it achieves recognition in combating cybercrime financial fraud Data Mining by contributing in solving tribulations within keeping banking sector by discovering patterns, relationships, and links that are unseen in the business information accumulated in the crime databases. [19] International Journal of Computer Science and Information Security (IJCSIS), Vol. 18, No. 6, June 2020 149 https://sites.google.com/site/ijcsis/ ISSN 1947-5500
  • 5. D. Association Rule mining Based on the frequent incidents of cybercrime patterns, Association rule mining processes rules for cybercrime dataset. These produced rules help the assessment processes of characterizes society to take a hindrance activity. The procedure comprises the subsequent measures: a) The method of deciding commonly occurring item sets within the cybercrime database. b) The recognizable of patterns in program implementation and customer behaviors as association rules known as interruption recognition. E. Clustering The clustering is the number of groups which divided up to a set of records or items. Clustering is suggested in discovering interactions linking cybercrime and criminal characteristics having a few past strange common characteristics. For discovering frauds in banking sectors, clustering techniques are utilized. Clustering is stated as unsupervised learning because its classes are not positive and decided in progress and consortium of data is exclusive of supervision. [20] K-Means' partition algorithm is implemented in clustering cybercrime datasets because of its minimalism and less computational intricacy. At first, the quantity of data items is assembled and precise as (k) clusters. Between the mean separations of objects, the mean value is intended. The repositioning iterative method is utilized to recover the partitions by transferring items from one cluster to another. Then until the union occurs, the number of iterations is carried out. [21] F. K-Means Algorithm G. Classification Classification is the most frequently utilized data mining technique, which executes a set of pre-classified cases to build up a model that can classify the instances of attributes on a huge scale. The classification technique makes an association between a dependent variable and an independent variable by mapping the data points. Within the given dataset, Classification is used to bring out in which group each data occurrence is associated. Classification is utilized to create several models of unknown patterns and prospect assessment on the basis of the previous decision making. Automatic credit authorization is the about major procedure in the banking sector and financial organizations. Frauds can be prohibited by building an outstanding assessment for the credit consents using the classification representation based on decision trees such as Apache Hadoop. H. Influenced Association Classification For fulfilling more exactness, the affiliated classification is an amazing and moving novel and improved method which assimilates the mining of association rule and classifications of the prediction model. This method is being implemented for ruling out the link and association over item sets. The affiliated classification comes under unsupervised learning since it does engage any class characteristic for rule extraction. Two steps employed to extract association rules are: [22] [23] a) Through the cybercrime data set, classes are produced based on the affiliation rule. b) In the class labels, perform an examination on the dataset classification. Different steps implemented in Affected Association Classifier has been summarized below: c) Pre-process the cybercrime dataset so assist mining hones can be accomplished on them. d) To replicate the assessment in the replica of prediction, every element is assigned within a range of weight Attributes having additional significance are allocated maximum weight (0.9) and having fewer significance are allocated minimum weight (0.1). Influenced Association Rule Mining algorithm is implemented on pre-processed cybercrime data set for obtaining fascinating pattern invention. Influenced Association Classification uses weighted support and confidence and the rules spawned by this process are known as Classification Association Rule. The extracted Classification Association Rules are stored in the Rule base index. At any time if any new cybercrime record is updated, this CAR rule forecast the class label from the Rule base. International Journal of Computer Science and Information Security (IJCSIS), Vol. 18, No. 6, June 2020 150 https://sites.google.com/site/ijcsis/ ISSN 1947-5500
  • 6. I. Cyber Crime Prediction using Apache Hadoop For the classification of problems and issues in the cybercrime prediction analysis, Apache Hadoop technique is thorny and more precise two steps are: [24] a) Formation of the tree. b) Validate the built tree over the cybercrime data set. The Apache Hadoop technique uses a clipping method for the construction of the tree. The clipping technique diminishes the size of the tree by removing appropriate data that guides the terrible concert in prediction. The anticipated Apache Hadoop technique classifies the data until the entire categorization and affords the utmost accuracy over the training of cybercrime data. It also stabilizes the precision and litheness. The Apache Hadoop technique is the extensive version of decision tree C4.5. The Apache Hadoop technique produces the classifier output in the form of rule sets and decision trees. The rule sets are straightforward to recognize and too easy for employing within the application. [18] J. Experimental Settings K-Mean cluster consists of six data nodes, considered as slave roles only, and one name node which is both a slave role and a master role in our system. The details of these nodes are listed in Table 3. Besides, and we will set the number of replicas to be 6 since there are 6 nodes in total in this cluster. TABLE IV. K-MEAN CLUSTER COMPOSITION We used the same configuration. Indicate references by Dili WM, 2013) VI. PERFORMANCE ANALYSIS This section will monitor and evaluate Apache Hadoop performance in three cases: • Without using Apache Hadoop. • At the beginning of using Apache Hadoop. • After a certain period (one month) from using Apache Hadoop. We will take in our consideration the following parameters in the evaluation process • Requests returned from the Apache Storage. • Requests returned from Apache Storage without verification. • Requests returned from the Apache Storage, updating a file in cache. • Requests returned from Apache Storage after verifying that they have not changed. A. Performance Metrics The main categories of performance metrics are: a) Apache Storage Performance: how requested Web objects were returned from the Storage or from the network. b) Traffic: the amount of network traffic, by date, sent through Apache Hadoop including both Web and non-Web traffic. c) Daily traffic: average network traffic through Apache Hadoop at various times during the day. This report includes both Web and non-Web traffic. d) Response Time: how Apache Hadoop responded to HTTP requests during the reporting period. e) Failures communicating: Apache Hadoop encountered the following failures communicating with other computers during the reporting period. f) Dropped Packets: shows the number of dropped network packets during the report period Users that had the most dropped packets are listed first g) Queue Length: Queue Length counter shows how many threads are ready in the processor queue, but not currently able to use the processor. Indicate references by (Spark Streaming Programming Guide) B. Types of Requests We want to know the file types that occur most often in the application server. Knowing the characteristics of the log files based on file type gives some indication of whether the document will change or not. TABLE V. TYPE OF REQUEST C. Apache Storage Performance The Storage performance results for each of the log files are shown below. The percentage of requests returned from Node Instance Type CPU Memory Storage Privet IP Node1 M1 Medium Core i7 8 GB 500 GB 10.1.1.2 Node2 M1 Small Core i5 4 GB 200 GB 10.1.1.3 Node3 M1 Small Core i5 4 GB 200 GB 10.1.1.4 Node4 M1 Small Core i5 4 GB 200 GB 10.1.1.5 Node5 M1 Small Core i5 4 GB 200 GB 10.1.1.6 Node6 M1 Small Core i5 4 GB 200 GB 10.1.1.7 International Journal of Computer Science and Information Security (IJCSIS), Vol. 18, No. 6, June 2020 151 https://sites.google.com/site/ijcsis/ ISSN 1947-5500
  • 7. storage without verification is high. It shows that between 38% of all requests result in a request returned from Apache Hadoop without verification, which is consistent with previously published results. Reported that only 15% to 32% of their Apache logs results in requests returned from Storage without verification, also notice detect unknown objects returned from the Apache Storage TABLE VI. STORAGE PERFORMANCE RESULTS Status Requests % of Total Requests Total Bytes Objects returned from Apache Storage 21251 59.30 % 622.73 MB Objects returned from Apache Storage without verification 14110 38.20 % 29.93 MB Objects returned from Storage after verifying that they have not changed 587 1.40 % 0.98 MB Information not available 337 1.00 % 47.94 KB Unknown objects returned from the Apache Storage 61 0.20 % 15.71 KB Total 327251 100.00 % 675.52 MB D. Traffic The results for average network traffic through Apache Hadoop at various times during the day at the beginning of using Apache Hadoop and after a certain time from using Apache Hadoop are in the Table below. The results indicate that the average processing time for handling the request is reduced by 43% after a certain time of using Apache Hadoop because Apache Hadoop the previously visited pages and return them directly to the client without waste time to ask Storage server each time a) Traffic by Time of day The following Table summarizes average network traffic through Apache Hadoop at various times during the day TABLE VII. TRAFFIC BY TIME OF DAY b) Dropped Packets The result below shows the users who had the highest number of dropped network packets during the reporting period. Users that had the most dropped packets are listed first. We can observe that the percentage of dropped packets is reduced by time, also notice detect unknown two users using a network IPs out of network range (172.31.0.,12). TABLE VIII. DROPPED PACKETS TABLE IX. DROPPED PACKETS AFTER USING APATCHE User At the beginning of using Apache Hadoop After a certain time of using Apache Hadoop Dropped Packets % of Total Dropped Packets Dropped Packets % of Total Dropped Packets 10.1.1.13 35887 23.30% 682 11.80% 10.1.1.14 34871 22.50% 662 11.10% 172.31.0.2 32817 21.50% Unknown Unknown 172.31.0.1 30618 20.20% Unknown Unknown 10.1.1.12 4832 3.00% ---- 13.30% 10.1.1.20 2310 1.50% 223 4.100% 10.1.1.23 2301 1.50% 221 3.70% An Algorithm is widely explored to detect unknown or previously unseen two networks IP. the technique not only detects the known Network IP but can also detect the unknown objects returned for patch storage. The technique is a two-step process, in the first step feature is extracted from the know datasets which plays a vital role, not only to represent the target concept but also to speed-up the learning and classification/detection processes. In the second step, appropriate machine learning techniques, and trained for detection/classification of up normal behavior . At the beginning of using Apache Hadoop After a certain time of using Apache Hadoop Requests Average Processing Time TotalBytes CacheHit Ratio Requests Average Processing Time TotalBytes CacheHit Ratio 1811 141.00 sec 4.52 GB 0.00 % 6186 52.80 sec 2.87 GB 1.00 % 1617 131.40 sec 8.95 MB 0.00 % 6246 59.31 sec 35.89 MB 0.00 % 1535 122.10 sec 8.23 MB 0.00 % 5844 61.29 sec 35.27 MB 0.00 % 1816 125.20 sec 8.71 MB 0.00 % 6103 57.00 sec 34.19 MB 0.00 % TimeInterval Average Requests Per Second Average BytesPer Second Average Response Time for Apache Requests Average Response Time for Non-Apache Requests 00:00 14.4 66.48 KB - 54.20 sec 00:15 18.1 12.14 MB - 57.80 sec 00:30 15.8 11.47 MB 0.00 sec - 00:45 16.1 82.44 KB Unknown 66.10 sec 01:00 15.1 64.19 KB 0.00 sec - 01:15 17.2 68.21 KB 0.00 sec - 01:30 15.5 91.32 KB 0.00 sec - International Journal of Computer Science and Information Security (IJCSIS), Vol. 18, No. 6, June 2020 152 https://sites.google.com/site/ijcsis/ ISSN 1947-5500
  • 8. VII. VALIDATION AND VERIFICATION The big data analysis required more visual inspection as well as manual execution than the other components. We have tested the solutions described above by several activities and measured the output of these activities. a) Tools Used Apache Hadoop Logs- The Hadoop log was used to determine the amount and size of data sent by clients. It also was used to generate the emulation of client requests. b) SCOM Reports System Center Operations Manager (SCOM) as a storage log analysis tool that allows the user to pull various pieces of data from its log files such as the number of requests, bytes transferred, hosts contacted, etc. and we used it to reveal some characteristics of the log files used in the performance analysis. VIII. CONCLUSION The proposed work focuses on cybercrime prediction by crime mapping with recorded data using the latest technology. The model helps in reducing cybercrime for the security authorities. And improve network performance. Numerous of the Network anomaly detection techniques are designed based on the accessibility of data instances. Numerous anomaly detection techniques have been specifically particularly for certain application domains, while others are more generic. this paper presents a cascaded algorithm utilizing K–Means algorithms for big data Anomaly Detection. The proposed algorithm is used to detect the anomalies presented in the supervised and unsupervised data set. The model also helps the authorities in the investigation of crimes. Using Big Data Analytics with the clustering approach reduces the investigation time and helps in retrieving the hidden information. REFERENCES [1] Cameron S.D. Brown, “Investigating and Prosecuting Cyber Crime: Forensic Dependencies and Barriers to Justice”, International Journal of Cyber Criminology, ( 2015). [2] Spalevic Z, “Cyber Security as a global challenge today”, Singidunum Journal of Applied Sciences, (2014 ) [3] Najafabadi M., Villanustre F., Khoshgoftaar T, Seliya N., R. Wald, and E. Muharemagic, “Deep learning applications and challenges in big data analytics”, Journal of Big Data, ( 2015) [4] Gupta P, N.Tyagi, “An Approach towards Big Data –A Review”, International Conference on Computing, Communication and Automation (IEEE), ( 2015). [5] Tahir S, Waseem I, “Big Data−An Evolving Concern for Forensic Investigators”, IEEE Transactions, ( 2015). [6] Chen X , Member S , X. Lin, “Big Data Deep Learning: Challenges and Perspectives”, IEEE Access, Vol 2, ,( 2014) [7] Magoulas R , Lorica B, “Introduction to Big Data”, Release 2.0, Issue 11, , (Feb 2009 ). [8] m.i.pramanik, raymond y.k. lau, wei t.yue, yunming ye and chunping li., “big data analytics for security and criminal investigations” Wiley interdisciplinary reviews-data mining and knowledge discovery, vol.7 no.4,1-19, (2017). [9] chung-hsien yu, max w. Ward, melissa morabito, wei ding, “crime forecasting using data mining techniques,” international conference on data mining workshops, IEEE, 2011. [10] ManjeetRege& Raymond Blanch K. Mbah, Machine Learning for Cyber Defense and Attack , DATA ANALYTICS 2018 : The Seventh International Conference on Data Analytics, Copyright (c) IARIA, 2018. ISBN: 978-1-61208-681-1 , pp.73–78. [11] Tariq M , Uzma A, “Security Analytics: Big Data Analytics for Cyber security A Review of Trends, Techniques and Tools”, 2nd National Conference on Information Assurance (NCIA), ,( 2013). [12] Palak G, Nidhi T, “An Approach towards Big Data –A Review”, International Conference on Computing, Communication and Automation (IEEE), (2015 ). [13] Giri T, Anjan G, “A Survey on Data Science Technologies & Big Data Analytics”, International Journal of Advanced Research in Computer Science and Software Engineering, Vol 6, Issue 2, , (Feb 2016 ). [14] Dean J , Ghemawat S, “MapReduce: Simplified data processing on large clusters”, Communications of the ACM, vol 51, pp. 107-113, , (2008 ). [15] Siddaraj u, Sowmya C , Rashmi K, Rahul M,( 2014) “ Efficient Analysis of Big Data Using Map Reduce Framework”, International Journal of Recent Development in Engineering and Technology, Vol.2. [16] Aksoy, S., “K–Nearest Neighbor Classifier and Distance Functions,” Technical Report, Department of Computer Engineering, Bilkent University (February 2008) [17] A. Reyes, R. Brittson, K. O’Shea, and J. Steele, Cyber Crime Investigations: Bridging the Gaps Between Security Professionals, Law Enforcement, and Prosecutors. Elsevier Science, 2011. [18] D. Quick and K.-K. R. Choo, “Impacts of increasing volume of digital forensic data: A survey and future research challenges,” Digital  Investigation, vol. 11, no. 4, pp. 273 – 294, 2014. [19] R. Rowlingson, “A ten step process for forensic readiness,” International Journal of Digital Evidence, vol. 2, no. 3, pp. 1–28, 2004. [20] A. Guarino, “Digital forensics as a big data challenge,” in ISSE 2013 Securing Electronic Business Processes. Springer, 2013, pp. 197–203. [21] P. Dhaka and R. Johari, “Crib: Cyber crime investigation, data archival and analysis using big data tool,” in 2016 International Conference on Computing, Communication and Automation (ICCCA), April 2016, pp. 117–121. [22] H. Van Beek, E. van Eijk, R. van Baar, M. Ugen, J. Bodde, and A. Siemelink, “Digital forensics as a service: Game on,” Digital Investigation, vol. 15, pp. 20–38, 2015. [23] Alessandro G, “Digital Forensic as a Big Data Challange”, ISSE Securing Electronic Business Processes,( 2013). [24] Katarina G, Michael H, Wilson A. Higashino, A, David S. Allison, and Miriam A. Capretz M, “Challenges for MapReduce in Big Data”, Proc. of the 10th 2014 world congress on services. ,( 2014). AUTHORS PROFILE Hossam Abdel Rahman Mohamed: Doctor degree for computer science in Cairo University, Computer and Information Technology Dept. His currently position is IT Director at Bek Group. International Journal of Computer Science and Information Security (IJCSIS), Vol. 18, No. 6, June 2020 153 https://sites.google.com/site/ijcsis/ ISSN 1947-5500
  • 9. JOURNAL IJCSIS Journal Impact Factor Google Scholar Alerts Conference Partnership Open Access Journals Sitemap CALL FOR PAPERS Call for Papers September 2020 Call for Papers August 2020 1st Special Issue - 2019 Special Issue 2018 AUTHORS Notes for Authors Submit Paper Publication Fee Review Process IJCSIS PUBLICATION ARCHIVES All Volumes & Issues Vol. 18 No. 7 JULY 2020 Vol. 18 No. 6 JUN 2020 Vol. 18 No. 5 MAY 2020 Vol. 18 No. 4 APR 2020 Vol. 18 No. 3 MAR 2020 Vol. 18 No. 2 FEB 2020 Vol. 18 No. 1 JAN 2020 Vol. 17 No. 12 DEC 2019 Vol. 17 No. 11 NOV 2019 Vol. 17 No. 10 OCT 2019 Vol. 17 No. 9 SEP 2019 Vol. 17 No. 8 AUG 2019 Vol. 17 No. 7 JULY 2019 Vol. 17 No. 6 JUNE 2019 Vol. 17 No. 5 MAY 2019 Vol. 17 No. 4 APR 2019 Vol. 17 No. 3 MAR 2019 Vol. 17 No. 2 FEB 2019 Vol. 17 No. 1 JAN 2019 Vol. 16 No. 12 DEC 2018 Vol. 16 No. 11 NOV 2018 Vol. 16 No. 10 OCT 2018 ------------------------------------------------------------------------------------------------------------------------------ The International Journal of Computer Science and Information Security (IJCSIS) is one of the leading open access publisher, with hundreds of papers published each year related to different areas ranging from computer science, mobile & wireless computing, networking and information security. The core vision of IJCSIS is to promote knowledge and technology advancement for the benefit of academia, professional research communities and industry practitioners. The aim is to support you to achieve success in your research and scholarly experience. Researchers, PhD scholars and professionals from academia and industry are solicited to submit completed research and developments in the listed areas below. With a large research community of authors, readers, editors and reviewers bounded together by their talent and integrity, IJCSIS publications are available online freely for everyone worldwide. All published papers undergo high-quality peer review and rigorous editorial processes. The journal of Computer Science and Information Security is an Open Access journal since 2009 with high citations in Google Scholar. ISSN 1947 5500 Copyright © IJCSIS. ------------------------------------------------------------------------------------------------------------------------------ International Journal of Computer Science and Information Security IJCSIS August 2020 Volume 18, No. 8 Important Dates: Paper Submission (until) - 11 August 2020 (Deadline Extension) * Deadline extension to submit a paper can be offered on request. Decision Notification (2-3 weeks) - August 19-23, 2020 Issue Publication (Online) - International Journal of Computer Science and Information Security IJCSIS September 2020 Volume 18, No. 9 Important Dates: Paper Submission (until) - September 04, 2020 * Deadline extension to submit a paper can be offered on request. Decision Notification (2-3 weeks) - September 18-21, 2020 Issue Publication (Online) - October 03, 2020 Search this site International Journal of Computer Science and Information Security https://sites.google.com/site/ijcsis/Home
  • 10. Vol. 16 No. 9 SEP 2018 Vol. 16 No. 8 AUG 2018 Vol. 16 No. 7 JULY 2018 Vol. 16 No. 6 JUNE 2018 Vol. 16 No. 5 MAY 2018 Vol. 16 No. 4 APR 2018 Vol. 16 No. 3 MAR 2018 Vol. 16 No. 2 FEB 2018 Vol. 16 No. 1 JAN 2018 Vol. 15 No. 12 DEC 2017 Vol. 15 No. 11 NOV 2017 Vol. 15 No. 10 OCT 2017 Vol. 15 No. 9 SEP 2017 Vol. 15 No. 8 AUG 2017 Vol. 15 No. 7 JUL 2017 Vol. 15 No. 6 JUN 2017 Vol. 15 No. 5 MAY 2017 Vol. 15 No. 4 APR 2017 Vol. 15 No. 3 MAR 2017 Vol. 15 No. 2 FEB 2017 Vol. 15 No. 1 JAN 2017 Vol. 14 VirtualCom 2016 Vol. 14 No. 12 DEC 2016 Vol. 14 No. 11 NOV 2016 Vol. 14 No. 10 OCT 2016 Vol. 14 No. 9 SEP 2016 Vol. 14 CIC 2016 Vol 14 ICETCSE 2016 Vol. 14 No. 8 AUG 2016 Vol. 14 No. 7 JUL 2016 Vol. 14 No. 6 JUN 2016 Vol. 14 No. 5 MAY 2016 Vol. 14 No. 4 APR 2016 Vol. 14 No. 3 MAR 2016 Vol. 14 No. 2 FEB 2016 Vol. 14 Special FEB 2016 Vol. 14 No. 1 JAN 2016 Vol. 13 No. 12 DEC 2015 Vol. 13 No. 11 NOV 2015 Vol. 13 No. 10 OCT 2015 Vol. 13 No. 9 SEP 2015 Vol. 13 No. 8 AUG 2015 Vol. 13 No. 7 JUL 2015 Vol. 13 No. 6 JUN 2015 Vol. 13 No. 5 MAY 2015 Vol. 13 No. 4 APR 2015 Vol. 13 No. 3 MAR 2015 Vol. 13 No. 2 FEB 2015 Vol. 13 No. 1 JAN 2015 Vol. 12 No. 12 DEC 2014 September 03, 2020            List of detail topics including, but not limited to: Computer science [more details] Information security [more details] Information and communication technology [more details] Cloud computing security [more details] Wireless, mobile, and sensor networks [more details] Forensics computing and security [more details] Parallel and distributed systems [more details] Network security and privacy [more details] Pervasive computing [more details] Security, Trust and Privacy [more details] Data mining and predictive modelling [more details] Cloud and big data analytics [more details] Computer vision [more details] Data warehouse [more details] Multimedia systems [more details] Internet of Things (IoT) [more details] 3D Modelling, animation and virtual reality [more details] Enterprise systems [more details] Biometrics and pattern recognition [more details] Software engineering [more details] Computational science [more information] Software security [more details] Digital image processing [more details] Business Intelligence & Analytics [more details] Computer networks [more details] Wireless sensor networks [more details] Green and Sustainable Computing [more details] Educational and web technologies Software testing tools & technologies Computer applications technology Network protocols, services and applications Intelligent systems Cloud Services and Networks [aims and scope] Communication Technologies [aims and scope] Cloud Computing [aims and scope] Applied Informatics [aims and scope] Information Processing [aims and scope] Smart Learning Environments [aims and scope] Next Generation Wired/Wireless Advanced Networks and Systems [aims and scope] Interaction Science [aims and scope] Mathematical/Analytical Modelling and Computer Simulation [aims and scope] Social and Mobile Connected Smart Objects [aims and scope] News and Updates Whats New? On this page you will find the latest happening and information about IJCSIS International Journal of Computer Science and Information Security https://sites.google.com/site/ijcsis/Home