International Journal of Trend in Scientific Research and Development (IJTSRD)
Volume 7 Issue 3, May-June 2023 Available Online: www.ijtsrd.com e-ISSN: 2456 – 6470
@ IJTSRD | Unique Paper ID – IJTSRD57439 | Volume – 7 | Issue – 3 | May-June 2023 Page 694
A Survey of Methods for Spotting Spammers on Twitter
Hareesha Devi, Pankaj Verma, Ankit Dhiman
Department of Computer Science, Arni University Kathgarh, Indora, Himachal Pradesh, India
ABSTRACT
Social networking sites' explosive expansion as a means of
information sharing, management, communication, storage, and
management has attracted hackers who abuse the Web to take
advantage of security flaws for their own nefarious ends. Every day,
forged internet accounts are compromised. Online social networks
(OSNs) are rife with impersonators, phishers, scammers, and
spammers who are difficult to spot. Users who send unsolicited
communications to a large audience with the objective of advertising
a product, entice victims to click on harmful links, or infect users'
systems only for financial gain are known as spammers. Many
studies have been conducted to identify spam profiles in OSNs. In
this essay, we have discussed the methods currently in use to identify
spam Twitter users. User-based, content-based, or a combination of
both features could be used to identify spammers. The current paper
gives a summary of the traits, methodologies, detection rates, and
restrictions (if any) for identifying spam profiles, primarily on
Twitter.
KEYWORDS: Twitter, legitimate users, online social networks (OS's), and
spammers
How to cite this paper: Hareesha Devi |
Pankaj Verma | Ankit Dhiman "A
Survey of Methods for Spotting
Spammers on
Twitter" Published
in International
Journal of Trend in
Scientific Research
and Development
(ijtsrd), ISSN: 2456-
6470, Volume-7 |
Issue-3, June 2023, pp.694-702, URL:
www.ijtsrd.com/papers/ijtsrd57439.pdf
Copyright © 2023 by author (s) and
International Journal of Trend in
Scientific Research and Development
Journal. This is an
Open Access article
distributed under the
terms of the Creative Commons
Attribution License (CC BY 4.0)
(http://creativecommons.org/licenses/by/4.0)
INTRODUCTION
A social networking site, according to Boyd et al.
[5,] enables users to (a) create a profile, (b)
befriend a list of other users, and (c) examine and
navigate their own and other users' buddy lists.
Through the use of Web 2.0 technology, these
online social networks (OSNs) enable user
interaction. These social networking sites are
expanding quickly and altering how individuals
communicate with one another. These websites
have transformed in less than 8 years from a
specialised area of online activity to a phenomenon
that attracts millions of internet users. Online
communities bring people with similar interests
together, making it simpler for them to stay in
touch with one another. Sixdegrees.com was the
first social networking site to launch in 1997, and
makeoutclub.com followed in 2000.
Sixdegrees.com and while new websites like
MySpace, LinkedIn, Bebo, Orkut, Twitter, etc.
found success. Facebook, a very well-known
website, was introduced in 2004 [5] and rapidly
rose to fame throughout the globe. OSNs' greater
user numbers make them more appealing targets
for spammers and malevolent users. On social
media websites, spam can take many forms and is
difficult to identify. Anyone who has used the
Internet has encountered spam of some kind,
whether it is in emails, forums, newsgroups, etc.
Spam [18] is defined as the practise of sending
unsolicited bulk messages over electronic
messaging systems. OSNs have grown in
popularity and are now used as a platform for spam
distribution. Spammers want to send product ads to
users who are not connected to them. Some
spammers post URLs that similar website had a
short lifespan and quickely faded, lead to phishing
websites where users’ sensitive information is
stolen. The detection of spam profiles in OSNs has
been the subject of numerous papers. However, no
review paper that consolidates the available
research has yet been published in this sector. The
purpose of our paper is to examine the academic
research and work that have been done in this area
by various scholars and to highlight the potential
directions for future research. The methods for
identifying spammers on Twitter have been
researched and compared in this study, along with
their presentation. The format of this essay is as
follows: The approach used to conduct this review
is described in Section 2, which is followed by a
IJTSRD57439
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
@ IJTSRD | Unique Paper ID – IJTSRD57439 | Volume – 7 | Issue – 3 | May-June 2023 Page 695
briefing on security issues in OSNs in Section 3.
Spammers are defined in Section 4 along with their
motivations; the introduction to Twitter and its
risks is given in Section 5; the purpose of this
survey study is covered in Section 6; the properties
that can be used for detection purposes are covered
in Section 7; A comparative examination of the
research produced by various researchers is
reviewed in Section 8; new researchers are given
research recommendations in Section 9; and the
review is concluded in Section 10.
METHODOLOGY
After conducting a systematic review using a
principled approach and searching major research
databases for computer science like IEEE Xplore,
ACM Digital Library, Springer Link, Google
Scholar, and Science Direct for relevant topics, the
current methods for detecting spam profiles in
OSNs were surveyed. We concentrated exclusively
on studies published after 2009 since social
networks were not conceptualised until 1997 [1],
and only afterwards did they gain widespread
acceptance. Then, in 2004 [1], Facebook was
introduced, and it quickly gained popularity. As a
result, it took some time for people to become
accustomed to using these networks for
communication, which is why these networks have
been attacked. Over 60 papers were found after
searching the five major databases mentioned
above. After reviewing all of the paper titles and
abstracts, the papers that will be reviewed for this
survey were chosen. Only papers that were deemed
appropriate for the current investigation were
selected. 21 papers in total have been chosen for
evaluation after publications with titles and
abstracts relating to spam message detection and
other unrelated areas were eliminated. The majority
of the criteria used to identify spammers have been
used to categorise the papers.
Through this essay, we're attempting to assemble a
list of social networking papers we've read about
identifying spam accounts on Twitter. The list is
probably not comprehensive, but it lends shape to
the ongoing study on identifying social network
spammers. After reading this survey study, new
researchers will find it simple to assess what
research has been done, when, and how the current
work may be expanded to improve spam detection.
Every time it was appropriate, we included details
on the methodology used, the dataset used, the
features for spammer detection, and the efficacy of
the strategies employed by different writers.
The papers discuss, in particular, the ramifications
of spammers' interactions with members of social
networks as well as current methods for identifying
them.
SECURITY ISSUES IN OSNs
Online social networking sites (OSNs) are
susceptible to security and privacy problems due to
the volume of user data that these sites process
daily. Social networking site users are vulnerable
to a range of attacks:
1. Viruses - spammers utilise social networks as a
distribution channel [19] for dangerous files to
infect users' systems.
2. Phishing attacks: By pretending to be a reliable
third party, users' sensitive information is
obtained [30].
3. Users of social networks are bombarded with
spam messages by spammers [11].
4. Sybil (fake) attack - attacker creates a number
of false identities and poses as a real user in the
system to undermine the reputation of
trustworthy users in the network [20].
5. Social bots, a group of fictitious personas made
to capture user information [32].
6. Attacks involving cloning and identity theft, in
which perpetrators construct a profile of an
already-existing user on the same network or
across many networks in an effort to deceive
the cloned user's friends [23]. Attackers will
gain access to victims' information if they allow
the friend requests provided by these cloned
identities. Users and systems are overextended
by these attacks.
TYPES OF SPAMMERS
The fraudulent users known as spammers put
social networks' security and users' privacy at risk
by tainting the data shared by legal users. One of
the following categories best describes spammers
[22]:
1. Phishers are people who act normally but are
actually out to steal the personal information of
other real users.
2. Fake Users: These are users who spoof real
users' profiles in order to distribute spam to
their friends or other network users.
3. Promoters: These are people that spread
harmful links in advertisements or other
promotional materials to other people in an
effort to collect their personal data.
Spammers' motivations:
Promote pornography, spread malware, launch
phishing attacks, and harm the reputation of the
system.
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
@ IJTSRD | Unique Paper ID – IJTSRD57439 | Volume – 7 | Issue – 3 | May-June 2023 Page 696
TWITTER AS AN OSN
Introduction
Twitter is a social networking website with 500
million active users [14] as of today who share
information. It was first introduced on March 21,
2006 [14]. Twitter's logo is a chirping bird, hence
the name of the website. Users can access it to
exchange frequent information called “tweets”
which are messages of up to 140 characters long
that anyone can send or read. These tweets are
public by default and visible to all those who are
following the tweeter. Users share these tweets
which may contain news, opinions, photos, videos,
links, and messages. Following is the standard
terminology used in Twitter and relevant to our
work:
Tweets [3]: A Twitter message that is no longer
than 140 characters.
Followers and Followings [3]: Followers are users
who a specific user is following, while Followings
are people that a user is following.
Retweet [3]: A tweet that has been forwarded to a
user's entire following.
Hashtags [3]: The # sign is used to annotate
keywords or subjects in a tweet so that search
engines may quickly find them.
Mention [3]: You can include replies and mentions
of other users in tweets by using the @ sign in
front of their usernames.
Lists [3]: Twitter offers a tool for grouping the
persons you follow into lists.
Direct Message [3]: Also known as a DM, this
refers to Twitter's mechanism for direct messaging
users to communicate privately.
According to Twitter policy [16], signs of spam
profiles include metrics like following a lot of
users quickly,1 posting mostly links, using popular
hashtags (#) when posting unrelated information,
and repeatedly posting other users' tweets as your
own. By tweeting to @spam, users have the option
to report spammy profiles to Twitter. However, the
Twitter policy [16] does not make it clear whether
managers utilise user reports or automated
processes to look for these circumstances, despite
the fact that it is assumed that both approaches are
used.
Threats on Twitter
1. Spammed Tweets [13]: Twitter only allows
users to post tweets with a maximum of 140
characters, but despite this restriction,
cybercriminals have discovered a way to make
the most of it by creating succinct but
compelling tweets that include links to
promotions for free vouchers, job postings, or
other promotions.
2. Downloads of malware [13]: Cybercriminals
have shared tweets with links to websites where
malware can be downloaded using Twitter.The
Twitter worms that transmitted direct messages
and even malware that attacked both Windows
and Mac operating systems include FAKEAV
and backdoor[13] programmes. KOOBFACE
[13], a piece of social media virus that attacked
both Facebook and Twitter, has the worst
reputation.
3. Twitter bots [13]: Online criminals frequently
utilise Twitter to run and manage botnets.
These botnets threaten the security and privacy
of the users by controlling their accounts.
Social Implications of OSNs
In addition to the typical issues that social
networking sites bring for users, such as spamming,
phishing assaults, malware infestations, social bots,
viruses, etc., the biggest challenge is maintaining
the security and confidentiality of private data.
Twitter policy states that if an account has more
than 2,000 followers, this amount is constrained by
the number of followers the account has.
Social networking websites are created with the
intention of making information readily available
and accessible to others. But tragically,
cybercriminals exploit this information, which is
readily accessible, to launch focused assaults.
Attackers can easily find a means to gain access to
a user's account in order to gather more information
and use that information to gain access to the user's
other accounts and the accounts of their friends.
MOTIVATING REVIEW
Social networks have been a target for spammers
due to the simplicity of information sharing and the
ability to stay up to date with current subjects. It
can be challenging to identify such fraudulent
individuals in OSNs because spammers are well-
aware of the methods available to identify them.
For the purpose of collecting money, spammers
can utilise OSNs as the ideal platform to pose as
legitimate users and attempt to convince innocent
users to click on harmful posts. The most crucial
area being researched by numerous experts is how
to identify such people in order to safeguard the
network and protect users' private information. In
order to quickly evaluate the work that has been
done in this field, researchers will find this paper to
be of great assistance.
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
@ IJTSRD | Unique Paper ID – IJTSRD57439 | Volume – 7 | Issue – 3 | May-June 2023 Page 697
FEATURES DISTINGUISHING SPAMMERS
& NON-SPAMMERS IN TWITTER
The papers analysed in this study are shown in
Table 1, along with the type of features that were
utilised to identify spam Twitter profiles. Spam and
non-spam profiles can be distinguished by either
user-based or content-based characteristics. In any
social network, user-based features are the
characteristics of the user's profile and behaviour,
whereas content-based features are the
characteristics of the text that users publish.
Table 1 Features for the detection of spam profiles
Attributes used for detection of spam profiles
User based features:
which contain demographic information such as profile information, follower and
following numbers, followers-to-followers ratio, reputation, account age, average time
between tweets, posting habits, idle hours, tweet frequency, etc.[33,12,34,3,26]
Content based features:
among them are the quantity of hashtags (#), the quantity of URLs in tweets, @ mentions,
retweets, spam terms, HTTP links, trending topics, duplicate tweets, etc.[33,7,11,25]
User based and content based both [1,22,24,27,29,2,4]
Any additional features, such as graph connectedness or pictorial distance: Graph-based
features, neighbor-based features, interaction-based features, social links, social activities,
and Markov clustering method [21,9,28,33,23,6]
Function of the aforementioned features in identifying spam profiles in accordance with Twitter rules [16]:
1. The quantity of followers—spammers have fewer followers.
2. The amount of followers—Spammers frequently follow a lot of users.
3. Followers/Following Ratio: Spammers have a ratio of less than 1.
4. The ratio of followers to the total of followers and followings is referred to as reputation. Spammers are
well-known.
5. Account age is calculated using the current date and the account's inception date. Since spammers
typically create fresh accounts, this feature is less useful to them.
6. Average time between posts - In order to attract attention, spammers send out more tweets quickly.
7. Posting Time Behaviour: Spammers frequently post at predetermined times, such as early in the
morning or late at night when real users aren't using social media.
8. Idle hours: Spammers continue to send messages to cut down on their idle time.
9. Tweet frequency: To attract other users' attention, spammers tweet more frequently and at unusual
hours.
10. The quantity of hashtags (#) used by spammers to entice genuine users to read their tweets by posting
numerous unrelated updates to the most popular topics on Twitter.
11. URLs: Spammers frequently tweet a big number of URLs to dangerous websites.
12. @mentions: In order to avoid being found, spammers use as many @usernames of unknown
individuals as possible in their tweets.
13. Retweets are replies to any tweet that contain the @RT symbol, and spammers frequently utilise @RT
in their tweets.less free time.
14. Spam Words – The majority of spammers' tweets contain spam words.
15. HTTP links - Tweets made by spammers contain the most www or http:// characters.
16. Duplicate tweets: Spammers frequently use many @usernames in their tweets to post identical tweets.
EXISTING METHODS FOR DETECTION OF SPAM PROFILES IN TWITTER
Researchers have employed a variety of strategies to identify the spam profiles in distinct OSNs. As
Twitter is used to discuss and disseminate information about trending topics in real time rather than just as
a social communication platform, we are concentrating primarily on the work that has been done to identify
spammers on Twitter. The summary of the papers that were looked at about the identification of spammers
on Twitter is shown in Table 2.
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
@ IJTSRD | Unique Paper ID – IJTSRD57439 | Volume – 7 | Issue – 3 | May-June 2023 Page 698
In 2010, Alex Hai Wang [1] made significant progress in the area of spam profile detection using both
user- and content-based features. To find suspicious Twitter users, a prototype spam detection system has
been presented. To investigate the "follower" and "friend" relationships, a directed social graph model has
been put forth. Using a Bayesian classification technique, content-based characteristics and user-based
features have been employed to make spam detection easier in accordance with Twitter's spam policy. The
performance of numerous traditional classification techniques, including Decision Trees, Support Vector
Machines (SVM), Naive Bayesian, and Neural Networks, has been compared using standard evaluation
measures, and among all of them, the Bayesian classifier has been found to perform the best. The algorithm
attained a 93.5% accuracy and an 89% precision across the 2,000 users in the crawling dataset and the 500
users in the test dataset. This method's limitation is that it was only evaluated on a very small dataset of 500
individuals by taking into account their 20 most recent tweets.
When Lee et al. [22] installed social honeypots made up of real profiles, they were able to identify
suspicious users, and their bot gathered proof of spam by crawling the profile of the user who sent the
unsolicited friend requests and URLs on Twitter and MySpace. Spammers have been identified using
characteristics of profiles such as their posting habits, content, and friend information to build machine
learning classifiers. Following investigation, profiles of users who contacted these social honeypots on
Twitter and MySpace via unsolicited friend requests have been gathered. Spammers have been identified
using the LIBSVM classifier. The approach's validation on two separate dataset combinations—10%
spammers+90% non-spammers and 10% non-spammers+90% spammers—is one of its strong points. The
approach has a drawback because fewer datasets have been utilised for validation.
Based on the content of tweets and user-based attributes, Benevenuto et al. [7] identified spammers. The
following tweet content attributes are used: the quantity of hashtags per word, the quantity of URLs per
word, the quantity of words per tweet, the quantity of characters per tweet, the quantity of hashtags per
tweet, the quantity of numeric characters in the text, the quantity of users mentioned in each tweet, and the
quantity of times the tweet has been retweeted. The features that set spammers apart from non-spammers
include the percentage of tweets that contain URLs, the percentage of tweets that contain spam words, and
the average amount of words that are hashtags on the tweets. 54 million Twitter users have been crawled,
and 1065 users have been manually classified as spammers and non-spammers. Spammers and non-
spammers have been separated using supervised machine learning, or SVM classifier. The system's
detection accuracy is 87.6%, with only 3.6% of non-spammers incorrectly categorised.
Sending a message to "@spam" on Twitter enables users to report spam accounts to the company. Gee et al.
[12] took use of this property and used a classification technique to find spam profiles. Both spam and
regular user profiles have been gathered using the Twitter API and "@spam" in Twitter, respectively. The
collected data was first represented in JSON before being provided in CSV format as a matrix. Users are
rows in the matrix, and features are columns. Then CSV files were trained using Naive Bayes algorithm
with 27% error rate then SVM algorithm has been used with error rate of 10%. Spam profiles detection
accuracy is 89.3%. Limitation of this approach is that not very technical features have been used for
detection and precision is also less i.e. 89.3% so it has been suggested that aggressive deployment of any
system should be done only if precision is more than 99%.
McCord et al. [24] employed content-based features such the quantity of URLs, replies/mentions, retweets,
and hashtags as well as user-based features like the quantity of friends and followers. Spam profiles on
Twitter have been identified using classifiers including Random Forest, Support Vector Machine (SVM),
Naive Bayesian, and K-Nearest Neighbour. The Random Forest classifier, which yields the best results after
the SMO, Naive Bayesian, and K-NN classifiers, has been validated on 1000 users with 95.7% precision and
95.7% accuracy. As a result of the unbalanced dataset used and the fact that Random Forest is typically used
in cases of unbalanced datasets, this approach's limitation is that reputation feature has been giving incorrect
results for the considered dataset, failing to distinguish between spammers and non-spammers. Finally, the
approach has only been validated on a small sample size.
Using two distinct features—URL rate and interaction rate—Lin et al. [28] identified persistent spam
accounts in Twitter. Many different indicators, including the number of followers, number of followings,
followers-to-following ratio, tweet content, number of hashtags, URL links, etc., have been utilised by the
majority of publications to identify spam accounts. However, according to this study, all of these features
are not very good at spotting spammers, hence only straightforward yet useful features like URL rate and
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
@ IJTSRD | Unique Paper ID – IJTSRD57439 | Volume – 7 | Issue – 3 | May-June 2023 Page 699
interaction rate have been employed for identification. The ratio of tweets with URLs to all tweets is known
as the URL rate, while the ratio of tweets that interact with one another is known as the interaction rate.
Twitter API was used to crawl 26,758 accounts, and J48 classifier analysis was performed on 816 long-
surviving accounts with an accuracy rate of 86%. The approach's limitation is that only two variables were
utilised to detect spam profiles; hence, if spammers maintain low URL rates and low interaction rates, the
system will not function as planned.
There are two different kinds of spammer detection systems, according to Amit A. et al. [2]: one is URL-
centric, which relies on identifying fraudulent URLs, and the other is user-centric, which is based on
features relating to people such followers/following ratio. The method used in this research is a hybrid one
that takes into account both of the properties listed above. Along with an alert system to identify spam
tweets, 15 new features have been proposed to catch spammers. Spammers' tweet campaigns and methods
have also been researched. A dataset from Twitter with 500K users and another with 110,789 individuals
were both used. Bait-oriented features, which highlight the strategies used by spammers to get victims to
click on harmful links, include mentions to non-followers, trend hijacking, and trend intersection with well-
known trends. Tweet interval variation, tweet volume variation, ratio of tweet interval variation to tweet
volume variation, and tweeting sources are examples of behavioural characteristics. Duplicate URLs,
duplicate domain names, and IP/domain ratio are examples of URL characteristics. Dissimilarity of tweet
content, similarity of tweets, and URL and tweet similarity are all examples of content entropy properties.
Follower/following ratio and the profile's description language dissimilarity are aspects of the profile. Then,
using the Weka tool, all of these features were gathered from both malicious and benign users and fed into
four supervised learning algorithms: Decision Tree, Random Forest, Bayes Network, and Decorate. With
Decorate's classifier, which produces the best results, 93.6% of spammers have been found. It has been
demonstrated that this method performs better than Twitter's spammer detection strategy. However, this
method has only been tested on 31,808 individuals, whereas Twitter is taking into account millions of users.
A technique to identify abusive us ers that publish offensive content, including dangerous URLs,
pornographic URLs, and phishing links, drive regular users from social networks, and violate their privacy
has been presented by Chakraborty et al. [4]. The algorithm has two steps: the first checks a user's profile for
offensive content before sending a friend request to another user, and the second checks the similarity of two
profiles. If the user should accept a friend invitation after these two phases is up to the recommendation.
This has been tested with a 5000 user Twitter dataset that was gathered using the REST API. Timing,
content, and profile-based criteria are all taken into account when determining how to distinguish between
abusive and non-abusive users. There have been SVM, Decision Tree, Random Forest, and Nave Bayesian
classifiers employed. All classifiers are outperformed by SVM, and the model is operating at an accuracy of
89%.
New features were used by Yang et al. [6] to identify spammers on Twitter. There have been discussions of
a number of evasion strategies used by spammers. Ten new detection features have been proposed, including
three graph-based features, three neighbor-based features, three automation-based features, and one timing-
based feature. These features are expensive and difficult to get around because they are based on techniques
that spammers don't use to avoid detection and require more time, money, and resources. With the help of
classifiers like Random Forest, Decision Tree, Decorate, and Bayesian Network, 18 features—eight already
existent and ten new—have been examined for detection purposes. A Bayesian classifier's accuracy of
88.6% is the best. This method has a limitation in that very little data has been crawled and only a specific
sort of spammers is being found with a low detection rate, which is the minimum number of spammers
found in the dataset.
RESEARCH DIRECTIONS
During the survey, it became pretty clear that there has been a lot of work done to identify spam profiles in
various OSNs. Even so, the detection rate can be improved by switching up the method and using more
substantial features as the determining factor. The following are a few findings from the survey:
1. Considering that Twitter has millions of active users, and this number is growing. Additionally, almost
all writers used a relatively limited testing dataset to evaluate the effectiveness of their methodology.
Therefore, in order to evaluate the effectiveness of any strategy, the testing dataset must be expanded.
2. A multivariate model must be developed.
3. A technique that can identify all types of spammers must be developed.
4. It is necessary to test the methods using various mixtures of spammers and non-spammers.
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
@ IJTSRD | Unique Paper ID – IJTSRD57439 | Volume – 7 | Issue – 3 | May-June 2023 Page 700
Table 2 Outline of techniques used for thedetection of spammers
CONCLUSION
Researchers have created and employed a variety
of techniques to identify spammers on various
social networks. The majority of the work has been
done utilising classification approaches like SVM,
Decision Tree, Naive Bayesian, and Random
Forest, as can be inferred from the publications
examined. User-based features, content-based
features, or a combination of both have been used
for detection. A few authors additionally added
new detection features. All of the methods were
only tried with a single combination of spammers
and non-spammers and were validated on a very
limited dataset. In comparison to employing solely
user-based or content-based characteristics,
combining features for the detection of spammers
has demonstrated improved performance in terms
of accuracy, precision, recall, etc.
REFERENCES
[1] Don’t Follow Me: Spam Detection in Twitter,
Proceedings of the 2010 International
Author MetricsUsed MethodologyUsed DatasetUsed Results
Alex Hai
Wang[1]
Graph Based
and Content
based
ComparedNaive
Bayesian, Neural
Network, SVM and
Decision Tree
Validatedon 500
Twitter users with
20 recent tweets
Naive Bayesian giving
highestaccuracy -
93.5%
Lee et.
al.[22]
User based
Compared Decorate,
SimpleLogistic, FT,
LogiBoost,
RandomSubSpace,
Bagging, J48,
LibSVM
Validatedon 1000
Twitter users
Decorate giving
highestaccuracy-
88.98%
Beneven
uto et.al.[7]
User
Based and
Contentbased
SVM
Validated
on 1065 Twitter
users
Accuracy-
87.6% (with user
based and content
based features) and
accuracy- 84.5% (with
only user based
features)
Gee et.
al.[12]
User based
Compared Naive
Bayesian, SVM
Validated on450
Twitter users with
200 recent
tweets
Accuracy-89.6%
McCord et.
al.[24]
User based and
content based
Compared Random
Forest,SVM, Naive
Bayesian, K-
NN
Validated on1000
Twitterusers with
100 recent
tweets
Radom Forestgiving
highestaccuracy-
95.7%
Lin et.
al.[28]
URL rate,
interaction rate
J48
Validated on
400 Twitterusers
Precision-86%
Amit A.et.
al.[2]
Introduced 15
newfeatures
Compared Random
Forest,Decision Tree,
Decorate, Naive
Bayesian
Validated on31,808
Twitter users
Accuracy-93.6%
Chakrabor
ty et. al.[4]
User based,
Contentbased
Compared Random
Forest,SVM, Naive
Bayesian,
Decision Tree
Trained on 5000
Twitterusers with
200 recent
tweets
SVM giving highest
accuracy-89%
Yang et.
al.[6]
18
features(8-
existing & 10
newfeatures
introduce
d)
Compared Random
Forest,Decision Tree,
Decorate, Naive
Bayesian
Validated ontwo
datasets-5000 users
and then 3500
users
with 40 recent
tweets
Bayesian giving
highestaccuracy-
88.6%
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
@ IJTSRD | Unique Paper ID – IJTSRD57439 | Volume – 7 | Issue – 3 | May-June 2023 Page 701
Conference, Pages 1-10, 26-28 July 2010,
IEEE.
[2] Amit A. Amleshwaram, Narasimha Reddy,
Sandeep Yadav, Guofei Gu, Chao Yang,
CATS: Characterizing Automation of Twitter
Spammers, Texas A&M University, 2013,
IEEE.
[3] Anshu Malhotra, Luam Totti, Wagner Meira
Jr., Ponnurangam Kumaraguru, Virgılio
Almeida, Studying User Footprints in Different
Online Social Networks
[4] ,International Conference on Advances in
Social Networks Analysis and Mining, 2012,
IEEE/ACM.
[5] Ayon Chakraborty, Jyotirmoy Sundi, Som
Satapathy, SPAM: A Framework for Social
Profile Abuse Monitoring.
[6] Boyd, Ellison, N. B. (2007), Social network
sites: Definition, history, and scholarship,
Journal of Computer- Mediated
Communication, 13(1), article 11,
http://jcmc.indiana.edu/vol13/issue1/boyd.elliso
n.html
[7] Chao Yang, Robert Chandler Harkreader,
Guofei Gu , Die Free or Live Hard? Empirical
Evaluation and New Design for Fighting
Evolving Twitter Spammers, RAID'11
Proceedings of the 14th international
conference on Recent Advances in Intrusion
Detection, Pages 318-337, 2011, Springer-
Verlag Berlin, Heidelberg, ACM
[8] Fabricio Benevenuto, Gabriel Magno, Tiago
Rodrigues, and Virgilio Almeida, Detecting
Spammers on Twitter, CEAS 2010 Seventh
annual Collaboration, Electronic messaging,
Anti Abuse and Spam Conference, July 2010,
Washington, US.
[9] Fact Sheet 35: Social Networking Privacy:
How to be Safe, Secure and Social
[10] Faraz Ahmed, Muhammad Abulaish, SMIEEE,
An MCL- Based Approach for Spam Profile
Detection in Online Social Networks, IEEE
11th International Conference on Trust,
Security and Privacy in Computing and
Communications, 2012.
[11] Georgios Kontaxis, Iasonas Polakis, Sotiris
Ioannidis and Evangelos P. Markatos,
Detecting Social Network Profile Cloning, 3rd
International Workshop on Security and Social
Networking, 2011, IEEE.
[12] Gianluca Stringhini, Christopher Kruegel,
Giovanni Vigna, Detecting Spammers on
Social Networks, University of California,
Santa Barbara, Proceedings of the 26th Annual
Computer Security Applications Conference,
ACSAC ’10, Austin, Texas USA, pages 1-9,
Dec. 6-10, 2010, ACM.
[13] Grace gee, Hakson Teh, Twitter Spammer
Profile Detection, 2010.
[14] http://about-
threats.trendmicro.com/us/webattack-
Information regarding Twitter threats.
[15] http://en.wikipedia.org/wiki/Twitter-
Information of Twitter.
[16] http://expandedramblings.com/index.php/march
-2013-by-the-numbers-a-few-amazing-
twitter-stats-Regarding statistics of Twitter.
[17] http://help.twitter.com/forums/26257/entries/
1831- The Twitter Rules.
[18] http://twittnotes.com/2009/03/, 2000-
following-limit-on- twitter.html-The 2000
Following Limit Policy on Twitter.
[19] http://www.spamhaus.org/consumer/definitio
n-Spam Definition.
[20] J. Baltazar, J. Costoya, and R. Flores, “The
real face of koobface: Thelargest web 2.0
botnet explained,” Trend Micro Threat
Research , 2009.
[21] J. Douceur, “The sybil attack,” Peer-to-peer
Systems, pp. 251–260, 2002.[12] D. Irani, M.
Balduzzi, D. Balzarotti,
[22] E. Kirda, and C. Pu, “Reverse
socialengineering attacks in online social
networks,” Detection of Intrusionsand
Malware, and Vulnerability Assessment , pp.
55–74, 2011.
[23] Jonghyuk Song, Sangho Lee and Jong Kim,
Spam Filtering in Twitter using Sender-
Receiver Relationship, RAID'11 Proceedings
of the 14th International Conference on
Recent Advances in Intrusion Detection, vol.
6961, Pages 301-317, 2011, Springer,
Heidelberg ACM.
[24] Kyumin Lee, James Caverlee, Steve Webb,
Uncovering Social Spammers: Social
Honeypots + Machine Learning, Proceeding
of the 33rd international ACM SIGIR
conference on Research and development in
information retrieval, 2010, Pages 435–442,
ACM, New York (2010).
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
@ IJTSRD | Unique Paper ID – IJTSRD57439 | Volume – 7 | Issue – 3 | May-June 2023 Page 702
[25] Leyla Bilge, Thorsten Strufe, Davide
Balzarotti, Engin Kirda, All Your Contacts
Are Belong to Us: Automated Identity Theft
Attacks on Social Networks, International
World Wide Web Conference Committee
(IW3C2),
[26] WWW 2009, April 20–24, 2009, Madrid,
Spain, ACM
[27] M. McCord, M. Chuah, Spam Detection on
Twitter Using Traditional Classifiers,
ATC’11, Banff, Canada, Sept 2-4, 2011,
IEEE.
[28] Manuel Egele, Gianluca Stringhini,
Christopher Kruegel, and Giovanni Vigna,
COMPA: Detecting Compromised Accounts
on Social Networks.
[29] Marcel Flores, Aleksandar Kuzmanovic,
Searching for Spam: Detecting Fraudulent
Accounts via Web Search, LNCS 7799, pp.
208–217, 2013. Springer-Verlag Berlin
Heidelberg 2013.
[30] Mauro Conti, Radha Poovendran, Marco
Secchiero, FakeBook: Detecting Fake
Profiles in On-line Social Networks,
IEEE/ACM International Conference on
Advances in Social Networks Analysis and
Mining, 2012.
[31] Po-Ching Lin, Po-Min Huang, A Study of
Effective Features for Detecting Long-
surviving Twitter Spam Accounts, Advanced
Communication Technology (ICACT), 15th
International Conference on 27-30 Jan. 2013,
IEEE.
[32] Sangho Lee and Jong Kimz,
WARNINGBIRD: Detecting Suspicious
URLs in Twitter Stream, 19th Network and
Distributed System Security Symposium
(NDSS), San Diego, California, USA,
February 5-8, 2012.
[33] T. Jagatic, N. Johnson, M. Jakobsson, and F.
Menczer, “Social phishing,”
Communications of the ACM , vol. 50, no.
10, pp. 94–100, 2007.
[34] Vijay A. Balasubramaniyan, Arjun
Maheswaran, Viswanathan Mahalingam,
Mustaque Ahamad, H. Venkateswaran, A
Crow or a Blackbird? Using True Social
Network and Tweeting Behavior to Detect
Malicious Entities in Twitter, 2002, ACM
[35] Y. Boshmaf, I. Muslukhov, K. Beznosov, and
M. Ripeanu, “The socialbotnetwork: when
bots socialize for fame and money,” in
Proceedings of the 27th Annual Computer
Security Applications Conference.
ACM,2011, pp. 93–102.
[36] Yin Zhuy, Xiao Wang, Erheng Zhong,
Nanthan N. Liuy, He Li, Qiang Yang,
Discovering Spammers in Social Networks,
Proceedings of the Twenty-Sixth AAAI
Conference on Artificial Intelligence.
[37] Zhi Yang, Christo Wilson, Xiao Wang,
Tingting Gao, Ben
[38] Y. Zhao, and Yafei Dai, Uncovering Social
Network Sybils in the Wild, Proceedings of
the 11th ACM/USENIX Internet
Measurement Conference (IMC’11), 2011.
[39] Verma, P., Khanday, A. M. U. D., Rabani, S.
T., Mir, M. H., & Jamwal, S. (2019). Twitter
sentiment analysis on Indian government
project using R. Int J Recent Technol Eng,
8(3), 8338-41.
[40] Verma, P., & Jamwal, S. (2020). Mining
public opinion on Indian Government
policies using R. Int. J. Innov. Technol.
Explor. Eng.(IJITEE), 9(3).
[41] Thakur, N., Choudhary, A., & Verma, P.
Machine Learning Algorithms-A Systematic
Review.
[42] Verma, P., & Mahajan, S. A Systematic
review of Techniques to Spot Spammers on
Twitter.
[43] Thakur, M., & Verma, P. A Review of
Computer Network Topology and Analysis
Examples.
[44] Kumar, A., Guleria, A., & Verma, P. Internet
of Things (IoT) and Its Applications: A
Survey Paper.

More Related Content

PDF
Classification Methods for Spam Detection in Online Social Network
DOCX
Spammer Detection and Fake User Identification on Social Networks
PDF
Spammer Detection and Fake User Identification on Social Networks
PPTX
sdfi_spammmer_detction_and_fakesppt.pptx
PDF
6356152.pdf
PDF
Exploring machine learning techniques for fake profile detection in online so...
DOCX
Spammer Detection and Fake User Identificationon Social Networks
DOCX
Spammer detection and fake user Identification on Social Networks
Classification Methods for Spam Detection in Online Social Network
Spammer Detection and Fake User Identification on Social Networks
Spammer Detection and Fake User Identification on Social Networks
sdfi_spammmer_detction_and_fakesppt.pptx
6356152.pdf
Exploring machine learning techniques for fake profile detection in online so...
Spammer Detection and Fake User Identificationon Social Networks
Spammer detection and fake user Identification on Social Networks

Similar to A Survey of Methods for Spotting Spammers on Twitter (20)

DOCX
DOCX
Spammer taxonomy using scientific approach
PDF
IRJET- Twitter Spammer Detection
PDF
Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...
PPTX
Presentation-Detecting Spammers on Social Networks
PDF
Security techniques for intelligent spam sensing and anomaly detection in onl...
PPTX
PPT.pptx
PDF
A DATA MINING APPROACH FOR FILTERING OUT SOCIAL SPAMMERS IN LARGE-SCALE TWITT...
DOCX
📘 Project Document automated spam.docx
DOCX
📘 automated spam Project Document.docx
PDF
Online social network
PDF
Integrated approach to detect spam in social media networks using hybrid feat...
PPTX
Dynamic feature selection for spam detection (1).pptx
DOCX
Proposal.docx
PDF
IEEE - 2016 Active Research Projects - 1 Crore Projects
PDF
762019109
PDF
britneyspears
PDF
Identification and Analysis of Malicious Content on Facebook: A Survey
PDF
DETECTION OF MALICIOUS SOCIAL BOTS USING ML TECHNIQUE IN TWITTER NETWORK
PPTX
osn-threats-solutions-2
Spammer taxonomy using scientific approach
IRJET- Twitter Spammer Detection
Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...
Presentation-Detecting Spammers on Social Networks
Security techniques for intelligent spam sensing and anomaly detection in onl...
PPT.pptx
A DATA MINING APPROACH FOR FILTERING OUT SOCIAL SPAMMERS IN LARGE-SCALE TWITT...
📘 Project Document automated spam.docx
📘 automated spam Project Document.docx
Online social network
Integrated approach to detect spam in social media networks using hybrid feat...
Dynamic feature selection for spam detection (1).pptx
Proposal.docx
IEEE - 2016 Active Research Projects - 1 Crore Projects
762019109
britneyspears
Identification and Analysis of Malicious Content on Facebook: A Survey
DETECTION OF MALICIOUS SOCIAL BOTS USING ML TECHNIQUE IN TWITTER NETWORK
osn-threats-solutions-2
Ad

More from ijtsrd (20)

PDF
A Study of School Dropout in Rural Districts of Darjeeling and Its Causes
PDF
Pre extension Demonstration and Evaluation of Soybean Technologies in Fedis D...
PDF
Pre extension Demonstration and Evaluation of Potato Technologies in Selected...
PDF
Pre extension Demonstration and Evaluation of Animal Drawn Potato Digger in S...
PDF
Pre extension Demonstration and Evaluation of Drought Tolerant and Early Matu...
PDF
Pre extension Demonstration and Evaluation of Double Cropping Practice Legume...
PDF
Pre extension Demonstration and Evaluation of Common Bean Technology in Low L...
PDF
Enhancing Image Quality in Compression and Fading Channels A Wavelet Based Ap...
PDF
Manpower Training and Employee Performance in Mellienium Ltdawka, Anambra State
PDF
A Statistical Analysis on the Growth Rate of Selected Sectors of Nigerian Eco...
PDF
Automatic Accident Detection and Emergency Alert System using IoT
PDF
Corporate Social Responsibility Dimensions and Corporate Image of Selected Up...
PDF
The Role of Media in Tribal Health and Educational Progress of Odisha
PDF
Advancements and Future Trends in Advanced Quantum Algorithms A Prompt Scienc...
PDF
A Study on Seismic Analysis of High Rise Building with Mass Irregularities, T...
PDF
Descriptive Study to Assess the Knowledge of B.Sc. Interns Regarding Biomedic...
PDF
Performance of Grid Connected Solar PV Power Plant at Clear Sky Day
PDF
Vitiligo Treated Homoeopathically A Case Report
PDF
Vitiligo Treated Homoeopathically A Case Report
PDF
Uterine Fibroids Homoeopathic Perspectives
A Study of School Dropout in Rural Districts of Darjeeling and Its Causes
Pre extension Demonstration and Evaluation of Soybean Technologies in Fedis D...
Pre extension Demonstration and Evaluation of Potato Technologies in Selected...
Pre extension Demonstration and Evaluation of Animal Drawn Potato Digger in S...
Pre extension Demonstration and Evaluation of Drought Tolerant and Early Matu...
Pre extension Demonstration and Evaluation of Double Cropping Practice Legume...
Pre extension Demonstration and Evaluation of Common Bean Technology in Low L...
Enhancing Image Quality in Compression and Fading Channels A Wavelet Based Ap...
Manpower Training and Employee Performance in Mellienium Ltdawka, Anambra State
A Statistical Analysis on the Growth Rate of Selected Sectors of Nigerian Eco...
Automatic Accident Detection and Emergency Alert System using IoT
Corporate Social Responsibility Dimensions and Corporate Image of Selected Up...
The Role of Media in Tribal Health and Educational Progress of Odisha
Advancements and Future Trends in Advanced Quantum Algorithms A Prompt Scienc...
A Study on Seismic Analysis of High Rise Building with Mass Irregularities, T...
Descriptive Study to Assess the Knowledge of B.Sc. Interns Regarding Biomedic...
Performance of Grid Connected Solar PV Power Plant at Clear Sky Day
Vitiligo Treated Homoeopathically A Case Report
Vitiligo Treated Homoeopathically A Case Report
Uterine Fibroids Homoeopathic Perspectives
Ad

Recently uploaded (20)

PDF
Hazard Identification & Risk Assessment .pdf
PDF
Τίμαιος είναι φιλοσοφικός διάλογος του Πλάτωνα
PPTX
20th Century Theater, Methods, History.pptx
PDF
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
PPTX
Computer Architecture Input Output Memory.pptx
PDF
Paper A Mock Exam 9_ Attempt review.pdf.
PDF
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
PPTX
B.Sc. DS Unit 2 Software Engineering.pptx
PDF
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
PDF
MBA _Common_ 2nd year Syllabus _2021-22_.pdf
PDF
What if we spent less time fighting change, and more time building what’s rig...
PDF
advance database management system book.pdf
PDF
Uderstanding digital marketing and marketing stratergie for engaging the digi...
PDF
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
PDF
Empowerment Technology for Senior High School Guide
PPTX
Chinmaya Tiranga Azadi Quiz (Class 7-8 )
PDF
HVAC Specification 2024 according to central public works department
PPTX
Unit 4 Computer Architecture Multicore Processor.pptx
PDF
FOISHS ANNUAL IMPLEMENTATION PLAN 2025.pdf
PDF
Complications of Minimal Access-Surgery.pdf
Hazard Identification & Risk Assessment .pdf
Τίμαιος είναι φιλοσοφικός διάλογος του Πλάτωνα
20th Century Theater, Methods, History.pptx
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
Computer Architecture Input Output Memory.pptx
Paper A Mock Exam 9_ Attempt review.pdf.
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
B.Sc. DS Unit 2 Software Engineering.pptx
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
MBA _Common_ 2nd year Syllabus _2021-22_.pdf
What if we spent less time fighting change, and more time building what’s rig...
advance database management system book.pdf
Uderstanding digital marketing and marketing stratergie for engaging the digi...
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
Empowerment Technology for Senior High School Guide
Chinmaya Tiranga Azadi Quiz (Class 7-8 )
HVAC Specification 2024 according to central public works department
Unit 4 Computer Architecture Multicore Processor.pptx
FOISHS ANNUAL IMPLEMENTATION PLAN 2025.pdf
Complications of Minimal Access-Surgery.pdf

A Survey of Methods for Spotting Spammers on Twitter

  • 1. International Journal of Trend in Scientific Research and Development (IJTSRD) Volume 7 Issue 3, May-June 2023 Available Online: www.ijtsrd.com e-ISSN: 2456 – 6470 @ IJTSRD | Unique Paper ID – IJTSRD57439 | Volume – 7 | Issue – 3 | May-June 2023 Page 694 A Survey of Methods for Spotting Spammers on Twitter Hareesha Devi, Pankaj Verma, Ankit Dhiman Department of Computer Science, Arni University Kathgarh, Indora, Himachal Pradesh, India ABSTRACT Social networking sites' explosive expansion as a means of information sharing, management, communication, storage, and management has attracted hackers who abuse the Web to take advantage of security flaws for their own nefarious ends. Every day, forged internet accounts are compromised. Online social networks (OSNs) are rife with impersonators, phishers, scammers, and spammers who are difficult to spot. Users who send unsolicited communications to a large audience with the objective of advertising a product, entice victims to click on harmful links, or infect users' systems only for financial gain are known as spammers. Many studies have been conducted to identify spam profiles in OSNs. In this essay, we have discussed the methods currently in use to identify spam Twitter users. User-based, content-based, or a combination of both features could be used to identify spammers. The current paper gives a summary of the traits, methodologies, detection rates, and restrictions (if any) for identifying spam profiles, primarily on Twitter. KEYWORDS: Twitter, legitimate users, online social networks (OS's), and spammers How to cite this paper: Hareesha Devi | Pankaj Verma | Ankit Dhiman "A Survey of Methods for Spotting Spammers on Twitter" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456- 6470, Volume-7 | Issue-3, June 2023, pp.694-702, URL: www.ijtsrd.com/papers/ijtsrd57439.pdf Copyright © 2023 by author (s) and International Journal of Trend in Scientific Research and Development Journal. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0) (http://creativecommons.org/licenses/by/4.0) INTRODUCTION A social networking site, according to Boyd et al. [5,] enables users to (a) create a profile, (b) befriend a list of other users, and (c) examine and navigate their own and other users' buddy lists. Through the use of Web 2.0 technology, these online social networks (OSNs) enable user interaction. These social networking sites are expanding quickly and altering how individuals communicate with one another. These websites have transformed in less than 8 years from a specialised area of online activity to a phenomenon that attracts millions of internet users. Online communities bring people with similar interests together, making it simpler for them to stay in touch with one another. Sixdegrees.com was the first social networking site to launch in 1997, and makeoutclub.com followed in 2000. Sixdegrees.com and while new websites like MySpace, LinkedIn, Bebo, Orkut, Twitter, etc. found success. Facebook, a very well-known website, was introduced in 2004 [5] and rapidly rose to fame throughout the globe. OSNs' greater user numbers make them more appealing targets for spammers and malevolent users. On social media websites, spam can take many forms and is difficult to identify. Anyone who has used the Internet has encountered spam of some kind, whether it is in emails, forums, newsgroups, etc. Spam [18] is defined as the practise of sending unsolicited bulk messages over electronic messaging systems. OSNs have grown in popularity and are now used as a platform for spam distribution. Spammers want to send product ads to users who are not connected to them. Some spammers post URLs that similar website had a short lifespan and quickely faded, lead to phishing websites where users’ sensitive information is stolen. The detection of spam profiles in OSNs has been the subject of numerous papers. However, no review paper that consolidates the available research has yet been published in this sector. The purpose of our paper is to examine the academic research and work that have been done in this area by various scholars and to highlight the potential directions for future research. The methods for identifying spammers on Twitter have been researched and compared in this study, along with their presentation. The format of this essay is as follows: The approach used to conduct this review is described in Section 2, which is followed by a IJTSRD57439
  • 2. International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470 @ IJTSRD | Unique Paper ID – IJTSRD57439 | Volume – 7 | Issue – 3 | May-June 2023 Page 695 briefing on security issues in OSNs in Section 3. Spammers are defined in Section 4 along with their motivations; the introduction to Twitter and its risks is given in Section 5; the purpose of this survey study is covered in Section 6; the properties that can be used for detection purposes are covered in Section 7; A comparative examination of the research produced by various researchers is reviewed in Section 8; new researchers are given research recommendations in Section 9; and the review is concluded in Section 10. METHODOLOGY After conducting a systematic review using a principled approach and searching major research databases for computer science like IEEE Xplore, ACM Digital Library, Springer Link, Google Scholar, and Science Direct for relevant topics, the current methods for detecting spam profiles in OSNs were surveyed. We concentrated exclusively on studies published after 2009 since social networks were not conceptualised until 1997 [1], and only afterwards did they gain widespread acceptance. Then, in 2004 [1], Facebook was introduced, and it quickly gained popularity. As a result, it took some time for people to become accustomed to using these networks for communication, which is why these networks have been attacked. Over 60 papers were found after searching the five major databases mentioned above. After reviewing all of the paper titles and abstracts, the papers that will be reviewed for this survey were chosen. Only papers that were deemed appropriate for the current investigation were selected. 21 papers in total have been chosen for evaluation after publications with titles and abstracts relating to spam message detection and other unrelated areas were eliminated. The majority of the criteria used to identify spammers have been used to categorise the papers. Through this essay, we're attempting to assemble a list of social networking papers we've read about identifying spam accounts on Twitter. The list is probably not comprehensive, but it lends shape to the ongoing study on identifying social network spammers. After reading this survey study, new researchers will find it simple to assess what research has been done, when, and how the current work may be expanded to improve spam detection. Every time it was appropriate, we included details on the methodology used, the dataset used, the features for spammer detection, and the efficacy of the strategies employed by different writers. The papers discuss, in particular, the ramifications of spammers' interactions with members of social networks as well as current methods for identifying them. SECURITY ISSUES IN OSNs Online social networking sites (OSNs) are susceptible to security and privacy problems due to the volume of user data that these sites process daily. Social networking site users are vulnerable to a range of attacks: 1. Viruses - spammers utilise social networks as a distribution channel [19] for dangerous files to infect users' systems. 2. Phishing attacks: By pretending to be a reliable third party, users' sensitive information is obtained [30]. 3. Users of social networks are bombarded with spam messages by spammers [11]. 4. Sybil (fake) attack - attacker creates a number of false identities and poses as a real user in the system to undermine the reputation of trustworthy users in the network [20]. 5. Social bots, a group of fictitious personas made to capture user information [32]. 6. Attacks involving cloning and identity theft, in which perpetrators construct a profile of an already-existing user on the same network or across many networks in an effort to deceive the cloned user's friends [23]. Attackers will gain access to victims' information if they allow the friend requests provided by these cloned identities. Users and systems are overextended by these attacks. TYPES OF SPAMMERS The fraudulent users known as spammers put social networks' security and users' privacy at risk by tainting the data shared by legal users. One of the following categories best describes spammers [22]: 1. Phishers are people who act normally but are actually out to steal the personal information of other real users. 2. Fake Users: These are users who spoof real users' profiles in order to distribute spam to their friends or other network users. 3. Promoters: These are people that spread harmful links in advertisements or other promotional materials to other people in an effort to collect their personal data. Spammers' motivations: Promote pornography, spread malware, launch phishing attacks, and harm the reputation of the system.
  • 3. International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470 @ IJTSRD | Unique Paper ID – IJTSRD57439 | Volume – 7 | Issue – 3 | May-June 2023 Page 696 TWITTER AS AN OSN Introduction Twitter is a social networking website with 500 million active users [14] as of today who share information. It was first introduced on March 21, 2006 [14]. Twitter's logo is a chirping bird, hence the name of the website. Users can access it to exchange frequent information called “tweets” which are messages of up to 140 characters long that anyone can send or read. These tweets are public by default and visible to all those who are following the tweeter. Users share these tweets which may contain news, opinions, photos, videos, links, and messages. Following is the standard terminology used in Twitter and relevant to our work: Tweets [3]: A Twitter message that is no longer than 140 characters. Followers and Followings [3]: Followers are users who a specific user is following, while Followings are people that a user is following. Retweet [3]: A tweet that has been forwarded to a user's entire following. Hashtags [3]: The # sign is used to annotate keywords or subjects in a tweet so that search engines may quickly find them. Mention [3]: You can include replies and mentions of other users in tweets by using the @ sign in front of their usernames. Lists [3]: Twitter offers a tool for grouping the persons you follow into lists. Direct Message [3]: Also known as a DM, this refers to Twitter's mechanism for direct messaging users to communicate privately. According to Twitter policy [16], signs of spam profiles include metrics like following a lot of users quickly,1 posting mostly links, using popular hashtags (#) when posting unrelated information, and repeatedly posting other users' tweets as your own. By tweeting to @spam, users have the option to report spammy profiles to Twitter. However, the Twitter policy [16] does not make it clear whether managers utilise user reports or automated processes to look for these circumstances, despite the fact that it is assumed that both approaches are used. Threats on Twitter 1. Spammed Tweets [13]: Twitter only allows users to post tweets with a maximum of 140 characters, but despite this restriction, cybercriminals have discovered a way to make the most of it by creating succinct but compelling tweets that include links to promotions for free vouchers, job postings, or other promotions. 2. Downloads of malware [13]: Cybercriminals have shared tweets with links to websites where malware can be downloaded using Twitter.The Twitter worms that transmitted direct messages and even malware that attacked both Windows and Mac operating systems include FAKEAV and backdoor[13] programmes. KOOBFACE [13], a piece of social media virus that attacked both Facebook and Twitter, has the worst reputation. 3. Twitter bots [13]: Online criminals frequently utilise Twitter to run and manage botnets. These botnets threaten the security and privacy of the users by controlling their accounts. Social Implications of OSNs In addition to the typical issues that social networking sites bring for users, such as spamming, phishing assaults, malware infestations, social bots, viruses, etc., the biggest challenge is maintaining the security and confidentiality of private data. Twitter policy states that if an account has more than 2,000 followers, this amount is constrained by the number of followers the account has. Social networking websites are created with the intention of making information readily available and accessible to others. But tragically, cybercriminals exploit this information, which is readily accessible, to launch focused assaults. Attackers can easily find a means to gain access to a user's account in order to gather more information and use that information to gain access to the user's other accounts and the accounts of their friends. MOTIVATING REVIEW Social networks have been a target for spammers due to the simplicity of information sharing and the ability to stay up to date with current subjects. It can be challenging to identify such fraudulent individuals in OSNs because spammers are well- aware of the methods available to identify them. For the purpose of collecting money, spammers can utilise OSNs as the ideal platform to pose as legitimate users and attempt to convince innocent users to click on harmful posts. The most crucial area being researched by numerous experts is how to identify such people in order to safeguard the network and protect users' private information. In order to quickly evaluate the work that has been done in this field, researchers will find this paper to be of great assistance.
  • 4. International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470 @ IJTSRD | Unique Paper ID – IJTSRD57439 | Volume – 7 | Issue – 3 | May-June 2023 Page 697 FEATURES DISTINGUISHING SPAMMERS & NON-SPAMMERS IN TWITTER The papers analysed in this study are shown in Table 1, along with the type of features that were utilised to identify spam Twitter profiles. Spam and non-spam profiles can be distinguished by either user-based or content-based characteristics. In any social network, user-based features are the characteristics of the user's profile and behaviour, whereas content-based features are the characteristics of the text that users publish. Table 1 Features for the detection of spam profiles Attributes used for detection of spam profiles User based features: which contain demographic information such as profile information, follower and following numbers, followers-to-followers ratio, reputation, account age, average time between tweets, posting habits, idle hours, tweet frequency, etc.[33,12,34,3,26] Content based features: among them are the quantity of hashtags (#), the quantity of URLs in tweets, @ mentions, retweets, spam terms, HTTP links, trending topics, duplicate tweets, etc.[33,7,11,25] User based and content based both [1,22,24,27,29,2,4] Any additional features, such as graph connectedness or pictorial distance: Graph-based features, neighbor-based features, interaction-based features, social links, social activities, and Markov clustering method [21,9,28,33,23,6] Function of the aforementioned features in identifying spam profiles in accordance with Twitter rules [16]: 1. The quantity of followers—spammers have fewer followers. 2. The amount of followers—Spammers frequently follow a lot of users. 3. Followers/Following Ratio: Spammers have a ratio of less than 1. 4. The ratio of followers to the total of followers and followings is referred to as reputation. Spammers are well-known. 5. Account age is calculated using the current date and the account's inception date. Since spammers typically create fresh accounts, this feature is less useful to them. 6. Average time between posts - In order to attract attention, spammers send out more tweets quickly. 7. Posting Time Behaviour: Spammers frequently post at predetermined times, such as early in the morning or late at night when real users aren't using social media. 8. Idle hours: Spammers continue to send messages to cut down on their idle time. 9. Tweet frequency: To attract other users' attention, spammers tweet more frequently and at unusual hours. 10. The quantity of hashtags (#) used by spammers to entice genuine users to read their tweets by posting numerous unrelated updates to the most popular topics on Twitter. 11. URLs: Spammers frequently tweet a big number of URLs to dangerous websites. 12. @mentions: In order to avoid being found, spammers use as many @usernames of unknown individuals as possible in their tweets. 13. Retweets are replies to any tweet that contain the @RT symbol, and spammers frequently utilise @RT in their tweets.less free time. 14. Spam Words – The majority of spammers' tweets contain spam words. 15. HTTP links - Tweets made by spammers contain the most www or http:// characters. 16. Duplicate tweets: Spammers frequently use many @usernames in their tweets to post identical tweets. EXISTING METHODS FOR DETECTION OF SPAM PROFILES IN TWITTER Researchers have employed a variety of strategies to identify the spam profiles in distinct OSNs. As Twitter is used to discuss and disseminate information about trending topics in real time rather than just as a social communication platform, we are concentrating primarily on the work that has been done to identify spammers on Twitter. The summary of the papers that were looked at about the identification of spammers on Twitter is shown in Table 2.
  • 5. International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470 @ IJTSRD | Unique Paper ID – IJTSRD57439 | Volume – 7 | Issue – 3 | May-June 2023 Page 698 In 2010, Alex Hai Wang [1] made significant progress in the area of spam profile detection using both user- and content-based features. To find suspicious Twitter users, a prototype spam detection system has been presented. To investigate the "follower" and "friend" relationships, a directed social graph model has been put forth. Using a Bayesian classification technique, content-based characteristics and user-based features have been employed to make spam detection easier in accordance with Twitter's spam policy. The performance of numerous traditional classification techniques, including Decision Trees, Support Vector Machines (SVM), Naive Bayesian, and Neural Networks, has been compared using standard evaluation measures, and among all of them, the Bayesian classifier has been found to perform the best. The algorithm attained a 93.5% accuracy and an 89% precision across the 2,000 users in the crawling dataset and the 500 users in the test dataset. This method's limitation is that it was only evaluated on a very small dataset of 500 individuals by taking into account their 20 most recent tweets. When Lee et al. [22] installed social honeypots made up of real profiles, they were able to identify suspicious users, and their bot gathered proof of spam by crawling the profile of the user who sent the unsolicited friend requests and URLs on Twitter and MySpace. Spammers have been identified using characteristics of profiles such as their posting habits, content, and friend information to build machine learning classifiers. Following investigation, profiles of users who contacted these social honeypots on Twitter and MySpace via unsolicited friend requests have been gathered. Spammers have been identified using the LIBSVM classifier. The approach's validation on two separate dataset combinations—10% spammers+90% non-spammers and 10% non-spammers+90% spammers—is one of its strong points. The approach has a drawback because fewer datasets have been utilised for validation. Based on the content of tweets and user-based attributes, Benevenuto et al. [7] identified spammers. The following tweet content attributes are used: the quantity of hashtags per word, the quantity of URLs per word, the quantity of words per tweet, the quantity of characters per tweet, the quantity of hashtags per tweet, the quantity of numeric characters in the text, the quantity of users mentioned in each tweet, and the quantity of times the tweet has been retweeted. The features that set spammers apart from non-spammers include the percentage of tweets that contain URLs, the percentage of tweets that contain spam words, and the average amount of words that are hashtags on the tweets. 54 million Twitter users have been crawled, and 1065 users have been manually classified as spammers and non-spammers. Spammers and non- spammers have been separated using supervised machine learning, or SVM classifier. The system's detection accuracy is 87.6%, with only 3.6% of non-spammers incorrectly categorised. Sending a message to "@spam" on Twitter enables users to report spam accounts to the company. Gee et al. [12] took use of this property and used a classification technique to find spam profiles. Both spam and regular user profiles have been gathered using the Twitter API and "@spam" in Twitter, respectively. The collected data was first represented in JSON before being provided in CSV format as a matrix. Users are rows in the matrix, and features are columns. Then CSV files were trained using Naive Bayes algorithm with 27% error rate then SVM algorithm has been used with error rate of 10%. Spam profiles detection accuracy is 89.3%. Limitation of this approach is that not very technical features have been used for detection and precision is also less i.e. 89.3% so it has been suggested that aggressive deployment of any system should be done only if precision is more than 99%. McCord et al. [24] employed content-based features such the quantity of URLs, replies/mentions, retweets, and hashtags as well as user-based features like the quantity of friends and followers. Spam profiles on Twitter have been identified using classifiers including Random Forest, Support Vector Machine (SVM), Naive Bayesian, and K-Nearest Neighbour. The Random Forest classifier, which yields the best results after the SMO, Naive Bayesian, and K-NN classifiers, has been validated on 1000 users with 95.7% precision and 95.7% accuracy. As a result of the unbalanced dataset used and the fact that Random Forest is typically used in cases of unbalanced datasets, this approach's limitation is that reputation feature has been giving incorrect results for the considered dataset, failing to distinguish between spammers and non-spammers. Finally, the approach has only been validated on a small sample size. Using two distinct features—URL rate and interaction rate—Lin et al. [28] identified persistent spam accounts in Twitter. Many different indicators, including the number of followers, number of followings, followers-to-following ratio, tweet content, number of hashtags, URL links, etc., have been utilised by the majority of publications to identify spam accounts. However, according to this study, all of these features are not very good at spotting spammers, hence only straightforward yet useful features like URL rate and
  • 6. International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470 @ IJTSRD | Unique Paper ID – IJTSRD57439 | Volume – 7 | Issue – 3 | May-June 2023 Page 699 interaction rate have been employed for identification. The ratio of tweets with URLs to all tweets is known as the URL rate, while the ratio of tweets that interact with one another is known as the interaction rate. Twitter API was used to crawl 26,758 accounts, and J48 classifier analysis was performed on 816 long- surviving accounts with an accuracy rate of 86%. The approach's limitation is that only two variables were utilised to detect spam profiles; hence, if spammers maintain low URL rates and low interaction rates, the system will not function as planned. There are two different kinds of spammer detection systems, according to Amit A. et al. [2]: one is URL- centric, which relies on identifying fraudulent URLs, and the other is user-centric, which is based on features relating to people such followers/following ratio. The method used in this research is a hybrid one that takes into account both of the properties listed above. Along with an alert system to identify spam tweets, 15 new features have been proposed to catch spammers. Spammers' tweet campaigns and methods have also been researched. A dataset from Twitter with 500K users and another with 110,789 individuals were both used. Bait-oriented features, which highlight the strategies used by spammers to get victims to click on harmful links, include mentions to non-followers, trend hijacking, and trend intersection with well- known trends. Tweet interval variation, tweet volume variation, ratio of tweet interval variation to tweet volume variation, and tweeting sources are examples of behavioural characteristics. Duplicate URLs, duplicate domain names, and IP/domain ratio are examples of URL characteristics. Dissimilarity of tweet content, similarity of tweets, and URL and tweet similarity are all examples of content entropy properties. Follower/following ratio and the profile's description language dissimilarity are aspects of the profile. Then, using the Weka tool, all of these features were gathered from both malicious and benign users and fed into four supervised learning algorithms: Decision Tree, Random Forest, Bayes Network, and Decorate. With Decorate's classifier, which produces the best results, 93.6% of spammers have been found. It has been demonstrated that this method performs better than Twitter's spammer detection strategy. However, this method has only been tested on 31,808 individuals, whereas Twitter is taking into account millions of users. A technique to identify abusive us ers that publish offensive content, including dangerous URLs, pornographic URLs, and phishing links, drive regular users from social networks, and violate their privacy has been presented by Chakraborty et al. [4]. The algorithm has two steps: the first checks a user's profile for offensive content before sending a friend request to another user, and the second checks the similarity of two profiles. If the user should accept a friend invitation after these two phases is up to the recommendation. This has been tested with a 5000 user Twitter dataset that was gathered using the REST API. Timing, content, and profile-based criteria are all taken into account when determining how to distinguish between abusive and non-abusive users. There have been SVM, Decision Tree, Random Forest, and Nave Bayesian classifiers employed. All classifiers are outperformed by SVM, and the model is operating at an accuracy of 89%. New features were used by Yang et al. [6] to identify spammers on Twitter. There have been discussions of a number of evasion strategies used by spammers. Ten new detection features have been proposed, including three graph-based features, three neighbor-based features, three automation-based features, and one timing- based feature. These features are expensive and difficult to get around because they are based on techniques that spammers don't use to avoid detection and require more time, money, and resources. With the help of classifiers like Random Forest, Decision Tree, Decorate, and Bayesian Network, 18 features—eight already existent and ten new—have been examined for detection purposes. A Bayesian classifier's accuracy of 88.6% is the best. This method has a limitation in that very little data has been crawled and only a specific sort of spammers is being found with a low detection rate, which is the minimum number of spammers found in the dataset. RESEARCH DIRECTIONS During the survey, it became pretty clear that there has been a lot of work done to identify spam profiles in various OSNs. Even so, the detection rate can be improved by switching up the method and using more substantial features as the determining factor. The following are a few findings from the survey: 1. Considering that Twitter has millions of active users, and this number is growing. Additionally, almost all writers used a relatively limited testing dataset to evaluate the effectiveness of their methodology. Therefore, in order to evaluate the effectiveness of any strategy, the testing dataset must be expanded. 2. A multivariate model must be developed. 3. A technique that can identify all types of spammers must be developed. 4. It is necessary to test the methods using various mixtures of spammers and non-spammers.
  • 7. International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470 @ IJTSRD | Unique Paper ID – IJTSRD57439 | Volume – 7 | Issue – 3 | May-June 2023 Page 700 Table 2 Outline of techniques used for thedetection of spammers CONCLUSION Researchers have created and employed a variety of techniques to identify spammers on various social networks. The majority of the work has been done utilising classification approaches like SVM, Decision Tree, Naive Bayesian, and Random Forest, as can be inferred from the publications examined. User-based features, content-based features, or a combination of both have been used for detection. A few authors additionally added new detection features. All of the methods were only tried with a single combination of spammers and non-spammers and were validated on a very limited dataset. In comparison to employing solely user-based or content-based characteristics, combining features for the detection of spammers has demonstrated improved performance in terms of accuracy, precision, recall, etc. REFERENCES [1] Don’t Follow Me: Spam Detection in Twitter, Proceedings of the 2010 International Author MetricsUsed MethodologyUsed DatasetUsed Results Alex Hai Wang[1] Graph Based and Content based ComparedNaive Bayesian, Neural Network, SVM and Decision Tree Validatedon 500 Twitter users with 20 recent tweets Naive Bayesian giving highestaccuracy - 93.5% Lee et. al.[22] User based Compared Decorate, SimpleLogistic, FT, LogiBoost, RandomSubSpace, Bagging, J48, LibSVM Validatedon 1000 Twitter users Decorate giving highestaccuracy- 88.98% Beneven uto et.al.[7] User Based and Contentbased SVM Validated on 1065 Twitter users Accuracy- 87.6% (with user based and content based features) and accuracy- 84.5% (with only user based features) Gee et. al.[12] User based Compared Naive Bayesian, SVM Validated on450 Twitter users with 200 recent tweets Accuracy-89.6% McCord et. al.[24] User based and content based Compared Random Forest,SVM, Naive Bayesian, K- NN Validated on1000 Twitterusers with 100 recent tweets Radom Forestgiving highestaccuracy- 95.7% Lin et. al.[28] URL rate, interaction rate J48 Validated on 400 Twitterusers Precision-86% Amit A.et. al.[2] Introduced 15 newfeatures Compared Random Forest,Decision Tree, Decorate, Naive Bayesian Validated on31,808 Twitter users Accuracy-93.6% Chakrabor ty et. al.[4] User based, Contentbased Compared Random Forest,SVM, Naive Bayesian, Decision Tree Trained on 5000 Twitterusers with 200 recent tweets SVM giving highest accuracy-89% Yang et. al.[6] 18 features(8- existing & 10 newfeatures introduce d) Compared Random Forest,Decision Tree, Decorate, Naive Bayesian Validated ontwo datasets-5000 users and then 3500 users with 40 recent tweets Bayesian giving highestaccuracy- 88.6%
  • 8. International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470 @ IJTSRD | Unique Paper ID – IJTSRD57439 | Volume – 7 | Issue – 3 | May-June 2023 Page 701 Conference, Pages 1-10, 26-28 July 2010, IEEE. [2] Amit A. Amleshwaram, Narasimha Reddy, Sandeep Yadav, Guofei Gu, Chao Yang, CATS: Characterizing Automation of Twitter Spammers, Texas A&M University, 2013, IEEE. [3] Anshu Malhotra, Luam Totti, Wagner Meira Jr., Ponnurangam Kumaraguru, Virgılio Almeida, Studying User Footprints in Different Online Social Networks [4] ,International Conference on Advances in Social Networks Analysis and Mining, 2012, IEEE/ACM. [5] Ayon Chakraborty, Jyotirmoy Sundi, Som Satapathy, SPAM: A Framework for Social Profile Abuse Monitoring. [6] Boyd, Ellison, N. B. (2007), Social network sites: Definition, history, and scholarship, Journal of Computer- Mediated Communication, 13(1), article 11, http://jcmc.indiana.edu/vol13/issue1/boyd.elliso n.html [7] Chao Yang, Robert Chandler Harkreader, Guofei Gu , Die Free or Live Hard? Empirical Evaluation and New Design for Fighting Evolving Twitter Spammers, RAID'11 Proceedings of the 14th international conference on Recent Advances in Intrusion Detection, Pages 318-337, 2011, Springer- Verlag Berlin, Heidelberg, ACM [8] Fabricio Benevenuto, Gabriel Magno, Tiago Rodrigues, and Virgilio Almeida, Detecting Spammers on Twitter, CEAS 2010 Seventh annual Collaboration, Electronic messaging, Anti Abuse and Spam Conference, July 2010, Washington, US. [9] Fact Sheet 35: Social Networking Privacy: How to be Safe, Secure and Social [10] Faraz Ahmed, Muhammad Abulaish, SMIEEE, An MCL- Based Approach for Spam Profile Detection in Online Social Networks, IEEE 11th International Conference on Trust, Security and Privacy in Computing and Communications, 2012. [11] Georgios Kontaxis, Iasonas Polakis, Sotiris Ioannidis and Evangelos P. Markatos, Detecting Social Network Profile Cloning, 3rd International Workshop on Security and Social Networking, 2011, IEEE. [12] Gianluca Stringhini, Christopher Kruegel, Giovanni Vigna, Detecting Spammers on Social Networks, University of California, Santa Barbara, Proceedings of the 26th Annual Computer Security Applications Conference, ACSAC ’10, Austin, Texas USA, pages 1-9, Dec. 6-10, 2010, ACM. [13] Grace gee, Hakson Teh, Twitter Spammer Profile Detection, 2010. [14] http://about- threats.trendmicro.com/us/webattack- Information regarding Twitter threats. [15] http://en.wikipedia.org/wiki/Twitter- Information of Twitter. [16] http://expandedramblings.com/index.php/march -2013-by-the-numbers-a-few-amazing- twitter-stats-Regarding statistics of Twitter. [17] http://help.twitter.com/forums/26257/entries/ 1831- The Twitter Rules. [18] http://twittnotes.com/2009/03/, 2000- following-limit-on- twitter.html-The 2000 Following Limit Policy on Twitter. [19] http://www.spamhaus.org/consumer/definitio n-Spam Definition. [20] J. Baltazar, J. Costoya, and R. Flores, “The real face of koobface: Thelargest web 2.0 botnet explained,” Trend Micro Threat Research , 2009. [21] J. Douceur, “The sybil attack,” Peer-to-peer Systems, pp. 251–260, 2002.[12] D. Irani, M. Balduzzi, D. Balzarotti, [22] E. Kirda, and C. Pu, “Reverse socialengineering attacks in online social networks,” Detection of Intrusionsand Malware, and Vulnerability Assessment , pp. 55–74, 2011. [23] Jonghyuk Song, Sangho Lee and Jong Kim, Spam Filtering in Twitter using Sender- Receiver Relationship, RAID'11 Proceedings of the 14th International Conference on Recent Advances in Intrusion Detection, vol. 6961, Pages 301-317, 2011, Springer, Heidelberg ACM. [24] Kyumin Lee, James Caverlee, Steve Webb, Uncovering Social Spammers: Social Honeypots + Machine Learning, Proceeding of the 33rd international ACM SIGIR conference on Research and development in information retrieval, 2010, Pages 435–442, ACM, New York (2010).
  • 9. International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470 @ IJTSRD | Unique Paper ID – IJTSRD57439 | Volume – 7 | Issue – 3 | May-June 2023 Page 702 [25] Leyla Bilge, Thorsten Strufe, Davide Balzarotti, Engin Kirda, All Your Contacts Are Belong to Us: Automated Identity Theft Attacks on Social Networks, International World Wide Web Conference Committee (IW3C2), [26] WWW 2009, April 20–24, 2009, Madrid, Spain, ACM [27] M. McCord, M. Chuah, Spam Detection on Twitter Using Traditional Classifiers, ATC’11, Banff, Canada, Sept 2-4, 2011, IEEE. [28] Manuel Egele, Gianluca Stringhini, Christopher Kruegel, and Giovanni Vigna, COMPA: Detecting Compromised Accounts on Social Networks. [29] Marcel Flores, Aleksandar Kuzmanovic, Searching for Spam: Detecting Fraudulent Accounts via Web Search, LNCS 7799, pp. 208–217, 2013. Springer-Verlag Berlin Heidelberg 2013. [30] Mauro Conti, Radha Poovendran, Marco Secchiero, FakeBook: Detecting Fake Profiles in On-line Social Networks, IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, 2012. [31] Po-Ching Lin, Po-Min Huang, A Study of Effective Features for Detecting Long- surviving Twitter Spam Accounts, Advanced Communication Technology (ICACT), 15th International Conference on 27-30 Jan. 2013, IEEE. [32] Sangho Lee and Jong Kimz, WARNINGBIRD: Detecting Suspicious URLs in Twitter Stream, 19th Network and Distributed System Security Symposium (NDSS), San Diego, California, USA, February 5-8, 2012. [33] T. Jagatic, N. Johnson, M. Jakobsson, and F. Menczer, “Social phishing,” Communications of the ACM , vol. 50, no. 10, pp. 94–100, 2007. [34] Vijay A. Balasubramaniyan, Arjun Maheswaran, Viswanathan Mahalingam, Mustaque Ahamad, H. Venkateswaran, A Crow or a Blackbird? Using True Social Network and Tweeting Behavior to Detect Malicious Entities in Twitter, 2002, ACM [35] Y. Boshmaf, I. Muslukhov, K. Beznosov, and M. Ripeanu, “The socialbotnetwork: when bots socialize for fame and money,” in Proceedings of the 27th Annual Computer Security Applications Conference. ACM,2011, pp. 93–102. [36] Yin Zhuy, Xiao Wang, Erheng Zhong, Nanthan N. Liuy, He Li, Qiang Yang, Discovering Spammers in Social Networks, Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence. [37] Zhi Yang, Christo Wilson, Xiao Wang, Tingting Gao, Ben [38] Y. Zhao, and Yafei Dai, Uncovering Social Network Sybils in the Wild, Proceedings of the 11th ACM/USENIX Internet Measurement Conference (IMC’11), 2011. [39] Verma, P., Khanday, A. M. U. D., Rabani, S. T., Mir, M. H., & Jamwal, S. (2019). Twitter sentiment analysis on Indian government project using R. Int J Recent Technol Eng, 8(3), 8338-41. [40] Verma, P., & Jamwal, S. (2020). Mining public opinion on Indian Government policies using R. Int. J. Innov. Technol. Explor. Eng.(IJITEE), 9(3). [41] Thakur, N., Choudhary, A., & Verma, P. Machine Learning Algorithms-A Systematic Review. [42] Verma, P., & Mahajan, S. A Systematic review of Techniques to Spot Spammers on Twitter. [43] Thakur, M., & Verma, P. A Review of Computer Network Topology and Analysis Examples. [44] Kumar, A., Guleria, A., & Verma, P. Internet of Things (IoT) and Its Applications: A Survey Paper.