SlideShare a Scribd company logo
A Presentation on
“ Fake User Detection”
SUBMITTED BY
Mahendra Nath Dwivedi
Roll No:- 502202216004
Enroll No.:- AA/3522
Department of Computer Science & Engineering
Central college of engineering and management
GUIDED BY
Mr. Abhishek Badholia
DEPT. OF COMPUTER SCIENCE &
ENGINEERING
CONTENT
• INTRODUCTION
- INTRODUCTION OF PROJECT
- REVIEW ANALYSIS OF AMAZON.COM
- BEHAVIOR FEATURES OF SPAMMERS
• LITERATURE REVIEW
• PROBLEM IDENTIFICATION
• METHODOLOGY
• RESULT AND FUTURE SCOPE
• REFERENCES
With the development of the Internet, people are more likely to express their
views and opinions on the Web. They can write reviews or other opinions on
E-Commerce sites, forums, and blogs. They are also used by product
manufacturers to identify problems of their products and to find competitive
intelligence information about their competitors. Unfortunately, this
importance of reviews also gives good incentive for spam, which contains
false positive or malicious negative opinions
INTRODUCTION
Table shows some selected mobile phone reviews from the Amazon website. For
the mobile phone product's topic, reviews 1 and 2 are relevant to the topic, and
review 1 has the highest relevancy than other reviews. But, it is hard to decide the
relevancy between review 3 and the topic. Besides, reviews 4 and 5 are part of
plagiarism of review 3, and review 6 is an advertisement. Only two of six reviews are
relevant to the mobile phone product's topic. Fake reviews can not only increase
decision's making cost, but also affect decision accuracy making.
Review(Comment) analysis of website.
Mahendra nath
1) Star User
2) Deviation Rate
3) Bias Rate
4) Review Similarity rate
5) Review Relevancy Rate
6) Content-Length
7) Illustration
8) Burst review
BEHAVIOR FEATURES OF SPAMMERS
Types of Spammers
Types of Review Spams
Basically three types of review spams exist[6]. These are:
Type 1 (Untruthful Review Spams): Fictitious positive reviews are rewarded to
products in order to promote them and also unreasonable negative reviews
are given to the competing products to harm their reputations among the
consumers.This is how untruthful reviews mislead the consumers into
believing their spam reviews.
Type 2 (Reviews with brand mentions): These spams have only brands as their prime
focus. They comment about the manufacturer or seller or the brand name
alone.These reviews are biased and can easily b figured out as they do not
talk about the product and rather only mention the brand names.
Type 3 (Non-reviews): These reviews are either junk, as in, have no relation with the
product or are purely used for advertisement purposes. They have these
two forms:
i. marketing purposes, and
ii. irrelevant text or reviews having random write-ups.
Rule Based Classification Of Spammers
METHODOLOGY
In this section we will discuss the proposed framework in detail. The
proposed spam detection and blocking framework consist of various
modules.
•Feature Discretization
•Negative Set Extraction
•Expected Maximization Algorithm
•Blocking of Users
.
Rating Deviation from Mean Agreement
Filter Mean Target Difference
Group Filter Mean Variance
Target Model Focus
Algorithm: Negative Set Extraction
Input: P → Positive Set of Spammers
U → unlabeled set of users
Output: RN → Set of negative set.
RN <- N initially
RN_Extract (P, U)
For each feature do
Calculate
End for
For evaluate each feature (decreasing order) do
Remove instances consists of from
If Size(RN) is close enough to P then
Return RN
End if
End for
End
Algorithm for Negative set extraction is presented below.
BLOCK DIAGRAM OF GENERATING A LIST OF SPAMMERS
Literature Riews
To detect spam reviews, some scholars have done some related research by using
the techniques of data mining and natural language processing.This works are
performed by the several other researcher by work of them my research takes place
1. The paper entiteled “FAKE REVIEW DETECTION FROM A PRODUCT REVIEW USING
MODIFIED METHOD OF ITERATIVE COMPUTATION FRAMEWORK “ was published in
DP Sciences.( DOI: 10.1051/conf/ matec5803003) by Eka Dyar Wahyuni and Arif
Djunaidy.
They worked on The honesty value of a review will be measured by utilizing the text
mining and opinion mining techniques. The result from the experiment shows that
the proposed system has a better accuracy compared with the result from iterative
computation framework (ICF) method and try to identified the fake reviews.The
drawback of this method is, some process need to be optimized, so it can detect a
fake review in a short amount of time.
REVIEW OF PAPERS:-
2. The paper entitled “Spammers Detection from Product Reviews: A Hybrid Model”
was published in 1550-4786/15 $31.00 © 2015 IEEEDOI 10.1109/ICDM.2015.73.
They worked on This paper focuses on detecting hidden spam users based on product
reviews. In the literature, there have been tremendous studies suggesting diversified
methods for spammer detection. This paper proposes a principled hybrid learning
model called hPSD to combine both user features and user-product relations for
spammer detection. Three essential components of hPSD, including feature
discretization, reliable negative set extraction and hybrid learning scheme, are
elaborated respectively.
3. The paper entitled “Mining the Peanut Gallery: Opinion Extraction and Semantic
Classification Of Product Reviews” was published by Kushal Dave NEC Laboratories
America
They worked on Opinion mining tool would process a set of search results for a given
item, generating a list of product attributes (quality ,features, etc.) and aggregating
opinions about each of them (poor , mixed,good).We begin by identifying the unique
properties of this problem and develop a method for automatically distinguishing
between positive and negative reviews. a number of issues that make this problem
difficult in Rating inconsistency, Ambivalence and comparison , Sparse data , Skewed
distribution
PROBLEM IDENTIFICATION
The main problem of reviews by users lies in the fact to identify the spam reviews
in between genuine reviews. The reviews posted by any users can be spam or not a
spam. Consider an example of person Alice. Alice constantly posting review of some
published “X”. The publisher published many books. Alice simply post good content
and genuine review to the publisher “X”. He purchase most of the books of “X” and
provide review on that particular book. So by looking at this posts, the algorithm can
conclude that the user Alice is genuine user so as its comments too.
But in fact, the user Alice is hired to posts review by publisher “X”. HE gave
good and 5 star rating to publisher “X” books. This might be the problem in
identification of users who looks to be genuine but not actually is.
Fig. 3.1. Alice Behaviour of Reviewing Books
Percentage of Users Being Spammer and Ham
Lastly, the users are classified into spam and non-spam categories. The
probability of categorizing into spam and non-spam category are presented in
In our dataset, the probability of spam users are 49 % and non-spam users
are 51%. The dataset is flooded with the spam users. The user need to be blocked
so that they cannot further effect the review and comments.
Fig.. Shows the Probability of Spammer and Non-Spammers
Users Blocking
After identifying of the spam users, they are blocked. The
blocking stage is depicted in fig.
The result produced by EM(Expectation Maximization) algorithm with 6 features are
compared with the base paper having more number of features. Fig. Shows the
comparison between proposed and existing approach.
Future scope of work
In this project, majority of the work has been done with respect to spammer
detection technique. The major drawback of this work is working with only one
dataset. The future scope might be working with multiple dataset to analyse the
attacker of other websites too.
References
Nitin Jindal and Bing Liu, “Analyzing and Detecting Review Spam”, Seventh IEEE International Conference on Data
Mining 2007.
SNEHAL DIXIT & A.J.AGRAWAL, “REVIEW SPAM DETECTION”, International Journal of Computational
Linguistics and Natural Language Processing Vol 2 Issue 6 June 2013 ISSN2279 –0756
Gera T., Thakur D. and Singh J. 2015. BILD Testing for Spotting Out Suspicious Reviews, Suspicious Reviewers and
Group Spammers, International Conference on Communication Systems and Network Technologies(CSNT.2015.138).
Liang D., Liu X. and Shen H. 2014. Detecting Spam Reviewers by Combing Reviewer Feature and Relationship,
International Conference on Informative and Cybernetics for Computational Social Systems (ICCSS).
Mukherjee A., Kumar A., Liu B., Wang J. and Ghosh R. 2013. Spotting Opinion Spammers using Behavioral Footprints.
Mukherjee A., Glance N. and Liu B. 2012 . Spotting Fake Reviewer Groups in Consumer Reviews.
Wang G., Xie S., Liu B. and Philip S. Yu 2011. Review Graph based Online Store Review Spammer Detection, IEEE
International Conference on Data Mining(ICDM) .
Zhang X., Xiong G., Zhu F. and Dong X. 2016. A Method of SMS Spam Filtering Based on AdaBoost Algorithm, World
Congress on Intelligent Control and Automation (WCICA).

More Related Content

PDF
IRJET- A New Approach to Product Recommendation Systems
PPTX
Identifying features in opinion mining via intrinsic and extrinsic domain rel...
PDF
Detection of Fraud Reviews for a Product
PDF
IRJET- Detection of Ranking Fraud in Mobile Applications
PDF
Faces in the Distorting Mirror: Revisiting Photo-based Social Authentication
PPTX
Is it time for a career switch
DOCX
Developing Movie Recommendation System
PDF
Web Rec Final Report
IRJET- A New Approach to Product Recommendation Systems
Identifying features in opinion mining via intrinsic and extrinsic domain rel...
Detection of Fraud Reviews for a Product
IRJET- Detection of Ranking Fraud in Mobile Applications
Faces in the Distorting Mirror: Revisiting Photo-based Social Authentication
Is it time for a career switch
Developing Movie Recommendation System
Web Rec Final Report

Similar to Mahendra nath (20)

PDF
Fake Product Review Monitoring System
PDF
IRJET- Enhancing NLP Techniques for Fake Review Detection
PDF
Fraud Detection in Online Reviews using Machine Learning Techniques
PPTX
SIDDESH PPT.pptxjdcnjdcndjcnfsfsfsfsfsfsfsfssf
PDF
IRJET- Improving Performance of Fake Reviews Detection in Online Review’s usi...
DOCX
VTU final year project report Main
PPTX
Shiva pptvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv...
PPTX
Shiva ppt.pptx
PPTX
FAKE PRODUCT PAPER PRESENTATION.pptx
PDF
Recommender System- Analyzing products by mining Data Streams
PDF
Extracting Business Intelligence from Online Product Reviews
PDF
EXTRACTING BUSINESS INTELLIGENCE FROM ONLINE PRODUCT REVIEWS
PDF
IRJET-A Novel Technic to Notice Spam Reviews on e-Shopping
PDF
IRJET- Spotting and Removing Fake Product Review in Consumer Rating Reviews
PDF
IRJET- A New Approach to Product Recommendation Systems
PDF
IRJET - Characterizing Products’ Outcome by Sentiment Analysis and Predicting...
PDF
Opinion Mining and Opinion Spam Detection
PDF
Netspam: An Efficient Approach to Prevent Spam Messages using Support Vector ...
PDF
IRJET- Hybrid Book Recommendation System
PDF
The Impact of Fake Reviews on Sentiment Analysis of IMDB Movie Reviews
Fake Product Review Monitoring System
IRJET- Enhancing NLP Techniques for Fake Review Detection
Fraud Detection in Online Reviews using Machine Learning Techniques
SIDDESH PPT.pptxjdcnjdcndjcnfsfsfsfsfsfsfsfssf
IRJET- Improving Performance of Fake Reviews Detection in Online Review’s usi...
VTU final year project report Main
Shiva pptvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv...
Shiva ppt.pptx
FAKE PRODUCT PAPER PRESENTATION.pptx
Recommender System- Analyzing products by mining Data Streams
Extracting Business Intelligence from Online Product Reviews
EXTRACTING BUSINESS INTELLIGENCE FROM ONLINE PRODUCT REVIEWS
IRJET-A Novel Technic to Notice Spam Reviews on e-Shopping
IRJET- Spotting and Removing Fake Product Review in Consumer Rating Reviews
IRJET- A New Approach to Product Recommendation Systems
IRJET - Characterizing Products’ Outcome by Sentiment Analysis and Predicting...
Opinion Mining and Opinion Spam Detection
Netspam: An Efficient Approach to Prevent Spam Messages using Support Vector ...
IRJET- Hybrid Book Recommendation System
The Impact of Fake Reviews on Sentiment Analysis of IMDB Movie Reviews
Ad

Recently uploaded (20)

PPTX
Fundamentals of safety and accident prevention -final (1).pptx
PDF
Visual Aids for Exploratory Data Analysis.pdf
PDF
737-MAX_SRG.pdf student reference guides
PPTX
Current and future trends in Computer Vision.pptx
PDF
BIO-INSPIRED HORMONAL MODULATION AND ADAPTIVE ORCHESTRATION IN S-AI-GPT
PDF
EXPLORING LEARNING ENGAGEMENT FACTORS INFLUENCING BEHAVIORAL, COGNITIVE, AND ...
PPTX
Module 8- Technological and Communication Skills.pptx
PDF
null (2) bgfbg bfgb bfgb fbfg bfbgf b.pdf
PDF
Design Guidelines and solutions for Plastics parts
PDF
UNIT no 1 INTRODUCTION TO DBMS NOTES.pdf
PPTX
6ME3A-Unit-II-Sensors and Actuators_Handouts.pptx
PPTX
Sorting and Hashing in Data Structures with Algorithms, Techniques, Implement...
PDF
Influence of Green Infrastructure on Residents’ Endorsement of the New Ecolog...
PPT
Occupational Health and Safety Management System
PDF
Human-AI Collaboration: Balancing Agentic AI and Autonomy in Hybrid Systems
PDF
Soil Improvement Techniques Note - Rabbi
PPTX
"Array and Linked List in Data Structures with Types, Operations, Implementat...
PPTX
communication and presentation skills 01
PDF
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
PDF
Accra-Kumasi Expressway - Prefeasibility Report Volume 1 of 7.11.2018.pdf
Fundamentals of safety and accident prevention -final (1).pptx
Visual Aids for Exploratory Data Analysis.pdf
737-MAX_SRG.pdf student reference guides
Current and future trends in Computer Vision.pptx
BIO-INSPIRED HORMONAL MODULATION AND ADAPTIVE ORCHESTRATION IN S-AI-GPT
EXPLORING LEARNING ENGAGEMENT FACTORS INFLUENCING BEHAVIORAL, COGNITIVE, AND ...
Module 8- Technological and Communication Skills.pptx
null (2) bgfbg bfgb bfgb fbfg bfbgf b.pdf
Design Guidelines and solutions for Plastics parts
UNIT no 1 INTRODUCTION TO DBMS NOTES.pdf
6ME3A-Unit-II-Sensors and Actuators_Handouts.pptx
Sorting and Hashing in Data Structures with Algorithms, Techniques, Implement...
Influence of Green Infrastructure on Residents’ Endorsement of the New Ecolog...
Occupational Health and Safety Management System
Human-AI Collaboration: Balancing Agentic AI and Autonomy in Hybrid Systems
Soil Improvement Techniques Note - Rabbi
"Array and Linked List in Data Structures with Types, Operations, Implementat...
communication and presentation skills 01
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
Accra-Kumasi Expressway - Prefeasibility Report Volume 1 of 7.11.2018.pdf
Ad

Mahendra nath

  • 1. A Presentation on “ Fake User Detection” SUBMITTED BY Mahendra Nath Dwivedi Roll No:- 502202216004 Enroll No.:- AA/3522 Department of Computer Science & Engineering Central college of engineering and management GUIDED BY Mr. Abhishek Badholia DEPT. OF COMPUTER SCIENCE & ENGINEERING
  • 2. CONTENT • INTRODUCTION - INTRODUCTION OF PROJECT - REVIEW ANALYSIS OF AMAZON.COM - BEHAVIOR FEATURES OF SPAMMERS • LITERATURE REVIEW • PROBLEM IDENTIFICATION • METHODOLOGY • RESULT AND FUTURE SCOPE • REFERENCES
  • 3. With the development of the Internet, people are more likely to express their views and opinions on the Web. They can write reviews or other opinions on E-Commerce sites, forums, and blogs. They are also used by product manufacturers to identify problems of their products and to find competitive intelligence information about their competitors. Unfortunately, this importance of reviews also gives good incentive for spam, which contains false positive or malicious negative opinions INTRODUCTION
  • 4. Table shows some selected mobile phone reviews from the Amazon website. For the mobile phone product's topic, reviews 1 and 2 are relevant to the topic, and review 1 has the highest relevancy than other reviews. But, it is hard to decide the relevancy between review 3 and the topic. Besides, reviews 4 and 5 are part of plagiarism of review 3, and review 6 is an advertisement. Only two of six reviews are relevant to the mobile phone product's topic. Fake reviews can not only increase decision's making cost, but also affect decision accuracy making. Review(Comment) analysis of website.
  • 6. 1) Star User 2) Deviation Rate 3) Bias Rate 4) Review Similarity rate 5) Review Relevancy Rate 6) Content-Length 7) Illustration 8) Burst review BEHAVIOR FEATURES OF SPAMMERS
  • 8. Types of Review Spams Basically three types of review spams exist[6]. These are: Type 1 (Untruthful Review Spams): Fictitious positive reviews are rewarded to products in order to promote them and also unreasonable negative reviews are given to the competing products to harm their reputations among the consumers.This is how untruthful reviews mislead the consumers into believing their spam reviews. Type 2 (Reviews with brand mentions): These spams have only brands as their prime focus. They comment about the manufacturer or seller or the brand name alone.These reviews are biased and can easily b figured out as they do not talk about the product and rather only mention the brand names. Type 3 (Non-reviews): These reviews are either junk, as in, have no relation with the product or are purely used for advertisement purposes. They have these two forms: i. marketing purposes, and ii. irrelevant text or reviews having random write-ups.
  • 10. METHODOLOGY In this section we will discuss the proposed framework in detail. The proposed spam detection and blocking framework consist of various modules. •Feature Discretization •Negative Set Extraction •Expected Maximization Algorithm •Blocking of Users .
  • 11. Rating Deviation from Mean Agreement Filter Mean Target Difference Group Filter Mean Variance Target Model Focus
  • 12. Algorithm: Negative Set Extraction Input: P → Positive Set of Spammers U → unlabeled set of users Output: RN → Set of negative set. RN <- N initially RN_Extract (P, U) For each feature do Calculate End for For evaluate each feature (decreasing order) do Remove instances consists of from If Size(RN) is close enough to P then Return RN End if End for End Algorithm for Negative set extraction is presented below.
  • 13. BLOCK DIAGRAM OF GENERATING A LIST OF SPAMMERS
  • 14. Literature Riews To detect spam reviews, some scholars have done some related research by using the techniques of data mining and natural language processing.This works are performed by the several other researcher by work of them my research takes place 1. The paper entiteled “FAKE REVIEW DETECTION FROM A PRODUCT REVIEW USING MODIFIED METHOD OF ITERATIVE COMPUTATION FRAMEWORK “ was published in DP Sciences.( DOI: 10.1051/conf/ matec5803003) by Eka Dyar Wahyuni and Arif Djunaidy. They worked on The honesty value of a review will be measured by utilizing the text mining and opinion mining techniques. The result from the experiment shows that the proposed system has a better accuracy compared with the result from iterative computation framework (ICF) method and try to identified the fake reviews.The drawback of this method is, some process need to be optimized, so it can detect a fake review in a short amount of time. REVIEW OF PAPERS:-
  • 15. 2. The paper entitled “Spammers Detection from Product Reviews: A Hybrid Model” was published in 1550-4786/15 $31.00 © 2015 IEEEDOI 10.1109/ICDM.2015.73. They worked on This paper focuses on detecting hidden spam users based on product reviews. In the literature, there have been tremendous studies suggesting diversified methods for spammer detection. This paper proposes a principled hybrid learning model called hPSD to combine both user features and user-product relations for spammer detection. Three essential components of hPSD, including feature discretization, reliable negative set extraction and hybrid learning scheme, are elaborated respectively. 3. The paper entitled “Mining the Peanut Gallery: Opinion Extraction and Semantic Classification Of Product Reviews” was published by Kushal Dave NEC Laboratories America They worked on Opinion mining tool would process a set of search results for a given item, generating a list of product attributes (quality ,features, etc.) and aggregating opinions about each of them (poor , mixed,good).We begin by identifying the unique properties of this problem and develop a method for automatically distinguishing between positive and negative reviews. a number of issues that make this problem difficult in Rating inconsistency, Ambivalence and comparison , Sparse data , Skewed distribution
  • 16. PROBLEM IDENTIFICATION The main problem of reviews by users lies in the fact to identify the spam reviews in between genuine reviews. The reviews posted by any users can be spam or not a spam. Consider an example of person Alice. Alice constantly posting review of some published “X”. The publisher published many books. Alice simply post good content and genuine review to the publisher “X”. He purchase most of the books of “X” and provide review on that particular book. So by looking at this posts, the algorithm can conclude that the user Alice is genuine user so as its comments too. But in fact, the user Alice is hired to posts review by publisher “X”. HE gave good and 5 star rating to publisher “X” books. This might be the problem in identification of users who looks to be genuine but not actually is. Fig. 3.1. Alice Behaviour of Reviewing Books
  • 17. Percentage of Users Being Spammer and Ham Lastly, the users are classified into spam and non-spam categories. The probability of categorizing into spam and non-spam category are presented in In our dataset, the probability of spam users are 49 % and non-spam users are 51%. The dataset is flooded with the spam users. The user need to be blocked so that they cannot further effect the review and comments. Fig.. Shows the Probability of Spammer and Non-Spammers
  • 18. Users Blocking After identifying of the spam users, they are blocked. The blocking stage is depicted in fig.
  • 19. The result produced by EM(Expectation Maximization) algorithm with 6 features are compared with the base paper having more number of features. Fig. Shows the comparison between proposed and existing approach.
  • 20. Future scope of work In this project, majority of the work has been done with respect to spammer detection technique. The major drawback of this work is working with only one dataset. The future scope might be working with multiple dataset to analyse the attacker of other websites too.
  • 21. References Nitin Jindal and Bing Liu, “Analyzing and Detecting Review Spam”, Seventh IEEE International Conference on Data Mining 2007. SNEHAL DIXIT & A.J.AGRAWAL, “REVIEW SPAM DETECTION”, International Journal of Computational Linguistics and Natural Language Processing Vol 2 Issue 6 June 2013 ISSN2279 –0756 Gera T., Thakur D. and Singh J. 2015. BILD Testing for Spotting Out Suspicious Reviews, Suspicious Reviewers and Group Spammers, International Conference on Communication Systems and Network Technologies(CSNT.2015.138). Liang D., Liu X. and Shen H. 2014. Detecting Spam Reviewers by Combing Reviewer Feature and Relationship, International Conference on Informative and Cybernetics for Computational Social Systems (ICCSS). Mukherjee A., Kumar A., Liu B., Wang J. and Ghosh R. 2013. Spotting Opinion Spammers using Behavioral Footprints. Mukherjee A., Glance N. and Liu B. 2012 . Spotting Fake Reviewer Groups in Consumer Reviews. Wang G., Xie S., Liu B. and Philip S. Yu 2011. Review Graph based Online Store Review Spammer Detection, IEEE International Conference on Data Mining(ICDM) . Zhang X., Xiong G., Zhu F. and Dong X. 2016. A Method of SMS Spam Filtering Based on AdaBoost Algorithm, World Congress on Intelligent Control and Automation (WCICA).