SlideShare a Scribd company logo
2
Most read
3
Most read
4
Most read
COMMERCE FAKE PRODUCT REVIEWS MONITORING AND DETECTION
SYSTEM
Abstract
Online consumer reviews play an important role in helping consumers judge the quality
and authenticity of products on e-commerce platforms. However, the constant presence of fake
reviews on these platforms has significantly impacted the operation and development of e-
commerce platforms. In this study, we develop a novel supervised probabilistic method to
detect fake reviews by utilizing the difference in the distribution of non-fraudulent reviews and
that of fake reviews. Specifically, we first derive the univariate distributions of several unique
features (linguistic, behavioral, and interrelationship features). We then integrate these
distributions into two mixed distributions according to their labels to represent the overall
difference between non-fraudulent reviews and fake reviews. Next, we randomly generate
synthetic review data points with different labels from the above mixed distributions. Finally,
we train a Multilayer Perceptron model by using these synthetic review data to obtain a
classifier. We conducted several experiments to test the model using several original real-world
review datasets. Numerical results indicated that the proposed supervised method
outperformed some well-known sampling models and fake review detection methods, in terms
of classification accuracy. Moreover, we extend the proposed method to handle the scenarios
with small samples of raw review data. This study contributes to the literature by exploiting
the difference in the distribution of non-fraudulent reviews and that of fraudulent reviews,
which can improve the accuracy of fake review detection for online platforms.
Existing System
Detecting fake reviews on e-commerce platforms is critical for maintaining credibility
and trust among users. A supervised general mixed probability approach offers a robust method
to sift through reviews and identify potential fraudulent ones. By leveraging a combination of
machine learning algorithms and probabilistic models, this approach aims to analyze various
features within reviews to distinguish between genuine and fake content.The system employs
a supervised learning framework, utilizing a labeled dataset to train the model. Features such
as sentiment analysis, linguistic patterns, reviewer behavior, review timing, and product
information are considered to create a comprehensive feature set. These features are then
processed through a mixed probability model, which combines the strengths of different
probabilistic techniques, such as Bayesian methods or Hidden Markov Models, to assess the
likelihood of a review being authentic or deceptive.By employing a mixed probability
approach, this system can effectively handle diverse types of fake reviews, adapting to evolving
strategies used by malicious actors. Additionally, continuous model retraining and adaptation
ensure its ability to stay updated with new trends in fraudulent review practices.The goal of
this approach is not only to accurately detect fake reviews but also to provide e-commerce
platforms with a scalable and adaptable solution to maintain the integrity of their review
systems, fostering a trustworthy environment for both consumers and businesses.
Drawback in Existing System
 Data Dependence: This approach heavily relies on labeled datasets for training.
Obtaining and maintaining a large and diverse labeled dataset can be challenging and
costly. Moreover, the model's effectiveness might decrease if the dataset doesn’t
adequately represent evolving fraudulent review tactics.
 Feature Engineering Complexity: Extracting relevant features from reviews requires
sophisticated natural language processing (NLP) techniques. Designing and
engineering these features can be complex and computationally intensive. Additionally,
the model's performance heavily relies on the quality and relevance of these features.
 Adaptability to New Techniques: Fraudulent review strategies evolve over time, and
new methods constantly emerge. The model might struggle to adapt quickly to these
changes, requiring frequent updates and retraining to maintain its effectiveness.
 Resource Intensive: Implementing and maintaining a mixed probability approach can
be computationally demanding. This might pose challenges for smaller e-commerce
platforms with limited resources.
Proposed System
 Data Collection: Description of the dataset acquisition process, emphasizing the need
for a diverse and labeled dataset.
 Preprocessing and Feature Engineering: Details on data preprocessing techniques
and the selection of various features (linguistic, behavioral, temporal) for model
training.
 Supervised Learning Framework: Explanation of the mixed probability approach
involving Bayesian classifiers, Hidden Markov Models, or ensemble methods.
 Model Training and Evaluation: Methodology for model training, validation, and
performance evaluation using appropriate metrics.
Algorithm
 Sentiment Analysis: Using algorithms like VADER (Valence Aware Dictionary and
sEntiment Reasoner) or supervised machine learning models to determine sentiment
polarity.
 NLP Techniques: Leveraging techniques like word embeddings (Word2Vec, GloVe)
or language models (BERT, GPT) for semantic understanding.
 Linguistic Features: Analyzing word frequency, syntactic patterns, or grammar
structures.
Advantages
 Incorporates Various Features: Leverages linguistic, temporal, and behavioral
attributes within reviews, offering a comprehensive assessment for identifying
fraudulent patterns.
 Comprehensive Feature Set: Utilizes diverse features such as sentiment analysis,
linguistic patterns, reviewer behavior, and temporal information, improving the
accuracy of detecting fake reviews.
 Mixed Probability Models: Combines different probabilistic techniques, allowing the
system to adapt to emerging fraudulent review strategies over time.
 Robust Classification: Considers multiple dimensions, minimizing misclassification
of genuine reviews as fake, thus reducing false alarms.
Software Specification
 Processor : I3 core processor
 Ram : 4 GB
 Hard disk : 500 GB
Software Specification
 Operating System : Windows 10 /11
 Frond End : JAVA Swing
 Back End : Mysql Server
 IDE Tools : Eclipse
 Browser : Microsoft Edge

More Related Content

PPT
Type Checking(Compiler Design) #ShareThisIfYouLike
PPT
Radial Basis Function and Splines.
PPTX
Introduction to Dynamic Programming, Principle of Optimality
PPTX
AI_session 24 knowledge representation.pptx
PPTX
Dynamic programming
PDF
Parse Tree
PPTX
Ant colony optimization (aco)
PPTX
Evaluating hypothesis
Type Checking(Compiler Design) #ShareThisIfYouLike
Radial Basis Function and Splines.
Introduction to Dynamic Programming, Principle of Optimality
AI_session 24 knowledge representation.pptx
Dynamic programming
Parse Tree
Ant colony optimization (aco)
Evaluating hypothesis

What's hot (20)

PPTX
weak slot and filler structure
PPTX
Cloud federation.pptx
PDF
Dbms 14: Relational Calculus
PDF
Machine Learning Clustering
PPT
Lecture 1 (distributed systems)
DOC
PDF
Hadoop combiner and partitioner
DOC
ARTIFICIAL INTELLIGENCE - SHORT NOTES
PPTX
Java.util
PPTX
Role-of-lexical-analysis
PPTX
Load balancing
PPTX
physical file system in operating system
PPTX
Dijkstra’S Algorithm
PPTX
Scheduling in Cloud Computing
PPTX
HDLC and Point to point protocol
PDF
Lecture11 syntax analysis_7
PPT
Clock synchronization in distributed system
PPT
3.1 clustering
PPT
Top down parsing
weak slot and filler structure
Cloud federation.pptx
Dbms 14: Relational Calculus
Machine Learning Clustering
Lecture 1 (distributed systems)
Hadoop combiner and partitioner
ARTIFICIAL INTELLIGENCE - SHORT NOTES
Java.util
Role-of-lexical-analysis
Load balancing
physical file system in operating system
Dijkstra’S Algorithm
Scheduling in Cloud Computing
HDLC and Point to point protocol
Lecture11 syntax analysis_7
Clock synchronization in distributed system
3.1 clustering
Top down parsing
Ad

Similar to COMMERCE FAKE PRODUCT REVIEWS MONITORING AND DETECTION (20)

DOCX
High level model for phishing detection system.docx
DOCX
A Boosting-Based Hybrid Feature Selection and Multi-Layer Stacked Ensemble Le...
DOCX
Customer_Analysis.docx
PPTX
It's a capstone project carried out and the title of the project is Credit Ca...
DOCX
VTU final year project report Main
PDF
Hybrid Deep Learning Model for Multilingual Sentiment Analysis
PDF
A SUPERVISED MACHINE LEARNING APPROACH USING K-NEAREST NEIGHBOR ALGORITHM TO ...
PDF
Evaluating Collaborative Filtering Recommender Systems
PDF
Data Mining methodology
PDF
IRJET- Improving Performance of Fake Reviews Detection in Online Review’s usi...
PPTX
Model training and parameter estimation techniques.pptx
PDF
IRJET- Classification of Food Recipe Comments using Naive Bayes
PPTX
Connecting social media to e commerce (2)
PDF
Ontological and clustering approach for content based recommendation systems
PDF
A Novel Jewellery Recommendation System using Machine Learning and Natural La...
DOCX
A novel recommender for mobile telecom alert services - linkedin
PPTX
100-Concepts-of-AI By Anupama Kate .pptx
PPTX
CREDIT_RISK_ASSESMENT_SYSTEM_USING_MACHINE_LEARNING[1] [Read-Only].pptx
PDF
IRJET-A Novel Technic to Notice Spam Reviews on e-Shopping
PDF
A REVIEW PAPER ON BFO AND PSO BASED MOVIE RECOMMENDATION SYSTEM | J4RV4I1015
High level model for phishing detection system.docx
A Boosting-Based Hybrid Feature Selection and Multi-Layer Stacked Ensemble Le...
Customer_Analysis.docx
It's a capstone project carried out and the title of the project is Credit Ca...
VTU final year project report Main
Hybrid Deep Learning Model for Multilingual Sentiment Analysis
A SUPERVISED MACHINE LEARNING APPROACH USING K-NEAREST NEIGHBOR ALGORITHM TO ...
Evaluating Collaborative Filtering Recommender Systems
Data Mining methodology
IRJET- Improving Performance of Fake Reviews Detection in Online Review’s usi...
Model training and parameter estimation techniques.pptx
IRJET- Classification of Food Recipe Comments using Naive Bayes
Connecting social media to e commerce (2)
Ontological and clustering approach for content based recommendation systems
A Novel Jewellery Recommendation System using Machine Learning and Natural La...
A novel recommender for mobile telecom alert services - linkedin
100-Concepts-of-AI By Anupama Kate .pptx
CREDIT_RISK_ASSESMENT_SYSTEM_USING_MACHINE_LEARNING[1] [Read-Only].pptx
IRJET-A Novel Technic to Notice Spam Reviews on e-Shopping
A REVIEW PAPER ON BFO AND PSO BASED MOVIE RECOMMENDATION SYSTEM | J4RV4I1015
Ad

More from Shakas Technologies (20)

DOCX
A Review on Deep-Learning-Based Cyberbullying Detection
DOCX
A Personal Privacy Data Protection Scheme for Encryption and Revocation of Hi...
DOCX
A Novel Framework for Credit Card.
DOCX
A Comparative Analysis of Sampling Techniques for Click-Through Rate Predicti...
DOCX
NS2 Final Year Project Titles 2023- 2024
DOCX
MATLAB Final Year IEEE Project Titles 2023-2024
DOCX
Latest Python IEEE Project Titles 2023-2024
DOCX
EMOTION RECOGNITION BY TEXTUAL TWEETS CLASSIFICATION USING VOTING CLASSIFIER ...
DOCX
CYBER THREAT INTELLIGENCE MINING FOR PROACTIVE CYBERSECURITY DEFENSE
DOCX
Detecting Mental Disorders in social Media through Emotional patterns-The cas...
DOCX
CO2 EMISSION RATING BY VEHICLES USING DATA SCIENCE
DOCX
Toward Effective Evaluation of Cyber Defense Threat Based Adversary Emulation...
DOCX
Optimizing Numerical Weather Prediction Model Performance Using Machine Learn...
DOCX
Nature-Based Prediction Model of Bug Reports Based on Ensemble Machine Learni...
DOCX
Multi-Class Stress Detection Through Heart Rate Variability A Deep Neural Net...
DOCX
Identifying Hot Topic Trends in Streaming Text Data Using News Sequential Evo...
DOCX
Fighting Money Laundering With Statistics and Machine Learning.docx
DOCX
Explainable Artificial Intelligence for Patient Safety A Review of Applicatio...
DOCX
Ensemble Deep Learning-Based Prediction of Fraudulent Cryptocurrency Transact...
DOCX
Effective Software Effort Estimation Leveraging Machine Learning for Digital ...
A Review on Deep-Learning-Based Cyberbullying Detection
A Personal Privacy Data Protection Scheme for Encryption and Revocation of Hi...
A Novel Framework for Credit Card.
A Comparative Analysis of Sampling Techniques for Click-Through Rate Predicti...
NS2 Final Year Project Titles 2023- 2024
MATLAB Final Year IEEE Project Titles 2023-2024
Latest Python IEEE Project Titles 2023-2024
EMOTION RECOGNITION BY TEXTUAL TWEETS CLASSIFICATION USING VOTING CLASSIFIER ...
CYBER THREAT INTELLIGENCE MINING FOR PROACTIVE CYBERSECURITY DEFENSE
Detecting Mental Disorders in social Media through Emotional patterns-The cas...
CO2 EMISSION RATING BY VEHICLES USING DATA SCIENCE
Toward Effective Evaluation of Cyber Defense Threat Based Adversary Emulation...
Optimizing Numerical Weather Prediction Model Performance Using Machine Learn...
Nature-Based Prediction Model of Bug Reports Based on Ensemble Machine Learni...
Multi-Class Stress Detection Through Heart Rate Variability A Deep Neural Net...
Identifying Hot Topic Trends in Streaming Text Data Using News Sequential Evo...
Fighting Money Laundering With Statistics and Machine Learning.docx
Explainable Artificial Intelligence for Patient Safety A Review of Applicatio...
Ensemble Deep Learning-Based Prediction of Fraudulent Cryptocurrency Transact...
Effective Software Effort Estimation Leveraging Machine Learning for Digital ...

Recently uploaded (20)

PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
Classroom Observation Tools for Teachers
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PPTX
Cell Types and Its function , kingdom of life
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PPTX
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
PDF
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PDF
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PPTX
Institutional Correction lecture only . . .
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
Microbial diseases, their pathogenesis and prophylaxis
Classroom Observation Tools for Teachers
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Cell Types and Its function , kingdom of life
STATICS OF THE RIGID BODIES Hibbelers.pdf
Final Presentation General Medicine 03-08-2024.pptx
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
FourierSeries-QuestionsWithAnswers(Part-A).pdf
O5-L3 Freight Transport Ops (International) V1.pdf
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
Institutional Correction lecture only . . .
human mycosis Human fungal infections are called human mycosis..pptx

COMMERCE FAKE PRODUCT REVIEWS MONITORING AND DETECTION

  • 1. COMMERCE FAKE PRODUCT REVIEWS MONITORING AND DETECTION SYSTEM Abstract Online consumer reviews play an important role in helping consumers judge the quality and authenticity of products on e-commerce platforms. However, the constant presence of fake reviews on these platforms has significantly impacted the operation and development of e- commerce platforms. In this study, we develop a novel supervised probabilistic method to detect fake reviews by utilizing the difference in the distribution of non-fraudulent reviews and that of fake reviews. Specifically, we first derive the univariate distributions of several unique features (linguistic, behavioral, and interrelationship features). We then integrate these distributions into two mixed distributions according to their labels to represent the overall difference between non-fraudulent reviews and fake reviews. Next, we randomly generate synthetic review data points with different labels from the above mixed distributions. Finally, we train a Multilayer Perceptron model by using these synthetic review data to obtain a classifier. We conducted several experiments to test the model using several original real-world review datasets. Numerical results indicated that the proposed supervised method outperformed some well-known sampling models and fake review detection methods, in terms of classification accuracy. Moreover, we extend the proposed method to handle the scenarios with small samples of raw review data. This study contributes to the literature by exploiting the difference in the distribution of non-fraudulent reviews and that of fraudulent reviews, which can improve the accuracy of fake review detection for online platforms. Existing System Detecting fake reviews on e-commerce platforms is critical for maintaining credibility and trust among users. A supervised general mixed probability approach offers a robust method to sift through reviews and identify potential fraudulent ones. By leveraging a combination of machine learning algorithms and probabilistic models, this approach aims to analyze various features within reviews to distinguish between genuine and fake content.The system employs a supervised learning framework, utilizing a labeled dataset to train the model. Features such as sentiment analysis, linguistic patterns, reviewer behavior, review timing, and product information are considered to create a comprehensive feature set. These features are then processed through a mixed probability model, which combines the strengths of different
  • 2. probabilistic techniques, such as Bayesian methods or Hidden Markov Models, to assess the likelihood of a review being authentic or deceptive.By employing a mixed probability approach, this system can effectively handle diverse types of fake reviews, adapting to evolving strategies used by malicious actors. Additionally, continuous model retraining and adaptation ensure its ability to stay updated with new trends in fraudulent review practices.The goal of this approach is not only to accurately detect fake reviews but also to provide e-commerce platforms with a scalable and adaptable solution to maintain the integrity of their review systems, fostering a trustworthy environment for both consumers and businesses. Drawback in Existing System  Data Dependence: This approach heavily relies on labeled datasets for training. Obtaining and maintaining a large and diverse labeled dataset can be challenging and costly. Moreover, the model's effectiveness might decrease if the dataset doesn’t adequately represent evolving fraudulent review tactics.  Feature Engineering Complexity: Extracting relevant features from reviews requires sophisticated natural language processing (NLP) techniques. Designing and engineering these features can be complex and computationally intensive. Additionally, the model's performance heavily relies on the quality and relevance of these features.  Adaptability to New Techniques: Fraudulent review strategies evolve over time, and new methods constantly emerge. The model might struggle to adapt quickly to these changes, requiring frequent updates and retraining to maintain its effectiveness.  Resource Intensive: Implementing and maintaining a mixed probability approach can be computationally demanding. This might pose challenges for smaller e-commerce platforms with limited resources. Proposed System  Data Collection: Description of the dataset acquisition process, emphasizing the need for a diverse and labeled dataset.  Preprocessing and Feature Engineering: Details on data preprocessing techniques and the selection of various features (linguistic, behavioral, temporal) for model training.
  • 3.  Supervised Learning Framework: Explanation of the mixed probability approach involving Bayesian classifiers, Hidden Markov Models, or ensemble methods.  Model Training and Evaluation: Methodology for model training, validation, and performance evaluation using appropriate metrics. Algorithm  Sentiment Analysis: Using algorithms like VADER (Valence Aware Dictionary and sEntiment Reasoner) or supervised machine learning models to determine sentiment polarity.  NLP Techniques: Leveraging techniques like word embeddings (Word2Vec, GloVe) or language models (BERT, GPT) for semantic understanding.  Linguistic Features: Analyzing word frequency, syntactic patterns, or grammar structures. Advantages  Incorporates Various Features: Leverages linguistic, temporal, and behavioral attributes within reviews, offering a comprehensive assessment for identifying fraudulent patterns.  Comprehensive Feature Set: Utilizes diverse features such as sentiment analysis, linguistic patterns, reviewer behavior, and temporal information, improving the accuracy of detecting fake reviews.  Mixed Probability Models: Combines different probabilistic techniques, allowing the system to adapt to emerging fraudulent review strategies over time.  Robust Classification: Considers multiple dimensions, minimizing misclassification of genuine reviews as fake, thus reducing false alarms. Software Specification  Processor : I3 core processor  Ram : 4 GB  Hard disk : 500 GB
  • 4. Software Specification  Operating System : Windows 10 /11  Frond End : JAVA Swing  Back End : Mysql Server  IDE Tools : Eclipse  Browser : Microsoft Edge