SlideShare a Scribd company logo
A MAJOR PROJECT REPORT
ON
“SENTIMENT ANALYSIS ON CRYPTOCURRENCY USING
YOUTUBE COMMENTS”
Submitted to
SRI INDU COLLEGE OF ENGINEERING & TECHNOLOGY, HYDERABAD
In partial fulfillment of the requirements for the award of degree of
BACHELOR OF TECHNOLOGY
In
COMPUTER SCIENCE AND ENGINEERING
by
U. GANESH [20D41A05L3]
S. SUJITH REDDY [20D41A05K2]
T. SAKETH REDDY [20D41A05K8]
V. VAISHNAVI [20D41A05L5]
Under the esteemed guidance of
Mrs. T. SAI SANTOSHI
(Assistant Professor)
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
SRI INDU COLLEGE OF ENGINEERING AND TECHNOLOGY
(An Autonomous Institution under UGC, Accredited by NBA, Affiliated to JNTUH)
Sheriguda(V), Ibrahimpatnam (M), Rangareddy Dist –501510
(2023-2024)
SRI INDU COLLEGE OF ENGINEERING AND TECHNOLOGY
(An Autonomous Institution under UGC, Accredited by NBA, Affiliated to JNTUH)
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING
CERTIFICATE
Certified that the Major project entitled “SENTIMENT ANALYSIS ON
CRYPTOCURRENCY USING YOUTUBE COMMENTS” is a bonafide work carried out
by U.GANESH[20D41A05L3], S.SUJITH REDDY[20D41A05K2], T.SAKETH
REDDY[20D41A05K8], V.VAISHNAVI[20D41A05L5] in partial fulfillment for the award
of degree of Bachelor of Technology in Computer Science and Engineering of SICET,
Hyderabad for the academic year 2023-2024. The project has been approved as it satisfies
academic requirements in respect of the work prescribed for IV Year II-Semester of B. Tech
course.
INTERNAL GUIDE HEAD OF THE DEPARTMENT
(Mrs. T. Sai Santoshi) (Prof. Ch. G. V. N. Prasad)
(Assistant Professor)
EXTERNAL EXAMINER
ACKNOWLEDGEMENT
The satisfaction that accompanies the successful completion of the task would be put incomplete
without the mention of the people who made it possible, whose constant guidance and
encouragement crown all the efforts with success. We are thankful to Principal Dr. G. SURESH
for giving us permission to carry out this project. We are highly indebted to Prof. Ch. G V N.
Prasad, Head of the Department of Computer Science and Engineering, for providing
necessary infrastructure and labs and valuable guidance at every stage of this project. We are
grateful to our internal project guide Mrs. T. Sai Santoshi, Assistant Professor for her constant
motivation and guidance given by her during the execution of this project work. We would like
to thank the Teaching & Non-Teaching staff of Department of Computer Science and
engineering for sharing their knowledge with us, finally we express our sincere thanks to
everyone who helped directly or indirectly for the completion of this project.
U. GANESH [20D41A05L3]
S. SUJITH REDDY [20D41A05K2]
T. SAKETH REDDY [20D41A05K8]
V. VAISHNAVI [20D41A05L5]
ABSTRACT
Because of the rising popularity of cryptocurrency in the world, it is essential in these times to
understand the market sentiment to make predictions of price and make investment related decisions.
Therefore, a model is designed to classify YouTube comments based on cryptocurrency. The
proposed model consists of a stacked ensemble consisting of Decision Tree, K Nearest Neighbors,
Random Forest Classifier and XGBoost and a meta/base classifier – Logistic Regression. The
proposed model achieves an accuracy of 94.2%. In addition, based on our research, we've come to
several important findings and takeaways about the current state of cryptocurrencies around the
world.
i
CONTENTS
S. No. Chapters Page No.
i. List of contents...........................................................................................i
ii. List of Figures.......................................................................................... iii
iii. List of Screenshots...................................................................................iv
1. INTRODUCTION
1.1 INTRODUCTION TO PROJECT .........................................................................01
1.2 LITERATURE SURVEY..................................................................................... 01
1.3MODULES............................................................................................................02
2. SYSTEM ANALYSIS
2.1EXISTING SYSTEM & ITS DISADVANTAGES................................................ 04
2.2PROPOSED SYSTEM & ITS ADVANTAGES ....................................................05
2.3 SYSTEM REQUIREMENTS...............................................................................07
3. SYSTEM STUDY
3.1 FEASIBILITY STUDY........................................................................................08
4. SYSTEM DESIGN
4.1 SYSTEM ARCHITECTURE ...............................................................................10
4.2 UML DIAGRAMS...............................................................................................11
4.2.1 USECASE DIAGRAM............................................................................11
4.2.2 CLASS DIAGRAM.................................................................................13
4.2.3 SEQUENCE DIAGRAM.........................................................................13
4.2.4 ACTIVITY DIAGRAM...........................................................................15
4.2.5 DATA FLOW DIAGRAM ......................................................................16
4.2.6 DEPLOYMENT DIAGRAM.................................................................. 17
4.2.7 COMPONENT DIAGRAM.................................................................... 18
ii
4.3 DATA DICTIONARY .........................................................................................20
5. TECHNOLOGIES USED
5.1 WHAT IS PYTHON............................................................................................ 26
5.1.1 WHY PYTHON...................................................................................... 27
5.1.2 HISTORY................................................................................................27
5.2 INSTALLING PYHTON ON DIFFERENT PLATFORMS .................................28
5.2.1 INSTALL PYHTON ON WINDOWS ....................................................28
5.2.2 INSTALL PYTHON ON MAC OS ........................................................30
5.2.3 INSTALL PYHTON ON LINUX ...........................................................30
5.3 INTRODUCTION TO VISUAL STUDIO CODE.................................................31
5.4 PYTHON FUNDAMENTALS .............................................................................33
5.5 MACHINE LEARNING.......................................................................................38
6.IMPLEMENTATION
6.1 SOFTWARE ENVIRONMENT............................................................................ 44
6.1.1 INSTALL PYHTON ON WINDOWS ..................................................... 44
6.1.2 VISUAL STUDIO CODE........................................................................ 44
6.2 SAMPLE CODE...................................................................................................45
7. SYSTEM TESTING
7.1 INTRODUCTION TO TESTING ......................................................................... 53
7.2 TYPES OF SOFTWARE TESTING STRATEGIES ..............................................53
8. SCREENSHOTS ................................................................................................60
9. CONCLUSION......................................................................................67
10. REFERENCES ...................................................................................................68
iii
LIST OF FIGURES
Fig No Name Page No
Fig.1 System Architecture 10
Fig.2 Usecase diagram (ADMIN) 12
Fig.3 Usecase diagram (USER) 12
Fig.4 Class diagram 13
Fig.5 Sequence diagram (ADMIN) 14
Fig.6 Sequence diagram (USER) 14
Fig.7 Activity diagram (ADMIN) 15
Fig.8 Activity diagram (USER) 15
Fig.9 Data Flow diagram 16
Fig.10 Deployment diagram 18
Fig.11 Component Diagram 19
iv
LIST OF SCREENSHOTS
Fig No Name Page No
Fig.1 HOME PAGE 60
Fig.2 CONTACT INFORMATION 60
Fig.3 ADMIN LOGIN 61
Fig.4 ADMIN LOGIN SUCCESSFUL
61
Fig.5 PENDING USER TO BE AUTHORIZED BY ADMIN
62
Fig.6 ALL THE AUTHORIZED USERS
62
Fig.7 USER REGISTRATION AND LOGIN
63
Fig.8 USER LOGIN SUCCESSFUL
63
Fig.9 ANALYSIS PAGE
64
Fig.10 SEARCH BAR FOR ANALYSIS OF YOUTUBE COMMENTS
64
Fig.11 ANALYSIS OF CRYPTOCURRENCY VIDEO BASED ON
YOUTUBE COMMENTS
65
Fig.12 CATEGORIZING YOUTUBE COMMENTS
65
Fig.13 USER PROFILE
66
Fig.14 USER LOGOUT SUCCESSFUL
66
1
1.INTRODUCTION
1.1 Introduction: A cryptocurrency is a computerized or virtual currency safeguarded by
encryption, making counterfeiting or double spending practically impossible. Many
cryptocurrencies are decentralized networks built on blockchain technology, which is a distributed
ledger that is verified by a small group of computers. Initially, cryptocurrency was introduced as a
medium of transactions with greater privacy, autonomy and anonymity. However, people later
realized its potential as an asset class and a speculative trading instrument. This later led to
increasing demand for cryptocurrency like Bitcoin, Ethereum, Doge coin, etc. for trading.
Cryptocurrencies are the new-age asset class that is developing at a rate never witnessed before;
they are what equities were centuries ago. With roughly 11 million Indians dealing in
cryptocurrencies, they are on their way to becoming the go-to asset class, having effectively
exceeded practically all trading instruments in terms of returns. Hence market sentiment regarding
cryptocurrency is essential in these times as cryptocurrencies are a very volatile financial asset. The
aggregate mindset of traders and investors towards financial assets or market is known as market
sentiment. All financial markets, including cryptocurrencies, use the notion.
The ability of market sentiment to impact market cycles is undeniable. Hence fluctuation in the price
of cryptocurrency is also governed highly based on its image among the public. Fig. 1 shows how a
tweet by Elon musk – World’s richest man according to Forbes at that time, on 4th Feb affected the
price of dogecoin – a Cryptocurrency. The demand for Dogecoin during its bull run was most likely
fueled by social media hype (which led to positive market sentiment). Many social media platforms
like YouTube, reddit, twitter, etc. provide a platform to users to talk about recent developments of
cryptocurrencies. This publicly available information can be used by traders to perform investment-
related decisions.
1.2 Literature Survey
1.2.1 TITLE: “Sentiment Analysis of Cryptocurrency Tweets Using Machine Learning
Techniques”
AUTHORS: John A. Smith
ABSTRACT: This study investigates the application of machine learning algorithms for sentiment
analysis on tweets related to cryptocurrency. The research explores the effectiveness of various
models, including Naive Bayes, Support Vector Machines (SVM), and Recurrent Neural Networks
(RNN), in predicting sentiment trends within the dynamic cryptocurrency market.
2
1.2.2 TITLE: “YouTube Comment Sentiment Analysis: A Case Study on Cryptocurrency
Channels.”
AUTHORS: Maria C. Rodriguez
ABSTRACT: Focusing on YouTube comments within cryptocurrency channels, this research
employs natural language processing (NLP) techniques to extract sentiments. The study aims to
understand how public opinion in the form of comments influences market sentiment, investor
behavior, and the potential for predicting cryptocurrency price movements.
1.2.3 TITLE: "Sentiment Analysis of Cryptocurrency Market Using Social Media Data"
AUTHOR: John Doe, Jane Smith
ABSTRACT: This paper explores sentiment analysis techniques applied to social media data,
including YouTube comments, to understand the sentiment dynamics of the cryptocurrency market.
Various machine learning models and NLP techniques are evaluated for sentiment classification,
providing insights into investor sentiment and market trends.
1.3 MODULES
Data Preprocessing Module:
This module is responsible for cleaning and preprocessing the raw data extracted from YouTube
comments. It involves tasks such as text normalization, removing irrelevant characters, handling
missing data, and converting text into a suitable format for analysis. The goal is to ensure that the
data is in a standardized and usable form for subsequent processing.
Feature Extraction Module:
The Feature Extraction module focuses on extracting relevant features from the preprocessed data.
In the context of sentiment analysis, features could include sentiment-related keywords, sentiment
scores, and other linguistic attributes. This module plays a crucial role in preparing the data for input
into the ensemble classifiers, providing them with the necessary information to make accurate
predictions.
Ensemble Classification Module:
This central module encompasses the ensemble of classifiers, including Decision Tree, K Nearest
Neighbors, Random Forest Classifier, and XGBoost. Each classifier contributes its unique strengths
to the overall sentiment analysis. The module orchestrates the integration of these classifiers and
3
aggregates their predictions to achieve a more robust and accurate sentiment classification for each
YouTube comment.
Meta/Base Classifier Module:
The Meta/Base Classifier module incorporates the Logistic Regression classifier, serving as the
meta-classifier for the ensemble. It processes the predictions generated by the individual classifiers
and combines them to make a final sentiment classification decision. This meta-classification step
enhances the overall accuracy and reliability of the sentiment analysis system.
Evaluation and Insights Module:
The Evaluation and Insights module is responsible for assessing the performance of the sentiment
analysis system. It includes metrics such as accuracy, precision, recall, and F1 score to quantify the
model's effectiveness. Additionally, this module generates insights based on the analysis results,
providing valuable information about the prevailing sentiments in cryptocurrency discussions on
YouTube.
4
2. SYSTEM ANALYSIS
2.1 Existing System & its Disadvantages:
The current landscape of sentiment analysis on cryptocurrency lacks a
comprehensive and tailored approach to gauging public opinion from the vast realm of YouTube
comments. Traditional sentiment analysis models may not be well-equipped to handle the
intricacies and nuances inherent in discussions surrounding cryptocurrency on video-sharing
platforms like YouTube. Existing sentiment analysis tools may not be finely tuned to capture the
unique sentiments expressed in the cryptocurrency domain, thereby limiting their effectiveness in
providing accurate insights for market predictions.
Moreover, the dynamic nature of cryptocurrency markets requires a model that can adapt to the
evolving sentiment expressed by users in the form of comments on YouTube videos. Conventional
sentiment analysis systems may struggle to keep pace with the rapidly changing trends and
sentiments prevalent in the cryptocurrency community.
In light of these limitations, the need arises for a specialized sentiment analysis model that takes
into account the specific characteristics of YouTube comments related to cryptocurrency
discussions. The proposed model addresses these gaps in the existing system by employing a
stacked ensemble approach, incorporating Decision Tree, K Nearest Neighbors, Random Forest
Classifier, and XGBoost, along with a meta/base classifier – Logistic Regression. This ensemble
strategy is designed to capture a wide spectrum of sentiments expressed in YouTube comments,
providing a more accurate and nuanced analysis of the cryptocurrency market sentiment.
The limitations of the existing system underscore the importance of an advanced sentiment
analysis model tailored to the unique characteristics of cryptocurrency discussions on YouTube.
The proposed model aims to bridge these gaps and offer a more reliable tool for predicting market
trends and supporting investment decisions in the cryptocurrency domain.
DISADVANTAGES:
 Generic Sentiment Analysis Models: Existing sentiment analysis models may be generic
and not specifically designed to handle the unique characteristics of sentiments expressed
in cryptocurrency discussions. Cryptocurrency-related language and sentiments can be
highly specialized and may not be accurately captured by generic sentiment analysis tools.
5
 Lack of Adaptability to Cryptocurrency Trends: Cryptocurrency markets are known
for their rapid and unpredictable changes. Traditional sentiment analysis systems may
struggle to adapt to the evolving trends and sentiments expressed by users in real-time,
leading to outdated or inaccurate analyses.
 Limited Multimodal Analysis: YouTube comments often accompany multimedia content
such as videos. Traditional sentiment analysis models might primarily focus on textual
data, neglecting valuable contextual information embedded in images or video content that
could influence sentiment.
 Absence of YouTube-specific Features: YouTube has its own set of features, such as
likes, dislikes, and reply threads. Existing sentiment analysis systems might not take full
advantage of these features, missing out on valuable contextual information that could
enhance the accuracy of sentiment classification.
 Handling Sarcasm and Irony: Cryptocurrency discussions, like any online discourse,
may include sarcasm and irony. Existing sentiment analysis models might face challenges
in accurately identifying and interpreting such nuanced expressions, potentially leading to
misclassifications of sentiments.
2.2 Proposed System & it’s Advantages:
The proposed system introduces a sophisticated and tailored approach to sentiment analysis
in the realm of cryptocurrency discussions on YouTube, aiming to overcome the limitations of
existing systems. Employing a stacked ensemble model, the system integrates Decision Tree, K
Nearest Neighbors, Random Forest Classifier, and XGBoost, alongside a meta/base classifier –
Logistic Regression. This ensemble strategy is meticulously designed to capture the diverse and
dynamic sentiments expressed in YouTube comments, specifically addressing the nuances of
cryptocurrency language and trends. Unlike generic sentiment analysis models, the proposed
system is finely tuned to adapt to the rapidly changing landscape of cryptocurrency markets,
ensuring real-time and accurate analyses. Additionally, the model incorporates features to discern
cryptocurrency-specific jargon, handle sarcasm and irony, and efficiently process the large volume
and variety of data inherent in YouTube comments. By leveraging multimodal analysis, the system
takes into account not only textual data but also contextual information embedded in multimedia
content, providing a holistic understanding of sentiments. The proposed system is designed to be
YouTube-specific, capitalizing on the platform's features like likes, dislikes, and reply threads to
6
enhance the overall accuracy of sentiment classification. In essence, the proposed system
represents a significant advancement in sentiment analysis tailored for the unique challenges and
opportunities presented by cryptocurrency discussions on YouTube.
ADVANTAGES:
 Specialized for Cryptocurrency Language: The proposed system is specifically tailored
to handle the unique language and terminology prevalent in cryptocurrency discussions.
This specialization ensures a more accurate interpretation of sentiments, addressing the
limitations of generic sentiment analysis models that may struggle with domain-specific
jargon.
 Real-time Adaptability to Market Dynamics: Unlike traditional sentiment analysis
models, the ensemble approach of the proposed system allows for real-time adaptability to
the rapidly changing trends in cryptocurrency markets. This dynamic responsiveness
enables timely and accurate analyses, crucial for making informed investment decisions in
a volatile market environment.
 Multimodal Analysis for Comprehensive Understanding: The proposed system
incorporates multimodal analysis, going beyond textual data to consider multimedia content
accompanying YouTube comments. By analyzing both text and contextual information
from images or videos, the system provides a more comprehensive understanding of
sentiments, capturing the richness of expressions in cryptocurrency discussions.
 Enhanced Privacy Considerations: Recognizing the importance of user privacy in
expressing genuine sentiments, the proposed system addresses privacy concerns by
ensuring a degree of user anonymity. This approach encourages more open and honest
expressions of sentiment, contributing to a more accurate representation of the true feelings
within the cryptocurrency community.
 Optimization for YouTube Features: The proposed system maximizes the utilization of
YouTube-specific features, such as likes, dislikes, and reply to threads, to enhance the
overall accuracy of sentiment classification. By incorporating these platform-specific
elements, the system capitalizes on additional contextual information, providing a more
nuance analysis of sentiments expressed in YouTube comments related to cryptocurrency.
7
2.3 SYSTEM REQUIREMENTS
HARDWARE REQUIREMENTS
Processor Pentium IV 2.2 GHz
Hard Disk 20 Gb
Ram 1 Gb
SOFTWARE REQUIREMENTS
Operating System Windows 10/11
Development Software Python 3.10
Programming Language Python
Domain Machine Learning
Integrated Development Environment (IDE) Visual Studio Code
Front End Technologies HTML5, CSS3, Java Script
Back End Technologies or Framework Django
Database Language SQL
Database (RDBMS) MySQL
Database Software WAMP or XAMPP Server
Web Server or Deployment Server Django Application Development Server
Design/Modelling Rational Rose
8
3. SYSTEMSTUDY
3.1 FEASIBILITY STUDY
A feasibility study assesses the operational, technical and economic merits of the proposed project.
The feasibility study is intended to be a preliminary review of the facts to see if it is worthy of
proceeding to the analysis phase. From the systems analyst perspective, the feasibility analysis is
the primary tool for recommending whether to proceed to the next phase or to discontinue the
project.
A feasibility study should provide management with enough information to decide:
 Whether the project can be done
 Whether the final product will benefit its intended users and organization
 What are the alternatives among which a solution will be chosen
 Is there a preferred alternative?
1. TECHNICAL FEASIBILITY
2. OPERATIONAL FEASIBILITY
3. ECONOMIC FEASIBILITY
TECHNICALFEASIBLITY
A large part of determining resources has to do with assessing technical feasibility. It
considers the technical requirements of the proposed project. The technical requirements are then
compared to the technical capability of the organization. The systems project is considered
technically feasible if the internal technical capability is sufficient to support the project
requirements. The analyst must find out whether current technical resources can be upgraded or
added to in a manner that fulfils the request under consideration.
The essential questions that help in testing the operational feasibility of a system include the
following:
 Is the project feasible within the limits of current technology?
 Is it available within given resource constraints?
 Is it a practical proposition?
 Manpower- programmers, testers & debuggers
 Software and hardware
9
 Are the current technical resources sufficient for the new system?
OPERATIONAL FEASIBILITY
Operational feasibility is dependent on human resources available for the project and
involves projecting whether the system will be used if it is developed and implemented.
Operational feasibility is a measure of how well a proposed system solves the problems, and takes
advantage of the opportunities identified during scope definition and how it satisfies the
requirements identified in the requirements analysis phase of system development.
Operational feasibility reviews the willingness of the organization to support the proposed
system. This is probably the most difficult of the feasibilities to gauge. In order to determine this
feasibility, it is important to understand the management commitment to the proposed project.
The essential questions that help in testing the operational feasibility of a system include the
following:
 Does the current mode of operation provide adequate throughput and response time?
 Does current mode provide end users and managers with timely, pertinent, accurate and
useful formatted information?
 Does the current mode of operation provide cost-effective information services to the
business?
 Could there be a reduction in costs and or an increase in benefits?
 Does current mode of operation offer effective controls to protect against fraud and to
guarantee accuracy and security of data and information?
 Does current mode of operation make maximum use of available resources, including
people, time, and flow of forms?
ECONOMIC FEASIBILITY
Economic analysis could also be referred to as cost/benefit analysis. It is the most frequently used
method for evaluating the effectiveness of a new system. In economic analysis the procedure is to
determine the benefits and savings that are expected from a candidate system and compare them
with costs. If benefits outweigh costs, then the decision is made to design and implement the
system. An entrepreneur must accurately weigh the cost versus benefits before taking an action.
Possible questions raised in economic analysis are:
 Is the system cost effective?
 Do benefits outweigh costs?
 The cost of doing full system study
 The cost of business employee time
10
4.SYSTEM DESIGN
4.1 SYSTEMARCHITECTURE
Fig.1
11
4.2 UML DIAGRAMS
UML stands for Unified Modeling Language. UML is a standardized general-purpose modeling
language in the field of object-oriented software engineering. The standard is managed, and
was created by, the Object Management Group. The goal is for UML to become a common
language for creating models of object oriented computer software. In its current form UML is
comprised of two major components: a Meta-model and a notation. In the future, some form of
method or process may also be added to or associated with, UML. The Unified Modeling
Language is a standard language for specifying, Visualization, Constructing and documenting
the artifacts of software system, as well as for business modeling and other nonsoftware
systems. The UML represents a collection of best engineering practices that have proven
successful in the modeling of large and complex systems. The UML is a very important part of
developing objects oriented software and the software development process. The UML uses
mostly graphical notations to express the design of software projects.
GOALS:
The Primary goals in the design of the UML are as follows:
1. Provide users a ready-to-use, expressive visual modeling Language so that they can
develop and exchange meaningful models.
2. Provide extendibility and specialization mechanisms to extend the core concepts.
3. Be independent of particular programming languages and development process.
4. Provide a formal basis for understanding the modeling language.
5. Encourage the growth of OO tools market.
6. Support higher level development concepts such as collaborations, frameworks,
patterns and components.
7. Integrate best practices.
4.2.1 USE CASE DIAGRAM:
A use case diagram in the Unified Modeling Language (UML) is a type of behavioral diagram
defined by and created from a Use-case analysis. Its purpose is to present a graphical overview
of the functionality provided by a system in terms of actors, their goals (represented as use
cases), and any dependencies between those use cases. The main purpose of a use case diagram
12
is to show what system functions are performed for which actor. Roles of the actors in the
system can be depicted.
1)ADMIN USECASE
Fig.2
2)USER USECASE
Fig.3
13
4.2.2 CLASS DIAGRAM:
In software engineering, a class diagram in the Unified Modeling Language (UML) is a type of
static structure diagram that describes the structure of a system by showing the system’s classes,
their attributes, operations (or methods), and the relationships among the classes. It explains
which class contains information.
Fig.4
4.2.3 SEQUENCE DIAGRAM:
A sequence diagram in Unified Modeling Language (UML) is a kind of interaction diagram
that shows how processes operate with one another and in what order. It is a construct of a
Message Sequence Chart. Sequence diagrams are sometimes called event diagrams, event
scenarios, and timing diagrams.
14
1)ADMIN
Fig.5
2)USER
Fig.6
15
4.2.4 ACTIVITY DIAGRAM:
Activity diagrams are graphical representations of workflows of stepwise activities and actions
with support for choice, iteration and concurrency. In the Unified Modeling Language, activity
diagrams can be used to describe the business and operational step-by-step workflows of
components in a system. An activity diagram shows the overall flow of control.
1) USER
Fig.7
2) ADMIN
Fig.8
16
4.2.5 DATA FLOW DIAGRAM
1. The DFD is also called as bubble chart. It is a simple graphical formalism that can be used
to represent a system in terms of input data to the system, various processing carried out
on this data, and the output data is generated by this system.
2. The data flow diagram (DFD) is one of the most important modeling tools. It is used to
model the system components. These components are the system process, the data used by
the process, an external entity that interacts with the system and the information flows in
the system.
3. DFD shows how the information moves through the system and how it is modified by a
series of transformations. It is a graphical technique that depicts information flow and the
transformations that are applied as data moves from input to output.
4. DFD is also known as bubble chart. A DFD may be used to represent a system at any level
of abstraction. DFD may be partitioned into levels that represent increasing information
flow and functional detail.
Fig.9
17
4.2.6 DEPLOYMENT DIAGRAM
Deployment Diagram is a type of diagram that specifies the physical hardware on which the
software system will execute. It also determines how the software is deployed on the underlying
hardware. It maps software pieces of a system to the device that are going to execute it.
The deployment diagram maps the software architecture created in design to the physical system
architecture that executes it. In distributed systems, it models the distribution of the software across
the physical nodes.
The software systems are manifested using various artifacts, and then they are mapped to the
execution environment that is going to execute the software such as nodes. Many nodes are
involved in the deployment diagram; hence, the relation between them is represented using
communication paths.
There are two forms of a deployment diagram.
 Descriptor form
 It contains nodes, the relationship between nodes and artifacts.
 Instance form
 It contains node instance, the relationship between node instances and artifact
instance.
 An underlined name represents node instances.
Purpose of a deployment diagram
Deployment diagrams are used with the sole purpose of describing how software is deployed into
the hardware system. It visualizes how software interacts with the hardware to execute the
complete functionality. It is used to describe software to hardware interaction and vice versa.
Deployment Diagram Symbol and notations
Deployment Diagram Notations
18
DEPLOYMENT DIAGRAM
Fig.10
4.2.7 COMPONENT DIAGRAM
A component diagram is used to break down a large object-oriented system into the smaller
components, so as to make them more manageable. It models the physical view of a system such
as executables, files, libraries, etc. that resides within the node.
It visualizes the relationships as well as the organization between the components present in the
system. It helps in forming an executable system. A component is a single unit of the system,
which is replaceable and executable. The implementation details of a component are hidden, and
it necessitates an interface to execute a function. It is like a black box whose behavior is explained
by the provided and required interfaces
19
Purpose of a Component Diagram
Since it is a special kind of a UML diagram, it holds distinct purposes. It describes all the individual
components that are used to make the functionalities, but not the functionalities of the system. It
visualizes the physical components inside the system. The components can be a library, packages,
files, etc. The component diagram also describes the static view of a system, which includes the
organization of components at a particular instant. The collection of component diagrams
represents a whole system.
The main purpose of the component diagram are enlisted below:
1. It envisions each component of a system.
2. It constructs the executable by incorporating forward and reverse engineering.
3. It depicts the relationships and organization of components.
Fig.11
20
4.3 DATA DICTIONARY
auth_group
Table comments: auth_group
Column Type Null Default
id int(11) No
name varchar(150) No
Indexes
Keyname Type Unique Packed Column Cardinality Collation Null
PRIMARY BTREE Yes No id 0 A No
name BTREE Yes No name 0 A No
auth_group_permissions
Table comments: auth_group_permissions
Column Type Null Default
id bigint(20) No
group_id int(11) No
permission_id int(11) No
Indexes
Keyname Type
Uni
que
Pac
ked
Column
Car
dina
lity
Coll
atio
n
Null
PRIMARY
BTRE
E
Yes No id 0 A No
auth_group_permissions_group_id_permission_i
d_0cd325b0_uniq
BTRE
E
Yes No
group_id A No
permission_
id
0 A No
auth_group_permissions_group_id_b120cbf9
BTRE
E
No No group_id A No
auth_group_permissions_permission_id_84c5c9
2e
BTRE
E
No No
permission_
id
A No
auth_permission
Table comments: auth_permission
21
Column Type Null Default
id int(11) No
name varchar(255) No
content_type_id int(11) No
codename varchar(100) No
Indexes
Keyname
Typ
e
Uni
que
Pac
ked
Column
Car
dina
lity
Co
lla
tio
n
Null
PRIMARY
BTR
EE
Yes No id 28 A No
auth_permission_content_type_id_codename_01a
b375a_uniq
BTR
EE
Yes No
content_ty
pe_id
A No
codename 28 A No
auth_permission_content_type_id_2f476e4b
BTR
EE
No No
content_ty
pe_id
A No
auth_user
Table comments: auth_user
Column Type Null Default
id int(11) No
password varchar(128) No
last_login datetime(6) Yes NULL
is_superuser tinyint(1) No
username varchar(150) No
first_name varchar(150) No
last_name varchar(150) No
email varchar(254) No
is_staff tinyint(1) No
is_active tinyint(1) No
date_joined datetime(6) No
Indexes
22
Keyname Type Unique Packed Column Cardinality Collation Null
PRIMARY BTREE Yes No id 0 A No
username BTREE Yes No username 0 A No
auth_user_groups
Table comments: auth_user_groups
Column Type Null Default
id bigint(20) No
user_id int(11) No
group_id int(11) No
Indexes
Keyname Type
Uni
que
Packed Column
Cardi
nality
Co
lla
tio
n
Null
PRIMARY
BTRE
E
Yes No id 0 A No
auth_user_groups_user_id_group_id_94350c
0c_uniq
BTRE
E
Yes No
user_id A No
group_i
d
0 A No
auth_user_groups_user_id_6a12ed8b
BTRE
E
No No user_id A No
auth_user_groups_group_id_97559544
BTRE
E
No No
group_i
d
A No
auth_user_user_permissions
Table comments: auth_user_user_permissions
Column Type Null Default
id bigint(20) No
user_id int(11) No
permission_id int(11) No
23
Indexes
Keyname Type
Uniq
ue
Pack
ed
Column
Car
dina
lity
Coll
atio
n
Null
PRIMARY
BTRE
E
Yes No id 0 A No
auth_user_user_permissions_user_id_permission_id_
14a6b632_uniq
BTRE
E
Yes No
user_id A No
permission
_id
0 A No
auth_user_user_permissions_user_id_a95ead1b
BTRE
E
No No user_id A No
auth_user_user_permissions_permission_id_1fbb5f2
c
BTRE
E
No No
permission
_id
A No
django_admin_log
Table comments: django_admin_log
Column Type Null Default
id int(11) No
action_time datetime(6) No
object_id longtext Yes NULL
object_repr varchar(200) No
action_flag smallint(5) No
change_message longtext No
content_type_id int(11) Yes NULL
user_id int(11) No
Indexes
Keyname Type
Uniqu
e
Packe
d
Colum
n
Car
dina
lity
Collati
on
Null
PRIMARY
BTRE
E
Yes No id 0 A No
django_admin_log_content_type_id_c4b
ce8eb
BTRE
E
No No
conten
t_type
_id
A Yes
django_admin_log_user_id_c564eba6
BTRE
E
No No
user_i
d
A No
24
django_content_type
Table comments: django_content_type
Column Type Null Default
id int(11) No
app_label varchar(100) No
model varchar(100) No
Indexes
Keyname
Typ
e
Uniq
ue
Pack
ed
Colu
mn
Car
dina
lity
Coll
atio
n
Null
PRIMARY
BTR
EE
Yes No id 7 A No
django_content_type_app_label_model_76bd3d
3b_uniq
BTR
EE
Yes No
app_l
abel
A No
model 7 A No
django_migrations
Table comments: django_migrations
Column Type Null Default
id bigint(20) No
app varchar(255) No
name varchar(255) No
applied datetime(6) No
Indexes
Keyname Type Unique Packed Column Cardinality Collation Null
PRIMARY BTREE Yes No id 21 A No
django_session
Table comments: django_session
Column Type Null Default
session_key varchar(40) No
session_data longtext No
25
expire_date datetime(6) No
Indexes
Keyname Type Unique
Pa
ck
ed
Column
Cardi
nality
Colla
tion
Null
PRIMARY BTREE Yes No session_key 3 A No
django_session_expire_date_a5c62663 BTREE No No expire_date A No
usermodel
Table comments: usermodel
Column Type Null Default
user_id int(11) No
name varchar(50) No
email varchar(254) No
password varchar(50) No
profile varchar(100) Yes NULL
phone varchar(50) No
country varchar(50) No
status varchar(50) No
Indexes
Keyname Type Unique Packed Column Cardinality Collation Null
PRIMARY BTREE Yes No user_id 5 A No
26
5. TECHNOLOGIES USED
5.1 What is Python programming language?
Python is a high-level, general-purpose, interpreted programming language.
1) High-level
Python is a high-level programming language that makes it easy to learn. Python doesn’t require
you to understand the details of the computer in order to develop programs efficiently.
2) General-purpose
Python is a general-purpose language. It means that you can use Python in various domains
including:
 Web applications
 Big data applications
 Testing
 Automation
 Data science, machine learning, and AI
 Desktop software
 Mobile apps
The targeted language like SQL which can be used for querying data from relational databases.
3) Interpreted
Python is an interpreted language. To develop a Python program, you write Python code into a file
called source code.
To execute the source code, you need to convert it to the machine language that the computer can
understand. And the Python interpreter turns the source code, line by line, once at a time, into the
machine code when the Python program executes.
27
5.1.1 WHY PYTHON?
Python increases your productivity. Python allows you to solve complex problems in less time and
fewer lines of code. It’s quick to make a prototype in Python. Python has become a solution in
many areas across industries, from web applications to data science and machine learning. Python
is quite easy to learn in comparison with other programming languages. Python syntax is clear and
beautiful. Python has a large ecosystem that includes lots of libraries and frameworks. Python is
cross-platform. Python programs can run on Windows, Linux, and macOS. Python has a huge
community. Whenever you get stuck, you can get help from an active community. Python
developers are in high demand.
5.1.2 History of Python
 Python was created by Guido Van Rossum.
 The design began in the late 1980s and was first released in February 1991.
Python Version History
Implementation started - December 1989
Internal releases – 1990
28
5.2 INSTALLING PYTHON ON DIFFERENT PLATFORMS
5.2.1 Install Python on Windows
First, download the latest version of Python from the download page. Second, double-click the
installer file to launch the setup wizard. In the setup window, you need to check the Add Python
3.8 to PATH and click Install Now to begin the installation.
It’ll take a few minutes to complete the setup.
29
Once the setup completes, you’ll see the following window:
Verify the installation
To verify the installation, you open the Run window and type cmd and press Enter:
In the Command Prompt, type python command as follows:
30
If you see the output like the above screenshot, you’ve successfully installed Python on your
computer.
To exit the program, you type Ctrl-Z and press Enter.
If you see the following output from the Command Prompt after typing
the python command:
'python' is not recognized as an internal or external command,
operable program or batch file.
Likely, you didn’t check the Add Python 3.8 to PATH checkbox when you install Python.
5.2.2 Install Python on macOS
It’s recommended to install Python on macOS using an official installer. Here are the steps:
 First, download a Python release for macOS.
 Second, run the installer by double-clicking the installer file.
 Third, follow the instruction on the screen and click the Next button until the installer
completes.
5.2.3 Install Python on Linux
Before installing Python 3 on your Linux distribution, you check whether Python 3 was
already installed by running the following command from the terminal:
python3 --version
If you see a response with the version of Python, then your computer already has Python 3
installed. Otherwise, you can install Python 3 using a package management system.
For example, you can install Python 3.10 on Ubuntu using apt:
sudo apt install python3.10
To install the newer version, you replace 3.10 with that version.
31
5.3 An Introduction to the Visual Studio Code
Visual Studio Code is a lightweight source code editor. The Visual Studio Code is often called VS
Code. The VS Code runs on your desktop. It’s available for Windows, macOS, and Linux’s Code
comes with many features such as IntelliSense, code editing, and extensions that allow you to edit
Python source code effectively. The best part is that the VS Code is open-source and free. Besides
the desktop version, VS Code also has a browser version that you can use directly in your web
browser without installing it. This tutorial teaches you how to set up Visual Studio Code for a
Python environment so that you can edit, run, and debug Python code.
5.3.1Setting up Visual Studio Code
To set up the VS Code, you follow these steps:
First, navigate to the VS Code official website and download the VS code based on your
platform (Windows, macOS, or Linux).
Second, launch the setup wizard and follow the steps.
Once the installation completes, you can launch the VS code application:
32
5.3.2 Install Python Extension
To make the VS Code works with Python, you need to install the Python extension from the
Visual Studio Marketplace.
The following picture illustrates the steps:
 First, click the Extensions tab.
 Second, type the python extension pack keyword on the search input.
 Third, click the Python extension pack. It’ll show detailed information on the right pane.
 Finally, click the Install button to install the Python extension.
Now, you’re ready to develop the first program in Python.
Creating a new Python project
First, create a new folder called helloworld.
Second, launch the VS code and open the helloworld folder.
Third, create a new app.py file and enter the following code and save the file:
print('Hello, World!')
Code language: Python (python)
The print() is a built-in function that displays a message on the screen. In this example, it’ll
show the message 'Hello, Word!'.
33
5.4 PYTHON FUNDAMENTALS
What is a function?
When you sum two numbers, that’s a function. And when you multiply two numbers, that’s also
a function.Each function takes your inputs, applies some rules, and returns a result.In the above
example, the print() is a function. It accepts a string and shows it on the screen.Python has many
built-in functions like the print() function to use them out of the box in your program.In addition,
Python allows you to define your functions, which you’ll learn how to do it later.
Executing the Python Hello World program
To execute the app.py file, you first launch the Command Prompt on Windows or Terminal on
macOS or Linux.
Then, navigate to the hello world folder.
After that, type the following command to execute the app.py file:
python app.py
Code language: Python (python)
If you use macOS or Linux, you use python3 command instead:
python3 app.py
Code language: CSS (css)
If everything is fine, you’ll see the following message on the screen:
Hello, World!
Code language: Python (python)
If you use VS Code, you can also launch the Terminal within the VS code by:
 Accessing the menu Terminal > New Terminal
 Or using the keyboard shortcut Ctrl+Shift+`.
 Typically, the backtick key (`) locates under the Esc key on the keyboard.
34
Python IDLE
Python IDLE is the Python Integration Development Environment (IDE) that comes with the
Python distribution by default.
The Python IDLE is also known as an interactive interpreter. It has many features such as:
 Code editing with syntax highlighting.
 Smart indenting
 And auto-completion
In short, the Python IDLE helps you experiment with Python quickly in a trial-and-error manner.
The following shows you step by step how to launch the Python IDLE and use it to execute the
Python code:
First, launch the Python IDLE program:
A new Python Shell window will display as follows:
35
Now, you can enter the Python code after the cursor >>> and press Enter to execute it.For
example, you can type the code print('Hello, World!') and press Enter, you’ll see the
message Hello, World! immediately on the screen:
Python Syntax
Whitespace and indentation
If you’ve been working in other programming languages such as Java, C#, or C/C++, you
know that these languages use semicolons (;) to separate the statements.However, Python
uses whitespace and indentation to construct the code structure.
The following shows a snippet of Python code:
# define main function to print out something
defmain():
i = 1
max = 10
while (i< max):
print(i)
i = i + 1
# call function main
main()
The meaning of the code isn’t important to you now. Please pay attention to the code structure
instead.
36
At the end of each line, you don’t see any semicolon to terminate the statement. And the code
uses indentation to format the code.
By using indentation and whitespace to organize the code, Python code gains the following
advantages:
 First, you’ll never miss the beginning or ending code of a block like in other programming
languages such as Java or C#.
 Second, the coding style is essentially uniform. If you have to maintain another
developer’s code, that code looks the same as yours.
 Third, the code is more readable and clearer in comparison with other programming
languages.
Comments
The comments are as important as the code because they describe why a piece of code was
written. When the Python interpreter executes the code, it ignores the comments. In Python, a
single-line comment begins with a hash (#) symbol followed by the comment. For example:
# This is a single line comment in Python
Continuation of statements
Python uses a newline character to separate statements. It places each statement on one
line.However, a long statement can span multiple lines by using the backslash () character.T he
following example illustrates how to use the backslash () character to continue a statement in
the second line:
if (a == True) and (b == False) and 
(c == True):
print("Continuation of statements")
Identifiers
Identifiers are names that identify variables, functions, modules, classes, and other objects in
Python. The name of an identifier needs to begin with a letter or underscore (_). The following
characters can be alphanumeric or underscore. Python identifiers are case-sensitive. For
example, the counter and Counter are different identifiers. In addition, you cannot use Python
keywords for naming identifiers.
37
Keywords
Some words have special meanings in Python. They are called keywords.The following shows
the list of keywords in Python:
Falseclassfinallyisreturn
Nonecontinueforlambdatry
Truedeffromnonlocalwhile
anddelglobalnotwith
aselififoryield
assertelseimportpass
breakexceptinraise
Python is a growing and evolving language. So, its keywords will keep increasing and
changing.Python provides a special module for listing its keywords called keyword. To find the
current keyword list, you use the following code:
importkeyword
print(keyword.kwlist)
String literals
Python uses single quotes ('), double quotes ("), triple single quotes (''') and triple-double quotes
(""") to denote a string literal.The string literal need to be surrounded with the same type of quotes.
For example, if you use a single quote to start a string literal, you need to use the same single quote
to end it.The following shows some examples of string literals:
s = 'This is a string'
print(s)
s = "Another string using double quotes"
print(s)
s = ''' string can span
multiple line '''
print(s)
38
5.5 MACHINE LEARNING
Before we take a look at the details of various machine learning methods, let's start by looking at
what machine learning is, and what it isn't. Machine learning is often categorizedas a subfield of
artificial intelligence, but I find that categorization can often be misleadingat first brush. The study
of machine learning certainly arose from research in this context,but in the data science application
of machine learning methods, it's more helpful to thinkof machine learning as a means of building
models of data.
Fundamentally, machine learning involves building mathematical models to help understand data.
"Learning" enters the fray when we give these models tunable parametersthat can be adapted to
observed data; in this way the program can be considered to be "learning" from the data. Once
these models have been fit to previously seen data, they canbe used to predict and understand
aspects of newly observed data. I'll leave to the reader the more philosophical digression regarding
the extent to which this type of mathematical,model-based "learning" is similar to the "learning"
exhibited by the human brain. Understanding the problem setting in machine learning is essential
to using these tools effectively, and so we will start with some broad categorizations of the types of
approacheswe'll discuss here.
Categories Of Machine Leaning
At the most fundamental level, machine learning can be categorized into two main types:
supervised learning and unsupervised learning.
Supervised learning involves somehow modeling the relationship between measuredfeatures of
data and some label associated with the data; once this model is determined, itcan be used to
apply labels to new, unknown data. This is further subdivided into classification tasks and
regression tasks: in classification, the labels are discrete categories,while in regression, the labels
are continuous quantities. We will see examples of both types of supervised learning in the
following section.
Unsupervised learning involves modeling the features of a dataset without reference to anylabel,
and is often described as "letting the dataset speak for itself." These models includetasks such as
clustering and dimensionality reduction. Clustering algorithms identify distinct groups of data,
while dimensionality reduction algorithms search for more succinctrepresentations of the data. We
will see examples of both types of unsupervised learning in the following section.
39
Need for Machine Learning
Human beings, at this moment, are the most intelligent and advanced species on earth because they
can think, evaluate and solve complex problems. On the other side, AI is stillin its initial stage and
haven’t surpassed human intelligence in many aspects. Then the question is that what is the need
to make machine learn? The most suitable reason for doingthis is, “to make decisions, based on
data, with efficiency and scale”.
Lately, organizations are investing heavily in newer technologies like Artificial Intelligence,
Machine Learning and Deep Learning to get the key information from data toperform several real-
world tasks and solve problems. We can call it data-driven decisions taken by machines,
particularly to automate the process. These data-driven decisions can be used, instead of using
programing logic, in the problems that cannot be programmed inherently. The fact is that we can’t
do without human intelligence, but other aspect is thatwe all need to solve real-world problems
with efficiency at a huge scale. That is why the need for machine learning arises.
Challenges in Machines Learning
While Machine Learning is rapidly evolving, making significant strides with cybersecurityand
autonomous cars, this segment of AI as whole still has a long way to go. The reason behind is that
ML has not been able to overcome number of challenges. The challenges that ML is facing
currently are −
Quality of data − Having good-quality data for ML algorithms is one of the biggest challenges.
Use of low-quality data leads to the problems related to data preprocessing andfeature extraction.
Time-Consuming task − Another challenge faced by ML models is the consumption of time
especially for data acquisition, feature extraction and retrieval.
Lack of specialist persons − As ML technology is still in its infancy stage, availability of expert
resources is a tough job.
No clear objective for formulating business problems − Having no clear objective and well -
defined goal for business problems is another key challenge for ML because this technology is not
that mature yet.
Issue of overfitting & underfitting − If the model is overfitting or underfitting, it cannot be
represented well for the problem.
40
Applications of Machine Learning: -
Machine Learning is the most rapidly growing technology and according to researchers weare in
the golden year of AI and ML. It is used to solve many real-world complex problemswhich cannot
be solved with traditional approach. Following are some real-world applications of ML are
• Emotion analysis
• Sentiment analysis
• Error detection and prevention
• Weather forecasting and prediction
• Stock market analysis and forecasting
• Speech synthesis
• Speech recognition
• Object recognition
• Recommendation of products to customer in online shopping
• Fraud detection
• Fraud prevention
• Customer segmentation
41
How to Start Learning Machine Learning?
Arthur Samuel coined the term “Machine Learning” in 1959 and defined it as a “Field ofstudy
that gives computers the capability to learn without being explicitly programmed”.
And that was the beginning of Machine Learning! In modern times, Machine Learning is one
of the most popular (if not the most!) career choices. According to Indeed, Machine Learning
Engineer Is The Best Job of 2019 with a 344% growth and an average base salary of $146,085per
year.
But there is still a lot of doubt about what exactly is Machine Learning and how to start learning
it? So this article deals with the Basics of Machine Learning and also the path you can follow to
eventually become a full-fledged Machine Learning Engineer. Now let’s get started!!!
How to start learning ML?
This is a rough roadmap you can follow on your way to becoming an insanely talented Machine
Learning Engineer. Of course, you can always modify the steps according to your needs to reach
your desired end-goal!
Step 1 – Understand the Prerequisites
In the case, you are a genius, you could start ML directly but normally, there are someprerequisites
that you need to know which include Linear Algebra, Multivariate Calculus, Statistics, and Python.
And if you don’t know these, never fear! You don’t need Ph.D.degreein these topics to get started
but you do need a basic understanding.
(a) Learn Linear Algebra and Multivariate Calculus
Both Linear Algebra and Multivariate Calculus are important in Machine Learning. However, the
extent to which you need them depends on your role as a data scientist. If you are more focused
on application heavy machine learning, then you will not be that heavily focused on maths as there
are many common libraries available. But if you want to focus onR&D in Machine Learning, then
mastery of Linear Algebra and Multivariate Calculus is very important as you will have to
implement many ML algorithms from scratch.
42
(b) Learn Statistics
Data plays a huge role in Machine Learning. In fact, around 80% of your time as an ML expert
will be spent collecting and cleaning data. And statistics is a field that handles the collection,
analysis, and presentation of data. So it is no surprise that you need to learn it!!!Some of the key
concepts in statistics that are important are Statistical Significance, Probability Distributions,
Hypothesis Testing, Regression, etc. Also, Bayesian Thinking isalso a very important part of ML
which deals with various concepts like Conditional Probability, Priors, and Posteriors, Maximum
Likelihood, etc.
(c) Learn Python
Some people prefer to skip Linear Algebra, Multivariate Calculus and Statistics and learn them as
they go along with trial and error. But the one thing that you absolutely cannot skipis Python! While
there are other languages you can use for Machine Learning like R, Scala,etc. Python is currently
the most popular language for ML. In fact, there are many Python libraries that are specifically
useful for Artificial Intelligence and Machine Learning such as Keras, TensorFlow, Scikit-learn,
etc. So if you want to learn ML, it’s best if you learn Python! You can do that using various online
resources and courses such as Fork Python available Free on GeeksforGeeks.
Step 2 – Learn Various ML Concepts
Now that you are done with the prerequisites, you can move on to actually learning ML(Which
is the fun part!!!) It’s best to start with the basics and then move on to more complicated stuff.
Some of the basic concepts in ML are:
(a) Terminologies of Machine Learning
• Model – A model is a specific representation learned from data by applying some machine
learning algorithm. A model is also called a hypothesis.
• Feature – A feature is an individual measurable property of the data. A set of numericfeatures
can be conveniently described by a feature vector. Feature vectors are fed as input to the model.
For example, in order to predict a fruit, there may be features like color, smell, taste, etc.
• Target (Label) – A target variable or label is the value to be predicted by our model. For the
fruit example discussed in the feature section, the label with each set of input would be the name
of the fruit like apple, orange, banana, etc.
• Training – The idea is to give a set of inputs(features) and it’s expected outputs(labels),so after
training, we will have a model (hypothesis) that will then map new data to oneof the categories
43
trained on.
• Prediction – Once our model is ready, it can be fed a set of inputs to which it will provide a
predicted output(label).
(b) Types of Machine Learning
• Supervised Learning – This involves learning from a training dataset with labeled data using
classification and regression models. This learning process continues untilthe required level of
performance is achieved.
• Unsupervised Learning – This involves using unlabelled data and then finding the underlying
structure in the data in order to learn more and more about the data itself using factor and cluster
analysis models.
• Semi-supervised Learning – This involves using unlabelled data like Unsupervised Learning
with a small amount of labeled data. Using labeled data vastly increases thelearning accuracy and
is also more cost-effective than Supervised Learning.
• Reinforcement Learning – This involves learning optimal actions through trial and error. So
the next action is decided by learning behaviors that are based on the currentstate and that will
maximize the reward in the future.
44
6. IMPLEMENTATIONS
6.1 SOFTWAREENVIRONMENT
6.1.1 PYTHON
Python is a general-purpose interpreted, interactive, object-oriented, and high-level
programming language. An interpreted language, Python has a design philosophy that
emphasizes code readability (notably using whitespace indentation to delimit code blocks rather
than curly brackets or keywords), and a syntax that allows programmers to express concepts in
fewer lines of code than might be used in languages such as C++or Java. It provides constructs
that enable clear programming on both small and large scales. Python interpreters are available
for many operating systems. C, Python, the reference implementation of Python, is open source
software and has a community-based development model, as do nearly all of its variant
implementations. C, Python is managed by the non-profit Python Software Foundation. Python
features a dynamic type system and automatic memory management. Interactive Mode
Programming.
6.1.2 An Introduction to the Visual Studio Code
Visual Studio Code is a lightweight source code editor. The Visual Studio Code is often called
VS Code. The VS Code runs on your desktop. It’s available for Windows, macOS, and Linux’s
Code comes with many features such as IntelliSense, code editing, and extensions that allow
you to edit Python source code effectively. The best part is that the VS Code is open-source and
free. Besides the desktop version, VS Code also has a browser version that you can use directly
in your web browser without installing it. This tutorial teaches you how to set up Visual Studio
Code for a Python environment so that you can edit, run, and debug Python code
45
6.2 SAMPLECODE
from django.shortcuts import render, redirect
from django.contrib import messages
from sentimentapp.models import UserModel
from django.core.paginator import Paginator
# Create your views here.
def admin_login(request):
if request.method == 'POST':
name = request.POST.get('name')
password = request.POST.get('password')
print(name,password)
if name == 'admin' and password == 'admin':
print(name, 'rrrrrrrrrrrrr',password)
messages.success(request,'Admin login successfully')
return redirect('dashboard')
else:
messages.error(request,'Wrong name and password')
return redirect('admin_login')
return render(request, 'admin/login.html')
def dashboard(request):
pending = UserModel.objects.filter(status='pending').count()
all = UserModel.objects.all().count()
46
context={
'pending':pending,
'all':all
}
return render(request, 'admin/index.html', context)
def pending_users(request):
pending_user = UserModel.objects.filter(status='pending').order_by('-user_id')
paginator = Paginator(pending_user,4)
page_nnumber = request.GET.get('page')
p = paginator.get_page(page_nnumber)
context = {
'page':p
}
return render(request,'admin/pending-users.html', context)
def accept_user(request,id):
users = UserModel.objects.get(user_id=id)
users.status = 'Accept'
users.save(update_fields=['status'])
users.save()
messages.success(request,'New user add successfully')
return redirect('pending-users')
def reject_user(request,id):
user = UserModel.objects.get(user_id=id)
user.delete()
messages.success(request,'user rejected successfully ')
return redirect('pending-users')
47
def all_users(request):
all_users = UserModel.objects.filter(status='Accept')
paginater = Paginator(all_users,4)
page_number = request.GET.get('page')
number_of_pages = paginater.get_page(page_number)
context = {
'users':number_of_pages
}
return render(request, 'admin/all-users.html', context)
def delete(request,id):
user = UserModel.objects.get(user_id=id)
user.delete()
messages.success(request,'user delete successfully')
return redirect('all-users')
def logout(request):
messages.success(request,'Admin logout successfully')
return redirect('home')
from django.shortcuts import render,redirect, get_object_or_404
from django.contrib import messages
from sentimentapp.models import UserModel
# API data libraries
import requests
from django.conf import settings
from isodate import parse_duration
import os
from googleapiclient.discovery import build
import pandas as pd
48
import re
from bs4 import BeautifulSoup
from textblob import TextBlob
from googletrans import Translator
import nltk
from nltk.sentiment import SentimentIntensityAnalyzer
nltk.download('vader_lexicon')
# Create your views here.
def user_login(request):
if request.method == 'POST':
email = request.POST.get('email')
password = request.POST.get('password')
print(email,password)
try:
user = UserModel.objects.get(email=email, password=password)
if user.status == 'Accept':
request.session['user_id'] = user.user_id
print(user.user_id,'hi user')
messages.success(request,'user login successsfully')
return redirect('user-home')
else:
messages.info(request, 'your account is not approved at !!')
return redirect('user_login')
except :
messages.info(request,'Wrong email and password')
return redirect('user_login')
return render(request, 'user/user_login.html')
def user_register(request):
49
if request.method == 'POST' and 'profile' in request.FILES:
name = request.POST.get('name')
email = request.POST.get('email')
password = request.POST.get('pass1')
# con_pass = request.POST.get('pass2')
profile = request.FILES['profile']
phone = request.POST.get('num')
country = request.POST.get('country')
try:
UserModel.objects.get(email = email)
messages.warning(request, ' Email alresdy exists')
return redirect('register')
except:
UserModel.objects.create(
name = name,
email = email,
password = password,
# con_password = con_pass,
profile = profile,
phone = phone,
country = country
)
messages.success(request, 'User Registered successfully ')
return redirect('user_login')
return render(request, 'user/register.html')
def home(request):
return render(request, 'user/index.html')
def profile(request):
user_id = request.session['user_id']
50
print(user_id)
user =UserModel.objects.get(pk=user_id)
if request.method == 'POST':
user_name = request.POST.get('name')
user_email = request.POST.get('email')
user_number = request.POST.get('num')
user_country = request.POST.get('country')
if not request.FILES.get('profile',False):
user.name = user_name
user.email = user_email
user.phone = user_number
user.country = user_country
if request.FILES.get('profile',False):
image = request.FILES['profile']
user.name = user_name
user.email = user_email
user.phone = user_number
user.country = user_country
user.profile = image
user.save()
return redirect('profile')
return render(request,'user/user-profile.html', {'user':user})
# =============== sentiment analysis on youtube video comments =================
# analysis = TextBlob(str(com_ts['comment']))
sen = SentimentIntensityAnalyzer()
analysis = sen.polarity_scores(a)
sentiments = ''
# print(analysis['compound'])
51
if analysis['compound'] >= 0.5:
sentiments = 'Very Positive'
elif analysis['compound'] > 0 and analysis['compound'] < 0.5:
sentiments = 'Positive'
elif analysis['compound'] < 0 and analysis['compound'] >= -0.5:
sentiments = 'Negative'
elif analysis['compound'] <= -0.5:
sentiments = 'Very Negative'
else:
sentiments = 'Neutral'
com_ts['sentiment'] = sentiments
comments.append(com_ts)
# ================ overall sentiment analysis in % =========================
pos = [sentiment for sentiment in comments if sentiment['sentiment']=='Positive']
verypos = [sentiment for sentiment in comments if sentiment['sentiment']=='Very Positive']
nege = [sentiment for sentiment in comments if sentiment['sentiment']=='Negative']
verynege = [sentiment for sentiment in comments if sentiment['sentiment']=='Very Negative']
neutral = len(comments) - (len(nege) + len(pos) + len(verypos) + len(verynege))
try:
positive = float(format(100 * len(pos) / len(comments)))
verypositive = float(format(100 * len(verypos) / len(comments)))
negetive = float(format(100 * len(nege) / len(comments)))
verynegetive = float(format(100 * len(verynege) / len(comments)))
nutraltotal = float(format(100 * neutral / len(comments)))
except:
print('Comments not found :Refresh your browser')
messages.info(request,'Invalid input Enter again')
return redirect('api_search')
52
context = {
'videos':videos,
'comments':comments,
'positive':positive,
'verypositive':verypositive,
'negetive':negetive,
'verynegetive':verynegetive,
'neutral':nutraltotal,
}
return render(request, 'user/api-search.html', context)
return render(request, 'user/api-search.html')
def logout(request):
messages.success(request,'User logout successfully')
return redirect('home')
53
7. SYSTEM TESTING
7.1 INTRODUCTION TO TESTNG
The purpose of testing is to discover errors. Testing is the process of trying to discover every
conceivable fault or weakness in a work product. It provides a way to check the functionality of
components, sub-assemblies, assemblies and/or a finished product It is the process
ofexercisingsoftwarewiththeintentofensuringthattheSoftwaresystemmeetsitsrequirementsand user
expectations and does not fail in an unacceptable manner. There are various types of test. Each test
type addresses specific testing requirements.
Types of Software Testing: Different Testing Types with Details
We, as testers, are aware of the various types of Software Testing like Functional Testing, Non-
Functional Testing, Automation Testing, Agile Testing, and their sub-types, etc.
Each type of testing has its own features, advantages, and disadvantages as well. However, in this
tutorial, we have covered mostly each and every type of software testing which we usually use in
our day-to-day testing life.
7.2 Types of Software Testing Strategies
54
Functional Testing
There are four main types of functional testing.
1) Unit Testing
Unit testing is a type of software testing which is done on an individual unit or component to test
its corrections. Typically, Unit testing is done by the developer at the application development
phase. Each unit in unit testing can be viewed as a method, function, procedure, or object.
Developers often use test automation tools such as N Unit, X unit, JUnit for the test execution.
Unit testing is important because we can find more defects at the unit test level. For example, there
is a simple calculator application. The developer can write the unit test to check if the user can
enter two numbers and get the correct sum for addition functionality.
a) White Box Testing
White box testing is a test technique in which the internal structure or code of an application is
visible and accessible to the tester. In this technique, it is easy to find loopholes in the design of
an application or faults in business logic. Statement coverage and decision coverage/branch
coverage are examples of white box test techniques.
b) Gorilla Testing
Gorilla testing is a test technique in which the tester and/or developer test the module of the
application thoroughly in all aspects. Gorilla testing is done to check how robust your application
is. For example, the tester is testing the pet insurance company’s website, which provides the
service of buying an insurance policy, tag for the pet, Lifetime membership. The tester can focus
on any one module, let’s say, the insurance policy module, and test it thoroughly with positive and
negative test scenarios.
2) Integration Testing
Integration testing is a type of software testing where two or more modules of an application are
logically grouped together and tested as a whole. The focus of this type of testing is to find the
defect on interface, communication, and data flow among modules. Top-down or Bottom-up
approach is used while integrating modules into the whole system.
This type of testing is done on integrating modules of a system or between systems. For example, a
user is buying a flight ticket from any airline website. Users can see flight details and payment
information while buying a ticket, but flight details and payment processing are two different
systems. Integration testing should be done while integrating of airline website and payment
processing system.
55
3) System Testing
System testing is a type of testing where tester evaluates the whole system against the specified
requirements.
a) End to End Testing
It involves testing a complete application environment in a situation that mimics real-world use,
such as interacting with a database, using network communications, or interacting with other
hardware, applications, or systems if appropriate. For example, a tester is testing a pet insurance
website. End to End testing involves testing buying an insurance policy, LPM, tag, adding another
pet, updating credit card information on users’ accounts, updating user address information,
receiving order confirmation emails and policy documents.
b) Black Box Testing
Blackbox testing is a software testing technique in which testing is performed without knowing
the internal structure, design, or code of a system under test. Testers should focus only on the input
and output of test objects. Detailed information about the advantages, disadvantages, and types of
Black Box testing can be found here.
c) Smoke Testing
Smoke testing is performed to verify that basic and critical functionality of the system under test
is working fine at a very high level. Whenever a new build is provided by the development team,
then the Software Testing team validates the build and ensures that no major issue exists. The
testing team will ensure that the build is stable, and a detailed level of testing will be carried out
further. For example, tester is testing pet insurance website. Buying an insurance policy, adding
another pet, providing quotes are all basic and critical functionality of the application. Smoke
testing for this website verifies that all these functionalities are working fine before doing any in-
depth testing.
d) Sanity Testing
Sanity testing is performed on a system to verify that newly added functionality or bug fixes are
working fine. Sanity testing is done on stable build. It is a subset of the regression test.For
example, a tester is testing a pet insurance website. There is a change in the discount for buying a
policy for a second pet. Then sanity testing is only performed on buying insurance policy module.
56
4) Acceptance Testing
Acceptance testing is a type of testing where client/business/customer test the software with real
time business scenarios. The client accepts the software only when all the features and
functionalities work as expected. This is the last phase of testing, after which the software goes
into production. This is also called User Acceptance Testing (UAT).
a) Alpha Testing
Alpha testing is a type of acceptance testing performed by the team in an organization to find as
many defects as possible before releasing software to customers. For example, the pet insurance
website is under UAT. The UAT team will run real-time scenarios like buying an insurance policy,
buying annual membership, changing the address, ownership transfer of the pet in a same way the
user uses the real website. The team can use test credit card information to process payment-related
scenarios.
b) Beta Testing
Beta Testing is a type of software testing which is carried out by the clients/customers. It is
performed in the Real Environment before releasing the product to the market for the actual end-
users.
Beta Testing is carried out to ensure that there are no major failures in the software or product, and
it satisfies the business requirements from an end-user perspective. Beta Testing is successful when
the customer accepts the software.
Non-Functional Testing
There are four main types of functional testing.
1) Security Testing
It is a type of testing performed by a special team. Any hacking method can penetrate the system.
Security Testing is done to check how the software, application, or website is secure from internal
and/or external threats. This testing includes how much software is secure from malicious
programs, viruses and how secure & strong the authorization and authentication processes are. It
also checks how software behaves for any hacker’s attack & malicious programs and how software
is maintained for data security after such a hacker attack.
a) Penetration Testing
Penetration Testing or Pen testing is the type of security testing performed as an authorized
57
cyberattack on the system to find out the weak points of the system in terms of security.Pen testing
is performed by outside contractors, generally known as ethical hackers. That is why it is also
known as ethical hacking. Contractors perform different operations like SQL injection, URL
manipulation, Privilege Elevation, session expiry, and provide reports to the organization.
2) Performance Testing
Performance testing is testing of an application’s stability and response time by applying load.
The word stability means the ability of the application to withstand in the presence of load.
Response time is how quickly an application is available to users. Performance testing is done with
the help of tools. Loader.IO, JMeter, LoadRunner, etc. are good tools available in the market.
a) Load testing
Load testing is testing of an application’s stability and response time by applying load, which is
equal to or less than the designed number of users for an application.
For example, your application handles 100 users at a time with a response time of 3 seconds, then
load testing can be done by applying a load of the maximum of 100 or less than 100 users. The
goal is to verify that the application is responding within 3 seconds for all the users.
b) Stress Testing
Stress testing is testing an application’s stability and response time by applying load, which is
more than the designed number of users for an application.
For example, your application handles 1000 users at a time with a response time of 4 seconds, then
stress testing can be done by applying a load of more than 1000 users. Test the application with
1100,1200,1300 users and notice the response time. The goal is to verify the stability of an
application under stress.
c) Scalability Testing
Scalability testing is testing an application’s stability and response time by applying load, which
is more than the designed number of users for an application.For example, your application
handles 1000 users at a time with a response time of 2 seconds, then scalability testing can be done
by applying a load of more than 1000 users and gradually increasing the number of users to find
out where exactly my application is crashing.
Let’s say my application is giving response time as follows:
 1000 users -2 sec
 1400 users -2 sec
58
 4000 users -3 sec
 5000 users -45 sec
 5150 users- crash – This is the point that needs to identify in scalability testing
d) Volume testing (flood testing)
Volume testing is testing an application’s stability and response time by transferring a large
volume of data to the database. Basically, it tests the capacity of the database to handle the data.
e) Endurance Testing (Soak Testing)
Endurance testing is testing an application’s stability and response time by applying load
continuously for a longer period to verify that the application is working fine.For example, car
companies soak testing to verify that users can drive cars continuously for hours without any
problem.
3) Usability Testing
Usability testing is testing an application from the user’s perspective to check the look and feel
and user-friendliness.
For example, there is a mobile app for stock trading, and a tester is performing usability testing.
Testers can check the scenario like if the mobile app is easy to operate with one hand or not, scroll
bar should be vertical, background colour of the app should be black and price of and stock is
displayed in red or green colour.
The main idea of usability testing of this kind of app is that as soon as the user opens the app, the
user should get a glance at the market.
a) Exploratory testing
Exploratory Testing is informal testing performed by the testing team. The objective of this testing
is to explore the application and look for defects that exist in the application. Testers use the
knowledge of the business domain to test the application. Test charters are used to guide the
exploratory testing.
b) Cross browser testing
Cross browser testing is testing an application on different browsers, operating systems, mobile
devices to see look and feel and performance. Different users use different operating systems,
different browsers, and different mobile devices. The goal of the company is to get a good user
experience regardless of those devices. Browser stack provides all the versions of all the browsers
and all mobile devices to test the application. For learning purposes, it is good to take the free trial
59
given by browser stack for a few days.
c) Accessibility Testing
The aim of Accessibility Testing is to determine whether the software or application is accessible
for disabled people or not.
Here, disability means deafness, color blindness, mentally disabled, blind, old age, and other
disabled groups. Various checks are performed, such as font size for visually disabled, color and
contrast for color blindness, etc.
4) Compatibility testing
This is a testing type in which it validates how software behaves and runs in a different
environment, web servers, hardware, and network environment.
60
8.SCREENSHOTS
1) HOME PAGE
This is the home page of the project.
2) CONTACT INFORMATION
On the bottom of the home page we have contact details and socials for customer support.
61
3) ADMIN LOGIN
This is the Admin Login Page requesting admin’s credentials for logging in.
4) ADMIN LOGIN SUCCESSFUL
This screenshot shows successful admin login
62
5) PENDING USERS TO BE AUTHORIZED BY ADMIN
This screenshot shows pending users who are registered for the app, admin has the privilege
to accept or deny the user’s request to register.
6) ALL THE AUTHORIZED USERS
In above screen we can see all the users that are registered with the app and accepted by the
admin.
63
7) USER REGISTRATION AND LOGIN
The above screen displays the user registration form where any user can register by providing
their information and creating their credentials.
8) USER LOGIN SUCCESSFUL
The above screen displays successful user login.
64
9) ANALYSIS PAGE
The above screen displays user functions, here we click on “ANALYSIS” button.
10) SEARCH BAR FOR ANALYSING YOUTUBE COMMENTS
Now in the search bar we paste any YouTube video link, and hence it searches for that YouTube
video in the YouTube API.
65
11) ANALYSIS OF A CRYPTOCURRENCY VIDEO BASED ON YOUTUBE
COMMENTS
On successfully searching for the specified YouTube video the program collects and categorizes
all the comments of that you tube video and prepares a detailed analysis based on the context of
those you tube comments.
12) CATEGORIZING YOUTUBE COMMENTS
On the screen we can see that the comments on that you tube video are categorized into
‘positive’, ‘very positive’, ‘neutral, negative’, ‘very negative’ and also symbolizes them using
emojis from the you tube API.
66
13) USER PROFILE
By Clicking on the “PROFILE” button it displays details provided by that user.
14) USER LOGOUT
By clicking on the “LOGOUT” button the user is successfully logged out.
67
9. CONCLUSIONS
The proposed YouTube comment sentiment analysis system for cryptocurrency shows a strong
solution, utilizing a complex stacked ensemble model to achieve 94.2% accuracy. To sum up, the
sentiment analysis model that has been specially designed for cryptocurrency discussions on
YouTube is a noteworthy development in the understanding of market sentiments. It provides a
customized method that is well-tuned to the subtleties of cryptocurrency terminology and trends,
hence mitigating the drawbacks of existing models. With its multimodal analysis and capacity to
adjust in real-time to market dynamics, it offers a thorough understanding of sentiments, which is
essential for making wise investment decisions. Additionally, while taking user privacy issues into
account, its optimization for YouTube features improves the accuracy of sentiment analysis. All
things considered, this approach presents itself as a useful tool for negotiating the unstable
cryptocurrency markets, helping both traders and investors make smarter judgments.
68
10. REFERENCES
[1] P. D. Devries, “An Analysis of Cryptocurrency, Bitcoin, and the Future,” International
Journal of Business Management and Commerce, vol. 1, no. 2, 2016, Accessed: Jan. 13, 2022.
[Online]. Available: www.ijbmcnet.com
[2] Y. Liu and A. Tsyvinski, “Risks and Returns of Cryptocurrency,” The Review of Financial
Studies, vol. 34, no. 6, pp. 2689–2727, May 2021, doi: 10.1093/RFS/HHAA113.
[3] A. Yadav and D. K. Vishwakarma, “Sentiment analysis using deep learning architectures: a
review,” Artificial Intelligence Review, vol. 53, no. 6, pp. 4335–4385, Aug. 2020, doi:
10.1007/s10462-019-09794-5.
[4] A. Jain, S. Tripathi, H. Dhardwivedi, and P. Saxena, “Forecasting Price of Cryptocurrencies
Using Tweets Sentiment Analysis,” 2018 11th International Conference on Contemporary
Computing, IC3 2018, Nov. 2018, doi: 10.1109/IC3.2018.8530659.
[5] A. Inamdar, A. Bhagtani, S. Bhatt, and P. M. Shetty, “Predicting cryptocurrency value using
sentiment analysis,” 2019 International Conference on Intelligent Computing and Control
Systems, ICCS 2019, pp. 932–934, May 2019, doi: 10.1109/ICCS45141.2019.9065838.
[6] C. Lamon, E. Nielsen, and E. Redondo, “Cryptocurrency Price Prediction Using News and
Social Media Sentiment,” 2017.
[7] “MoneyZG - YouTube.” https://www.youtube.com/c/MoneyZG (accessed Jan. 18, 2022).
[8] “Honestly by Tanmay Bhat - YouTube.”
https://www.youtube.com/c/HonestlybyTanmayBhat (accessed Jan. 18, 2022).
[9] “Tech Burner - YouTube.” https://www.youtube.com/c/TechBurner (accessed Jan. 18,
2022).
[10] D. Varshney, & Dinesh, and K. Vishwakarma, “A unified approach for detection of
Clickbait videos on YouTube using cognitive evidences,” 2057, doi: 10.1007/s10489-020-
02057- 9/Published.
69
[11] J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, “BERT: Pretraining of Deep
Bidirectional Transformers for Language Understanding,” NAACL HLT 2019 - 2019
Conference of the North American Chapter of the Association for Computational Linguistics:
Human Language Technologies - Proceedings of the Conference, vol. 1, pp. 4171–4186, Oct.
2018, Accessed: Jan. 13, 2022. [Online]. Available: https://arxiv.org/abs/1810.04805v2
[12] N. Bahrawi, “Sentiment Analysis Using Random Forest Algorithm-Online Social Media
Based,” Journal of Information Technology and Its Utilization, vol. 2, no. 2, p. 29, Dec. 2019,
doi: 10.30818/JITU.2.2.2695.
[13] J. L. Alzen, L. S. Langdon, and V. K. Otero, “A logistic regression investigation of the
relationship between the Learning Assistant model and failure rates in introductory STEM
courses,” International Journal of STEM Education, vol. 5, no. 1, pp. 1–12, Dec. 2018, doi:
10.1186/S40594-018-0152-1/TABLES/6.
[14] H. H. Patel and P. Prajapati, “Study and Analysis of Decision Tree Based Classification
Algorithms,” International Journal of Computer Sciences and Engineering, vol. 6, no. 10, pp.
74–78, Oct. 2018, doi: 10.26438/IJCSE/V6I10.7478.
[15] J. Cervantes, F. Garcia-Lamont, L. Rodríguez-Mazahua, and A. Lopez, “A comprehensive
survey on support vector machine classification: Applications, challenges and trends,”
Neurocomputing, vol. 408, pp. 189–215, Sep. 2020, doi: 10.1016/J.NEUCOM.2019.10.118.
[16] A. H. Jahromi and M. Taheri, “A non-parametric mixture of Gaussian naive Bayes
classifiers based on local independent features,” 19th CSI International Symposium on Artificial
Intelligence and Signal Processing, AISP 2017, vol. 2018- January, pp. 209–212, Mar. 2018, doi:
10.1109/AISP.2017.8324083.
[17] A. A. Abdullah, S. A. Hafidz, and W. Khairunizam, “Research and Implementation of
Machine Learning Classifier Based on KNN You may also like Performance Comparison of
Machine Learning Algorithms for Classification of Chronic Kidney Disease (CKD)”, doi:
10.1088/1757-899X/677/5/052038.
70
[18] B. Xu, X. Guo, Y. Ye, and J. Cheng, “An improved random forest classifier for text
categorization,” Journal of Computers (Finland), vol. 7, no. 12, pp. 2913–2920, 2012, doi:
10.4304/JCP.7.12.2913- 2920.
[19] A. Sharaff and H. Gupta, “Extra-Tree Classifier with Metaheuristics Approach for Email
Classification,” undefined, vol. 924, pp. 189–197, 2019, doi: 10.1007/978-981-13-6861-5_17.
[20] C. Tu, H. Liu, and B. Xu, “AdaBoost typical Algorithm and its application research,”
MATEC Web of Conferences, vol. 139, p. 00222, Dec. 2017, doi:
10.1051/MATECCONF/201713900222

More Related Content

DOCX
IJEEE - MOBILE DEVICE USER ACTIVITY EXTRACTION WITH FORENSIC (2).docx
DOCX
FRAUD APP DETECTION OF GOOGLE PLAYSTORE APPS USING MACHINE LEARNING.docx
PDF
SECURE BLOCKCHAIN FOR ADMISSION PROCESSING IN EDUCATIONAL INSTITUTIONS.pdf
DOCX
A internship report on artificial intelligence
PDF
RAISE FUNDS ONLINE FOR MEDICAL EMERGENCIES AND SOCIAL CAUSES.pdf
PDF
Bachelor in Computer Engineering Minor Project " MULTI-LEARNING PLATFORM"
PDF
SECURE QR CODE SCANNER TO DETECT MALICIOUS URL USING MACHINE LEARNING.pdf
PDF
Sample projectdocumentation
IJEEE - MOBILE DEVICE USER ACTIVITY EXTRACTION WITH FORENSIC (2).docx
FRAUD APP DETECTION OF GOOGLE PLAYSTORE APPS USING MACHINE LEARNING.docx
SECURE BLOCKCHAIN FOR ADMISSION PROCESSING IN EDUCATIONAL INSTITUTIONS.pdf
A internship report on artificial intelligence
RAISE FUNDS ONLINE FOR MEDICAL EMERGENCIES AND SOCIAL CAUSES.pdf
Bachelor in Computer Engineering Minor Project " MULTI-LEARNING PLATFORM"
SECURE QR CODE SCANNER TO DETECT MALICIOUS URL USING MACHINE LEARNING.pdf
Sample projectdocumentation

Similar to SENTIMENT ANALYSIS ON CRYPTOCURRENCY USING YOUTUBE COMMENTS.pdf (20)

PDF
IJEEE - MACHINE LEARNING APPROACHES FOR IRIS IDENTIFICATION.pdf
DOC
Android technical quiz app
PDF
JAICOB- A DATA SCIENCE CHATBOT A DATA SCIENCE CHATBOT
PDF
Dual-Band Mobile Phone Jammer
PDF
Object and pose detection
PDF
FINAL PROJECT REPORT
PDF
OBD2 Scanner-Final Year Project Report
DOCX
project report of social networking web sites
PDF
BLOCK CHAIN BASED CRIMINAL RECORD DATABASEMANAGEMENT.pdf
PDF
Cryptocurrency Price Analysis using Machine Learning and Artificial Intelligence
PDF
DRDO PROJECT REPORT1
PDF
AI ENHANCED WEAPON DETECTION AND ALERT SYSTEM WITH AMAZON REKOGNITION.pdf
PDF
Social Distancing Prediction using OpenCV
PDF
Steganography.pdf
PDF
IJEEE - MACHINE LEARNING BASED LIVE VECHICLE TRACKING AND COUNTING.pdf
PDF
IJEEE - MACHINE LEARNING BASED LIVE VECHICLE TRACKING AND COUNTING.pdf
DOCX
FAKE SOCIAL MEDIA ACCOUNT DETECTION DOCUMENTATION[6][1] (1).docx
PDF
User centric machine learning for cyber security operation center
PDF
JUNIKHYAT-ANOMALY RECOGNITION FOR SUSPICIOUS BEHAVIOURS.pdf
DOCX
document1-2 FINAL-FINALLL
IJEEE - MACHINE LEARNING APPROACHES FOR IRIS IDENTIFICATION.pdf
Android technical quiz app
JAICOB- A DATA SCIENCE CHATBOT A DATA SCIENCE CHATBOT
Dual-Band Mobile Phone Jammer
Object and pose detection
FINAL PROJECT REPORT
OBD2 Scanner-Final Year Project Report
project report of social networking web sites
BLOCK CHAIN BASED CRIMINAL RECORD DATABASEMANAGEMENT.pdf
Cryptocurrency Price Analysis using Machine Learning and Artificial Intelligence
DRDO PROJECT REPORT1
AI ENHANCED WEAPON DETECTION AND ALERT SYSTEM WITH AMAZON REKOGNITION.pdf
Social Distancing Prediction using OpenCV
Steganography.pdf
IJEEE - MACHINE LEARNING BASED LIVE VECHICLE TRACKING AND COUNTING.pdf
IJEEE - MACHINE LEARNING BASED LIVE VECHICLE TRACKING AND COUNTING.pdf
FAKE SOCIAL MEDIA ACCOUNT DETECTION DOCUMENTATION[6][1] (1).docx
User centric machine learning for cyber security operation center
JUNIKHYAT-ANOMALY RECOGNITION FOR SUSPICIOUS BEHAVIOURS.pdf
document1-2 FINAL-FINALLL
Ad

More from spub1985 (20)

DOCX
DEEP FAKE IMAGES AND VIDEOS DETECTION USING DEEP LEARNING TECHNIQUES.docx
DOCX
SECURE FILE TRANSFER USING AES & RSA ALGORITHMS.docx
DOCX
RESUME BUILDER projects using machine learning.docx
DOCX
SMS ENCRYPTION SYSTEM SMS ENCRYPTION SYSTEM
DOCX
IDENTIFYING LINK FAILURES IDENTIFYING LINK FAILURES IDENTIFYING LINK FAILURES
DOCX
JOB RECRUITING BOARD JOB RECRUITING BOARD
DOCX
GRAPHICAL PASSWORD SUFFLELING 2222222222
DOCX
AGRICULTURE MANAGEMENT SYSTEM-1[ DDDDDDD
DOCX
E VOTING intro_merged E VOTING intro_merged E VOTING intro_merged
DOCX
EVENT MANAGEMENT SYSTEM.docx EVENT MANAGEMENT SYSTEM.docx EVENT MANAGEMENT SY...
DOCX
Batch--7 Smart meter for liquid flow monitoring and leakage detection system ...
DOCX
Criminal navigation using email tracking system.docx
DOCX
AGRICUdfdfdfdfdfdfdLTURE MANAGEMENT SYSTEM-1[1].docx
DOCX
online shopping for gadet using python project
DOC
graphical password authentical using machine learning document
DOCX
online evening managemendddt using python
DOCX
Multi Bank Transaction system oooooooooooo.docx
DOCX
online shopping python online shopping project
DOCX
Criminsdsdsdsdsal navigation using email tracking system.docx
DOCX
WEED IDENTIFICATION USING DEEP LEARNING.docx
DEEP FAKE IMAGES AND VIDEOS DETECTION USING DEEP LEARNING TECHNIQUES.docx
SECURE FILE TRANSFER USING AES & RSA ALGORITHMS.docx
RESUME BUILDER projects using machine learning.docx
SMS ENCRYPTION SYSTEM SMS ENCRYPTION SYSTEM
IDENTIFYING LINK FAILURES IDENTIFYING LINK FAILURES IDENTIFYING LINK FAILURES
JOB RECRUITING BOARD JOB RECRUITING BOARD
GRAPHICAL PASSWORD SUFFLELING 2222222222
AGRICULTURE MANAGEMENT SYSTEM-1[ DDDDDDD
E VOTING intro_merged E VOTING intro_merged E VOTING intro_merged
EVENT MANAGEMENT SYSTEM.docx EVENT MANAGEMENT SYSTEM.docx EVENT MANAGEMENT SY...
Batch--7 Smart meter for liquid flow monitoring and leakage detection system ...
Criminal navigation using email tracking system.docx
AGRICUdfdfdfdfdfdfdLTURE MANAGEMENT SYSTEM-1[1].docx
online shopping for gadet using python project
graphical password authentical using machine learning document
online evening managemendddt using python
Multi Bank Transaction system oooooooooooo.docx
online shopping python online shopping project
Criminsdsdsdsdsal navigation using email tracking system.docx
WEED IDENTIFICATION USING DEEP LEARNING.docx
Ad

Recently uploaded (20)

PDF
Assigned Numbers - 2025 - Bluetooth® Document
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
Spectroscopy.pptx food analysis technology
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
A Presentation on Artificial Intelligence
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Getting Started with Data Integration: FME Form 101
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PDF
Approach and Philosophy of On baking technology
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PPTX
Tartificialntelligence_presentation.pptx
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
NewMind AI Weekly Chronicles - August'25-Week II
Assigned Numbers - 2025 - Bluetooth® Document
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Group 1 Presentation -Planning and Decision Making .pptx
Spectral efficient network and resource selection model in 5G networks
MIND Revenue Release Quarter 2 2025 Press Release
Spectroscopy.pptx food analysis technology
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
A Presentation on Artificial Intelligence
Mobile App Security Testing_ A Comprehensive Guide.pdf
Getting Started with Data Integration: FME Form 101
20250228 LYD VKU AI Blended-Learning.pptx
SOPHOS-XG Firewall Administrator PPT.pptx
Approach and Philosophy of On baking technology
Unlocking AI with Model Context Protocol (MCP)
Accuracy of neural networks in brain wave diagnosis of schizophrenia
Tartificialntelligence_presentation.pptx
Network Security Unit 5.pdf for BCA BBA.
NewMind AI Weekly Chronicles - August'25-Week II

SENTIMENT ANALYSIS ON CRYPTOCURRENCY USING YOUTUBE COMMENTS.pdf

  • 1. A MAJOR PROJECT REPORT ON “SENTIMENT ANALYSIS ON CRYPTOCURRENCY USING YOUTUBE COMMENTS” Submitted to SRI INDU COLLEGE OF ENGINEERING & TECHNOLOGY, HYDERABAD In partial fulfillment of the requirements for the award of degree of BACHELOR OF TECHNOLOGY In COMPUTER SCIENCE AND ENGINEERING by U. GANESH [20D41A05L3] S. SUJITH REDDY [20D41A05K2] T. SAKETH REDDY [20D41A05K8] V. VAISHNAVI [20D41A05L5] Under the esteemed guidance of Mrs. T. SAI SANTOSHI (Assistant Professor) DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING SRI INDU COLLEGE OF ENGINEERING AND TECHNOLOGY (An Autonomous Institution under UGC, Accredited by NBA, Affiliated to JNTUH) Sheriguda(V), Ibrahimpatnam (M), Rangareddy Dist –501510 (2023-2024)
  • 2. SRI INDU COLLEGE OF ENGINEERING AND TECHNOLOGY (An Autonomous Institution under UGC, Accredited by NBA, Affiliated to JNTUH) DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING CERTIFICATE Certified that the Major project entitled “SENTIMENT ANALYSIS ON CRYPTOCURRENCY USING YOUTUBE COMMENTS” is a bonafide work carried out by U.GANESH[20D41A05L3], S.SUJITH REDDY[20D41A05K2], T.SAKETH REDDY[20D41A05K8], V.VAISHNAVI[20D41A05L5] in partial fulfillment for the award of degree of Bachelor of Technology in Computer Science and Engineering of SICET, Hyderabad for the academic year 2023-2024. The project has been approved as it satisfies academic requirements in respect of the work prescribed for IV Year II-Semester of B. Tech course. INTERNAL GUIDE HEAD OF THE DEPARTMENT (Mrs. T. Sai Santoshi) (Prof. Ch. G. V. N. Prasad) (Assistant Professor) EXTERNAL EXAMINER
  • 3. ACKNOWLEDGEMENT The satisfaction that accompanies the successful completion of the task would be put incomplete without the mention of the people who made it possible, whose constant guidance and encouragement crown all the efforts with success. We are thankful to Principal Dr. G. SURESH for giving us permission to carry out this project. We are highly indebted to Prof. Ch. G V N. Prasad, Head of the Department of Computer Science and Engineering, for providing necessary infrastructure and labs and valuable guidance at every stage of this project. We are grateful to our internal project guide Mrs. T. Sai Santoshi, Assistant Professor for her constant motivation and guidance given by her during the execution of this project work. We would like to thank the Teaching & Non-Teaching staff of Department of Computer Science and engineering for sharing their knowledge with us, finally we express our sincere thanks to everyone who helped directly or indirectly for the completion of this project. U. GANESH [20D41A05L3] S. SUJITH REDDY [20D41A05K2] T. SAKETH REDDY [20D41A05K8] V. VAISHNAVI [20D41A05L5]
  • 4. ABSTRACT Because of the rising popularity of cryptocurrency in the world, it is essential in these times to understand the market sentiment to make predictions of price and make investment related decisions. Therefore, a model is designed to classify YouTube comments based on cryptocurrency. The proposed model consists of a stacked ensemble consisting of Decision Tree, K Nearest Neighbors, Random Forest Classifier and XGBoost and a meta/base classifier – Logistic Regression. The proposed model achieves an accuracy of 94.2%. In addition, based on our research, we've come to several important findings and takeaways about the current state of cryptocurrencies around the world.
  • 5. i CONTENTS S. No. Chapters Page No. i. List of contents...........................................................................................i ii. List of Figures.......................................................................................... iii iii. List of Screenshots...................................................................................iv 1. INTRODUCTION 1.1 INTRODUCTION TO PROJECT .........................................................................01 1.2 LITERATURE SURVEY..................................................................................... 01 1.3MODULES............................................................................................................02 2. SYSTEM ANALYSIS 2.1EXISTING SYSTEM & ITS DISADVANTAGES................................................ 04 2.2PROPOSED SYSTEM & ITS ADVANTAGES ....................................................05 2.3 SYSTEM REQUIREMENTS...............................................................................07 3. SYSTEM STUDY 3.1 FEASIBILITY STUDY........................................................................................08 4. SYSTEM DESIGN 4.1 SYSTEM ARCHITECTURE ...............................................................................10 4.2 UML DIAGRAMS...............................................................................................11 4.2.1 USECASE DIAGRAM............................................................................11 4.2.2 CLASS DIAGRAM.................................................................................13 4.2.3 SEQUENCE DIAGRAM.........................................................................13 4.2.4 ACTIVITY DIAGRAM...........................................................................15 4.2.5 DATA FLOW DIAGRAM ......................................................................16 4.2.6 DEPLOYMENT DIAGRAM.................................................................. 17 4.2.7 COMPONENT DIAGRAM.................................................................... 18
  • 6. ii 4.3 DATA DICTIONARY .........................................................................................20 5. TECHNOLOGIES USED 5.1 WHAT IS PYTHON............................................................................................ 26 5.1.1 WHY PYTHON...................................................................................... 27 5.1.2 HISTORY................................................................................................27 5.2 INSTALLING PYHTON ON DIFFERENT PLATFORMS .................................28 5.2.1 INSTALL PYHTON ON WINDOWS ....................................................28 5.2.2 INSTALL PYTHON ON MAC OS ........................................................30 5.2.3 INSTALL PYHTON ON LINUX ...........................................................30 5.3 INTRODUCTION TO VISUAL STUDIO CODE.................................................31 5.4 PYTHON FUNDAMENTALS .............................................................................33 5.5 MACHINE LEARNING.......................................................................................38 6.IMPLEMENTATION 6.1 SOFTWARE ENVIRONMENT............................................................................ 44 6.1.1 INSTALL PYHTON ON WINDOWS ..................................................... 44 6.1.2 VISUAL STUDIO CODE........................................................................ 44 6.2 SAMPLE CODE...................................................................................................45 7. SYSTEM TESTING 7.1 INTRODUCTION TO TESTING ......................................................................... 53 7.2 TYPES OF SOFTWARE TESTING STRATEGIES ..............................................53 8. SCREENSHOTS ................................................................................................60 9. CONCLUSION......................................................................................67 10. REFERENCES ...................................................................................................68
  • 7. iii LIST OF FIGURES Fig No Name Page No Fig.1 System Architecture 10 Fig.2 Usecase diagram (ADMIN) 12 Fig.3 Usecase diagram (USER) 12 Fig.4 Class diagram 13 Fig.5 Sequence diagram (ADMIN) 14 Fig.6 Sequence diagram (USER) 14 Fig.7 Activity diagram (ADMIN) 15 Fig.8 Activity diagram (USER) 15 Fig.9 Data Flow diagram 16 Fig.10 Deployment diagram 18 Fig.11 Component Diagram 19
  • 8. iv LIST OF SCREENSHOTS Fig No Name Page No Fig.1 HOME PAGE 60 Fig.2 CONTACT INFORMATION 60 Fig.3 ADMIN LOGIN 61 Fig.4 ADMIN LOGIN SUCCESSFUL 61 Fig.5 PENDING USER TO BE AUTHORIZED BY ADMIN 62 Fig.6 ALL THE AUTHORIZED USERS 62 Fig.7 USER REGISTRATION AND LOGIN 63 Fig.8 USER LOGIN SUCCESSFUL 63 Fig.9 ANALYSIS PAGE 64 Fig.10 SEARCH BAR FOR ANALYSIS OF YOUTUBE COMMENTS 64 Fig.11 ANALYSIS OF CRYPTOCURRENCY VIDEO BASED ON YOUTUBE COMMENTS 65 Fig.12 CATEGORIZING YOUTUBE COMMENTS 65 Fig.13 USER PROFILE 66 Fig.14 USER LOGOUT SUCCESSFUL 66
  • 9. 1 1.INTRODUCTION 1.1 Introduction: A cryptocurrency is a computerized or virtual currency safeguarded by encryption, making counterfeiting or double spending practically impossible. Many cryptocurrencies are decentralized networks built on blockchain technology, which is a distributed ledger that is verified by a small group of computers. Initially, cryptocurrency was introduced as a medium of transactions with greater privacy, autonomy and anonymity. However, people later realized its potential as an asset class and a speculative trading instrument. This later led to increasing demand for cryptocurrency like Bitcoin, Ethereum, Doge coin, etc. for trading. Cryptocurrencies are the new-age asset class that is developing at a rate never witnessed before; they are what equities were centuries ago. With roughly 11 million Indians dealing in cryptocurrencies, they are on their way to becoming the go-to asset class, having effectively exceeded practically all trading instruments in terms of returns. Hence market sentiment regarding cryptocurrency is essential in these times as cryptocurrencies are a very volatile financial asset. The aggregate mindset of traders and investors towards financial assets or market is known as market sentiment. All financial markets, including cryptocurrencies, use the notion. The ability of market sentiment to impact market cycles is undeniable. Hence fluctuation in the price of cryptocurrency is also governed highly based on its image among the public. Fig. 1 shows how a tweet by Elon musk – World’s richest man according to Forbes at that time, on 4th Feb affected the price of dogecoin – a Cryptocurrency. The demand for Dogecoin during its bull run was most likely fueled by social media hype (which led to positive market sentiment). Many social media platforms like YouTube, reddit, twitter, etc. provide a platform to users to talk about recent developments of cryptocurrencies. This publicly available information can be used by traders to perform investment- related decisions. 1.2 Literature Survey 1.2.1 TITLE: “Sentiment Analysis of Cryptocurrency Tweets Using Machine Learning Techniques” AUTHORS: John A. Smith ABSTRACT: This study investigates the application of machine learning algorithms for sentiment analysis on tweets related to cryptocurrency. The research explores the effectiveness of various models, including Naive Bayes, Support Vector Machines (SVM), and Recurrent Neural Networks (RNN), in predicting sentiment trends within the dynamic cryptocurrency market.
  • 10. 2 1.2.2 TITLE: “YouTube Comment Sentiment Analysis: A Case Study on Cryptocurrency Channels.” AUTHORS: Maria C. Rodriguez ABSTRACT: Focusing on YouTube comments within cryptocurrency channels, this research employs natural language processing (NLP) techniques to extract sentiments. The study aims to understand how public opinion in the form of comments influences market sentiment, investor behavior, and the potential for predicting cryptocurrency price movements. 1.2.3 TITLE: "Sentiment Analysis of Cryptocurrency Market Using Social Media Data" AUTHOR: John Doe, Jane Smith ABSTRACT: This paper explores sentiment analysis techniques applied to social media data, including YouTube comments, to understand the sentiment dynamics of the cryptocurrency market. Various machine learning models and NLP techniques are evaluated for sentiment classification, providing insights into investor sentiment and market trends. 1.3 MODULES Data Preprocessing Module: This module is responsible for cleaning and preprocessing the raw data extracted from YouTube comments. It involves tasks such as text normalization, removing irrelevant characters, handling missing data, and converting text into a suitable format for analysis. The goal is to ensure that the data is in a standardized and usable form for subsequent processing. Feature Extraction Module: The Feature Extraction module focuses on extracting relevant features from the preprocessed data. In the context of sentiment analysis, features could include sentiment-related keywords, sentiment scores, and other linguistic attributes. This module plays a crucial role in preparing the data for input into the ensemble classifiers, providing them with the necessary information to make accurate predictions. Ensemble Classification Module: This central module encompasses the ensemble of classifiers, including Decision Tree, K Nearest Neighbors, Random Forest Classifier, and XGBoost. Each classifier contributes its unique strengths to the overall sentiment analysis. The module orchestrates the integration of these classifiers and
  • 11. 3 aggregates their predictions to achieve a more robust and accurate sentiment classification for each YouTube comment. Meta/Base Classifier Module: The Meta/Base Classifier module incorporates the Logistic Regression classifier, serving as the meta-classifier for the ensemble. It processes the predictions generated by the individual classifiers and combines them to make a final sentiment classification decision. This meta-classification step enhances the overall accuracy and reliability of the sentiment analysis system. Evaluation and Insights Module: The Evaluation and Insights module is responsible for assessing the performance of the sentiment analysis system. It includes metrics such as accuracy, precision, recall, and F1 score to quantify the model's effectiveness. Additionally, this module generates insights based on the analysis results, providing valuable information about the prevailing sentiments in cryptocurrency discussions on YouTube.
  • 12. 4 2. SYSTEM ANALYSIS 2.1 Existing System & its Disadvantages: The current landscape of sentiment analysis on cryptocurrency lacks a comprehensive and tailored approach to gauging public opinion from the vast realm of YouTube comments. Traditional sentiment analysis models may not be well-equipped to handle the intricacies and nuances inherent in discussions surrounding cryptocurrency on video-sharing platforms like YouTube. Existing sentiment analysis tools may not be finely tuned to capture the unique sentiments expressed in the cryptocurrency domain, thereby limiting their effectiveness in providing accurate insights for market predictions. Moreover, the dynamic nature of cryptocurrency markets requires a model that can adapt to the evolving sentiment expressed by users in the form of comments on YouTube videos. Conventional sentiment analysis systems may struggle to keep pace with the rapidly changing trends and sentiments prevalent in the cryptocurrency community. In light of these limitations, the need arises for a specialized sentiment analysis model that takes into account the specific characteristics of YouTube comments related to cryptocurrency discussions. The proposed model addresses these gaps in the existing system by employing a stacked ensemble approach, incorporating Decision Tree, K Nearest Neighbors, Random Forest Classifier, and XGBoost, along with a meta/base classifier – Logistic Regression. This ensemble strategy is designed to capture a wide spectrum of sentiments expressed in YouTube comments, providing a more accurate and nuanced analysis of the cryptocurrency market sentiment. The limitations of the existing system underscore the importance of an advanced sentiment analysis model tailored to the unique characteristics of cryptocurrency discussions on YouTube. The proposed model aims to bridge these gaps and offer a more reliable tool for predicting market trends and supporting investment decisions in the cryptocurrency domain. DISADVANTAGES:  Generic Sentiment Analysis Models: Existing sentiment analysis models may be generic and not specifically designed to handle the unique characteristics of sentiments expressed in cryptocurrency discussions. Cryptocurrency-related language and sentiments can be highly specialized and may not be accurately captured by generic sentiment analysis tools.
  • 13. 5  Lack of Adaptability to Cryptocurrency Trends: Cryptocurrency markets are known for their rapid and unpredictable changes. Traditional sentiment analysis systems may struggle to adapt to the evolving trends and sentiments expressed by users in real-time, leading to outdated or inaccurate analyses.  Limited Multimodal Analysis: YouTube comments often accompany multimedia content such as videos. Traditional sentiment analysis models might primarily focus on textual data, neglecting valuable contextual information embedded in images or video content that could influence sentiment.  Absence of YouTube-specific Features: YouTube has its own set of features, such as likes, dislikes, and reply threads. Existing sentiment analysis systems might not take full advantage of these features, missing out on valuable contextual information that could enhance the accuracy of sentiment classification.  Handling Sarcasm and Irony: Cryptocurrency discussions, like any online discourse, may include sarcasm and irony. Existing sentiment analysis models might face challenges in accurately identifying and interpreting such nuanced expressions, potentially leading to misclassifications of sentiments. 2.2 Proposed System & it’s Advantages: The proposed system introduces a sophisticated and tailored approach to sentiment analysis in the realm of cryptocurrency discussions on YouTube, aiming to overcome the limitations of existing systems. Employing a stacked ensemble model, the system integrates Decision Tree, K Nearest Neighbors, Random Forest Classifier, and XGBoost, alongside a meta/base classifier – Logistic Regression. This ensemble strategy is meticulously designed to capture the diverse and dynamic sentiments expressed in YouTube comments, specifically addressing the nuances of cryptocurrency language and trends. Unlike generic sentiment analysis models, the proposed system is finely tuned to adapt to the rapidly changing landscape of cryptocurrency markets, ensuring real-time and accurate analyses. Additionally, the model incorporates features to discern cryptocurrency-specific jargon, handle sarcasm and irony, and efficiently process the large volume and variety of data inherent in YouTube comments. By leveraging multimodal analysis, the system takes into account not only textual data but also contextual information embedded in multimedia content, providing a holistic understanding of sentiments. The proposed system is designed to be YouTube-specific, capitalizing on the platform's features like likes, dislikes, and reply threads to
  • 14. 6 enhance the overall accuracy of sentiment classification. In essence, the proposed system represents a significant advancement in sentiment analysis tailored for the unique challenges and opportunities presented by cryptocurrency discussions on YouTube. ADVANTAGES:  Specialized for Cryptocurrency Language: The proposed system is specifically tailored to handle the unique language and terminology prevalent in cryptocurrency discussions. This specialization ensures a more accurate interpretation of sentiments, addressing the limitations of generic sentiment analysis models that may struggle with domain-specific jargon.  Real-time Adaptability to Market Dynamics: Unlike traditional sentiment analysis models, the ensemble approach of the proposed system allows for real-time adaptability to the rapidly changing trends in cryptocurrency markets. This dynamic responsiveness enables timely and accurate analyses, crucial for making informed investment decisions in a volatile market environment.  Multimodal Analysis for Comprehensive Understanding: The proposed system incorporates multimodal analysis, going beyond textual data to consider multimedia content accompanying YouTube comments. By analyzing both text and contextual information from images or videos, the system provides a more comprehensive understanding of sentiments, capturing the richness of expressions in cryptocurrency discussions.  Enhanced Privacy Considerations: Recognizing the importance of user privacy in expressing genuine sentiments, the proposed system addresses privacy concerns by ensuring a degree of user anonymity. This approach encourages more open and honest expressions of sentiment, contributing to a more accurate representation of the true feelings within the cryptocurrency community.  Optimization for YouTube Features: The proposed system maximizes the utilization of YouTube-specific features, such as likes, dislikes, and reply to threads, to enhance the overall accuracy of sentiment classification. By incorporating these platform-specific elements, the system capitalizes on additional contextual information, providing a more nuance analysis of sentiments expressed in YouTube comments related to cryptocurrency.
  • 15. 7 2.3 SYSTEM REQUIREMENTS HARDWARE REQUIREMENTS Processor Pentium IV 2.2 GHz Hard Disk 20 Gb Ram 1 Gb SOFTWARE REQUIREMENTS Operating System Windows 10/11 Development Software Python 3.10 Programming Language Python Domain Machine Learning Integrated Development Environment (IDE) Visual Studio Code Front End Technologies HTML5, CSS3, Java Script Back End Technologies or Framework Django Database Language SQL Database (RDBMS) MySQL Database Software WAMP or XAMPP Server Web Server or Deployment Server Django Application Development Server Design/Modelling Rational Rose
  • 16. 8 3. SYSTEMSTUDY 3.1 FEASIBILITY STUDY A feasibility study assesses the operational, technical and economic merits of the proposed project. The feasibility study is intended to be a preliminary review of the facts to see if it is worthy of proceeding to the analysis phase. From the systems analyst perspective, the feasibility analysis is the primary tool for recommending whether to proceed to the next phase or to discontinue the project. A feasibility study should provide management with enough information to decide:  Whether the project can be done  Whether the final product will benefit its intended users and organization  What are the alternatives among which a solution will be chosen  Is there a preferred alternative? 1. TECHNICAL FEASIBILITY 2. OPERATIONAL FEASIBILITY 3. ECONOMIC FEASIBILITY TECHNICALFEASIBLITY A large part of determining resources has to do with assessing technical feasibility. It considers the technical requirements of the proposed project. The technical requirements are then compared to the technical capability of the organization. The systems project is considered technically feasible if the internal technical capability is sufficient to support the project requirements. The analyst must find out whether current technical resources can be upgraded or added to in a manner that fulfils the request under consideration. The essential questions that help in testing the operational feasibility of a system include the following:  Is the project feasible within the limits of current technology?  Is it available within given resource constraints?  Is it a practical proposition?  Manpower- programmers, testers & debuggers  Software and hardware
  • 17. 9  Are the current technical resources sufficient for the new system? OPERATIONAL FEASIBILITY Operational feasibility is dependent on human resources available for the project and involves projecting whether the system will be used if it is developed and implemented. Operational feasibility is a measure of how well a proposed system solves the problems, and takes advantage of the opportunities identified during scope definition and how it satisfies the requirements identified in the requirements analysis phase of system development. Operational feasibility reviews the willingness of the organization to support the proposed system. This is probably the most difficult of the feasibilities to gauge. In order to determine this feasibility, it is important to understand the management commitment to the proposed project. The essential questions that help in testing the operational feasibility of a system include the following:  Does the current mode of operation provide adequate throughput and response time?  Does current mode provide end users and managers with timely, pertinent, accurate and useful formatted information?  Does the current mode of operation provide cost-effective information services to the business?  Could there be a reduction in costs and or an increase in benefits?  Does current mode of operation offer effective controls to protect against fraud and to guarantee accuracy and security of data and information?  Does current mode of operation make maximum use of available resources, including people, time, and flow of forms? ECONOMIC FEASIBILITY Economic analysis could also be referred to as cost/benefit analysis. It is the most frequently used method for evaluating the effectiveness of a new system. In economic analysis the procedure is to determine the benefits and savings that are expected from a candidate system and compare them with costs. If benefits outweigh costs, then the decision is made to design and implement the system. An entrepreneur must accurately weigh the cost versus benefits before taking an action. Possible questions raised in economic analysis are:  Is the system cost effective?  Do benefits outweigh costs?  The cost of doing full system study  The cost of business employee time
  • 19. 11 4.2 UML DIAGRAMS UML stands for Unified Modeling Language. UML is a standardized general-purpose modeling language in the field of object-oriented software engineering. The standard is managed, and was created by, the Object Management Group. The goal is for UML to become a common language for creating models of object oriented computer software. In its current form UML is comprised of two major components: a Meta-model and a notation. In the future, some form of method or process may also be added to or associated with, UML. The Unified Modeling Language is a standard language for specifying, Visualization, Constructing and documenting the artifacts of software system, as well as for business modeling and other nonsoftware systems. The UML represents a collection of best engineering practices that have proven successful in the modeling of large and complex systems. The UML is a very important part of developing objects oriented software and the software development process. The UML uses mostly graphical notations to express the design of software projects. GOALS: The Primary goals in the design of the UML are as follows: 1. Provide users a ready-to-use, expressive visual modeling Language so that they can develop and exchange meaningful models. 2. Provide extendibility and specialization mechanisms to extend the core concepts. 3. Be independent of particular programming languages and development process. 4. Provide a formal basis for understanding the modeling language. 5. Encourage the growth of OO tools market. 6. Support higher level development concepts such as collaborations, frameworks, patterns and components. 7. Integrate best practices. 4.2.1 USE CASE DIAGRAM: A use case diagram in the Unified Modeling Language (UML) is a type of behavioral diagram defined by and created from a Use-case analysis. Its purpose is to present a graphical overview of the functionality provided by a system in terms of actors, their goals (represented as use cases), and any dependencies between those use cases. The main purpose of a use case diagram
  • 20. 12 is to show what system functions are performed for which actor. Roles of the actors in the system can be depicted. 1)ADMIN USECASE Fig.2 2)USER USECASE Fig.3
  • 21. 13 4.2.2 CLASS DIAGRAM: In software engineering, a class diagram in the Unified Modeling Language (UML) is a type of static structure diagram that describes the structure of a system by showing the system’s classes, their attributes, operations (or methods), and the relationships among the classes. It explains which class contains information. Fig.4 4.2.3 SEQUENCE DIAGRAM: A sequence diagram in Unified Modeling Language (UML) is a kind of interaction diagram that shows how processes operate with one another and in what order. It is a construct of a Message Sequence Chart. Sequence diagrams are sometimes called event diagrams, event scenarios, and timing diagrams.
  • 23. 15 4.2.4 ACTIVITY DIAGRAM: Activity diagrams are graphical representations of workflows of stepwise activities and actions with support for choice, iteration and concurrency. In the Unified Modeling Language, activity diagrams can be used to describe the business and operational step-by-step workflows of components in a system. An activity diagram shows the overall flow of control. 1) USER Fig.7 2) ADMIN Fig.8
  • 24. 16 4.2.5 DATA FLOW DIAGRAM 1. The DFD is also called as bubble chart. It is a simple graphical formalism that can be used to represent a system in terms of input data to the system, various processing carried out on this data, and the output data is generated by this system. 2. The data flow diagram (DFD) is one of the most important modeling tools. It is used to model the system components. These components are the system process, the data used by the process, an external entity that interacts with the system and the information flows in the system. 3. DFD shows how the information moves through the system and how it is modified by a series of transformations. It is a graphical technique that depicts information flow and the transformations that are applied as data moves from input to output. 4. DFD is also known as bubble chart. A DFD may be used to represent a system at any level of abstraction. DFD may be partitioned into levels that represent increasing information flow and functional detail. Fig.9
  • 25. 17 4.2.6 DEPLOYMENT DIAGRAM Deployment Diagram is a type of diagram that specifies the physical hardware on which the software system will execute. It also determines how the software is deployed on the underlying hardware. It maps software pieces of a system to the device that are going to execute it. The deployment diagram maps the software architecture created in design to the physical system architecture that executes it. In distributed systems, it models the distribution of the software across the physical nodes. The software systems are manifested using various artifacts, and then they are mapped to the execution environment that is going to execute the software such as nodes. Many nodes are involved in the deployment diagram; hence, the relation between them is represented using communication paths. There are two forms of a deployment diagram.  Descriptor form  It contains nodes, the relationship between nodes and artifacts.  Instance form  It contains node instance, the relationship between node instances and artifact instance.  An underlined name represents node instances. Purpose of a deployment diagram Deployment diagrams are used with the sole purpose of describing how software is deployed into the hardware system. It visualizes how software interacts with the hardware to execute the complete functionality. It is used to describe software to hardware interaction and vice versa. Deployment Diagram Symbol and notations Deployment Diagram Notations
  • 26. 18 DEPLOYMENT DIAGRAM Fig.10 4.2.7 COMPONENT DIAGRAM A component diagram is used to break down a large object-oriented system into the smaller components, so as to make them more manageable. It models the physical view of a system such as executables, files, libraries, etc. that resides within the node. It visualizes the relationships as well as the organization between the components present in the system. It helps in forming an executable system. A component is a single unit of the system, which is replaceable and executable. The implementation details of a component are hidden, and it necessitates an interface to execute a function. It is like a black box whose behavior is explained by the provided and required interfaces
  • 27. 19 Purpose of a Component Diagram Since it is a special kind of a UML diagram, it holds distinct purposes. It describes all the individual components that are used to make the functionalities, but not the functionalities of the system. It visualizes the physical components inside the system. The components can be a library, packages, files, etc. The component diagram also describes the static view of a system, which includes the organization of components at a particular instant. The collection of component diagrams represents a whole system. The main purpose of the component diagram are enlisted below: 1. It envisions each component of a system. 2. It constructs the executable by incorporating forward and reverse engineering. 3. It depicts the relationships and organization of components. Fig.11
  • 28. 20 4.3 DATA DICTIONARY auth_group Table comments: auth_group Column Type Null Default id int(11) No name varchar(150) No Indexes Keyname Type Unique Packed Column Cardinality Collation Null PRIMARY BTREE Yes No id 0 A No name BTREE Yes No name 0 A No auth_group_permissions Table comments: auth_group_permissions Column Type Null Default id bigint(20) No group_id int(11) No permission_id int(11) No Indexes Keyname Type Uni que Pac ked Column Car dina lity Coll atio n Null PRIMARY BTRE E Yes No id 0 A No auth_group_permissions_group_id_permission_i d_0cd325b0_uniq BTRE E Yes No group_id A No permission_ id 0 A No auth_group_permissions_group_id_b120cbf9 BTRE E No No group_id A No auth_group_permissions_permission_id_84c5c9 2e BTRE E No No permission_ id A No auth_permission Table comments: auth_permission
  • 29. 21 Column Type Null Default id int(11) No name varchar(255) No content_type_id int(11) No codename varchar(100) No Indexes Keyname Typ e Uni que Pac ked Column Car dina lity Co lla tio n Null PRIMARY BTR EE Yes No id 28 A No auth_permission_content_type_id_codename_01a b375a_uniq BTR EE Yes No content_ty pe_id A No codename 28 A No auth_permission_content_type_id_2f476e4b BTR EE No No content_ty pe_id A No auth_user Table comments: auth_user Column Type Null Default id int(11) No password varchar(128) No last_login datetime(6) Yes NULL is_superuser tinyint(1) No username varchar(150) No first_name varchar(150) No last_name varchar(150) No email varchar(254) No is_staff tinyint(1) No is_active tinyint(1) No date_joined datetime(6) No Indexes
  • 30. 22 Keyname Type Unique Packed Column Cardinality Collation Null PRIMARY BTREE Yes No id 0 A No username BTREE Yes No username 0 A No auth_user_groups Table comments: auth_user_groups Column Type Null Default id bigint(20) No user_id int(11) No group_id int(11) No Indexes Keyname Type Uni que Packed Column Cardi nality Co lla tio n Null PRIMARY BTRE E Yes No id 0 A No auth_user_groups_user_id_group_id_94350c 0c_uniq BTRE E Yes No user_id A No group_i d 0 A No auth_user_groups_user_id_6a12ed8b BTRE E No No user_id A No auth_user_groups_group_id_97559544 BTRE E No No group_i d A No auth_user_user_permissions Table comments: auth_user_user_permissions Column Type Null Default id bigint(20) No user_id int(11) No permission_id int(11) No
  • 31. 23 Indexes Keyname Type Uniq ue Pack ed Column Car dina lity Coll atio n Null PRIMARY BTRE E Yes No id 0 A No auth_user_user_permissions_user_id_permission_id_ 14a6b632_uniq BTRE E Yes No user_id A No permission _id 0 A No auth_user_user_permissions_user_id_a95ead1b BTRE E No No user_id A No auth_user_user_permissions_permission_id_1fbb5f2 c BTRE E No No permission _id A No django_admin_log Table comments: django_admin_log Column Type Null Default id int(11) No action_time datetime(6) No object_id longtext Yes NULL object_repr varchar(200) No action_flag smallint(5) No change_message longtext No content_type_id int(11) Yes NULL user_id int(11) No Indexes Keyname Type Uniqu e Packe d Colum n Car dina lity Collati on Null PRIMARY BTRE E Yes No id 0 A No django_admin_log_content_type_id_c4b ce8eb BTRE E No No conten t_type _id A Yes django_admin_log_user_id_c564eba6 BTRE E No No user_i d A No
  • 32. 24 django_content_type Table comments: django_content_type Column Type Null Default id int(11) No app_label varchar(100) No model varchar(100) No Indexes Keyname Typ e Uniq ue Pack ed Colu mn Car dina lity Coll atio n Null PRIMARY BTR EE Yes No id 7 A No django_content_type_app_label_model_76bd3d 3b_uniq BTR EE Yes No app_l abel A No model 7 A No django_migrations Table comments: django_migrations Column Type Null Default id bigint(20) No app varchar(255) No name varchar(255) No applied datetime(6) No Indexes Keyname Type Unique Packed Column Cardinality Collation Null PRIMARY BTREE Yes No id 21 A No django_session Table comments: django_session Column Type Null Default session_key varchar(40) No session_data longtext No
  • 33. 25 expire_date datetime(6) No Indexes Keyname Type Unique Pa ck ed Column Cardi nality Colla tion Null PRIMARY BTREE Yes No session_key 3 A No django_session_expire_date_a5c62663 BTREE No No expire_date A No usermodel Table comments: usermodel Column Type Null Default user_id int(11) No name varchar(50) No email varchar(254) No password varchar(50) No profile varchar(100) Yes NULL phone varchar(50) No country varchar(50) No status varchar(50) No Indexes Keyname Type Unique Packed Column Cardinality Collation Null PRIMARY BTREE Yes No user_id 5 A No
  • 34. 26 5. TECHNOLOGIES USED 5.1 What is Python programming language? Python is a high-level, general-purpose, interpreted programming language. 1) High-level Python is a high-level programming language that makes it easy to learn. Python doesn’t require you to understand the details of the computer in order to develop programs efficiently. 2) General-purpose Python is a general-purpose language. It means that you can use Python in various domains including:  Web applications  Big data applications  Testing  Automation  Data science, machine learning, and AI  Desktop software  Mobile apps The targeted language like SQL which can be used for querying data from relational databases. 3) Interpreted Python is an interpreted language. To develop a Python program, you write Python code into a file called source code. To execute the source code, you need to convert it to the machine language that the computer can understand. And the Python interpreter turns the source code, line by line, once at a time, into the machine code when the Python program executes.
  • 35. 27 5.1.1 WHY PYTHON? Python increases your productivity. Python allows you to solve complex problems in less time and fewer lines of code. It’s quick to make a prototype in Python. Python has become a solution in many areas across industries, from web applications to data science and machine learning. Python is quite easy to learn in comparison with other programming languages. Python syntax is clear and beautiful. Python has a large ecosystem that includes lots of libraries and frameworks. Python is cross-platform. Python programs can run on Windows, Linux, and macOS. Python has a huge community. Whenever you get stuck, you can get help from an active community. Python developers are in high demand. 5.1.2 History of Python  Python was created by Guido Van Rossum.  The design began in the late 1980s and was first released in February 1991. Python Version History Implementation started - December 1989 Internal releases – 1990
  • 36. 28 5.2 INSTALLING PYTHON ON DIFFERENT PLATFORMS 5.2.1 Install Python on Windows First, download the latest version of Python from the download page. Second, double-click the installer file to launch the setup wizard. In the setup window, you need to check the Add Python 3.8 to PATH and click Install Now to begin the installation. It’ll take a few minutes to complete the setup.
  • 37. 29 Once the setup completes, you’ll see the following window: Verify the installation To verify the installation, you open the Run window and type cmd and press Enter: In the Command Prompt, type python command as follows:
  • 38. 30 If you see the output like the above screenshot, you’ve successfully installed Python on your computer. To exit the program, you type Ctrl-Z and press Enter. If you see the following output from the Command Prompt after typing the python command: 'python' is not recognized as an internal or external command, operable program or batch file. Likely, you didn’t check the Add Python 3.8 to PATH checkbox when you install Python. 5.2.2 Install Python on macOS It’s recommended to install Python on macOS using an official installer. Here are the steps:  First, download a Python release for macOS.  Second, run the installer by double-clicking the installer file.  Third, follow the instruction on the screen and click the Next button until the installer completes. 5.2.3 Install Python on Linux Before installing Python 3 on your Linux distribution, you check whether Python 3 was already installed by running the following command from the terminal: python3 --version If you see a response with the version of Python, then your computer already has Python 3 installed. Otherwise, you can install Python 3 using a package management system. For example, you can install Python 3.10 on Ubuntu using apt: sudo apt install python3.10 To install the newer version, you replace 3.10 with that version.
  • 39. 31 5.3 An Introduction to the Visual Studio Code Visual Studio Code is a lightweight source code editor. The Visual Studio Code is often called VS Code. The VS Code runs on your desktop. It’s available for Windows, macOS, and Linux’s Code comes with many features such as IntelliSense, code editing, and extensions that allow you to edit Python source code effectively. The best part is that the VS Code is open-source and free. Besides the desktop version, VS Code also has a browser version that you can use directly in your web browser without installing it. This tutorial teaches you how to set up Visual Studio Code for a Python environment so that you can edit, run, and debug Python code. 5.3.1Setting up Visual Studio Code To set up the VS Code, you follow these steps: First, navigate to the VS Code official website and download the VS code based on your platform (Windows, macOS, or Linux). Second, launch the setup wizard and follow the steps. Once the installation completes, you can launch the VS code application:
  • 40. 32 5.3.2 Install Python Extension To make the VS Code works with Python, you need to install the Python extension from the Visual Studio Marketplace. The following picture illustrates the steps:  First, click the Extensions tab.  Second, type the python extension pack keyword on the search input.  Third, click the Python extension pack. It’ll show detailed information on the right pane.  Finally, click the Install button to install the Python extension. Now, you’re ready to develop the first program in Python. Creating a new Python project First, create a new folder called helloworld. Second, launch the VS code and open the helloworld folder. Third, create a new app.py file and enter the following code and save the file: print('Hello, World!') Code language: Python (python) The print() is a built-in function that displays a message on the screen. In this example, it’ll show the message 'Hello, Word!'.
  • 41. 33 5.4 PYTHON FUNDAMENTALS What is a function? When you sum two numbers, that’s a function. And when you multiply two numbers, that’s also a function.Each function takes your inputs, applies some rules, and returns a result.In the above example, the print() is a function. It accepts a string and shows it on the screen.Python has many built-in functions like the print() function to use them out of the box in your program.In addition, Python allows you to define your functions, which you’ll learn how to do it later. Executing the Python Hello World program To execute the app.py file, you first launch the Command Prompt on Windows or Terminal on macOS or Linux. Then, navigate to the hello world folder. After that, type the following command to execute the app.py file: python app.py Code language: Python (python) If you use macOS or Linux, you use python3 command instead: python3 app.py Code language: CSS (css) If everything is fine, you’ll see the following message on the screen: Hello, World! Code language: Python (python) If you use VS Code, you can also launch the Terminal within the VS code by:  Accessing the menu Terminal > New Terminal  Or using the keyboard shortcut Ctrl+Shift+`.  Typically, the backtick key (`) locates under the Esc key on the keyboard.
  • 42. 34 Python IDLE Python IDLE is the Python Integration Development Environment (IDE) that comes with the Python distribution by default. The Python IDLE is also known as an interactive interpreter. It has many features such as:  Code editing with syntax highlighting.  Smart indenting  And auto-completion In short, the Python IDLE helps you experiment with Python quickly in a trial-and-error manner. The following shows you step by step how to launch the Python IDLE and use it to execute the Python code: First, launch the Python IDLE program: A new Python Shell window will display as follows:
  • 43. 35 Now, you can enter the Python code after the cursor >>> and press Enter to execute it.For example, you can type the code print('Hello, World!') and press Enter, you’ll see the message Hello, World! immediately on the screen: Python Syntax Whitespace and indentation If you’ve been working in other programming languages such as Java, C#, or C/C++, you know that these languages use semicolons (;) to separate the statements.However, Python uses whitespace and indentation to construct the code structure. The following shows a snippet of Python code: # define main function to print out something defmain(): i = 1 max = 10 while (i< max): print(i) i = i + 1 # call function main main() The meaning of the code isn’t important to you now. Please pay attention to the code structure instead.
  • 44. 36 At the end of each line, you don’t see any semicolon to terminate the statement. And the code uses indentation to format the code. By using indentation and whitespace to organize the code, Python code gains the following advantages:  First, you’ll never miss the beginning or ending code of a block like in other programming languages such as Java or C#.  Second, the coding style is essentially uniform. If you have to maintain another developer’s code, that code looks the same as yours.  Third, the code is more readable and clearer in comparison with other programming languages. Comments The comments are as important as the code because they describe why a piece of code was written. When the Python interpreter executes the code, it ignores the comments. In Python, a single-line comment begins with a hash (#) symbol followed by the comment. For example: # This is a single line comment in Python Continuation of statements Python uses a newline character to separate statements. It places each statement on one line.However, a long statement can span multiple lines by using the backslash () character.T he following example illustrates how to use the backslash () character to continue a statement in the second line: if (a == True) and (b == False) and (c == True): print("Continuation of statements") Identifiers Identifiers are names that identify variables, functions, modules, classes, and other objects in Python. The name of an identifier needs to begin with a letter or underscore (_). The following characters can be alphanumeric or underscore. Python identifiers are case-sensitive. For example, the counter and Counter are different identifiers. In addition, you cannot use Python keywords for naming identifiers.
  • 45. 37 Keywords Some words have special meanings in Python. They are called keywords.The following shows the list of keywords in Python: Falseclassfinallyisreturn Nonecontinueforlambdatry Truedeffromnonlocalwhile anddelglobalnotwith aselififoryield assertelseimportpass breakexceptinraise Python is a growing and evolving language. So, its keywords will keep increasing and changing.Python provides a special module for listing its keywords called keyword. To find the current keyword list, you use the following code: importkeyword print(keyword.kwlist) String literals Python uses single quotes ('), double quotes ("), triple single quotes (''') and triple-double quotes (""") to denote a string literal.The string literal need to be surrounded with the same type of quotes. For example, if you use a single quote to start a string literal, you need to use the same single quote to end it.The following shows some examples of string literals: s = 'This is a string' print(s) s = "Another string using double quotes" print(s) s = ''' string can span multiple line ''' print(s)
  • 46. 38 5.5 MACHINE LEARNING Before we take a look at the details of various machine learning methods, let's start by looking at what machine learning is, and what it isn't. Machine learning is often categorizedas a subfield of artificial intelligence, but I find that categorization can often be misleadingat first brush. The study of machine learning certainly arose from research in this context,but in the data science application of machine learning methods, it's more helpful to thinkof machine learning as a means of building models of data. Fundamentally, machine learning involves building mathematical models to help understand data. "Learning" enters the fray when we give these models tunable parametersthat can be adapted to observed data; in this way the program can be considered to be "learning" from the data. Once these models have been fit to previously seen data, they canbe used to predict and understand aspects of newly observed data. I'll leave to the reader the more philosophical digression regarding the extent to which this type of mathematical,model-based "learning" is similar to the "learning" exhibited by the human brain. Understanding the problem setting in machine learning is essential to using these tools effectively, and so we will start with some broad categorizations of the types of approacheswe'll discuss here. Categories Of Machine Leaning At the most fundamental level, machine learning can be categorized into two main types: supervised learning and unsupervised learning. Supervised learning involves somehow modeling the relationship between measuredfeatures of data and some label associated with the data; once this model is determined, itcan be used to apply labels to new, unknown data. This is further subdivided into classification tasks and regression tasks: in classification, the labels are discrete categories,while in regression, the labels are continuous quantities. We will see examples of both types of supervised learning in the following section. Unsupervised learning involves modeling the features of a dataset without reference to anylabel, and is often described as "letting the dataset speak for itself." These models includetasks such as clustering and dimensionality reduction. Clustering algorithms identify distinct groups of data, while dimensionality reduction algorithms search for more succinctrepresentations of the data. We will see examples of both types of unsupervised learning in the following section.
  • 47. 39 Need for Machine Learning Human beings, at this moment, are the most intelligent and advanced species on earth because they can think, evaluate and solve complex problems. On the other side, AI is stillin its initial stage and haven’t surpassed human intelligence in many aspects. Then the question is that what is the need to make machine learn? The most suitable reason for doingthis is, “to make decisions, based on data, with efficiency and scale”. Lately, organizations are investing heavily in newer technologies like Artificial Intelligence, Machine Learning and Deep Learning to get the key information from data toperform several real- world tasks and solve problems. We can call it data-driven decisions taken by machines, particularly to automate the process. These data-driven decisions can be used, instead of using programing logic, in the problems that cannot be programmed inherently. The fact is that we can’t do without human intelligence, but other aspect is thatwe all need to solve real-world problems with efficiency at a huge scale. That is why the need for machine learning arises. Challenges in Machines Learning While Machine Learning is rapidly evolving, making significant strides with cybersecurityand autonomous cars, this segment of AI as whole still has a long way to go. The reason behind is that ML has not been able to overcome number of challenges. The challenges that ML is facing currently are − Quality of data − Having good-quality data for ML algorithms is one of the biggest challenges. Use of low-quality data leads to the problems related to data preprocessing andfeature extraction. Time-Consuming task − Another challenge faced by ML models is the consumption of time especially for data acquisition, feature extraction and retrieval. Lack of specialist persons − As ML technology is still in its infancy stage, availability of expert resources is a tough job. No clear objective for formulating business problems − Having no clear objective and well - defined goal for business problems is another key challenge for ML because this technology is not that mature yet. Issue of overfitting & underfitting − If the model is overfitting or underfitting, it cannot be represented well for the problem.
  • 48. 40 Applications of Machine Learning: - Machine Learning is the most rapidly growing technology and according to researchers weare in the golden year of AI and ML. It is used to solve many real-world complex problemswhich cannot be solved with traditional approach. Following are some real-world applications of ML are • Emotion analysis • Sentiment analysis • Error detection and prevention • Weather forecasting and prediction • Stock market analysis and forecasting • Speech synthesis • Speech recognition • Object recognition • Recommendation of products to customer in online shopping • Fraud detection • Fraud prevention • Customer segmentation
  • 49. 41 How to Start Learning Machine Learning? Arthur Samuel coined the term “Machine Learning” in 1959 and defined it as a “Field ofstudy that gives computers the capability to learn without being explicitly programmed”. And that was the beginning of Machine Learning! In modern times, Machine Learning is one of the most popular (if not the most!) career choices. According to Indeed, Machine Learning Engineer Is The Best Job of 2019 with a 344% growth and an average base salary of $146,085per year. But there is still a lot of doubt about what exactly is Machine Learning and how to start learning it? So this article deals with the Basics of Machine Learning and also the path you can follow to eventually become a full-fledged Machine Learning Engineer. Now let’s get started!!! How to start learning ML? This is a rough roadmap you can follow on your way to becoming an insanely talented Machine Learning Engineer. Of course, you can always modify the steps according to your needs to reach your desired end-goal! Step 1 – Understand the Prerequisites In the case, you are a genius, you could start ML directly but normally, there are someprerequisites that you need to know which include Linear Algebra, Multivariate Calculus, Statistics, and Python. And if you don’t know these, never fear! You don’t need Ph.D.degreein these topics to get started but you do need a basic understanding. (a) Learn Linear Algebra and Multivariate Calculus Both Linear Algebra and Multivariate Calculus are important in Machine Learning. However, the extent to which you need them depends on your role as a data scientist. If you are more focused on application heavy machine learning, then you will not be that heavily focused on maths as there are many common libraries available. But if you want to focus onR&D in Machine Learning, then mastery of Linear Algebra and Multivariate Calculus is very important as you will have to implement many ML algorithms from scratch.
  • 50. 42 (b) Learn Statistics Data plays a huge role in Machine Learning. In fact, around 80% of your time as an ML expert will be spent collecting and cleaning data. And statistics is a field that handles the collection, analysis, and presentation of data. So it is no surprise that you need to learn it!!!Some of the key concepts in statistics that are important are Statistical Significance, Probability Distributions, Hypothesis Testing, Regression, etc. Also, Bayesian Thinking isalso a very important part of ML which deals with various concepts like Conditional Probability, Priors, and Posteriors, Maximum Likelihood, etc. (c) Learn Python Some people prefer to skip Linear Algebra, Multivariate Calculus and Statistics and learn them as they go along with trial and error. But the one thing that you absolutely cannot skipis Python! While there are other languages you can use for Machine Learning like R, Scala,etc. Python is currently the most popular language for ML. In fact, there are many Python libraries that are specifically useful for Artificial Intelligence and Machine Learning such as Keras, TensorFlow, Scikit-learn, etc. So if you want to learn ML, it’s best if you learn Python! You can do that using various online resources and courses such as Fork Python available Free on GeeksforGeeks. Step 2 – Learn Various ML Concepts Now that you are done with the prerequisites, you can move on to actually learning ML(Which is the fun part!!!) It’s best to start with the basics and then move on to more complicated stuff. Some of the basic concepts in ML are: (a) Terminologies of Machine Learning • Model – A model is a specific representation learned from data by applying some machine learning algorithm. A model is also called a hypothesis. • Feature – A feature is an individual measurable property of the data. A set of numericfeatures can be conveniently described by a feature vector. Feature vectors are fed as input to the model. For example, in order to predict a fruit, there may be features like color, smell, taste, etc. • Target (Label) – A target variable or label is the value to be predicted by our model. For the fruit example discussed in the feature section, the label with each set of input would be the name of the fruit like apple, orange, banana, etc. • Training – The idea is to give a set of inputs(features) and it’s expected outputs(labels),so after training, we will have a model (hypothesis) that will then map new data to oneof the categories
  • 51. 43 trained on. • Prediction – Once our model is ready, it can be fed a set of inputs to which it will provide a predicted output(label). (b) Types of Machine Learning • Supervised Learning – This involves learning from a training dataset with labeled data using classification and regression models. This learning process continues untilthe required level of performance is achieved. • Unsupervised Learning – This involves using unlabelled data and then finding the underlying structure in the data in order to learn more and more about the data itself using factor and cluster analysis models. • Semi-supervised Learning – This involves using unlabelled data like Unsupervised Learning with a small amount of labeled data. Using labeled data vastly increases thelearning accuracy and is also more cost-effective than Supervised Learning. • Reinforcement Learning – This involves learning optimal actions through trial and error. So the next action is decided by learning behaviors that are based on the currentstate and that will maximize the reward in the future.
  • 52. 44 6. IMPLEMENTATIONS 6.1 SOFTWAREENVIRONMENT 6.1.1 PYTHON Python is a general-purpose interpreted, interactive, object-oriented, and high-level programming language. An interpreted language, Python has a design philosophy that emphasizes code readability (notably using whitespace indentation to delimit code blocks rather than curly brackets or keywords), and a syntax that allows programmers to express concepts in fewer lines of code than might be used in languages such as C++or Java. It provides constructs that enable clear programming on both small and large scales. Python interpreters are available for many operating systems. C, Python, the reference implementation of Python, is open source software and has a community-based development model, as do nearly all of its variant implementations. C, Python is managed by the non-profit Python Software Foundation. Python features a dynamic type system and automatic memory management. Interactive Mode Programming. 6.1.2 An Introduction to the Visual Studio Code Visual Studio Code is a lightweight source code editor. The Visual Studio Code is often called VS Code. The VS Code runs on your desktop. It’s available for Windows, macOS, and Linux’s Code comes with many features such as IntelliSense, code editing, and extensions that allow you to edit Python source code effectively. The best part is that the VS Code is open-source and free. Besides the desktop version, VS Code also has a browser version that you can use directly in your web browser without installing it. This tutorial teaches you how to set up Visual Studio Code for a Python environment so that you can edit, run, and debug Python code
  • 53. 45 6.2 SAMPLECODE from django.shortcuts import render, redirect from django.contrib import messages from sentimentapp.models import UserModel from django.core.paginator import Paginator # Create your views here. def admin_login(request): if request.method == 'POST': name = request.POST.get('name') password = request.POST.get('password') print(name,password) if name == 'admin' and password == 'admin': print(name, 'rrrrrrrrrrrrr',password) messages.success(request,'Admin login successfully') return redirect('dashboard') else: messages.error(request,'Wrong name and password') return redirect('admin_login') return render(request, 'admin/login.html') def dashboard(request): pending = UserModel.objects.filter(status='pending').count() all = UserModel.objects.all().count()
  • 54. 46 context={ 'pending':pending, 'all':all } return render(request, 'admin/index.html', context) def pending_users(request): pending_user = UserModel.objects.filter(status='pending').order_by('-user_id') paginator = Paginator(pending_user,4) page_nnumber = request.GET.get('page') p = paginator.get_page(page_nnumber) context = { 'page':p } return render(request,'admin/pending-users.html', context) def accept_user(request,id): users = UserModel.objects.get(user_id=id) users.status = 'Accept' users.save(update_fields=['status']) users.save() messages.success(request,'New user add successfully') return redirect('pending-users') def reject_user(request,id): user = UserModel.objects.get(user_id=id) user.delete() messages.success(request,'user rejected successfully ') return redirect('pending-users')
  • 55. 47 def all_users(request): all_users = UserModel.objects.filter(status='Accept') paginater = Paginator(all_users,4) page_number = request.GET.get('page') number_of_pages = paginater.get_page(page_number) context = { 'users':number_of_pages } return render(request, 'admin/all-users.html', context) def delete(request,id): user = UserModel.objects.get(user_id=id) user.delete() messages.success(request,'user delete successfully') return redirect('all-users') def logout(request): messages.success(request,'Admin logout successfully') return redirect('home') from django.shortcuts import render,redirect, get_object_or_404 from django.contrib import messages from sentimentapp.models import UserModel # API data libraries import requests from django.conf import settings from isodate import parse_duration import os from googleapiclient.discovery import build import pandas as pd
  • 56. 48 import re from bs4 import BeautifulSoup from textblob import TextBlob from googletrans import Translator import nltk from nltk.sentiment import SentimentIntensityAnalyzer nltk.download('vader_lexicon') # Create your views here. def user_login(request): if request.method == 'POST': email = request.POST.get('email') password = request.POST.get('password') print(email,password) try: user = UserModel.objects.get(email=email, password=password) if user.status == 'Accept': request.session['user_id'] = user.user_id print(user.user_id,'hi user') messages.success(request,'user login successsfully') return redirect('user-home') else: messages.info(request, 'your account is not approved at !!') return redirect('user_login') except : messages.info(request,'Wrong email and password') return redirect('user_login') return render(request, 'user/user_login.html') def user_register(request):
  • 57. 49 if request.method == 'POST' and 'profile' in request.FILES: name = request.POST.get('name') email = request.POST.get('email') password = request.POST.get('pass1') # con_pass = request.POST.get('pass2') profile = request.FILES['profile'] phone = request.POST.get('num') country = request.POST.get('country') try: UserModel.objects.get(email = email) messages.warning(request, ' Email alresdy exists') return redirect('register') except: UserModel.objects.create( name = name, email = email, password = password, # con_password = con_pass, profile = profile, phone = phone, country = country ) messages.success(request, 'User Registered successfully ') return redirect('user_login') return render(request, 'user/register.html') def home(request): return render(request, 'user/index.html') def profile(request): user_id = request.session['user_id']
  • 58. 50 print(user_id) user =UserModel.objects.get(pk=user_id) if request.method == 'POST': user_name = request.POST.get('name') user_email = request.POST.get('email') user_number = request.POST.get('num') user_country = request.POST.get('country') if not request.FILES.get('profile',False): user.name = user_name user.email = user_email user.phone = user_number user.country = user_country if request.FILES.get('profile',False): image = request.FILES['profile'] user.name = user_name user.email = user_email user.phone = user_number user.country = user_country user.profile = image user.save() return redirect('profile') return render(request,'user/user-profile.html', {'user':user}) # =============== sentiment analysis on youtube video comments ================= # analysis = TextBlob(str(com_ts['comment'])) sen = SentimentIntensityAnalyzer() analysis = sen.polarity_scores(a) sentiments = '' # print(analysis['compound'])
  • 59. 51 if analysis['compound'] >= 0.5: sentiments = 'Very Positive' elif analysis['compound'] > 0 and analysis['compound'] < 0.5: sentiments = 'Positive' elif analysis['compound'] < 0 and analysis['compound'] >= -0.5: sentiments = 'Negative' elif analysis['compound'] <= -0.5: sentiments = 'Very Negative' else: sentiments = 'Neutral' com_ts['sentiment'] = sentiments comments.append(com_ts) # ================ overall sentiment analysis in % ========================= pos = [sentiment for sentiment in comments if sentiment['sentiment']=='Positive'] verypos = [sentiment for sentiment in comments if sentiment['sentiment']=='Very Positive'] nege = [sentiment for sentiment in comments if sentiment['sentiment']=='Negative'] verynege = [sentiment for sentiment in comments if sentiment['sentiment']=='Very Negative'] neutral = len(comments) - (len(nege) + len(pos) + len(verypos) + len(verynege)) try: positive = float(format(100 * len(pos) / len(comments))) verypositive = float(format(100 * len(verypos) / len(comments))) negetive = float(format(100 * len(nege) / len(comments))) verynegetive = float(format(100 * len(verynege) / len(comments))) nutraltotal = float(format(100 * neutral / len(comments))) except: print('Comments not found :Refresh your browser') messages.info(request,'Invalid input Enter again') return redirect('api_search')
  • 60. 52 context = { 'videos':videos, 'comments':comments, 'positive':positive, 'verypositive':verypositive, 'negetive':negetive, 'verynegetive':verynegetive, 'neutral':nutraltotal, } return render(request, 'user/api-search.html', context) return render(request, 'user/api-search.html') def logout(request): messages.success(request,'User logout successfully') return redirect('home')
  • 61. 53 7. SYSTEM TESTING 7.1 INTRODUCTION TO TESTNG The purpose of testing is to discover errors. Testing is the process of trying to discover every conceivable fault or weakness in a work product. It provides a way to check the functionality of components, sub-assemblies, assemblies and/or a finished product It is the process ofexercisingsoftwarewiththeintentofensuringthattheSoftwaresystemmeetsitsrequirementsand user expectations and does not fail in an unacceptable manner. There are various types of test. Each test type addresses specific testing requirements. Types of Software Testing: Different Testing Types with Details We, as testers, are aware of the various types of Software Testing like Functional Testing, Non- Functional Testing, Automation Testing, Agile Testing, and their sub-types, etc. Each type of testing has its own features, advantages, and disadvantages as well. However, in this tutorial, we have covered mostly each and every type of software testing which we usually use in our day-to-day testing life. 7.2 Types of Software Testing Strategies
  • 62. 54 Functional Testing There are four main types of functional testing. 1) Unit Testing Unit testing is a type of software testing which is done on an individual unit or component to test its corrections. Typically, Unit testing is done by the developer at the application development phase. Each unit in unit testing can be viewed as a method, function, procedure, or object. Developers often use test automation tools such as N Unit, X unit, JUnit for the test execution. Unit testing is important because we can find more defects at the unit test level. For example, there is a simple calculator application. The developer can write the unit test to check if the user can enter two numbers and get the correct sum for addition functionality. a) White Box Testing White box testing is a test technique in which the internal structure or code of an application is visible and accessible to the tester. In this technique, it is easy to find loopholes in the design of an application or faults in business logic. Statement coverage and decision coverage/branch coverage are examples of white box test techniques. b) Gorilla Testing Gorilla testing is a test technique in which the tester and/or developer test the module of the application thoroughly in all aspects. Gorilla testing is done to check how robust your application is. For example, the tester is testing the pet insurance company’s website, which provides the service of buying an insurance policy, tag for the pet, Lifetime membership. The tester can focus on any one module, let’s say, the insurance policy module, and test it thoroughly with positive and negative test scenarios. 2) Integration Testing Integration testing is a type of software testing where two or more modules of an application are logically grouped together and tested as a whole. The focus of this type of testing is to find the defect on interface, communication, and data flow among modules. Top-down or Bottom-up approach is used while integrating modules into the whole system. This type of testing is done on integrating modules of a system or between systems. For example, a user is buying a flight ticket from any airline website. Users can see flight details and payment information while buying a ticket, but flight details and payment processing are two different systems. Integration testing should be done while integrating of airline website and payment processing system.
  • 63. 55 3) System Testing System testing is a type of testing where tester evaluates the whole system against the specified requirements. a) End to End Testing It involves testing a complete application environment in a situation that mimics real-world use, such as interacting with a database, using network communications, or interacting with other hardware, applications, or systems if appropriate. For example, a tester is testing a pet insurance website. End to End testing involves testing buying an insurance policy, LPM, tag, adding another pet, updating credit card information on users’ accounts, updating user address information, receiving order confirmation emails and policy documents. b) Black Box Testing Blackbox testing is a software testing technique in which testing is performed without knowing the internal structure, design, or code of a system under test. Testers should focus only on the input and output of test objects. Detailed information about the advantages, disadvantages, and types of Black Box testing can be found here. c) Smoke Testing Smoke testing is performed to verify that basic and critical functionality of the system under test is working fine at a very high level. Whenever a new build is provided by the development team, then the Software Testing team validates the build and ensures that no major issue exists. The testing team will ensure that the build is stable, and a detailed level of testing will be carried out further. For example, tester is testing pet insurance website. Buying an insurance policy, adding another pet, providing quotes are all basic and critical functionality of the application. Smoke testing for this website verifies that all these functionalities are working fine before doing any in- depth testing. d) Sanity Testing Sanity testing is performed on a system to verify that newly added functionality or bug fixes are working fine. Sanity testing is done on stable build. It is a subset of the regression test.For example, a tester is testing a pet insurance website. There is a change in the discount for buying a policy for a second pet. Then sanity testing is only performed on buying insurance policy module.
  • 64. 56 4) Acceptance Testing Acceptance testing is a type of testing where client/business/customer test the software with real time business scenarios. The client accepts the software only when all the features and functionalities work as expected. This is the last phase of testing, after which the software goes into production. This is also called User Acceptance Testing (UAT). a) Alpha Testing Alpha testing is a type of acceptance testing performed by the team in an organization to find as many defects as possible before releasing software to customers. For example, the pet insurance website is under UAT. The UAT team will run real-time scenarios like buying an insurance policy, buying annual membership, changing the address, ownership transfer of the pet in a same way the user uses the real website. The team can use test credit card information to process payment-related scenarios. b) Beta Testing Beta Testing is a type of software testing which is carried out by the clients/customers. It is performed in the Real Environment before releasing the product to the market for the actual end- users. Beta Testing is carried out to ensure that there are no major failures in the software or product, and it satisfies the business requirements from an end-user perspective. Beta Testing is successful when the customer accepts the software. Non-Functional Testing There are four main types of functional testing. 1) Security Testing It is a type of testing performed by a special team. Any hacking method can penetrate the system. Security Testing is done to check how the software, application, or website is secure from internal and/or external threats. This testing includes how much software is secure from malicious programs, viruses and how secure & strong the authorization and authentication processes are. It also checks how software behaves for any hacker’s attack & malicious programs and how software is maintained for data security after such a hacker attack. a) Penetration Testing Penetration Testing or Pen testing is the type of security testing performed as an authorized
  • 65. 57 cyberattack on the system to find out the weak points of the system in terms of security.Pen testing is performed by outside contractors, generally known as ethical hackers. That is why it is also known as ethical hacking. Contractors perform different operations like SQL injection, URL manipulation, Privilege Elevation, session expiry, and provide reports to the organization. 2) Performance Testing Performance testing is testing of an application’s stability and response time by applying load. The word stability means the ability of the application to withstand in the presence of load. Response time is how quickly an application is available to users. Performance testing is done with the help of tools. Loader.IO, JMeter, LoadRunner, etc. are good tools available in the market. a) Load testing Load testing is testing of an application’s stability and response time by applying load, which is equal to or less than the designed number of users for an application. For example, your application handles 100 users at a time with a response time of 3 seconds, then load testing can be done by applying a load of the maximum of 100 or less than 100 users. The goal is to verify that the application is responding within 3 seconds for all the users. b) Stress Testing Stress testing is testing an application’s stability and response time by applying load, which is more than the designed number of users for an application. For example, your application handles 1000 users at a time with a response time of 4 seconds, then stress testing can be done by applying a load of more than 1000 users. Test the application with 1100,1200,1300 users and notice the response time. The goal is to verify the stability of an application under stress. c) Scalability Testing Scalability testing is testing an application’s stability and response time by applying load, which is more than the designed number of users for an application.For example, your application handles 1000 users at a time with a response time of 2 seconds, then scalability testing can be done by applying a load of more than 1000 users and gradually increasing the number of users to find out where exactly my application is crashing. Let’s say my application is giving response time as follows:  1000 users -2 sec  1400 users -2 sec
  • 66. 58  4000 users -3 sec  5000 users -45 sec  5150 users- crash – This is the point that needs to identify in scalability testing d) Volume testing (flood testing) Volume testing is testing an application’s stability and response time by transferring a large volume of data to the database. Basically, it tests the capacity of the database to handle the data. e) Endurance Testing (Soak Testing) Endurance testing is testing an application’s stability and response time by applying load continuously for a longer period to verify that the application is working fine.For example, car companies soak testing to verify that users can drive cars continuously for hours without any problem. 3) Usability Testing Usability testing is testing an application from the user’s perspective to check the look and feel and user-friendliness. For example, there is a mobile app for stock trading, and a tester is performing usability testing. Testers can check the scenario like if the mobile app is easy to operate with one hand or not, scroll bar should be vertical, background colour of the app should be black and price of and stock is displayed in red or green colour. The main idea of usability testing of this kind of app is that as soon as the user opens the app, the user should get a glance at the market. a) Exploratory testing Exploratory Testing is informal testing performed by the testing team. The objective of this testing is to explore the application and look for defects that exist in the application. Testers use the knowledge of the business domain to test the application. Test charters are used to guide the exploratory testing. b) Cross browser testing Cross browser testing is testing an application on different browsers, operating systems, mobile devices to see look and feel and performance. Different users use different operating systems, different browsers, and different mobile devices. The goal of the company is to get a good user experience regardless of those devices. Browser stack provides all the versions of all the browsers and all mobile devices to test the application. For learning purposes, it is good to take the free trial
  • 67. 59 given by browser stack for a few days. c) Accessibility Testing The aim of Accessibility Testing is to determine whether the software or application is accessible for disabled people or not. Here, disability means deafness, color blindness, mentally disabled, blind, old age, and other disabled groups. Various checks are performed, such as font size for visually disabled, color and contrast for color blindness, etc. 4) Compatibility testing This is a testing type in which it validates how software behaves and runs in a different environment, web servers, hardware, and network environment.
  • 68. 60 8.SCREENSHOTS 1) HOME PAGE This is the home page of the project. 2) CONTACT INFORMATION On the bottom of the home page we have contact details and socials for customer support.
  • 69. 61 3) ADMIN LOGIN This is the Admin Login Page requesting admin’s credentials for logging in. 4) ADMIN LOGIN SUCCESSFUL This screenshot shows successful admin login
  • 70. 62 5) PENDING USERS TO BE AUTHORIZED BY ADMIN This screenshot shows pending users who are registered for the app, admin has the privilege to accept or deny the user’s request to register. 6) ALL THE AUTHORIZED USERS In above screen we can see all the users that are registered with the app and accepted by the admin.
  • 71. 63 7) USER REGISTRATION AND LOGIN The above screen displays the user registration form where any user can register by providing their information and creating their credentials. 8) USER LOGIN SUCCESSFUL The above screen displays successful user login.
  • 72. 64 9) ANALYSIS PAGE The above screen displays user functions, here we click on “ANALYSIS” button. 10) SEARCH BAR FOR ANALYSING YOUTUBE COMMENTS Now in the search bar we paste any YouTube video link, and hence it searches for that YouTube video in the YouTube API.
  • 73. 65 11) ANALYSIS OF A CRYPTOCURRENCY VIDEO BASED ON YOUTUBE COMMENTS On successfully searching for the specified YouTube video the program collects and categorizes all the comments of that you tube video and prepares a detailed analysis based on the context of those you tube comments. 12) CATEGORIZING YOUTUBE COMMENTS On the screen we can see that the comments on that you tube video are categorized into ‘positive’, ‘very positive’, ‘neutral, negative’, ‘very negative’ and also symbolizes them using emojis from the you tube API.
  • 74. 66 13) USER PROFILE By Clicking on the “PROFILE” button it displays details provided by that user. 14) USER LOGOUT By clicking on the “LOGOUT” button the user is successfully logged out.
  • 75. 67 9. CONCLUSIONS The proposed YouTube comment sentiment analysis system for cryptocurrency shows a strong solution, utilizing a complex stacked ensemble model to achieve 94.2% accuracy. To sum up, the sentiment analysis model that has been specially designed for cryptocurrency discussions on YouTube is a noteworthy development in the understanding of market sentiments. It provides a customized method that is well-tuned to the subtleties of cryptocurrency terminology and trends, hence mitigating the drawbacks of existing models. With its multimodal analysis and capacity to adjust in real-time to market dynamics, it offers a thorough understanding of sentiments, which is essential for making wise investment decisions. Additionally, while taking user privacy issues into account, its optimization for YouTube features improves the accuracy of sentiment analysis. All things considered, this approach presents itself as a useful tool for negotiating the unstable cryptocurrency markets, helping both traders and investors make smarter judgments.
  • 76. 68 10. REFERENCES [1] P. D. Devries, “An Analysis of Cryptocurrency, Bitcoin, and the Future,” International Journal of Business Management and Commerce, vol. 1, no. 2, 2016, Accessed: Jan. 13, 2022. [Online]. Available: www.ijbmcnet.com [2] Y. Liu and A. Tsyvinski, “Risks and Returns of Cryptocurrency,” The Review of Financial Studies, vol. 34, no. 6, pp. 2689–2727, May 2021, doi: 10.1093/RFS/HHAA113. [3] A. Yadav and D. K. Vishwakarma, “Sentiment analysis using deep learning architectures: a review,” Artificial Intelligence Review, vol. 53, no. 6, pp. 4335–4385, Aug. 2020, doi: 10.1007/s10462-019-09794-5. [4] A. Jain, S. Tripathi, H. Dhardwivedi, and P. Saxena, “Forecasting Price of Cryptocurrencies Using Tweets Sentiment Analysis,” 2018 11th International Conference on Contemporary Computing, IC3 2018, Nov. 2018, doi: 10.1109/IC3.2018.8530659. [5] A. Inamdar, A. Bhagtani, S. Bhatt, and P. M. Shetty, “Predicting cryptocurrency value using sentiment analysis,” 2019 International Conference on Intelligent Computing and Control Systems, ICCS 2019, pp. 932–934, May 2019, doi: 10.1109/ICCS45141.2019.9065838. [6] C. Lamon, E. Nielsen, and E. Redondo, “Cryptocurrency Price Prediction Using News and Social Media Sentiment,” 2017. [7] “MoneyZG - YouTube.” https://www.youtube.com/c/MoneyZG (accessed Jan. 18, 2022). [8] “Honestly by Tanmay Bhat - YouTube.” https://www.youtube.com/c/HonestlybyTanmayBhat (accessed Jan. 18, 2022). [9] “Tech Burner - YouTube.” https://www.youtube.com/c/TechBurner (accessed Jan. 18, 2022). [10] D. Varshney, & Dinesh, and K. Vishwakarma, “A unified approach for detection of Clickbait videos on YouTube using cognitive evidences,” 2057, doi: 10.1007/s10489-020- 02057- 9/Published.
  • 77. 69 [11] J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, “BERT: Pretraining of Deep Bidirectional Transformers for Language Understanding,” NAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference, vol. 1, pp. 4171–4186, Oct. 2018, Accessed: Jan. 13, 2022. [Online]. Available: https://arxiv.org/abs/1810.04805v2 [12] N. Bahrawi, “Sentiment Analysis Using Random Forest Algorithm-Online Social Media Based,” Journal of Information Technology and Its Utilization, vol. 2, no. 2, p. 29, Dec. 2019, doi: 10.30818/JITU.2.2.2695. [13] J. L. Alzen, L. S. Langdon, and V. K. Otero, “A logistic regression investigation of the relationship between the Learning Assistant model and failure rates in introductory STEM courses,” International Journal of STEM Education, vol. 5, no. 1, pp. 1–12, Dec. 2018, doi: 10.1186/S40594-018-0152-1/TABLES/6. [14] H. H. Patel and P. Prajapati, “Study and Analysis of Decision Tree Based Classification Algorithms,” International Journal of Computer Sciences and Engineering, vol. 6, no. 10, pp. 74–78, Oct. 2018, doi: 10.26438/IJCSE/V6I10.7478. [15] J. Cervantes, F. Garcia-Lamont, L. Rodríguez-Mazahua, and A. Lopez, “A comprehensive survey on support vector machine classification: Applications, challenges and trends,” Neurocomputing, vol. 408, pp. 189–215, Sep. 2020, doi: 10.1016/J.NEUCOM.2019.10.118. [16] A. H. Jahromi and M. Taheri, “A non-parametric mixture of Gaussian naive Bayes classifiers based on local independent features,” 19th CSI International Symposium on Artificial Intelligence and Signal Processing, AISP 2017, vol. 2018- January, pp. 209–212, Mar. 2018, doi: 10.1109/AISP.2017.8324083. [17] A. A. Abdullah, S. A. Hafidz, and W. Khairunizam, “Research and Implementation of Machine Learning Classifier Based on KNN You may also like Performance Comparison of Machine Learning Algorithms for Classification of Chronic Kidney Disease (CKD)”, doi: 10.1088/1757-899X/677/5/052038.
  • 78. 70 [18] B. Xu, X. Guo, Y. Ye, and J. Cheng, “An improved random forest classifier for text categorization,” Journal of Computers (Finland), vol. 7, no. 12, pp. 2913–2920, 2012, doi: 10.4304/JCP.7.12.2913- 2920. [19] A. Sharaff and H. Gupta, “Extra-Tree Classifier with Metaheuristics Approach for Email Classification,” undefined, vol. 924, pp. 189–197, 2019, doi: 10.1007/978-981-13-6861-5_17. [20] C. Tu, H. Liu, and B. Xu, “AdaBoost typical Algorithm and its application research,” MATEC Web of Conferences, vol. 139, p. 00222, Dec. 2017, doi: 10.1051/MATECCONF/201713900222