SlideShare a Scribd company logo
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 05 Issue: 03 | Mar-2018 www.irjet.net p-ISSN: 2395-0072
© 2018, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 3021
A Comprehensive way of finding Top-K Competitors using
C-Miner Algorithm
Sk. Wasim Akram1, G. Manoj Babu2, D. Pratap Roy3, G. Lakshmi Narayana Reddy4
1Asst. professor, Dept. of Computer Science and Engineering, VVIT, AP, India
2,3,4 Student , Dept. of Computer Science and Engineering, VVIT, AP, India
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract – In order to get success in any business
environment it is important to attract the customers than
the competitor. A number of difficulties arise in the
perspective of this task that is to find a method to formalize
and compute the competitiveness relationship between two
items and to find the true competitors of a given item also to
know the features of an item that most affects its
competitiveness. Despite the impact and relevance of this
problem to many domains, only a limited amount of work
has been devoted toward an efficient solution. In this paper,
we present a formal definition of the competitiveness
between two items. An efficient method is presented for
evaluating competitiveness between items in large datasets
and address the natural problem of displaying the top-k
competitors of a given item. Our approach is evaluated
against strong baselines via a user study and experiments
on multiple datasets from various domains.
Key Words: Contenders, Pyramid Finder, C-Miner,
SkyLine, Cluster.
1. INTRODUCTION
Data mining is the process of sorting through large data
sets to discover patterns and establish relationships to
solve problems through data analytics [1]. Data mining
tools allow enterprises to predict future trends. A Long
line of research has exhibited the vital significance of
recognizing and observing firms rivals. Propelled by this
issue, the promoting and administration group have
concentrated on exact strategies for competitors
generation and in addition to techniques for breaking
down known contenders. Surviving exploration of the
previous has concentrated on mining near articulations
(e.g. "Thing A is superior to Item B") from the Web or
other literary sources. Despite the fact that such
articulations can without a doubt be pointers of
competitiveness, they are truant in numerous spaces. For
example, think about the area of get-away bundles. For
this situation, things have no doled out the name by which
they can be questioned or contrasted and each other.
Further, the recurrence of printed relative proof can differ
enormously crosswise over areas. For instance, when
looking at mark names at the firm level (e.g." Google
versus Yahoo" or "Sony versus Panasonic"), it is to be sure
likely that relative examples can be found by basically
questioning the web. Notwithstanding, it is anything but
difficult to recognize standard spaces where such proof is
to a great degree rare, for example, shoes, gems, inns,
eateries, and furniture. Roused by these weaknesses, we
propose another formalization of the competitiveness
between two things, in the market sections that they can
both cover.
2. EXISTING SYSTEM
The administration writing is rich with works that
emphasis on how directors can physically recognize
competitors. Some of these works demonstrate contender
recognizable proof as a psychological classification
process in which administrators create mental portrayals
of contenders and utilize them to order competitor firms.
Other manual classification techniques depend on market-
and asset-based similarities between a firm and applicant
contenders.
2.1 DISADVANTAGES OF EXISTING SYSTEM
The existing methodology isn't proper for assessing the
intensity of any two things or firms in a given market.
Rather, the creators accept that the arrangement of
contenders is given and, in this way, they will likely
register the estimation of the picked measures for every
contender. What's more, the reliance on value-based
information is a constraint we don't have.
3. PROPOSED SYSTEM
We propose another formalization of the intensity
between two things, in view of the market portions that
they can both cover. We depict a strategy for processing
every one of the sections in a given market in light of
mining vast survey datasets. This strategy enables us to
functionalize our meaning of competitiveness and address
the issue of finding the best k contenders of a thing in any
given market [2].
3.1 COMPETITIVENESS
Give U a chance to be the number of inhabitants in every
single conceivable client in a given market. We look at that
as a thing I covers a client u ∈ U in the event that it can
cover the greater part of the client's necessities. At that
point, the competitiveness between two things I, j is
relative to the quantity of clients that they can both cover
[2].
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 05 Issue: 03 | Mar-2018 www.irjet.net p-ISSN: 2395-0072
© 2018, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 3022
Fig - 1: Example of competitiveness on items
The above figure shows the competitiveness between 3
things i, j and k. Every item is mapped to the set of features
that it can offer to a customer. We Consider 3 features in
this example they are X, Y, Z. The left side of the Fig.1
shows three groups of customers named as group1,
group2 and group3. Every group represents a different
market segment. Customers are grouped based on their
preferences with respect to the features. Let us consider,
the customers in group2 are only interested in features X
and Y. We observe that items j and k are not competitive,
since they are not appeal to the same groups of customers
[2] [3].
This case represents the perfect situation, in which we
approach the total arrangement of clients in a given
market, and in addition to particular market sections and
their necessities. Practically speaking, be that as it may,
such data isn't accessible. With a specific end goal to beat
this, we portray a strategy for registering every one of the
fragments in a given market in light of mining extensive
audit datasets. This strategy enables us to functionalize
our meaning of competitiveness and address the issue of
finding the best k contenders of a thing in any given
market. As we appear in our work, this issue presents
critical computational difficulties, particularly within the
sight of huge datasets with hundreds or thousands of
things, for example, those that are regularly found in
standard spaces. We address these difficulties through an
exceptionally adaptable structure for top-k calculation,
including a productive assessment calculation and a fitting
record.
The common client session on a survey stage, for example,
Yelp, Amazon or Trip Advisor, comprises of the
accompanying advances
1) Specify every single required element in an inquiry.
2) Submit the inquiry to the site's internet searcher and
recover the coordinating things.
3) Process the surveys of the returned things and settle on
a buy choice.
In this setting, things that cover the client's prerequisites
will be incorporated into the web crawler's reaction and
will go after her consideration. Then again, non-covering
things won't be considered by the client and, along these
lines, won't have an opportunity to contend. Next, we
show a case that stretches out this basic leadership
procedure to a multi-client setting [4].
3.2 PYRAMID FINDER
Given the horizon Sky(I) of an arrangement of things I and
a thing i ∈ I, let Y contain the k things from Sky(I) that are
most aggressive with I. At that point, a thing j ∈ I must be
in the best k contenders of i, if j ∈ Y or if j is overwhelmed
by one of the things in Y. we don't have to consider the
whole arrangement of applicants with a specific end goal
to generate the top-k contenders. This propels us to build
the horizon pyramid. A structure that unimaginably
diminishes the amount of things that ought to be
considered. We allude to the calculation used to build the
horizon pyramid as Pyramid Finder [5][6].
Pseudo code:
Input: Set of items I
Output: Dominance Pyramid DI
1: DI[0] ←Sky(I)
2: Z ← I  Skyline(I)
3: Level ←1.
4: while Z is not empty do
5: DI[level] ←Sky(Z)
6: for every item j ∈ DI[level] do
7: for every item I ∈ DI[level1] do
8: if i dominates j then
9: Add a link i → j
10: break
11: end if
12: end for
13: end for
14: Z ← Z  skyline(Z)
15: level ←level + 1
16: end while
3.3 THE C-MINER ALGORITHM
Next, we exhibit C Miner, a correct calculation for finding
the best k contenders of a given thing. Our calculation
influences utilization of the horizon to pyramid keeping in
mind the end goal to diminish the quantity of things that
should be considered. Given that we just think about the
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 05 Issue: 03 | Mar-2018 www.irjet.net p-ISSN: 2395-0072
© 2018, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 3023
best k contenders, we can incrementally process the score
of every applicant and stop when it is ensured that the
best k has developed [7].
The info incorporates the arrangement of things I, the
arrangement of highlights F, the thing of intrigue I, the
number k of best contenders to recover, the set Q of
inquiries and their probabilities, and the horizon pyramid
DI . The calculation initially recovers the things that
overwhelm I. These things have the greatest conceivable
intensity with I. On the off chance that at any rate k such
things exist, we report those and close. Else, we add them
to Top-k and decrement our financial plan of k
appropriately. Consider LB keeps up the most reduced
lower bound from the present best k set and is utilized to
prune competitors. We instate the arrangement of
applicants X as the association of things in the main layer
of the pyramid and the arrangement of things commanded
by those as of now in the Top-k [8] .
This is accomplished by means of calling GETSLAVES
routine by passing parameters (Top-k, DI). In each cycle,
C-Miner nourishes the arrangement of hopefuls X to the
UPDATETOPK() schedule, which prunes things in view of
the LB edge. It at that point refreshes the Top-k set
through the MERGE() routine work, which recognizes the
things with the most astounding intensity from Top-k .
This can be accomplished in direct time, since both X and
TOP-K are arranged. The pruning edge LB is set to the
most noticeably bad (least) score among the new Top-k. At
long last, GETSLAVES() routine is utilized to extend the
arrangement of hopefuls by including things that are
overwhelmed by those in X [9].
The UPDATETOPK() procedures hopefuls in X and finds at
most k applicants with the most elevated intensity with i.
The routine uses an information structure nearby Top-k,
executed as a cooperative cluster: the score of every
applicant fills in as the key, while its id fills in as the
esteem. The cluster is key-arranged, to encourage the
calculation of the k best things. The structure is
consequently truncated with the goal that it generally
contains at most k things. We instate the lower and upper
limits. For each thing j ∈ X, low(j) keeps up the current
intensity score of j as new inquiries are considered and
fills in as a lower bound to the applicant's real score. Each
lower bound low(j) begins from 0, and after the fulfillment
of UPDATETOPK(), it incorporates the genuine intensity
score CF (i, j) of competitor j with the central thing i. Then
again, up(j) is an idealistic upper bound on j's
competitiveness score. At first, up(j) is set to the most
extreme conceivable score. For each question q ∈ Q, maxV
holds the most extreme conceivable competitiveness
between thing i and some other thing for that inquiry,
which is in reality the scope of i as for q. At that point, for
every competitor j ∈ X , we subtract maxV from up(j) and
afterward add to it the genuine intensity amongst i and j
for inquiry q. In the event that the upper bound up(j) of a
competitor j progresses toward becoming lower than the
pruning limit LB, at that point j can be securely precluded.
Something else, low(j) is refreshed and j stays in thought.
After each refresh, the estimation of LB is set to the most
exceedingly terrible score in nearby TOP-K, to utilize
stricter pruning in future cycles.
In the event that the quantity of applicant’s |X| turns out to
be less or equivalent to k, the loop over the questions
stops. This is an early-ceasing model: since we will
probably recover the best k hopefuls in X, having |X| <= k
implies that every outstanding competitor ought to be
returned. We finish the intensity calculation of the rest of
the hopefuls and refresh nearby Top-k as needs be. This
happens after the consummation of the principal circle,
with a specific end goal to stay away from pointless
bound-checking and enhance execution [10].
3.4 ADVANTAGES OF PROPOSED SYSTEM
1. A formal meaning of the competitiveness between two
things, in view of their interest to the different client
portions in their market. Our approach beats the
dependence of past work on rare near proof mined from
the content.
2. A formal technique for the distinguishing proof of the
diverse kinds of clients in a given market, and also for the
estimation of the level of clients that have a place with
each sort.
3. A profoundly versatile structure for finding the best k
contenders of a given thing in expansive datasets.
4. CONCLUSION
In this work, we are providing a traditional definition
among various competitors based on features specified by
users on different items. C-miner Algorithm combined
with pyramid finder provides an efficient way of
identifying Top-k Contenders by considering various
factors like preferences and opinions of the users, and
finally it generates clusters by validating the score of
various participants to find K best things. This proposed
framework is designed to handle large data sets that are
chosen from various domains which are consider as key
data sets that are helpful in finding Top-k competitors. The
basic idea of our methodology is efficient and adaptable to
enhance and evaluate real datasets from a variety of
domains.
REFERENCES
[1]http://searchsqlserver.techtarget.com/definition/data-
mining.
[2]George Valkanas, Theodoros Lappas, and Dimitrios
Gunopulos, ”Mining Competitors from Large Unstructured
Datasets”,DOI10.1109/TKDE.2017.2705101, IEEE
Transactions on Knowledge and Data Engineering.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 05 Issue: 03 | Mar-2018 www.irjet.net p-ISSN: 2395-0072
© 2018, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 3024
[3]Theodoros Lappas, George Valkanas, Dimitrios
Gunopulos, ”Efficient and Domain-Invariant Competitor
Mining”,2012.
[4] Mark Bergena, y and Margaret A. Peteraf b,
”Competitor Identification and Competitor Analysis: A
Broad-Based Managerial Approach”, 2002.
[5] Sanket Shah, Amit Thakkar, Sonal Rami, ”A Novel
Approach for Making Recommendation using Skyline
Query based on user Location and Preference”, Indian
Journal of Science and Technology, Vol 9(30), DOI:
10.17485/ijst/2016/v9i30/99075,August 2016.
[6] kian-lee tian, pin-kwang Eng, Beng chin Ooi, “Efficient
Progressive Skyline Computation”,2001.
[7] Rui Li shenghua Bao, Jin Wang, Yong Yu, ”Cominer: An
Effective Algorithm for Mining Competitors from the web”
,2006.
[8] Qian Wan, Raymond Chi-Wing Wong, Yu Peng,
”Finding Top-k Profitable products”, 2012.
[9] Maksim Lapin, Matthias Hein, and Bernt Schiele,
”Analysis and Optimization of Loss Functions for
Multiclass, Top-k, and Multi label Classification”, 12 Dec
2016.
[10] Vincent S., T seng, “Efficient algorithms for mining
Top-k high utility ItemSets”, ieee vol.28 January 2016.

More Related Content

PDF
IRJET- Ad-Click Prediction using Prediction Algorithm: Machine Learning Approach
PDF
Residential-Forward-Vs-Spot-Market-Empirical-Analysis-of-Property-Prices-Inve...
PDF
Predicting Intraday Prices in the Frontier Stock Market of Romania Using Mach...
PDF
L3 1b
PDF
Modeling+pricing+strategies+using+game+theory+and+support+vector+machines
PDF
The disruptometer: an artificial intelligence algorithm for market insights
PDF
IRJET- Financial Analysis using Data Mining
PDF
Econometrics project
IRJET- Ad-Click Prediction using Prediction Algorithm: Machine Learning Approach
Residential-Forward-Vs-Spot-Market-Empirical-Analysis-of-Property-Prices-Inve...
Predicting Intraday Prices in the Frontier Stock Market of Romania Using Mach...
L3 1b
Modeling+pricing+strategies+using+game+theory+and+support+vector+machines
The disruptometer: an artificial intelligence algorithm for market insights
IRJET- Financial Analysis using Data Mining
Econometrics project

What's hot (14)

PDF
Lecture 2: NBERMetrics
PDF
Order Flow Imbalance Trading Algorithm
PPT
Demand Estimation
PDF
Deriving insights from data using "R"ight way
PDF
cCorrGAN: Conditional Correlation GAN for Learning Empirical Conditional Dist...
PPTX
Final presentation
PPT
Lecture 4
PDF
Predicting Bank Customer Churn Using Classification
PDF
50120140503005
PPTX
Final Presentation
PDF
Note about CFAR-m
PDF
A Tour through the Data Vizualization Zoo - Communications of the ACM
PDF
UNDERSTANDING CUSTOMERS' EVALUATIONS THROUGH MINING AIRLINE REVIEWS
PPTX
ForecastIT 2. Linear Regression & Model Statistics
Lecture 2: NBERMetrics
Order Flow Imbalance Trading Algorithm
Demand Estimation
Deriving insights from data using "R"ight way
cCorrGAN: Conditional Correlation GAN for Learning Empirical Conditional Dist...
Final presentation
Lecture 4
Predicting Bank Customer Churn Using Classification
50120140503005
Final Presentation
Note about CFAR-m
A Tour through the Data Vizualization Zoo - Communications of the ACM
UNDERSTANDING CUSTOMERS' EVALUATIONS THROUGH MINING AIRLINE REVIEWS
ForecastIT 2. Linear Regression & Model Statistics
Ad

Similar to IRJET- A Comprehensive way of finding Top-K Competitors using C-Miner Algorithm (20)

PDF
IRJET-Survey on Identification of Top-K Competitors using Data Mining
DOCX
MINING COMPETITORS FROM LARGE UNSTRUCTURED DATASETS
PDF
IRJET- Finding Optimal Skyline Product Combinations Under Price Promotion
PDF
Variance rover system web analytics tool using data
PDF
Variance rover system
PDF
An Improvised Fuzzy Preference Tree Of CRS For E-Services Using Incremental A...
PDF
A NEW HYBRID ALGORITHM FOR BUSINESS INTELLIGENCE RECOMMENDER SYSTEM
PDF
A new hybrid algorithm for business intelligence recommender system
PPT
Mining competitors from large unstructured data
PDF
LABELING CUSTOMERS USING DISCOVERED KNOWLEDGE CASE STUDY: AUTOMOBILE INSURAN...
PPTX
Data mining primitives
PDF
Impulsion of Mining Paradigm with Density Based Clustering of Multi Dimension...
PPT
1.11.association mining 3
PDF
Av24317320
PPTX
Fundamental of Data Science BCA 6th Sem Notes
PPTX
Fundamental of Data Science BCA 6th Sem ppt
PPTX
Fundamental of Data Science BCA 6th Sem Notes
PDF
IRJET- User Preferences and Similarity Estimation
PDF
Paper id 212014126
PDF
IRJET-Survey on Identification of Top-K Competitors using Data Mining
MINING COMPETITORS FROM LARGE UNSTRUCTURED DATASETS
IRJET- Finding Optimal Skyline Product Combinations Under Price Promotion
Variance rover system web analytics tool using data
Variance rover system
An Improvised Fuzzy Preference Tree Of CRS For E-Services Using Incremental A...
A NEW HYBRID ALGORITHM FOR BUSINESS INTELLIGENCE RECOMMENDER SYSTEM
A new hybrid algorithm for business intelligence recommender system
Mining competitors from large unstructured data
LABELING CUSTOMERS USING DISCOVERED KNOWLEDGE CASE STUDY: AUTOMOBILE INSURAN...
Data mining primitives
Impulsion of Mining Paradigm with Density Based Clustering of Multi Dimension...
1.11.association mining 3
Av24317320
Fundamental of Data Science BCA 6th Sem Notes
Fundamental of Data Science BCA 6th Sem ppt
Fundamental of Data Science BCA 6th Sem Notes
IRJET- User Preferences and Similarity Estimation
Paper id 212014126
Ad

More from IRJET Journal (20)

PDF
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
PDF
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
PDF
Kiona – A Smart Society Automation Project
PDF
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
PDF
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
PDF
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
PDF
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
PDF
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
PDF
BRAIN TUMOUR DETECTION AND CLASSIFICATION
PDF
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
PDF
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
PDF
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
PDF
Breast Cancer Detection using Computer Vision
PDF
Auto-Charging E-Vehicle with its battery Management.
PDF
Analysis of high energy charge particle in the Heliosphere
PDF
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
PDF
Auto-Charging E-Vehicle with its battery Management.
PDF
Analysis of high energy charge particle in the Heliosphere
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
Kiona – A Smart Society Automation Project
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
BRAIN TUMOUR DETECTION AND CLASSIFICATION
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
Breast Cancer Detection using Computer Vision
Auto-Charging E-Vehicle with its battery Management.
Analysis of high energy charge particle in the Heliosphere
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
Auto-Charging E-Vehicle with its battery Management.
Analysis of high energy charge particle in the Heliosphere
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...

Recently uploaded (20)

PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PPTX
Sustainable Sites - Green Building Construction
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PPTX
CH1 Production IntroductoryConcepts.pptx
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
PPTX
Lecture Notes Electrical Wiring System Components
PDF
composite construction of structures.pdf
PPTX
Internet of Things (IOT) - A guide to understanding
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PPTX
OOP with Java - Java Introduction (Basics)
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PPTX
additive manufacturing of ss316l using mig welding
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
CYBER-CRIMES AND SECURITY A guide to understanding
Sustainable Sites - Green Building Construction
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
CH1 Production IntroductoryConcepts.pptx
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
Embodied AI: Ushering in the Next Era of Intelligent Systems
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
Automation-in-Manufacturing-Chapter-Introduction.pdf
Lecture Notes Electrical Wiring System Components
composite construction of structures.pdf
Internet of Things (IOT) - A guide to understanding
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
OOP with Java - Java Introduction (Basics)
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
additive manufacturing of ss316l using mig welding
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026

IRJET- A Comprehensive way of finding Top-K Competitors using C-Miner Algorithm

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 05 Issue: 03 | Mar-2018 www.irjet.net p-ISSN: 2395-0072 © 2018, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 3021 A Comprehensive way of finding Top-K Competitors using C-Miner Algorithm Sk. Wasim Akram1, G. Manoj Babu2, D. Pratap Roy3, G. Lakshmi Narayana Reddy4 1Asst. professor, Dept. of Computer Science and Engineering, VVIT, AP, India 2,3,4 Student , Dept. of Computer Science and Engineering, VVIT, AP, India ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract – In order to get success in any business environment it is important to attract the customers than the competitor. A number of difficulties arise in the perspective of this task that is to find a method to formalize and compute the competitiveness relationship between two items and to find the true competitors of a given item also to know the features of an item that most affects its competitiveness. Despite the impact and relevance of this problem to many domains, only a limited amount of work has been devoted toward an efficient solution. In this paper, we present a formal definition of the competitiveness between two items. An efficient method is presented for evaluating competitiveness between items in large datasets and address the natural problem of displaying the top-k competitors of a given item. Our approach is evaluated against strong baselines via a user study and experiments on multiple datasets from various domains. Key Words: Contenders, Pyramid Finder, C-Miner, SkyLine, Cluster. 1. INTRODUCTION Data mining is the process of sorting through large data sets to discover patterns and establish relationships to solve problems through data analytics [1]. Data mining tools allow enterprises to predict future trends. A Long line of research has exhibited the vital significance of recognizing and observing firms rivals. Propelled by this issue, the promoting and administration group have concentrated on exact strategies for competitors generation and in addition to techniques for breaking down known contenders. Surviving exploration of the previous has concentrated on mining near articulations (e.g. "Thing A is superior to Item B") from the Web or other literary sources. Despite the fact that such articulations can without a doubt be pointers of competitiveness, they are truant in numerous spaces. For example, think about the area of get-away bundles. For this situation, things have no doled out the name by which they can be questioned or contrasted and each other. Further, the recurrence of printed relative proof can differ enormously crosswise over areas. For instance, when looking at mark names at the firm level (e.g." Google versus Yahoo" or "Sony versus Panasonic"), it is to be sure likely that relative examples can be found by basically questioning the web. Notwithstanding, it is anything but difficult to recognize standard spaces where such proof is to a great degree rare, for example, shoes, gems, inns, eateries, and furniture. Roused by these weaknesses, we propose another formalization of the competitiveness between two things, in the market sections that they can both cover. 2. EXISTING SYSTEM The administration writing is rich with works that emphasis on how directors can physically recognize competitors. Some of these works demonstrate contender recognizable proof as a psychological classification process in which administrators create mental portrayals of contenders and utilize them to order competitor firms. Other manual classification techniques depend on market- and asset-based similarities between a firm and applicant contenders. 2.1 DISADVANTAGES OF EXISTING SYSTEM The existing methodology isn't proper for assessing the intensity of any two things or firms in a given market. Rather, the creators accept that the arrangement of contenders is given and, in this way, they will likely register the estimation of the picked measures for every contender. What's more, the reliance on value-based information is a constraint we don't have. 3. PROPOSED SYSTEM We propose another formalization of the intensity between two things, in view of the market portions that they can both cover. We depict a strategy for processing every one of the sections in a given market in light of mining vast survey datasets. This strategy enables us to functionalize our meaning of competitiveness and address the issue of finding the best k contenders of a thing in any given market [2]. 3.1 COMPETITIVENESS Give U a chance to be the number of inhabitants in every single conceivable client in a given market. We look at that as a thing I covers a client u ∈ U in the event that it can cover the greater part of the client's necessities. At that point, the competitiveness between two things I, j is relative to the quantity of clients that they can both cover [2].
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 05 Issue: 03 | Mar-2018 www.irjet.net p-ISSN: 2395-0072 © 2018, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 3022 Fig - 1: Example of competitiveness on items The above figure shows the competitiveness between 3 things i, j and k. Every item is mapped to the set of features that it can offer to a customer. We Consider 3 features in this example they are X, Y, Z. The left side of the Fig.1 shows three groups of customers named as group1, group2 and group3. Every group represents a different market segment. Customers are grouped based on their preferences with respect to the features. Let us consider, the customers in group2 are only interested in features X and Y. We observe that items j and k are not competitive, since they are not appeal to the same groups of customers [2] [3]. This case represents the perfect situation, in which we approach the total arrangement of clients in a given market, and in addition to particular market sections and their necessities. Practically speaking, be that as it may, such data isn't accessible. With a specific end goal to beat this, we portray a strategy for registering every one of the fragments in a given market in light of mining extensive audit datasets. This strategy enables us to functionalize our meaning of competitiveness and address the issue of finding the best k contenders of a thing in any given market. As we appear in our work, this issue presents critical computational difficulties, particularly within the sight of huge datasets with hundreds or thousands of things, for example, those that are regularly found in standard spaces. We address these difficulties through an exceptionally adaptable structure for top-k calculation, including a productive assessment calculation and a fitting record. The common client session on a survey stage, for example, Yelp, Amazon or Trip Advisor, comprises of the accompanying advances 1) Specify every single required element in an inquiry. 2) Submit the inquiry to the site's internet searcher and recover the coordinating things. 3) Process the surveys of the returned things and settle on a buy choice. In this setting, things that cover the client's prerequisites will be incorporated into the web crawler's reaction and will go after her consideration. Then again, non-covering things won't be considered by the client and, along these lines, won't have an opportunity to contend. Next, we show a case that stretches out this basic leadership procedure to a multi-client setting [4]. 3.2 PYRAMID FINDER Given the horizon Sky(I) of an arrangement of things I and a thing i ∈ I, let Y contain the k things from Sky(I) that are most aggressive with I. At that point, a thing j ∈ I must be in the best k contenders of i, if j ∈ Y or if j is overwhelmed by one of the things in Y. we don't have to consider the whole arrangement of applicants with a specific end goal to generate the top-k contenders. This propels us to build the horizon pyramid. A structure that unimaginably diminishes the amount of things that ought to be considered. We allude to the calculation used to build the horizon pyramid as Pyramid Finder [5][6]. Pseudo code: Input: Set of items I Output: Dominance Pyramid DI 1: DI[0] ←Sky(I) 2: Z ← I Skyline(I) 3: Level ←1. 4: while Z is not empty do 5: DI[level] ←Sky(Z) 6: for every item j ∈ DI[level] do 7: for every item I ∈ DI[level1] do 8: if i dominates j then 9: Add a link i → j 10: break 11: end if 12: end for 13: end for 14: Z ← Z skyline(Z) 15: level ←level + 1 16: end while 3.3 THE C-MINER ALGORITHM Next, we exhibit C Miner, a correct calculation for finding the best k contenders of a given thing. Our calculation influences utilization of the horizon to pyramid keeping in mind the end goal to diminish the quantity of things that should be considered. Given that we just think about the
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 05 Issue: 03 | Mar-2018 www.irjet.net p-ISSN: 2395-0072 © 2018, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 3023 best k contenders, we can incrementally process the score of every applicant and stop when it is ensured that the best k has developed [7]. The info incorporates the arrangement of things I, the arrangement of highlights F, the thing of intrigue I, the number k of best contenders to recover, the set Q of inquiries and their probabilities, and the horizon pyramid DI . The calculation initially recovers the things that overwhelm I. These things have the greatest conceivable intensity with I. On the off chance that at any rate k such things exist, we report those and close. Else, we add them to Top-k and decrement our financial plan of k appropriately. Consider LB keeps up the most reduced lower bound from the present best k set and is utilized to prune competitors. We instate the arrangement of applicants X as the association of things in the main layer of the pyramid and the arrangement of things commanded by those as of now in the Top-k [8] . This is accomplished by means of calling GETSLAVES routine by passing parameters (Top-k, DI). In each cycle, C-Miner nourishes the arrangement of hopefuls X to the UPDATETOPK() schedule, which prunes things in view of the LB edge. It at that point refreshes the Top-k set through the MERGE() routine work, which recognizes the things with the most astounding intensity from Top-k . This can be accomplished in direct time, since both X and TOP-K are arranged. The pruning edge LB is set to the most noticeably bad (least) score among the new Top-k. At long last, GETSLAVES() routine is utilized to extend the arrangement of hopefuls by including things that are overwhelmed by those in X [9]. The UPDATETOPK() procedures hopefuls in X and finds at most k applicants with the most elevated intensity with i. The routine uses an information structure nearby Top-k, executed as a cooperative cluster: the score of every applicant fills in as the key, while its id fills in as the esteem. The cluster is key-arranged, to encourage the calculation of the k best things. The structure is consequently truncated with the goal that it generally contains at most k things. We instate the lower and upper limits. For each thing j ∈ X, low(j) keeps up the current intensity score of j as new inquiries are considered and fills in as a lower bound to the applicant's real score. Each lower bound low(j) begins from 0, and after the fulfillment of UPDATETOPK(), it incorporates the genuine intensity score CF (i, j) of competitor j with the central thing i. Then again, up(j) is an idealistic upper bound on j's competitiveness score. At first, up(j) is set to the most extreme conceivable score. For each question q ∈ Q, maxV holds the most extreme conceivable competitiveness between thing i and some other thing for that inquiry, which is in reality the scope of i as for q. At that point, for every competitor j ∈ X , we subtract maxV from up(j) and afterward add to it the genuine intensity amongst i and j for inquiry q. In the event that the upper bound up(j) of a competitor j progresses toward becoming lower than the pruning limit LB, at that point j can be securely precluded. Something else, low(j) is refreshed and j stays in thought. After each refresh, the estimation of LB is set to the most exceedingly terrible score in nearby TOP-K, to utilize stricter pruning in future cycles. In the event that the quantity of applicant’s |X| turns out to be less or equivalent to k, the loop over the questions stops. This is an early-ceasing model: since we will probably recover the best k hopefuls in X, having |X| <= k implies that every outstanding competitor ought to be returned. We finish the intensity calculation of the rest of the hopefuls and refresh nearby Top-k as needs be. This happens after the consummation of the principal circle, with a specific end goal to stay away from pointless bound-checking and enhance execution [10]. 3.4 ADVANTAGES OF PROPOSED SYSTEM 1. A formal meaning of the competitiveness between two things, in view of their interest to the different client portions in their market. Our approach beats the dependence of past work on rare near proof mined from the content. 2. A formal technique for the distinguishing proof of the diverse kinds of clients in a given market, and also for the estimation of the level of clients that have a place with each sort. 3. A profoundly versatile structure for finding the best k contenders of a given thing in expansive datasets. 4. CONCLUSION In this work, we are providing a traditional definition among various competitors based on features specified by users on different items. C-miner Algorithm combined with pyramid finder provides an efficient way of identifying Top-k Contenders by considering various factors like preferences and opinions of the users, and finally it generates clusters by validating the score of various participants to find K best things. This proposed framework is designed to handle large data sets that are chosen from various domains which are consider as key data sets that are helpful in finding Top-k competitors. The basic idea of our methodology is efficient and adaptable to enhance and evaluate real datasets from a variety of domains. REFERENCES [1]http://searchsqlserver.techtarget.com/definition/data- mining. [2]George Valkanas, Theodoros Lappas, and Dimitrios Gunopulos, ”Mining Competitors from Large Unstructured Datasets”,DOI10.1109/TKDE.2017.2705101, IEEE Transactions on Knowledge and Data Engineering.
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 05 Issue: 03 | Mar-2018 www.irjet.net p-ISSN: 2395-0072 © 2018, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 3024 [3]Theodoros Lappas, George Valkanas, Dimitrios Gunopulos, ”Efficient and Domain-Invariant Competitor Mining”,2012. [4] Mark Bergena, y and Margaret A. Peteraf b, ”Competitor Identification and Competitor Analysis: A Broad-Based Managerial Approach”, 2002. [5] Sanket Shah, Amit Thakkar, Sonal Rami, ”A Novel Approach for Making Recommendation using Skyline Query based on user Location and Preference”, Indian Journal of Science and Technology, Vol 9(30), DOI: 10.17485/ijst/2016/v9i30/99075,August 2016. [6] kian-lee tian, pin-kwang Eng, Beng chin Ooi, “Efficient Progressive Skyline Computation”,2001. [7] Rui Li shenghua Bao, Jin Wang, Yong Yu, ”Cominer: An Effective Algorithm for Mining Competitors from the web” ,2006. [8] Qian Wan, Raymond Chi-Wing Wong, Yu Peng, ”Finding Top-k Profitable products”, 2012. [9] Maksim Lapin, Matthias Hein, and Bernt Schiele, ”Analysis and Optimization of Loss Functions for Multiclass, Top-k, and Multi label Classification”, 12 Dec 2016. [10] Vincent S., T seng, “Efficient algorithms for mining Top-k high utility ItemSets”, ieee vol.28 January 2016.