SlideShare a Scribd company logo
CONTENT & LOCATION AWARE
RESTAURANT RECOMMENDATIONS
USING URBAN REVIEW NETWORKS
PROJECT REPORT
Submitted By
Jayant Jaiswal, Roll No-12600112104, Regn No-121260110042
Shoaib Khan, Roll No-12600112163, Regn No-121260110101
Rohan Agarwal, Roll No-12600112143, Regn No-121260110081
Under the Supervision of
Asst. Prof. Partha Basuchowdhuri
Computer Science & Engineering
in partial fulfillment for the award of the degree
of
BACHELOR OF TECHNOLOGY
In
COMPUTER SCIENCE & ENGINEERING
HERITAGE INSTITUTE OF TECHNOLOGY, KOLKATA
MAULANA ABUL KALAM AZAD
UNIVERSITY OF TECHNOLOGY
Acknowledgements
We would take this opportunity to thank Dr. P. Chaudhuri, Principal, Heritage In-
stitute of Technology for giving us the golden opportunity of working on this project
and providing us with all the necessary facilities and resources to work towards com-
pletion.
We are thankful to Asst. Prof. Partha Basuchowdhuri, our advisor and guide, for
his continuous support, advise and words of encouragement without which we could
have not seen through the completion of this project. He is not just an advisor but a
patient teacher who has always been there solving our doubts no matter how trivial
and providing us with valuable insights which helped us in every way possible. We
also owe our sincere gratitude to Dr. Subhashis Majumder, the Head of the Depart-
ment, for his enriching discussions, novel ideas and valuable feedbacks.
We would also like to thank our teachers, faculty members and laboratory assistants
at the Heritage Institute of Technology for playing a pivotal and decisive role during
the development of the project. Last but not the least we thank all friends for their
cooperation and encouragement.
Jayant Jaiswal
Shoaib Khan
Rohan Agarwal
i
HERITAGE INSTITUTE OF TECHNOLOGY
MAULANA ABUL KALAM AZAD UNIVERSITY OF TECHNOLOGY
BONAFIDE CERTIFICATE
Certified that this Project Report : ”CONTENT & LOCATION AWARE
RESTAURANT RECOMMENDATIONS USING URBAN REVIEW NET-
WORKS” is the bonafide work of ”Jayant Jaiswal, Shoaib Khan and Rohan
Agarwal” who carried out this project work under my supervision.
SIGNATURE SIGNATURE
Dr. Subhashis Majumder Asst. Prof. Partha Basuchowdhuri
Head of the Department Project Guide
Computer Science & Engineering Computer Science & Engineering
East Kolkata Township, East Kolkata Township,
Chowbaga Road,Anandapur, Chowbaga Road,Anandapur,
West Bengal - 700107. West Bengal - 700107.
SIGNATURE
EXAMINER
ii
Abstract
Restaurant recommendation system is a very popular service whose so-
phistication keeps increasing everyday.In this paper we present a per-
sonalised restaurant recommendation system which has two parts to
it. The first part recommends users’ restaurants based on their restau-
rant review history. The second part recommends business owners with
places perfect to open a restaurant with a particular cuisine where the
owner would get the best traffic for the restaurant. Using Zomato data,
we built a restaurant recommendation system for the individuals and
business owners. For each user in our data we find out the cuisine
preferences and other restrictions such as services offered, ambience,
average rating, etc. and based on that we recommend the restaurants
accordingly. We propose a metric that takes the popularity as well as
the sentiment of opinions for the food items based on the user gener-
ated reviews as opposed to other systems where which only consider
the features mentioned above to recommend restaurants.
iii
Contents
1 Introduction 1
1.1 Road Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 What are Recommendation Systems? . . . . . . . . . . . . . . . . . . . . . . 1
1.2.1 Collaborative Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2.2 Content Based Filtering . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.3 Hybrid Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.4 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Motivation for Restaurant Recommendations . . . . . . . . . . . . . . . . . 3
2 Literature Review 4
3 Problem Definition 5
4 Data Analysis 6
4.1 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4.2 Data Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
5 Methodology 7
5.1 Location Aware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
5.2 Content Based . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
6 Conclusion 12
7 Future Works 13
8 References 14
iv
List of Figures
5.1 Live Map of Kolkata sorted on the basis of ratings . . . . . . . . . . . . . . 7
5.2 The road network stored in PostgreSQL . . . . . . . . . . . . . . . . . . . . 8
5.3 Map of Kolkata showing the important intersections to setup a new restau-
rant based upon a cuisine North Indian . . . . . . . . . . . . . . . . . . . . 9
5.4 The system taking user id as input to generate recommendations for that user. 11
5.5 Top 5 restaurants recommended by the system to the user for each food item 11
v
Chapter 1
Introduction
1.1 Road Map
In Chapter 1, we provide a broad description of the types of recommendation system
and applications of it in todays customer centric e-commerce market coupled with the basic
knowledge about recommendation system. In Chapter 2 we give a brief overview of the
prior works done in the field of restaurant recommendation. Chatpter 3 discusses about
the problem definition and terminologies related to it like content and location based rec-
ommendation. In Chapter 4 we discuss the methods of fetching data and the preprocessing
done to suit the sytem and create good recommendation. Chapter 5 discusses about the
methodologies and gives a detailed study about our system. The results of our system
on content and location specific recommendation are provided in Chapter 6. Scope for
improvements and future ideas are mentioned in Chapter 8 as future works.
1.2 What are Recommendation Systems?
Recommender systems have changed the way people find products, information, and even
other people. The goal of a Recommender System is to generate meaningful recommenda-
tions to a collection of users for items or products that might interest them. It has changed
the way inanimate websites communicate with their users. Rather than providing a static
experience in which users search for and potentially buy products, recommender systems
increase interaction to provide a richer experience. The systems identify recommendations
autonomously for individual users based on past purchases and searches, and on other users’
behavior. They study patterns of behavior to know what someone will prefer from among a
collection of things he has never experienced. The technology behind recommender systems
has evolved over the past 20 years into a rich collection of tools that enable the practitioner
or researcher to develop effective recommenders.
1.2.1 Collaborative Filtering
Collaborative filtering methods are based on collecting and analyzing a large amount
of information on users behaviors, activities or preferences and predicting what users will
like based on their similarity to other users. A key advantage of the collaborative filtering
1
approach is that it does not rely on machine analyzable content and therefore it is capable of
accurately recommending complex items such as movies without requiring an understanding
of the item itself. Many algorithms have been used in measuring user similarity or item
similarity in recommender systems. For example, the k-nearest neighbor (k-NN) approach
and the Pearson Correlation.
1.2.2 Content Based Filtering
Content-based filtering methods are based on a description of the item and a profile of
the users preference. In a content-based recommendation system, keywords are used to
describe the items; beside, a user profile is built to indicate the type of item this user likes.
In other words, these algorithms try to recommend items that are similar to those that
a user liked in the past (or is examining in the present). In particular, various candidate
items are compared with items previously rated by the user and the best-matching items
are recommended. This approach has its roots in information retrieval and information
filtering research.
1.2.3 Hybrid Approach
Recent research has demonstrated that a hybrid approach, combining collaborative fil-
tering and content-based filtering could be more effective in some cases. Hybrid approaches
can be implemented in several ways, by making content-based and collaborative-based pre-
dictions separately and then combining them, by adding content-based capabilities to a
collaborative-based approach (and vice versa), or by unifying the approaches into one model.
Several studies empirically compare the performance of the hybrid with the pure collabo-
rative and content-based methods and demonstrate that the hybrid methods can provide
more accurate recommendations than pure approaches. These methods can also be used to
overcome some of the common problems in recommendation systems such as cold start and
the sparsity problem. Netflix is a good example of a hybrid system. They make recommen-
dations by comparing the watching and searching habits of similar users (i.e. collaborative
filtering) as well as by offering movies that share characteristics with films that a user has
rated highly (content-based filtering).
1.2.4 Applications
1) Facebook users a recommender system to suggest Facebook users you may know offline.
The system is trained on personal data mutual friends, where you went to school, places of
work and mutual networks (pages, groups, etc.), to learn who might be in your offline &
offline network.
2) When you fill out your Taste Preferences or rate movies and TV shows, youre helping
Netflix to filter through the thousands of selections to get a better idea of what you might
like to watch. Factors that Netflix algorithm uses to make such recommendations include:
a) The genre of movies and TV shows available
b) Your streaming history, and previous ratings youve made.
2
c) The combined ratings of all Netflix members who have similar tastes in titles to you.
3) The Jobs You May Be Interested In feature shows jobs posted on LinkedIn that match
your profile in some way. These recommendations shown based on the titles and descriptions
in your previous experience, and the skills other users have endorsed.
4) Amazons algorithm crunches data on all of its millions of customer baskets, to figure out
which items are frequently bought together. This can lead to huge returns- for example,
if youre buying an electrical item, and see a recommendation for the cables or batteries it
requires beneath it, youre very likely to purchase both the core product and the accessories
from Amazon.
1.3 Motivation for Restaurant Recommendations
Obtaining recommendations from trusted sources is a critical component of the natural
process of human decision making. With burgeoning consumerism buoyed by the emergence
of the web, buyers are being presented with an increasing range of choices while sellers are
being faced with the challenge of personalizing their advertising efforts. In parallel, it has
become common for enterprises to collect large volumes of transactional data that allows
for deeper analysis of how a customer base interacts with the space of product offerings.
Recommender Systems have evolved to fulfill the natural dual need of buyers and sellers by
automating the generation of recommendations based on data analysis.
There are many recommendation systems available for problems like shopping, online
video entertainment, games etc. Restaurants & Dining is one area where there is a big
opportunity to recommend dining options to users based on their preferences as well as
historical data. Zomato is a very good source of such data with not only restaurant reviews,
but also user-level information on their preferred restaurants. This report describes the work
to recommend restaurants to a given Zomato user based on their history or their cuisine
preferences. It also does the task of recommending cuisine specific suitable locations to
newcomers in the restaurant business.
3
Chapter 2
Literature Review
In this section we bring to limelight a few previous works done in the field of providing
restaurant recommendations. Recommender systems seek to predict the ’rating’ or ’pref-
erence’ that a user would give to an item. Recommender systems typically produce a list
of recommendations in one of two ways - through collaborative or content-based filtering.
Collaborative filtering approaches building a model from a user’s past behavior (items pre-
viously purchased or selected and/or numerical ratings given to those items) as well as
similar decisions made by other users. This model is then used to predict items (or ratings
for items) that the user may have an interest in. Content-based filtering approaches uti-
lize a series of discrete characteristics of an item in order to recommend additional items
with similar properties. These approaches are often combined to from hybrid recommender
systems.
Traditional recommendation system has used user profile to analysis and find similar
user. The systems recommend restaurants to users from result of analysis. However, these
systems are lack of consideration of user mobility and environment. Other recommendation
system provides service by finding restaurant and providing information of restaurant by
web site. This system is close to search system but not recommendation system. Recently
research relating with context information is using user location to serve advertise, sale
and event information. This system analyses user preference through user profile and finds
restaurant satisfying user preference and closing user location. The research consists of two
sections, one which has online activity, and the other which processes data offline. When
the user is in motion, i.e., his geo-position changes notably, the system goes online and
recommendation module becomes active, retrieving nearby and restaurants and ranking
them, based on their properties, according to the scores generated offline. The offline part
generally remains in a non-functional mode when the user is stationary. The work of the
offline system is to generate a user interest profile, using a Machine Learning algorithm. The
drawback of the offline feature is that the interest profile is generated based on users check-
in to restaurants. It doesnt take into account users taste, habits and the cuisines he favors.
Thus the offline recommendation can be considered as a shallow approach lacking users
detailed interaction with each restaurants which can be obtained in the form of reviews.
4
Chapter 3
Problem Definition
Creating an innovative recommendation system to provide content based
recommendations to restaurant goers and owners and provide location
based recommendations to restaurant owners using Zomato Restaurant
Review Network.
Suggest best-suited places to new entrants in the restaurant business
for setting up a new restaurant to fill in the cuisine void and garner
high traffic.
Suggest restaurants to users based on their previous review activity on
Zomato by creating a recommendation system using all reviews from
all restaurants in a city.
5
Chapter 4
Data Analysis
4.1 Data Collection
Zomato and Yelp are two popular restaurant search, discovery and review services. While
both are popular gloablly Zomato has an edge over Yelp in India. Since we are based in India
we decided to choose Zomato as our ”Restaurant Review Network”. Also, Zomato provides
more carefully curated content which will be enough to satiate appetites even outside its
native land. Users can find restaurants, leave reviews, rate a restaurant, and keep their
own restaurant diary to share with friends. Zomato has built a highly coherent and focused
experience that puts the emphasis on being a comprehensive network for food-lovers. Very
little on the site is superfluous. In a survey users were impressed with the amount of
attention to detail evident in the sites content, and unlike on Yelp, several testers actually
used Zomatos curated lists and suggestions to find restaurants. Hence, Zomato was chosen
as our ”Restaurant Review Network” over other popular services due to it’s detailed yet
simplistic data.
4.2 Data Handling
We first crawled the restaurant data of Delhi from Zomato. The crawling was done using
data crawlers built in python which would specifically crawl restaurant data. The data
comprised of all possible features listed on Zomato like ”Dine-in or Takeaway”, ”AC or
Non-Ac”, etc. But then we switched to Kolkata as our sample city for data analysis due
to a number of reasons. Firstly, Kolkata had around 2000 restaurants compared to Delhi’s
10000+ restaurants. Also, we were based in Kolkata and knew the city in and out.Thus,
we could analyze the results better.
After the first crawl of restaurant’s data we crawled restaurant reviews for these 2000
restaurants. This crawl operation generated over two hundred thousand reviews. Each
review also comprised of restaurant name, reviewer id and name, details of the review. All
the data was later stored in MongoDB which is a No-Sql Database for easy fetching and
manipulation. These is the data that will be used by or system.
6
Chapter 5
Methodology
5.1 Location Aware
The location aware part of our project has the primary motive of recommending to people
who want to setup a new restaurant business explained earlier in our problem statement.
We assess the road map to identify concentration of restaurant clusters in a city. These
clusters can be defined as restaurant hotspots. We have generated a live map of our sample
city (Kolkata) with all the restaurants marked in it. The nodes are given a particular color
as per the rating range in which they fall into and clicking on a node gives the details of
the restaurant the node is representing. This will help provide real time recommendations
to users based on ratings and locations. Below is a snapshot of the aforementioned.
Figure 5.1: Live Map of Kolkata sorted on the basis of ratings
The road network of our sample city was generated using OepenStreetMaps and Post-
greSql. It was a graph with road intersections as nodes and roads as edges. We couldn’t use
Google Maps for this as as it came at a premium. We had the coordinates of all restaurants
in the city. We added the restaurants as pendant nodes to their nearest intersection by
7
using K-NIN (K nearest neighbor) algorithm. Thus, we get a complete road network with
all restaurants and road intersections as nodes and the roads as edges.
Figure 5.2: The road network stored in PostgreSQL
For every node, which is an intersection, in the road network we store the distances of the
nearest restaurant for each and every cuisine at that node itself as node attributes. We now
created a vector for each road intersection which stores the X/Y ratios for top 10 cuisines
where,
X = Avg. rating for the nearest restaurant
Y = The distance of the nearest restaurant
The lesser this ratio (X/Y), the better it is suited to opening of a new restaurant for some
cuisine. Running the Page Rank Algorithm on the graph will also give us the important
intersections having more importance and traffic. Combining the page rank probablilites
with our ratio we define a new ratio R as follows :-
R =
Page Rank Probability
Ratio(X/Y )
If this ratio R is maximum, then the intersection is the best suited to setting up of a new
restaurant for that cuisine.
8
Figure 5.3: Map of Kolkata showing the important intersections to setup a new restaurant
based upon a cuisine (North Indian)
The map shows most important intersections favourable to setting up of a new restaurant
offering north indian cuisine in red colour.
5.2 Content Based
We have crawled the restaurants of our sample city (Kolkata) for their features (i.e. rat-
ings, cuisines, bar, ac/non-ac, veg/non-veg, etc.) and their reviews given by the visiting
users. These reviews would give us insights about a users degree of likeness towards a par-
ticular restaurant. The review data for each restaurant in Kolkata consisted of user id, user
name, restaurant id restaurant name and the review. The reviews for each restaurants taken
individually were passed to ”Intellexer Sentiment Analyser” module. Applying Intellexer
on the restaurant reviews we get
1) Opinion Holders which are food items.
2) Each opinion holder (food item) having multiple opinions.
3) Sentiment values which can be either positive or negative for each opinion.
Intellexer Sentiment Analyzer is a powerful and efficient solution that automatically ex-
tracts sentiments (positivity/negativity), opinion objects and emotions (liking, anger, dis-
gust, etc.) from unstructured text information. From these sentiment values we found out
the best food items available in that restaurant, applying sorting in descending order of
their sentiments. The sentiment values are calculated using the metric in equation 5.1. The
calculation for getting the best food item is done in the following way. For each opinion
holder we had a 3-tuple list of n items. The values in each tuple were
1) Food tags
2) Average Sentiment
9
3) Opinion Count
The food tags are the best spelling suited for a particular food item of North Indian
cuisine. These tags are sufficient to identify north Indian dishes and reviews from all the
user reviews. Example of these north Indian food tags are biryani, kebab, qorma, tikka
etc. Since there are multiple possible spellings for each unique food item holder given by
the users we attempt to replace all by a best suited name for the holder. Example Biryani
has multiple forms like biriyani,beryani,beeryani etc. We clubbed the sentiment of similar
food items with different spellings using fuzzywuzzy. Fuzzy String Matching, also called
Approximate String Matching, is the process of finding strings that approximately match a
given pattern. The closeness of a match is often measured in terms of edit distance, which
is the number of primitive operations necessary to convert the string into an exact match.
We took each words in our repository and found out a partial ratio of these items using
fuzz.partial Ratio() with the holders extracted from intellexer. We took a threshold value
of 67 i.e. if the ratio is greater than or equal to 67 then we considered the items to be
similar using this we clustered all the similar items and found the average of each tags in
our repository using the sentiment of the opinion from intellexer. Once we have got the
average sentiment and count of all the tags we then used a metric to rank the tags in that
restaurant. Here are some of the terms to be known:
Max cnt = Maximum of the opinion count value of all food tags
Min cnt = Minimum of the opinion count value of all food tags
The metric is :
Sentiment of tag = Avg. Sentiment of tag ×
opinion count of tag − min count + 1
max cnt − min cnt + 1
(5.1)
This metric is applied for each restaurant and their tags obtained. Using this we get a
normalized sentiment of each tag and sorting the tags on this value in descending order will
give us the highest ranked to lowest ranked tags. The normalization is done for a particular
restaurant and not related to all restaurants.
The Restaurant data of all the restaurants were inserted in MongoDB along with their
opinion counts. Thus, for each food tag in our repository we found out the restaurants that
provide that food. The previous max cnt and min cnt metric is again applied to individual
food tags in our repository for all restaurants. The result after sorting will give the top
restaurants for a particular food tag. This result for each individual tag is stored in the
database. Now for any user, his reviews are fed into intellexer and his opinion holders(food
items) are generated. These opinion holders indicate his food preference. The number of
opinion holders should be greater than one. Now comes the easy part of recommending the
already stored top 5 restaurants for each tag in the database to the user. Thus, the user
gets recommended based on his own reviews using our system. The results are shown in
screenshots below.
10
Figure 5.4: The system taking user id as input to generate recommendations for that user.
Figure 5.5: Top 5 restaurants recommended by the system to the user for each food item
11
Chapter 6
Conclusion
Our results show the busiest roads in the city which are ideal for setting up of new
restaurants and for foodies to take that course is a delight in itself. This can have positive
influence on current businesses also. Our results can provide cuisines lacking in a particular
area which can be exploited by current businesses. Previous works on location based recom-
mendation focused on providing results to users only but our system focuses on restaurant
owners.
Our system implements content-based filtering to provide restaurant recommendations
based on their previous reviews. The recommendations are pretty much accurate as per
our tests. Our system can be easily extended to other cities and cuisines. Our system has
immense potential and is multipurpose as it can come handy for businesses as well as the
average user. The field of restaurant recommendations is one of the uncharted territories
and our system is a small step in a giant ocean.
12
Chapter 7
Future Works
We plan on building our own sentiment analyzer pertaining to restaurants rather than
relying on a Intellexer Sentiment Analyser module. This will help in getting correct and
accurate sentiments for tags like services, food and other features of the restaurants and
solve our sentiment ambiguity. Sentiments are not purely positive or negative infact there
are various levels to identify sentiments and their effect in the statement. These can be
employed in details to the system for more accurate results.
We also plan on using Collaborative Filtering method to our system. These systems do
not use any information regarding the actual content of the items (as opposed to content
filtering). They are based on usage or preference patterns of other users. Selection (or
filtering) of items is done in a method similar to individuals collaborating to make recom-
mendations for each other i.e. if some tags are similar between users then the users are
termed as similar and similar recommendations are provided.
13
Chapter 8
References
1. Anant Gupta and Kuldeep Singh. Location Based Personalized Restaurant Recommen-
dation System for Mobile Environments.
2. Sumit Negi. Single Document Keyphrase Extraction Using Label Information.
3.Liu, J., Shang, J., Wang, C., Ren, X., Han, J., 2015. Mining Quality Phrases from Massive
Text Corpora, in:. Presented at the Proceedings of the 2015 ACM SIGMOD International
Conference on Management of Data, ACM, pp. 17291744.
4. Mariana Romanyshyn. RULE-BASED SENTIMENT ANALYSIS OF UKRAINIAN RE-
VIEWS. International Journal of Artificial Intelligence & Applications (IJAIA), Vol. 4, No.
4, July 2013.
5. El-Kishky, A., Song, Y., Wang, C., Voss, C.R., Han, J., 2014. Scalable topical phrase
mining from text corpora. Proceedings of the VLDB Endowment 8, 305316.
6. Burusothman Ahiladas, Paraneetharan Saravanaperumal, Sanjith Balachandran, Thamayan-
thy Sripalan and Surangika Ranathunga. Ruchi: Rating Individual Food Items in Restau-
rant Reviews.
14

More Related Content

PPTX
Introduction to ML (Machine Learning)
PDF
Introduction to Recommendation Systems
PDF
SVD and the Netflix Dataset
PPTX
Recommender Systems
PPTX
Recommendation Engine Project Presentation
PDF
Collaborative filtering
PDF
The Science and the Magic of User Feedback for Recommender Systems
PPTX
Collaborative Filtering Recommendation System
Introduction to ML (Machine Learning)
Introduction to Recommendation Systems
SVD and the Netflix Dataset
Recommender Systems
Recommendation Engine Project Presentation
Collaborative filtering
The Science and the Magic of User Feedback for Recommender Systems
Collaborative Filtering Recommendation System

What's hot (20)

PPTX
Recommendation Systems
PPTX
Recommender systems using collaborative filtering
PDF
Book Recommendation Engine
PDF
Collaborative Filtering and Recommender Systems By Navisro Analytics
PPT
Recommendation system
PPTX
Recommender system
PPTX
Machine Learning and Real-World Applications
PPT
Item Based Collaborative Filtering Recommendation Algorithms
PDF
Overview of recommender system
PDF
풀잎스쿨 - LIME 발표자료(설명가능한 인공지능 기획!)
PDF
Recommendation System Explained
PDF
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
PPT
Machine learning
PPTX
Language Model.pptx
PDF
Recommender Systems
PPTX
Recommendation system (1).pptx
PDF
Movie recommendation project
PDF
HBase at LINE
PDF
Amazon Personalize 소개 (+ 실습 구성)::김영진, 솔루션즈 아키텍트, AWS::AWS AIML 스페셜 웨비나
PPTX
Recommender Systems
Recommendation Systems
Recommender systems using collaborative filtering
Book Recommendation Engine
Collaborative Filtering and Recommender Systems By Navisro Analytics
Recommendation system
Recommender system
Machine Learning and Real-World Applications
Item Based Collaborative Filtering Recommendation Algorithms
Overview of recommender system
풀잎스쿨 - LIME 발표자료(설명가능한 인공지능 기획!)
Recommendation System Explained
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Machine learning
Language Model.pptx
Recommender Systems
Recommendation system (1).pptx
Movie recommendation project
HBase at LINE
Amazon Personalize 소개 (+ 실습 구성)::김영진, 솔루션즈 아키텍트, AWS::AWS AIML 스페셜 웨비나
Recommender Systems
Ad

Viewers also liked (20)

PPTX
Swot, tam, sam, som presentation
PPTX
From zero to zomato in five years
DOCX
Summer internship project report on online food app- TINYOWL
PDF
Text classification & sentiment analysis
DOCX
Front page
PPTX
Zomato revenue analysis
PDF
ANALYSING ROLE OF SOCIAL MEDIA IN CONSUMER DECISION MAKING FOR PURCHASE OF AU...
PDF
Past, present, and future of Recommender Systems: an industry perspective
PPTX
Sentiment analysis using naive bayes classifier
PDF
Zomato case study v1.0 20 7-2015
PDF
Mba marketing
PDF
Zomato - SEO and Search Marketing
PPTX
Zomato presentation
PPTX
Zomato
PDF
Zomato: Transforming the Global Restaurant Business
PDF
AngularJS Basics with Example
PDF
Market sizing TAM SAM SOM Target Market
PPTX
All about Zomato
DOCX
Consumer Behavior Project
DOCX
comparative Analysis of mutual fund
Swot, tam, sam, som presentation
From zero to zomato in five years
Summer internship project report on online food app- TINYOWL
Text classification & sentiment analysis
Front page
Zomato revenue analysis
ANALYSING ROLE OF SOCIAL MEDIA IN CONSUMER DECISION MAKING FOR PURCHASE OF AU...
Past, present, and future of Recommender Systems: an industry perspective
Sentiment analysis using naive bayes classifier
Zomato case study v1.0 20 7-2015
Mba marketing
Zomato - SEO and Search Marketing
Zomato presentation
Zomato
Zomato: Transforming the Global Restaurant Business
AngularJS Basics with Example
Market sizing TAM SAM SOM Target Market
All about Zomato
Consumer Behavior Project
comparative Analysis of mutual fund
Ad

Similar to Zomato Crawler & Recommender (20)

PDF
Different Location based Approaches in Recommendation Systems
PPTX
Thesis work - ppt- food recommendation system
PPTX
food recommendation sytem using python and streamlit
PPTX
Recommendation Architecture - OpenTable - RecSys 2014 - Large Scale Recommend...
PDF
IRJET- An Integrated Recommendation System using Graph Database and QGIS
PDF
Recipe Companion: Posting And Sharing Using Recipes Recommendation System
PDF
UPSERVE – Restaurant Sales and Analysis System
PDF
Recommender Engines Seminar Paper
PPT
Impersonal Recommendation system on top of Hadoop
PDF
Jeremy Schiff, Senior Manager, Data Science, OpenTable at MLconf NYC
PDF
Place recommendation system
PDF
Notes on Recommender Systems pdf 2nd module
PDF
IRJET- Rating based Recommedation System for Web Service
PDF
IRJET- Popularity based Recommender Sytsem for Google Maps
PDF
Introduction to Recommendation Systems
PDF
IRJET- Survey Paper on Recommendation Systems
PDF
FIND MY VENUE: Content & Review Based Location Recommendation System
PDF
Tourist Destination Recommendation System using Cosine Similarity
PPTX
Unit 1 Recommender Systems it's most important topic in machine
PDF
Keyword Based Service Recommendation system for Hotel System using Collaborat...
Different Location based Approaches in Recommendation Systems
Thesis work - ppt- food recommendation system
food recommendation sytem using python and streamlit
Recommendation Architecture - OpenTable - RecSys 2014 - Large Scale Recommend...
IRJET- An Integrated Recommendation System using Graph Database and QGIS
Recipe Companion: Posting And Sharing Using Recipes Recommendation System
UPSERVE – Restaurant Sales and Analysis System
Recommender Engines Seminar Paper
Impersonal Recommendation system on top of Hadoop
Jeremy Schiff, Senior Manager, Data Science, OpenTable at MLconf NYC
Place recommendation system
Notes on Recommender Systems pdf 2nd module
IRJET- Rating based Recommedation System for Web Service
IRJET- Popularity based Recommender Sytsem for Google Maps
Introduction to Recommendation Systems
IRJET- Survey Paper on Recommendation Systems
FIND MY VENUE: Content & Review Based Location Recommendation System
Tourist Destination Recommendation System using Cosine Similarity
Unit 1 Recommender Systems it's most important topic in machine
Keyword Based Service Recommendation system for Hotel System using Collaborat...

Recently uploaded (20)

PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
Computer network topology notes for revision
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
Global journeys: estimating international migration
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PDF
Foundation of Data Science unit number two notes
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PPT
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPTX
05. PRACTICAL GUIDE TO MICROSOFT EXCEL.pptx
PPTX
Database Infoormation System (DBIS).pptx
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
Business Acumen Training GuidePresentation.pptx
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
Business Ppt On Nestle.pptx huunnnhhgfvu
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Reliability_Chapter_ presentation 1221.5784
Computer network topology notes for revision
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
IBA_Chapter_11_Slides_Final_Accessible.pptx
Global journeys: estimating international migration
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
Foundation of Data Science unit number two notes
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Galatica Smart Energy Infrastructure Startup Pitch Deck
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
05. PRACTICAL GUIDE TO MICROSOFT EXCEL.pptx
Database Infoormation System (DBIS).pptx
Fluorescence-microscope_Botany_detailed content
Data_Analytics_and_PowerBI_Presentation.pptx
Business Acumen Training GuidePresentation.pptx

Zomato Crawler & Recommender

  • 1. CONTENT & LOCATION AWARE RESTAURANT RECOMMENDATIONS USING URBAN REVIEW NETWORKS PROJECT REPORT Submitted By Jayant Jaiswal, Roll No-12600112104, Regn No-121260110042 Shoaib Khan, Roll No-12600112163, Regn No-121260110101 Rohan Agarwal, Roll No-12600112143, Regn No-121260110081 Under the Supervision of Asst. Prof. Partha Basuchowdhuri Computer Science & Engineering in partial fulfillment for the award of the degree of BACHELOR OF TECHNOLOGY In COMPUTER SCIENCE & ENGINEERING HERITAGE INSTITUTE OF TECHNOLOGY, KOLKATA MAULANA ABUL KALAM AZAD UNIVERSITY OF TECHNOLOGY
  • 2. Acknowledgements We would take this opportunity to thank Dr. P. Chaudhuri, Principal, Heritage In- stitute of Technology for giving us the golden opportunity of working on this project and providing us with all the necessary facilities and resources to work towards com- pletion. We are thankful to Asst. Prof. Partha Basuchowdhuri, our advisor and guide, for his continuous support, advise and words of encouragement without which we could have not seen through the completion of this project. He is not just an advisor but a patient teacher who has always been there solving our doubts no matter how trivial and providing us with valuable insights which helped us in every way possible. We also owe our sincere gratitude to Dr. Subhashis Majumder, the Head of the Depart- ment, for his enriching discussions, novel ideas and valuable feedbacks. We would also like to thank our teachers, faculty members and laboratory assistants at the Heritage Institute of Technology for playing a pivotal and decisive role during the development of the project. Last but not the least we thank all friends for their cooperation and encouragement. Jayant Jaiswal Shoaib Khan Rohan Agarwal i
  • 3. HERITAGE INSTITUTE OF TECHNOLOGY MAULANA ABUL KALAM AZAD UNIVERSITY OF TECHNOLOGY BONAFIDE CERTIFICATE Certified that this Project Report : ”CONTENT & LOCATION AWARE RESTAURANT RECOMMENDATIONS USING URBAN REVIEW NET- WORKS” is the bonafide work of ”Jayant Jaiswal, Shoaib Khan and Rohan Agarwal” who carried out this project work under my supervision. SIGNATURE SIGNATURE Dr. Subhashis Majumder Asst. Prof. Partha Basuchowdhuri Head of the Department Project Guide Computer Science & Engineering Computer Science & Engineering East Kolkata Township, East Kolkata Township, Chowbaga Road,Anandapur, Chowbaga Road,Anandapur, West Bengal - 700107. West Bengal - 700107. SIGNATURE EXAMINER ii
  • 4. Abstract Restaurant recommendation system is a very popular service whose so- phistication keeps increasing everyday.In this paper we present a per- sonalised restaurant recommendation system which has two parts to it. The first part recommends users’ restaurants based on their restau- rant review history. The second part recommends business owners with places perfect to open a restaurant with a particular cuisine where the owner would get the best traffic for the restaurant. Using Zomato data, we built a restaurant recommendation system for the individuals and business owners. For each user in our data we find out the cuisine preferences and other restrictions such as services offered, ambience, average rating, etc. and based on that we recommend the restaurants accordingly. We propose a metric that takes the popularity as well as the sentiment of opinions for the food items based on the user gener- ated reviews as opposed to other systems where which only consider the features mentioned above to recommend restaurants. iii
  • 5. Contents 1 Introduction 1 1.1 Road Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 What are Recommendation Systems? . . . . . . . . . . . . . . . . . . . . . . 1 1.2.1 Collaborative Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2.2 Content Based Filtering . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2.3 Hybrid Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2.4 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Motivation for Restaurant Recommendations . . . . . . . . . . . . . . . . . 3 2 Literature Review 4 3 Problem Definition 5 4 Data Analysis 6 4.1 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 4.2 Data Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 5 Methodology 7 5.1 Location Aware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 5.2 Content Based . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 6 Conclusion 12 7 Future Works 13 8 References 14 iv
  • 6. List of Figures 5.1 Live Map of Kolkata sorted on the basis of ratings . . . . . . . . . . . . . . 7 5.2 The road network stored in PostgreSQL . . . . . . . . . . . . . . . . . . . . 8 5.3 Map of Kolkata showing the important intersections to setup a new restau- rant based upon a cuisine North Indian . . . . . . . . . . . . . . . . . . . . 9 5.4 The system taking user id as input to generate recommendations for that user. 11 5.5 Top 5 restaurants recommended by the system to the user for each food item 11 v
  • 7. Chapter 1 Introduction 1.1 Road Map In Chapter 1, we provide a broad description of the types of recommendation system and applications of it in todays customer centric e-commerce market coupled with the basic knowledge about recommendation system. In Chapter 2 we give a brief overview of the prior works done in the field of restaurant recommendation. Chatpter 3 discusses about the problem definition and terminologies related to it like content and location based rec- ommendation. In Chapter 4 we discuss the methods of fetching data and the preprocessing done to suit the sytem and create good recommendation. Chapter 5 discusses about the methodologies and gives a detailed study about our system. The results of our system on content and location specific recommendation are provided in Chapter 6. Scope for improvements and future ideas are mentioned in Chapter 8 as future works. 1.2 What are Recommendation Systems? Recommender systems have changed the way people find products, information, and even other people. The goal of a Recommender System is to generate meaningful recommenda- tions to a collection of users for items or products that might interest them. It has changed the way inanimate websites communicate with their users. Rather than providing a static experience in which users search for and potentially buy products, recommender systems increase interaction to provide a richer experience. The systems identify recommendations autonomously for individual users based on past purchases and searches, and on other users’ behavior. They study patterns of behavior to know what someone will prefer from among a collection of things he has never experienced. The technology behind recommender systems has evolved over the past 20 years into a rich collection of tools that enable the practitioner or researcher to develop effective recommenders. 1.2.1 Collaborative Filtering Collaborative filtering methods are based on collecting and analyzing a large amount of information on users behaviors, activities or preferences and predicting what users will like based on their similarity to other users. A key advantage of the collaborative filtering 1
  • 8. approach is that it does not rely on machine analyzable content and therefore it is capable of accurately recommending complex items such as movies without requiring an understanding of the item itself. Many algorithms have been used in measuring user similarity or item similarity in recommender systems. For example, the k-nearest neighbor (k-NN) approach and the Pearson Correlation. 1.2.2 Content Based Filtering Content-based filtering methods are based on a description of the item and a profile of the users preference. In a content-based recommendation system, keywords are used to describe the items; beside, a user profile is built to indicate the type of item this user likes. In other words, these algorithms try to recommend items that are similar to those that a user liked in the past (or is examining in the present). In particular, various candidate items are compared with items previously rated by the user and the best-matching items are recommended. This approach has its roots in information retrieval and information filtering research. 1.2.3 Hybrid Approach Recent research has demonstrated that a hybrid approach, combining collaborative fil- tering and content-based filtering could be more effective in some cases. Hybrid approaches can be implemented in several ways, by making content-based and collaborative-based pre- dictions separately and then combining them, by adding content-based capabilities to a collaborative-based approach (and vice versa), or by unifying the approaches into one model. Several studies empirically compare the performance of the hybrid with the pure collabo- rative and content-based methods and demonstrate that the hybrid methods can provide more accurate recommendations than pure approaches. These methods can also be used to overcome some of the common problems in recommendation systems such as cold start and the sparsity problem. Netflix is a good example of a hybrid system. They make recommen- dations by comparing the watching and searching habits of similar users (i.e. collaborative filtering) as well as by offering movies that share characteristics with films that a user has rated highly (content-based filtering). 1.2.4 Applications 1) Facebook users a recommender system to suggest Facebook users you may know offline. The system is trained on personal data mutual friends, where you went to school, places of work and mutual networks (pages, groups, etc.), to learn who might be in your offline & offline network. 2) When you fill out your Taste Preferences or rate movies and TV shows, youre helping Netflix to filter through the thousands of selections to get a better idea of what you might like to watch. Factors that Netflix algorithm uses to make such recommendations include: a) The genre of movies and TV shows available b) Your streaming history, and previous ratings youve made. 2
  • 9. c) The combined ratings of all Netflix members who have similar tastes in titles to you. 3) The Jobs You May Be Interested In feature shows jobs posted on LinkedIn that match your profile in some way. These recommendations shown based on the titles and descriptions in your previous experience, and the skills other users have endorsed. 4) Amazons algorithm crunches data on all of its millions of customer baskets, to figure out which items are frequently bought together. This can lead to huge returns- for example, if youre buying an electrical item, and see a recommendation for the cables or batteries it requires beneath it, youre very likely to purchase both the core product and the accessories from Amazon. 1.3 Motivation for Restaurant Recommendations Obtaining recommendations from trusted sources is a critical component of the natural process of human decision making. With burgeoning consumerism buoyed by the emergence of the web, buyers are being presented with an increasing range of choices while sellers are being faced with the challenge of personalizing their advertising efforts. In parallel, it has become common for enterprises to collect large volumes of transactional data that allows for deeper analysis of how a customer base interacts with the space of product offerings. Recommender Systems have evolved to fulfill the natural dual need of buyers and sellers by automating the generation of recommendations based on data analysis. There are many recommendation systems available for problems like shopping, online video entertainment, games etc. Restaurants & Dining is one area where there is a big opportunity to recommend dining options to users based on their preferences as well as historical data. Zomato is a very good source of such data with not only restaurant reviews, but also user-level information on their preferred restaurants. This report describes the work to recommend restaurants to a given Zomato user based on their history or their cuisine preferences. It also does the task of recommending cuisine specific suitable locations to newcomers in the restaurant business. 3
  • 10. Chapter 2 Literature Review In this section we bring to limelight a few previous works done in the field of providing restaurant recommendations. Recommender systems seek to predict the ’rating’ or ’pref- erence’ that a user would give to an item. Recommender systems typically produce a list of recommendations in one of two ways - through collaborative or content-based filtering. Collaborative filtering approaches building a model from a user’s past behavior (items pre- viously purchased or selected and/or numerical ratings given to those items) as well as similar decisions made by other users. This model is then used to predict items (or ratings for items) that the user may have an interest in. Content-based filtering approaches uti- lize a series of discrete characteristics of an item in order to recommend additional items with similar properties. These approaches are often combined to from hybrid recommender systems. Traditional recommendation system has used user profile to analysis and find similar user. The systems recommend restaurants to users from result of analysis. However, these systems are lack of consideration of user mobility and environment. Other recommendation system provides service by finding restaurant and providing information of restaurant by web site. This system is close to search system but not recommendation system. Recently research relating with context information is using user location to serve advertise, sale and event information. This system analyses user preference through user profile and finds restaurant satisfying user preference and closing user location. The research consists of two sections, one which has online activity, and the other which processes data offline. When the user is in motion, i.e., his geo-position changes notably, the system goes online and recommendation module becomes active, retrieving nearby and restaurants and ranking them, based on their properties, according to the scores generated offline. The offline part generally remains in a non-functional mode when the user is stationary. The work of the offline system is to generate a user interest profile, using a Machine Learning algorithm. The drawback of the offline feature is that the interest profile is generated based on users check- in to restaurants. It doesnt take into account users taste, habits and the cuisines he favors. Thus the offline recommendation can be considered as a shallow approach lacking users detailed interaction with each restaurants which can be obtained in the form of reviews. 4
  • 11. Chapter 3 Problem Definition Creating an innovative recommendation system to provide content based recommendations to restaurant goers and owners and provide location based recommendations to restaurant owners using Zomato Restaurant Review Network. Suggest best-suited places to new entrants in the restaurant business for setting up a new restaurant to fill in the cuisine void and garner high traffic. Suggest restaurants to users based on their previous review activity on Zomato by creating a recommendation system using all reviews from all restaurants in a city. 5
  • 12. Chapter 4 Data Analysis 4.1 Data Collection Zomato and Yelp are two popular restaurant search, discovery and review services. While both are popular gloablly Zomato has an edge over Yelp in India. Since we are based in India we decided to choose Zomato as our ”Restaurant Review Network”. Also, Zomato provides more carefully curated content which will be enough to satiate appetites even outside its native land. Users can find restaurants, leave reviews, rate a restaurant, and keep their own restaurant diary to share with friends. Zomato has built a highly coherent and focused experience that puts the emphasis on being a comprehensive network for food-lovers. Very little on the site is superfluous. In a survey users were impressed with the amount of attention to detail evident in the sites content, and unlike on Yelp, several testers actually used Zomatos curated lists and suggestions to find restaurants. Hence, Zomato was chosen as our ”Restaurant Review Network” over other popular services due to it’s detailed yet simplistic data. 4.2 Data Handling We first crawled the restaurant data of Delhi from Zomato. The crawling was done using data crawlers built in python which would specifically crawl restaurant data. The data comprised of all possible features listed on Zomato like ”Dine-in or Takeaway”, ”AC or Non-Ac”, etc. But then we switched to Kolkata as our sample city for data analysis due to a number of reasons. Firstly, Kolkata had around 2000 restaurants compared to Delhi’s 10000+ restaurants. Also, we were based in Kolkata and knew the city in and out.Thus, we could analyze the results better. After the first crawl of restaurant’s data we crawled restaurant reviews for these 2000 restaurants. This crawl operation generated over two hundred thousand reviews. Each review also comprised of restaurant name, reviewer id and name, details of the review. All the data was later stored in MongoDB which is a No-Sql Database for easy fetching and manipulation. These is the data that will be used by or system. 6
  • 13. Chapter 5 Methodology 5.1 Location Aware The location aware part of our project has the primary motive of recommending to people who want to setup a new restaurant business explained earlier in our problem statement. We assess the road map to identify concentration of restaurant clusters in a city. These clusters can be defined as restaurant hotspots. We have generated a live map of our sample city (Kolkata) with all the restaurants marked in it. The nodes are given a particular color as per the rating range in which they fall into and clicking on a node gives the details of the restaurant the node is representing. This will help provide real time recommendations to users based on ratings and locations. Below is a snapshot of the aforementioned. Figure 5.1: Live Map of Kolkata sorted on the basis of ratings The road network of our sample city was generated using OepenStreetMaps and Post- greSql. It was a graph with road intersections as nodes and roads as edges. We couldn’t use Google Maps for this as as it came at a premium. We had the coordinates of all restaurants in the city. We added the restaurants as pendant nodes to their nearest intersection by 7
  • 14. using K-NIN (K nearest neighbor) algorithm. Thus, we get a complete road network with all restaurants and road intersections as nodes and the roads as edges. Figure 5.2: The road network stored in PostgreSQL For every node, which is an intersection, in the road network we store the distances of the nearest restaurant for each and every cuisine at that node itself as node attributes. We now created a vector for each road intersection which stores the X/Y ratios for top 10 cuisines where, X = Avg. rating for the nearest restaurant Y = The distance of the nearest restaurant The lesser this ratio (X/Y), the better it is suited to opening of a new restaurant for some cuisine. Running the Page Rank Algorithm on the graph will also give us the important intersections having more importance and traffic. Combining the page rank probablilites with our ratio we define a new ratio R as follows :- R = Page Rank Probability Ratio(X/Y ) If this ratio R is maximum, then the intersection is the best suited to setting up of a new restaurant for that cuisine. 8
  • 15. Figure 5.3: Map of Kolkata showing the important intersections to setup a new restaurant based upon a cuisine (North Indian) The map shows most important intersections favourable to setting up of a new restaurant offering north indian cuisine in red colour. 5.2 Content Based We have crawled the restaurants of our sample city (Kolkata) for their features (i.e. rat- ings, cuisines, bar, ac/non-ac, veg/non-veg, etc.) and their reviews given by the visiting users. These reviews would give us insights about a users degree of likeness towards a par- ticular restaurant. The review data for each restaurant in Kolkata consisted of user id, user name, restaurant id restaurant name and the review. The reviews for each restaurants taken individually were passed to ”Intellexer Sentiment Analyser” module. Applying Intellexer on the restaurant reviews we get 1) Opinion Holders which are food items. 2) Each opinion holder (food item) having multiple opinions. 3) Sentiment values which can be either positive or negative for each opinion. Intellexer Sentiment Analyzer is a powerful and efficient solution that automatically ex- tracts sentiments (positivity/negativity), opinion objects and emotions (liking, anger, dis- gust, etc.) from unstructured text information. From these sentiment values we found out the best food items available in that restaurant, applying sorting in descending order of their sentiments. The sentiment values are calculated using the metric in equation 5.1. The calculation for getting the best food item is done in the following way. For each opinion holder we had a 3-tuple list of n items. The values in each tuple were 1) Food tags 2) Average Sentiment 9
  • 16. 3) Opinion Count The food tags are the best spelling suited for a particular food item of North Indian cuisine. These tags are sufficient to identify north Indian dishes and reviews from all the user reviews. Example of these north Indian food tags are biryani, kebab, qorma, tikka etc. Since there are multiple possible spellings for each unique food item holder given by the users we attempt to replace all by a best suited name for the holder. Example Biryani has multiple forms like biriyani,beryani,beeryani etc. We clubbed the sentiment of similar food items with different spellings using fuzzywuzzy. Fuzzy String Matching, also called Approximate String Matching, is the process of finding strings that approximately match a given pattern. The closeness of a match is often measured in terms of edit distance, which is the number of primitive operations necessary to convert the string into an exact match. We took each words in our repository and found out a partial ratio of these items using fuzz.partial Ratio() with the holders extracted from intellexer. We took a threshold value of 67 i.e. if the ratio is greater than or equal to 67 then we considered the items to be similar using this we clustered all the similar items and found the average of each tags in our repository using the sentiment of the opinion from intellexer. Once we have got the average sentiment and count of all the tags we then used a metric to rank the tags in that restaurant. Here are some of the terms to be known: Max cnt = Maximum of the opinion count value of all food tags Min cnt = Minimum of the opinion count value of all food tags The metric is : Sentiment of tag = Avg. Sentiment of tag × opinion count of tag − min count + 1 max cnt − min cnt + 1 (5.1) This metric is applied for each restaurant and their tags obtained. Using this we get a normalized sentiment of each tag and sorting the tags on this value in descending order will give us the highest ranked to lowest ranked tags. The normalization is done for a particular restaurant and not related to all restaurants. The Restaurant data of all the restaurants were inserted in MongoDB along with their opinion counts. Thus, for each food tag in our repository we found out the restaurants that provide that food. The previous max cnt and min cnt metric is again applied to individual food tags in our repository for all restaurants. The result after sorting will give the top restaurants for a particular food tag. This result for each individual tag is stored in the database. Now for any user, his reviews are fed into intellexer and his opinion holders(food items) are generated. These opinion holders indicate his food preference. The number of opinion holders should be greater than one. Now comes the easy part of recommending the already stored top 5 restaurants for each tag in the database to the user. Thus, the user gets recommended based on his own reviews using our system. The results are shown in screenshots below. 10
  • 17. Figure 5.4: The system taking user id as input to generate recommendations for that user. Figure 5.5: Top 5 restaurants recommended by the system to the user for each food item 11
  • 18. Chapter 6 Conclusion Our results show the busiest roads in the city which are ideal for setting up of new restaurants and for foodies to take that course is a delight in itself. This can have positive influence on current businesses also. Our results can provide cuisines lacking in a particular area which can be exploited by current businesses. Previous works on location based recom- mendation focused on providing results to users only but our system focuses on restaurant owners. Our system implements content-based filtering to provide restaurant recommendations based on their previous reviews. The recommendations are pretty much accurate as per our tests. Our system can be easily extended to other cities and cuisines. Our system has immense potential and is multipurpose as it can come handy for businesses as well as the average user. The field of restaurant recommendations is one of the uncharted territories and our system is a small step in a giant ocean. 12
  • 19. Chapter 7 Future Works We plan on building our own sentiment analyzer pertaining to restaurants rather than relying on a Intellexer Sentiment Analyser module. This will help in getting correct and accurate sentiments for tags like services, food and other features of the restaurants and solve our sentiment ambiguity. Sentiments are not purely positive or negative infact there are various levels to identify sentiments and their effect in the statement. These can be employed in details to the system for more accurate results. We also plan on using Collaborative Filtering method to our system. These systems do not use any information regarding the actual content of the items (as opposed to content filtering). They are based on usage or preference patterns of other users. Selection (or filtering) of items is done in a method similar to individuals collaborating to make recom- mendations for each other i.e. if some tags are similar between users then the users are termed as similar and similar recommendations are provided. 13
  • 20. Chapter 8 References 1. Anant Gupta and Kuldeep Singh. Location Based Personalized Restaurant Recommen- dation System for Mobile Environments. 2. Sumit Negi. Single Document Keyphrase Extraction Using Label Information. 3.Liu, J., Shang, J., Wang, C., Ren, X., Han, J., 2015. Mining Quality Phrases from Massive Text Corpora, in:. Presented at the Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, ACM, pp. 17291744. 4. Mariana Romanyshyn. RULE-BASED SENTIMENT ANALYSIS OF UKRAINIAN RE- VIEWS. International Journal of Artificial Intelligence & Applications (IJAIA), Vol. 4, No. 4, July 2013. 5. El-Kishky, A., Song, Y., Wang, C., Voss, C.R., Han, J., 2014. Scalable topical phrase mining from text corpora. Proceedings of the VLDB Endowment 8, 305316. 6. Burusothman Ahiladas, Paraneetharan Saravanaperumal, Sanjith Balachandran, Thamayan- thy Sripalan and Surangika Ranathunga. Ruchi: Rating Individual Food Items in Restau- rant Reviews. 14