SlideShare a Scribd company logo
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 03 | Mar 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 6961
Prediction of cab demand using machine learning
Abdul Rahuman Aslam M A1, Gobinathan V2, Krishnan K3, Rajasekaran G4
1Student, Easwari Engineering College, Bhrathi Salai, Ramapuram, Chennai – 600089.
2 Student, Easwari Engineering College, Bhrathi Salai, Ramapuram, Chennai – 600089.
3
Student, Easwari Engineering College, Bhrathi Salai, Ramapuram, Chennai – 600089.
4Assistant Professor, Easwari Engineering College, Bhrathi Salai, Ramapuram, Chennai – 600089.
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - The cab service industryisboomingforthe
last couple of years and it is expected to grow in the
near future. Taxi drivers need to decide where to wait
for passengers as they can pick up someone as soon as
possible. Passengers also prefer a quick taxi service
whenever needed. The control centre of the taxi service
decides the busy area to be concentrated. In the existing
system, sometimes the taxis were scattered across the
larger area missing the time based busy area like
Airport, Business area, school area, Train stations etc.
Effective taxi allocation can help both drivers and
passengers to minimize the wait-timetofindeachother.
In the proposed system, the future demand can be
predicted using Recurrent Neural Networks based
model that can be trained with given historical data. It
can serve more customers in a short time by organizing
the availability of taxi. The data set includes GPS
location and other properties of the taxi like drop point,
pickup point etc. This model is used to predict the
demand for aparticular time in different areas of the
city.
Key Words: Taxi demand prediction, Recurrent neural
network.
1.INTRODUCTION
Taxi drivers need to decide where to wait for the
passengers so that they can pick someone quickly.
Similarly, passengers also need to find their cabs
quickly. Dispatching the taxi efficiently help both the
customers and drivers. Effective dispatching of taxi
helps to reduce waiting time for customers, as well
as drivers.
A driver will not have enough information about
where to wait in order to get passengers quickly. A
taxi centre can organize and send the required
number of taxis to the area based on the historical
data. The historical data uses Global Positioning
System (GPS) and predict future demand. In Tokyo,
today this system is reducing the waiting time for the
customers, quickly respond to the sudden change of
demands and it bridges the gap between the
experienceddriversandnovicedrivers.Thesebenefits
allow the cab service to achieve maximum benefit.
A real-time taxi demand prediction is proposed here
andinthissystem,historicaldataisusedtopredictthe
future demand for taxis in a particular place at a
particular time. Some of the real-time objectives
include managing fleet of taxi to crowded area,
effective utilization of resources to reduce waiting
time, server more customers in a short time by
organizingavailabletaxi.OursystemusesGPSlocation
andother properties ofthetaxilikepickuppoint,drop
point etc. to predict taxi demand. A model is trained
using a recurrent neural network. The recurrent
neural networks are used in speech recognition
software. Therecurrentneuralnetworksareusedfor
sequential data. It is being used in Google’s voice
search and Apple Siri personal assistants. This
algorithm is the achievement of deep learning in past
years.Recurrentneuralnetworksproducepredictions
result for the sequential data.
Taxi demand prediction is a time series analysis
problem. Itisafeedforwardworkneuralnetworkand
the information moves from one network to another
network and from the input layer to the output layer
through the hidden layer. The difference between the
normal neural network and the recurrent neural
network is that in the recurrent neural network, the
information cycles through the loop.
In the system, the recurrent neural network is used
andPythonlanguage ispreferredbecauseithasavast
collection ofmachine learning libraries. The data set
might contain empty values, negative values or error.
Data set is cleaned in the preprocessing. The
preprocessing methods involve removing records
which are not complete. Once the cleaned data set is
available it is prepared to be fed to the machine
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 03 | Mar 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 6962
learning algorithm. Recurrent Neural Networks take
the previous node output or hidden states as inputs.
RNNs are useful as their intermediate values (state)
can storeinformationaboutpreviousinputsforatime
interval. The main feature of a Recurrent Neural
Network (RNN) is that the network possess at least
one feedback connection, so the activations can flow
round in a loop-wise manner.
That enables the networks to learn sequences and to
do temporal processing, e.g., perform sequence
recognition/reproduction or temporal
association/prediction. Recurrent neural network
architectures can have many variated forms. One
common
The type that consists of a standard Multi-Layer
Perceptron (MLP) plus added curves. These can
escapade the powerful non-linear mapping
capabilities of the MLP, and also have some form of
memory. Since one can think about recurrent
networks in terms of their properties as dynamical
systems, it is needed to ask about their stability,
observability and controllability:
Stabilityenterprisestheboundednessovertimeofthe
network outputs and the response of the network
output to minute changes to the weights or network
inputs. Controllability is concerned with whether it is
possible to control the dynamic behaviour patterns.
An RNN is said to be controllable if an original state is
steerable to a desired state within a finite number of
sequential time steps.Observabilityisconcernedwith
whether it is possible to observe the results of the
control that is applied. A recurrent network is
observable if the state of the network can be dogged
from a fixed set of input and output measurements.
The network input is the current taxi demand, while
the output is the demand in the next time-step. The
reason recurrent neural network is used is that it can
be trained to store all the relevant information in a
sequence to predict particularoutcomes in the future.
Itisatimeseriesforecastingproblemtopredictfuture
demand. Hence, a sequential algorithm is used. It is
desired to predict taxi demand in small areas so that
the drivers know exactly where to go. The system is
trained with the data set and create the model for
future prediction.
A graph isplottedforthefuturepredictionforthenext
time slot and the area to be crowded. This machine
learning model predicts the future demand area in a
city based on neural network and the drivers were
taken to wait in the area where the system identified
as demand area.
2. RELATED WORKS
Fei Miao et al [1] proposed a modern robust
transportationsystem that senses datacollectedfrom
transportation systems that help in analyzing the
passenger demands. Prediction Methods on taxi-
passenger demand were travel time and travelling
speed according to traffic monitoring data have been
developed. In their proposed model Robotic mobility-
on-demand systems that minimize the number of
rebalancing trips and best parking systems that
allocate resource based on a driver’s payments.
These kinds of algorithms aim to reduce long mile or
to minimize customers’ waiting time have been
developed. Although optimizationinrobust approach
aims to minimize the worst case cost under all
possible random parameters, it results in average
system performances. For a taxi dispatch system, it is
essential to address the compensation between the
worst case and the average dispatch costs under
uncertain demand. Estimations show that under the
robust dispatch framework we design, the average
demand-supply ratio imbalance is reduced by 31.7%,
and the average total idle driving distance is reduced
by 10.130% or about 20 million miles in total in one
year.
Mohammad Saiedur Rahaman et al [2] defined the
neighbourhoodidentificationprobleminthepresence
of a large number of heterogeneous contextual
modules. It codifies research as a problem of less wait
time prediction for taxi drivers at airports and
investigate heterogeneous elements related to time,
weather, flight arrivals and taxi trips. Taxis are
regarded as the easiest mode of transport for
transfer between the airport and the city. The queue
managers continuously monitor the concurrent
queues related to taxis and passengers and instruct
taxi drivers to join the passengers at the terminal
when there is higher demand. Toensure the seamless
operation of this process, the queue manager
estimates the demand for taxis in future. Airport
satisfaction ratings depend on the proper
management of both passenger queues and taxi.
Aimingtomaintaindemand-supplysymmetryoftaxis,
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 03 | Mar 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 6963
the airport transport managers employ an approach
where it requires extended humanintervention. They
examine the corresponding p-values, which show the
statistical significance of this difference. In this D-
value is around 0:45 and p-value are less than 0.001.
This means that neighbourhoods are different
statisticallyfor differentk-values. Ifmedianandmean
denote the improvement shown by the DIbiased
weights method over the baseline method in terms of
medianandmeanofpredictionerrorsrespectivelyfor
different k-values, the correlation between the
corresponding D-values and the prediction errors,
d_error(Median)andd_error(Mean)canbemeasured
to show the relationship between the improvement
inthedensequalityneighbourhoodandtheprediction
accuracy is improved. This paper shows Pearson’s
correlation scores of 0.393 (with mean) and 0.484
(with median). They argue that the quality of the
neighbourhood they identified is significantly
improved by the consideration of relevant
heterogeneous contextual factors, thus the
performance isboosted (i.e. mean prediction error is
less than 0.09 and the median prediction error is
less than 0.06).
DeshengZhang et al [3]states that theexistingsystem
of data collection is offline and collected by manual
investigations and it may result in inaccurate data for
real-time analysis. To address this, they have used a
model called Dmodel, employing roving taxicabs and
using them as real-time mobile sensors. By
implementing this, they can infer arriving passenger
moments by investigating the logical information.
They used 450GB dataset of 14,000 taxicabs for a half
year and it achieves 83% accuracy and outperforms
the statistical model by 42%. Passenger demand
prediction may be halted by bad weather, special
events or accidents. They provide two parts of the
solution for this. First, they mine a large dataset of
historical data, consisting of taxi passengers and their
related pickups. Second, to address the real-time
dynamics, they consider thousands of taxi cabs and
use them as real-time mobile sensors. It is possible
because they use GPS data in dense urban areas. The
front-end systemandtheback-enddispatchingcentre
form a network called roving sensor network. The
used Dmodel observes hidden contexts to infer
demand based on historical and real-time taxi cab
data. Dmodel analyses both the offlineanalysisofdata
and real-time data collected from roving taxis or
sensor network formed by them. They used a novel
parameter,namelythepickuppattern.Dmodelutilizes
the real-time pickup pattern so that the model can
select customized and compact training data to
increase the accuracyof inference. The use ofDmodel
yields 83% accuracy and outperforms statistical
model by 42%. They used a Hidden Markov Chain for
implementation. The Dmodel- based dispatching
outperforms basic and SDD based by 11% on an
average. This is due to the accurate inference by
Dmodel.
LuisDamasetal[4]proposedanovelmethodologyfor
predicting the spatial distribution of taxi–passengers
for a short time horizon using streaming input data.
First, the information was accumulated into a
histogram time-series. Then, three time-series
forecasting techniques were combined to emerge a
prediction. Experimental tests were conducted using
the online data set that is transmitted by 441 vehicles
of a fleet running in the city of Porto, Portugal. The
resultselaboratedsothattheframeworkproposedcan
provide effective learning into the spatio-temporal
distributionoftaxi–passengerdemandforahorizonof
30minutes.Thispaperfocusesonthereal-timechoice
problem of which is the best cab service stand to go to
after a passenger drop-off (i.e., the stand where
another passenger can be picked up within a short
span of time). An intelligent approach regarding this
flaw will improve network reliability for both
companies and customers; an intelligent distribution
of vehicles throughout stands will minimize the
average waitingtime to pickup a passenger, whilethe
distance travelled will be more profitable.
Furthermore, whenever they need a taxi, passengers
will also experience a lesser waiting time to get a
vacanttaxi.Themajorcontributionofthispaperfacing
this is to build predictions on the spatio-temporal
distribution of the taxi–passenger demand using
only streaming data. As a result, the model that is
presented has beenable to predict thetaxi–passenger
demand at each one of the 64 taxi stands for 30
minutes period intervals. The model has presented a
more than satisfactory performance, correctly
predicting the 506 874 tested services with an
aggregated error measurement lower than 26.11%.
Biao Leng et al [5] elaborated the battle between two
taxicompaniesinChinanamelyDidiandKauidadithat
occurred in 2014. The two companies are backed
up by internet giants like Tencent and Alipay. These
companies promoted the taxi drivers by giving them
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 03 | Mar 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 6964
incentives for each ride and also allowed the users
also to use their application by giving frequent
discounts and offers and also promoted payment
throughthemobilephone.Inthispaper,theycollected
a 37-day trip data and use 9000 entries in Beijing.
For the first 18 days there was no battle and for the
next 19 days, the battle was competitive. The spatial-
temporal data are studied and based on the
comprehensive analysis, benefits and drawbacks are
discussed. Taxi drivers welcomed this battle and
startedtoaccepteverypassengertheycansothatthey
can achieve maximum benefits from incentives.
Customers started to complain that they cannot find
any taxis without using these apps. So, offline taxi
services may get affected. So, this paper addresses
that offline cab services may get affected. These
companies wanted to increase the usage of mobile
payments and hence used money promotion.
Indirectly, money promotion increased cab service
indirectly. Our paper helps both online and offlinecab
services, as it is purely dependent on historical data
based on the Global Positioning System.
Nicholas Jing Yuan et al [6] proposed a recommender
system for both cab drivers and people expecting to
take a cab, using the knowledge of 1) passengers’
mobility patterns and 2) taxi drivers’ picking-
up/dropping-off behavioural patterns learned from
the GPS trajectories of taxicabs. In their method, they
learn the above-mentioned knowledge (probability
representation) from GPS trajectories of taxis. Then,
they feed knowledge into a probabilistic model that
calculate the profit of the candidate locations for a
particular cab driver based on where and when the
driverrequeststherecommendation.Theyproposeda
defined approach to detect parking places based on a
large number of GPS trajectories generated by taxis,
where the parking places stand for the locations
where cab drivers usually wait for passengers with
theircars parked. Theyinventedaprobabilistic model
to formulate the time-dependent taxi behaviors
(picking-up/dropping-off/cruising/parking) and
enable a city recommendation system for both
passengers and taxi drivers. They also improve the
taxi recommender by considering the time-varying
queue length at the parking places. Forbothofthetaxi
recommender and the passenger recommender, they
constructed a model consolidating day of the week
and historical weather conditions to the varying pick-
up/drop-off behaviors. They also perform large-scale
evaluations including in-the-field user studies to
validate their system. They also built the
recommender system with a data set generated by
12,000 taxicabs in a period of 110 days and evaluated
the system by large-scale experiments including a
series of in-the-field studies. As a result, the taxi
recommender predicts accurately the time-varying
queue in parking lots and the recommender
successfully suggests the segment of the road where
users can easily find vacant taxis. In the future, they
alsoplantodeployourrecommenderintherealworld
so as to further validate andincreasethe effectiveness
and robustness of thissystem.
Jun Xu et al [7] proposed a system in which they used
recurrent neural networks to predict the cab demand
for the future based on the historical data which has
GPS. They used Long Short Term Memory (LSTM) to
predict the future demand. They used the LSTM in
order to store the sequential data and predict the cab
demand for future. LSTMs are used in handwriting
recognition, Natural language Processing etc. LSTM
used in this model uses some gating mechanism to
store the previous value. In thismodel,theyhaveused
a Mixed Density Network (MDN) along with Long
Short Term Memory (LSTM). This paper can achieve
83% accuracy in predicting the future cab demand
and it contains 17% prediction error.
Tian he et al [8] proposed a receding horizon control
(RHC) framework to dispatch cabs, which
incorporates highlyspatiotemporallycorrelatedreal-
time Global Positioning System and demand/supply,
models(GPS)locationandoccupancyinformation.The
target includes matching spatiotemporal ratio
between demand and supply for service quality with
less usage of current and assumed future taxi idle
drivingdistance.TheydesignedanRHCframeworkfor
large-scale taxi dispatching. They look at both current
and future demand, saving costs under constraints by
involving expected future idle driving distance for
rebalancingsupply.Theframeworkcoverslarge-scale
data in real-time control. Sensing this kind of data is
used to build predictive passenger demand, taxi
mobility models, and serve as real-time feedback for
RHC. Their Future plan is to develop a privacy-
preservingcontrolframeworkwhendataofsomecabs
are not shared with the dispatch centre. Extensive
trace-driven analysis with a data set containing taxi
operational records in San Francisco, CA, USA, shows
that our end solution reduces the average total idle
distance by 52.01%, and reduces the supply-demand
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 03 | Mar 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 6965
ratio error across the city during one experimental
time slot by 45.02%.
3. SYSTEM ARCHITECTURE
The real wordtaxi-tripdatasetiscollected.The
collected data is pre-processed and the data
preprocessing is a data mining technique that involves
transforming raw data into an understandable format.
Real-world data is often incomplete, inconsistent,
and/or lacking in certain behaviors or trends, and is
likely to contain many errors. The major tasks of the
data preprocessing involves data cleaning, data
integration, data transformation, data reduction etc.
The missingvaluesareidentifiedanditisreplacedwith
mean values.Thecleaneddatasetisfedtotherecurrent
neural network model and is it trained using historical
data which consists of date, time, pickup location,
weather etc. and the future demand for the taxi is
predicted. Recurrent Neural Networks (RNN) are a
powerful and robust type of neural networks and
belong to the most promising algorithms out there at
the moment because they are the only ones with an
internal memory. The predicted demand for the cab is
visualized using a graph.
Fig. 3.1 System Architecture
4. SYSTEM IMPLEMENTATION
The dataset consists of three columns and
30,426 entries. The minimum index value and
maximum index values are found. The maximum and
minimum indices are used to format the dataset. The
output consists of a graph consisting of missing values
in the dataset corresponding to the given data with
time in a one-hour interval in x-axis and number of
pickups in the y-axis. It also consists of a graph
consisting of bookings with time in a one-hourinterval
in x-axis and number of pickups in the y-axis. Once,
missing values are found, the mean value for the
columns is found and this value is replaced in the place
of missing entries. There are a lot of options for
replacing missing values but the simplest andthemost
common one is replacing them with the mean of all the
entries under the respective column. At the end of the
data preprocessing, a cleaned data set with no missing
values is found and all the missing values in the data
set were replaced by the mean of the respective
columns. The cleaned dataset is important because
there may be some irrelevant data in the dataset that
may cause prediction error and causes inaccurate
results. The Comma Separated file (CSV) is collected
from the usage of the user and perform cleaning and
processing, as a result of a cleaned data set is achieved.
The cleaned data set is used for the next stage in
predicting the taxi demand the recurrent neural
network is applied to predict the future cab demand.
.
Fig. 4.1 Bookings graph
The figure 4.1 shows the plotted graph using
matplotlib. pyplot, where it is plotted as a ratio of time
in hours and number of pickups. The graph shows the
number of pickups, at a particular date.
The Vanilla Neural Networks or Conventional
Neural Networks accept only a fixed size inputs and
produce fixed sizeoutputs.Itcannotworkforsequence
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 03 | Mar 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 6966
of inputs and sequence of outputs. Recurrent Neural
Network is a class of Artificial Neural Network (ANN).
The network input is the current taxi demand and
other relevant information while the output is the
demand in the next time-step. The reason a recurrent
neural network is used is that it can be trained to store
all the relevant information in a sequence to predict
particular outcomes in the future. In addition, taxi
demand predictionisatimeseriesforecastingproblem
in which an intelligent sequence analysis model is
required. It is desired to predict taxi demand in small
areas so that the drivers know exactlywheretogo.The
system is trained with the dataset and the model is
used for future demand prediction.
Fig. 4.2 Artificial Neural Networks
Recurrent Neural Network considers previous input
and the current inputtocomputethefutureoutput.For
taxi demand, we use the RNN model because taxi data
is a sequential data and it is a time series forecasting
problem. Unlike, traditional neural network RNN has
memory for storing relevant parts of the input and use
it for prediction of the future demand. Recurrent
Neural Networks take the previous node output or
hidden states as inputs. RNNs are useful as their
intermediate values (state) can store information
about previous inputs for a time interval. The main
feature of a Recurrent Neural Network (RNN) is that
the network possesses at least one feedback
connection, so the activations can flow roundinaloop-
wise manner. That enables the networks to learn
sequences and to do temporal processing,e.g.,perform
sequence recognition/reproduction or temporal
association/prediction. Recurrent neural network
architectures can have many variated forms. One
common type that consists of a standard Multi-Layer
Perceptron (MLP) plus added curves. These can
escapade the powerfulnon-linearmappingcapabilities
of the MLP, and also have some form of memory. Since
one can think about recurrent networks in terms of
their properties as dynamical systems, it is needed to
ask about their stability, observability and
controllability. Stability enterprises the boundedness
over time of the network outputs and the response of
the network output to minute changes to the weights
or network inputs. Controllability is concerned with
whether it is possibletocontrolthedynamicbehaviour
patterns. An RNN is said to becontrollableifanoriginal
state is steerable to a desired state within a finite
number of sequential time steps. Observability is
concerned with whether it is possible to observe the
results of the control that is applied. A recurrent
network is observable if thestateofthenetworkcanbe
dogged from a fixed set of i/p and o/pmeasurements.
For predicting the future taxi demand, a dataset which
consists of date and time, number of pickups, number
of passengers, maximum temperature, minimum
temperature, humidity and ,wind speedisused.RNNis
called recurrent because it performs the same
computation for every input element and each output
is conditioned on previous input element.
Fig. 4.3 Recurrent Neural Network
Long short-term memory is RNNarchitecturemodel.It
can be used for handwriting recognition, speech
recognition etc. LSTM consists of an input gate, output
gate and a forget gate. LSTMs are explicitly designed in
order to avoid the long term dependency problem.
Traditional recurrent neural network has a chain of
repeating modules. But, in LSTM this structure is
different.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 03 | Mar 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 6967
Fig. 4.4 Repeating Modules in standard RNN
Fig. 4.5 Repeating Modules in LSTM
LSTMs also have this chain like structure, but the
repeating module has a different structure. Instead of
having a single neural network layer, there are four,
interacting in a very special way.
Fig. 4.6 LSTM Model loss
The figure 4.6 shows the model shows the Long Short-
Term Memory (LSTM) model loss graph, whose
accuracy can be increased in each iteration. In the
neural network, when each data passes from the input
layer to the output layer, through the hidden layer the
accuracy of the model increases and each timethedata
passes the accuracy increases.
Figure 4.7 shows the number of days with greaterthan
800 pickups, which indicate peak hours. Peak hours
indicate the time in which the demand for the taxi is
high.
Fig. 4.7 Predicted Number of taxi pickups vs. Actual
Number of taxi pickups
Fig. 4.8 Predicted Number of taxi pickups vs. Actual
Number of taxi pickups
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 03 | Mar 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 6968
Figure 4.8 shows the graph plotted between predicted
number of taxi pickups and actual number of taxi pickups
from the historical data.
Fig 4.9 Estimated Number of Taxi Pickups
Fig 4.9 shows the estimated number of taxi pickups
for the next six hours.
5. CONCLUSION
The proposed system is a sequential learning model
with recurrent neural network for predicting the taxi
demandin different areas inthecity.Learningfromthe
past historical data, the demand prediction is done for
the location. Three Years data of the Airport is used to
train our model. This model gives the predictionoftaxi
demand for hourly basis and a particular time.
This work can be extended in the future by adding
more input such as holidays, festivals etc. Taxis can be
organized and send based on the prediction of the
model. In addition, it can save so much gas that is
currently being spent by taxis to find passengers.
REFERENCES
[1] Jun Xu , Rouhollah Rahmatizadeh, Ladislau Bölöni,
and Damla Turgut, “Real-Time Prediction of Taxi
Demand Using Recurrent Neural Networks”, IEEE
Transaction on Intelligent transport system, vol. 19,
no. 8, pp. 2572-2581,Aug.2018.
[2] Fei Miao, Shuo Han, Shan Lin, Qian Wang, John A.
Stankovic,AbdeltawabHendawi,DeshengZhang,Tain
he and George J. Pappas, "Data-Driven Robust Taxi
Dispatch Under Demand Uncertainties", IEEE
Transactions on Control Systems Technology,vol.27,
no. 1, pp. 175 - 191, Jan.2019.
[3] Mohammad Saiedur Rahaman, Yongli Ren, Margaret
Hamilton and Flora D. Salim, "Wait Time Prediction
for Airport Taxis Using Weighted Nearest Neighbor
Regression", IEEE Access, vol. 6, pp. 74660 - 74672,
Nov.2018.
[4] Desheng Zhang, Tian He, Shan Lin, Sirajum Munirand
John A. Stankovic,"Taxi-Passenger-DemandModeling
Based on Big Data from a Roving Sensor Network",
IEEE Transactions on Big Data, vol. 3, no. 3, pp. 362 -
374, Sept.2017.
[5] Luis Moreira-Matias, JoãoGama,MichelFerreira,João
Mendes-Moreira, and Luis Damas, "Predicting Taxi–
Passenger Demand Using Streaming Data", IEEE
Transactions on Intelligent Transportation Systems,
vol. 14, no. 3, pp. 1393 - 1402, sept.2013.
[6] Biao Leng, Heng Du, Jianyuan Wang, Li Li and Zhang
Xiong, "Analysis of Taxi Drivers’ Behaviors Within a
Battle BetweenTwoTaxiApps",IEEETransactionson
Intelligent Transportation Systems, vol. 17, no. 1, pp.
296 - 300, jan.2016.
[7] Nicholas Jing Yuan, Yu Zheng,LiuhangZhangandXing
Xie, "T-Finder: A Recommender System for Finding
Passengers and Vacant Taxis", IEEE Transactions on
Knowledge and Data Engineering, vol. 25, no. 10, pp.
2390 - 2403, Oct.2013.
[8] Fei Miao, Student Member, Shuo Han, Member, Shan
Lin, John A. Stankovic,DeshengZhang,SirajumMunir,
Hua Huang, Tian He, and George J. Pappas, "Taxi
Dispatch With Real-Time Sensing Data in
Metropolitan Areas: A Receding Horizon Control
Approach", IEEE TransactionsonAutomationScience
and Engineering, vol. 13, no. 2, pp. 463 - 478,
Mar.2016.
[9] P. Santi, G. Resta, M. Szell, S. Sobolevsky,S.H.Strogatz,
and C. Ratti, “Quantifying the benefits of vehicle
pooling with shareability networks,” Proc. Nat. Acad.
Sci. USA, vol. 111, no. 37, pp. 13290–13294, 2014.

More Related Content

PDF
IRJET- Smart Railway System using Trip Chaining Method
PDF
Neural Network Based Parking via Google Map Guidance
PDF
FORESEEING BUS ARRIVAL TIME IN VIEW OF TRAFFIC MODELING AND REAL-TIME DELAY
PDF
Smart Parking
PDF
TRANSFORMING PUBLIC TRANSPORTATION
PDF
Traffic Congestion Prediction using Deep Reinforcement Learning in Vehicular ...
PDF
A Review on Intrusion Detection System Based Data Mining Techniques
PDF
Online Bus Arrival Time Prediction Using Hybrid Neural Network and Kalman fil...
IRJET- Smart Railway System using Trip Chaining Method
Neural Network Based Parking via Google Map Guidance
FORESEEING BUS ARRIVAL TIME IN VIEW OF TRAFFIC MODELING AND REAL-TIME DELAY
Smart Parking
TRANSFORMING PUBLIC TRANSPORTATION
Traffic Congestion Prediction using Deep Reinforcement Learning in Vehicular ...
A Review on Intrusion Detection System Based Data Mining Techniques
Online Bus Arrival Time Prediction Using Hybrid Neural Network and Kalman fil...

What's hot (19)

PDF
IRJET-To Analyze Calibration of Car-Following Behavior of Vehicles
PDF
Dynamic resource allocation in road transport sector using mobile cloud compu...
PDF
Inter vehicular communication using packet network theory
PDF
Real Time Road Blocker Detection and Distance Calculation for Autonomous Vehi...
PDF
IRJET- Detailed Survey & Analysis of a Traffic System on Mid Block Sectio...
PDF
The realistic mobility evaluation of vehicular ad hoc network for indian auto...
PDF
Traffic census and analysis (a case study)
PDF
IRJET- Simulation based Automatic Traffic Controlling System
PDF
Optimized Traffic Signal Control System at Traffic Intersections Using Vanet
PDF
Optimization of smart traffic lights to prevent traffic congestion using fuzz...
PDF
IRJET- Application for Keen Transporation and Crisis Framework
PDF
Study of Congestion Control Scheme with Decentralized Threshold Function in V...
PDF
(Paper) Parking Navigation for Alleviating Congestion in Multilevel Parking F...
PDF
The International Journal of Engineering and Science (The IJES)
PDF
Classification Approach for Big Data Driven Traffic Flow Prediction using Ap...
PDF
CREATING DATA OUTPUTS FROM MULTI AGENT TRAFFIC MICRO SIMULATION TO ASSIMILATI...
PDF
A two Stage Fuzzy Logic Adaptive Traffic Signal Control for an Isolated Inter...
PDF
Icquest1518
PDF
IRJET- Automated Traffic Control System
IRJET-To Analyze Calibration of Car-Following Behavior of Vehicles
Dynamic resource allocation in road transport sector using mobile cloud compu...
Inter vehicular communication using packet network theory
Real Time Road Blocker Detection and Distance Calculation for Autonomous Vehi...
IRJET- Detailed Survey & Analysis of a Traffic System on Mid Block Sectio...
The realistic mobility evaluation of vehicular ad hoc network for indian auto...
Traffic census and analysis (a case study)
IRJET- Simulation based Automatic Traffic Controlling System
Optimized Traffic Signal Control System at Traffic Intersections Using Vanet
Optimization of smart traffic lights to prevent traffic congestion using fuzz...
IRJET- Application for Keen Transporation and Crisis Framework
Study of Congestion Control Scheme with Decentralized Threshold Function in V...
(Paper) Parking Navigation for Alleviating Congestion in Multilevel Parking F...
The International Journal of Engineering and Science (The IJES)
Classification Approach for Big Data Driven Traffic Flow Prediction using Ap...
CREATING DATA OUTPUTS FROM MULTI AGENT TRAFFIC MICRO SIMULATION TO ASSIMILATI...
A two Stage Fuzzy Logic Adaptive Traffic Signal Control for an Isolated Inter...
Icquest1518
IRJET- Automated Traffic Control System
Ad

Similar to IRJET- Prediction of Cab Demand using Machine Learning (20)

PDF
TAXI DEMAND PREDICTION IN REAL TIME
PDF
Taxi Demand Prediction using Machine Learning.
PDF
Auto mobile vehicle direction in road traffic using artificial neural networks
PDF
AUTO-MOBILE VEHICLE DIRECTION IN ROAD TRAFFIC USING ARTIFICIAL NEURAL NETWORKS
PDF
AUTO-MOBILE VEHICLE DIRECTION IN ROAD TRAFFIC USING ARTIFICIAL NEURAL NETWORKS
PDF
AUTO-MOBILE VEHICLE DIRECTION IN ROAD TRAFFIC USING ARTIFICIAL NEURAL NETWORKS.
PDF
Smart traffic forecasting: leveraging adaptive machine learning and big data ...
PDF
A Hybrid Deep Neural Network Model For Time Series Forecasting
PPTX
Deep learning Tutorial - Part II
PPTX
An effective joint prediction model for travel demands and traffic flows
PDF
Deep learning with Keras
PDF
Time Series Forecasting Using Novel Feature Extraction Algorithm and Multilay...
PPTX
Traffic Prediction from Street Network images.pptx
PPTX
Deep learning to the rescue - solving long standing problems of recommender ...
PDF
Rides Request Demand Forecast- OLA Bike
PDF
Deep Learning for Time Series Data
PPTX
Deep learning (2)
PDF
Mobile Network Coverage Determination at 900MHz for Abuja Rural Areas using A...
PDF
TRAFFIC FORECAST FOR INTELLECTUAL TRANSPORTATION SYSTEM USING MACHINE LEARNING
PPTX
Machine Learning Algorithms for Anomaly Detection in Particles Accelerators T...
TAXI DEMAND PREDICTION IN REAL TIME
Taxi Demand Prediction using Machine Learning.
Auto mobile vehicle direction in road traffic using artificial neural networks
AUTO-MOBILE VEHICLE DIRECTION IN ROAD TRAFFIC USING ARTIFICIAL NEURAL NETWORKS
AUTO-MOBILE VEHICLE DIRECTION IN ROAD TRAFFIC USING ARTIFICIAL NEURAL NETWORKS
AUTO-MOBILE VEHICLE DIRECTION IN ROAD TRAFFIC USING ARTIFICIAL NEURAL NETWORKS.
Smart traffic forecasting: leveraging adaptive machine learning and big data ...
A Hybrid Deep Neural Network Model For Time Series Forecasting
Deep learning Tutorial - Part II
An effective joint prediction model for travel demands and traffic flows
Deep learning with Keras
Time Series Forecasting Using Novel Feature Extraction Algorithm and Multilay...
Traffic Prediction from Street Network images.pptx
Deep learning to the rescue - solving long standing problems of recommender ...
Rides Request Demand Forecast- OLA Bike
Deep Learning for Time Series Data
Deep learning (2)
Mobile Network Coverage Determination at 900MHz for Abuja Rural Areas using A...
TRAFFIC FORECAST FOR INTELLECTUAL TRANSPORTATION SYSTEM USING MACHINE LEARNING
Machine Learning Algorithms for Anomaly Detection in Particles Accelerators T...
Ad

More from IRJET Journal (20)

PDF
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
PDF
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
PDF
Kiona – A Smart Society Automation Project
PDF
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
PDF
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
PDF
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
PDF
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
PDF
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
PDF
BRAIN TUMOUR DETECTION AND CLASSIFICATION
PDF
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
PDF
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
PDF
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
PDF
Breast Cancer Detection using Computer Vision
PDF
Auto-Charging E-Vehicle with its battery Management.
PDF
Analysis of high energy charge particle in the Heliosphere
PDF
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
PDF
Auto-Charging E-Vehicle with its battery Management.
PDF
Analysis of high energy charge particle in the Heliosphere
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
Kiona – A Smart Society Automation Project
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
BRAIN TUMOUR DETECTION AND CLASSIFICATION
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
Breast Cancer Detection using Computer Vision
Auto-Charging E-Vehicle with its battery Management.
Analysis of high energy charge particle in the Heliosphere
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
Auto-Charging E-Vehicle with its battery Management.
Analysis of high energy charge particle in the Heliosphere
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...

Recently uploaded (20)

PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PDF
R24 SURVEYING LAB MANUAL for civil enggi
PDF
Well-logging-methods_new................
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PPT
Project quality management in manufacturing
DOCX
573137875-Attendance-Management-System-original
PPT
Introduction, IoT Design Methodology, Case Study on IoT System for Weather Mo...
PPTX
UNIT 4 Total Quality Management .pptx
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PDF
PPT on Performance Review to get promotions
PDF
composite construction of structures.pdf
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PPTX
Current and future trends in Computer Vision.pptx
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
R24 SURVEYING LAB MANUAL for civil enggi
Well-logging-methods_new................
Embodied AI: Ushering in the Next Era of Intelligent Systems
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
Project quality management in manufacturing
573137875-Attendance-Management-System-original
Introduction, IoT Design Methodology, Case Study on IoT System for Weather Mo...
UNIT 4 Total Quality Management .pptx
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PPT on Performance Review to get promotions
composite construction of structures.pdf
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
Current and future trends in Computer Vision.pptx

IRJET- Prediction of Cab Demand using Machine Learning

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 03 | Mar 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 6961 Prediction of cab demand using machine learning Abdul Rahuman Aslam M A1, Gobinathan V2, Krishnan K3, Rajasekaran G4 1Student, Easwari Engineering College, Bhrathi Salai, Ramapuram, Chennai – 600089. 2 Student, Easwari Engineering College, Bhrathi Salai, Ramapuram, Chennai – 600089. 3 Student, Easwari Engineering College, Bhrathi Salai, Ramapuram, Chennai – 600089. 4Assistant Professor, Easwari Engineering College, Bhrathi Salai, Ramapuram, Chennai – 600089. ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract - The cab service industryisboomingforthe last couple of years and it is expected to grow in the near future. Taxi drivers need to decide where to wait for passengers as they can pick up someone as soon as possible. Passengers also prefer a quick taxi service whenever needed. The control centre of the taxi service decides the busy area to be concentrated. In the existing system, sometimes the taxis were scattered across the larger area missing the time based busy area like Airport, Business area, school area, Train stations etc. Effective taxi allocation can help both drivers and passengers to minimize the wait-timetofindeachother. In the proposed system, the future demand can be predicted using Recurrent Neural Networks based model that can be trained with given historical data. It can serve more customers in a short time by organizing the availability of taxi. The data set includes GPS location and other properties of the taxi like drop point, pickup point etc. This model is used to predict the demand for aparticular time in different areas of the city. Key Words: Taxi demand prediction, Recurrent neural network. 1.INTRODUCTION Taxi drivers need to decide where to wait for the passengers so that they can pick someone quickly. Similarly, passengers also need to find their cabs quickly. Dispatching the taxi efficiently help both the customers and drivers. Effective dispatching of taxi helps to reduce waiting time for customers, as well as drivers. A driver will not have enough information about where to wait in order to get passengers quickly. A taxi centre can organize and send the required number of taxis to the area based on the historical data. The historical data uses Global Positioning System (GPS) and predict future demand. In Tokyo, today this system is reducing the waiting time for the customers, quickly respond to the sudden change of demands and it bridges the gap between the experienceddriversandnovicedrivers.Thesebenefits allow the cab service to achieve maximum benefit. A real-time taxi demand prediction is proposed here andinthissystem,historicaldataisusedtopredictthe future demand for taxis in a particular place at a particular time. Some of the real-time objectives include managing fleet of taxi to crowded area, effective utilization of resources to reduce waiting time, server more customers in a short time by organizingavailabletaxi.OursystemusesGPSlocation andother properties ofthetaxilikepickuppoint,drop point etc. to predict taxi demand. A model is trained using a recurrent neural network. The recurrent neural networks are used in speech recognition software. Therecurrentneuralnetworksareusedfor sequential data. It is being used in Google’s voice search and Apple Siri personal assistants. This algorithm is the achievement of deep learning in past years.Recurrentneuralnetworksproducepredictions result for the sequential data. Taxi demand prediction is a time series analysis problem. Itisafeedforwardworkneuralnetworkand the information moves from one network to another network and from the input layer to the output layer through the hidden layer. The difference between the normal neural network and the recurrent neural network is that in the recurrent neural network, the information cycles through the loop. In the system, the recurrent neural network is used andPythonlanguage ispreferredbecauseithasavast collection ofmachine learning libraries. The data set might contain empty values, negative values or error. Data set is cleaned in the preprocessing. The preprocessing methods involve removing records which are not complete. Once the cleaned data set is available it is prepared to be fed to the machine
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 03 | Mar 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 6962 learning algorithm. Recurrent Neural Networks take the previous node output or hidden states as inputs. RNNs are useful as their intermediate values (state) can storeinformationaboutpreviousinputsforatime interval. The main feature of a Recurrent Neural Network (RNN) is that the network possess at least one feedback connection, so the activations can flow round in a loop-wise manner. That enables the networks to learn sequences and to do temporal processing, e.g., perform sequence recognition/reproduction or temporal association/prediction. Recurrent neural network architectures can have many variated forms. One common The type that consists of a standard Multi-Layer Perceptron (MLP) plus added curves. These can escapade the powerful non-linear mapping capabilities of the MLP, and also have some form of memory. Since one can think about recurrent networks in terms of their properties as dynamical systems, it is needed to ask about their stability, observability and controllability: Stabilityenterprisestheboundednessovertimeofthe network outputs and the response of the network output to minute changes to the weights or network inputs. Controllability is concerned with whether it is possible to control the dynamic behaviour patterns. An RNN is said to be controllable if an original state is steerable to a desired state within a finite number of sequential time steps.Observabilityisconcernedwith whether it is possible to observe the results of the control that is applied. A recurrent network is observable if the state of the network can be dogged from a fixed set of input and output measurements. The network input is the current taxi demand, while the output is the demand in the next time-step. The reason recurrent neural network is used is that it can be trained to store all the relevant information in a sequence to predict particularoutcomes in the future. Itisatimeseriesforecastingproblemtopredictfuture demand. Hence, a sequential algorithm is used. It is desired to predict taxi demand in small areas so that the drivers know exactly where to go. The system is trained with the data set and create the model for future prediction. A graph isplottedforthefuturepredictionforthenext time slot and the area to be crowded. This machine learning model predicts the future demand area in a city based on neural network and the drivers were taken to wait in the area where the system identified as demand area. 2. RELATED WORKS Fei Miao et al [1] proposed a modern robust transportationsystem that senses datacollectedfrom transportation systems that help in analyzing the passenger demands. Prediction Methods on taxi- passenger demand were travel time and travelling speed according to traffic monitoring data have been developed. In their proposed model Robotic mobility- on-demand systems that minimize the number of rebalancing trips and best parking systems that allocate resource based on a driver’s payments. These kinds of algorithms aim to reduce long mile or to minimize customers’ waiting time have been developed. Although optimizationinrobust approach aims to minimize the worst case cost under all possible random parameters, it results in average system performances. For a taxi dispatch system, it is essential to address the compensation between the worst case and the average dispatch costs under uncertain demand. Estimations show that under the robust dispatch framework we design, the average demand-supply ratio imbalance is reduced by 31.7%, and the average total idle driving distance is reduced by 10.130% or about 20 million miles in total in one year. Mohammad Saiedur Rahaman et al [2] defined the neighbourhoodidentificationprobleminthepresence of a large number of heterogeneous contextual modules. It codifies research as a problem of less wait time prediction for taxi drivers at airports and investigate heterogeneous elements related to time, weather, flight arrivals and taxi trips. Taxis are regarded as the easiest mode of transport for transfer between the airport and the city. The queue managers continuously monitor the concurrent queues related to taxis and passengers and instruct taxi drivers to join the passengers at the terminal when there is higher demand. Toensure the seamless operation of this process, the queue manager estimates the demand for taxis in future. Airport satisfaction ratings depend on the proper management of both passenger queues and taxi. Aimingtomaintaindemand-supplysymmetryoftaxis,
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 03 | Mar 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 6963 the airport transport managers employ an approach where it requires extended humanintervention. They examine the corresponding p-values, which show the statistical significance of this difference. In this D- value is around 0:45 and p-value are less than 0.001. This means that neighbourhoods are different statisticallyfor differentk-values. Ifmedianandmean denote the improvement shown by the DIbiased weights method over the baseline method in terms of medianandmeanofpredictionerrorsrespectivelyfor different k-values, the correlation between the corresponding D-values and the prediction errors, d_error(Median)andd_error(Mean)canbemeasured to show the relationship between the improvement inthedensequalityneighbourhoodandtheprediction accuracy is improved. This paper shows Pearson’s correlation scores of 0.393 (with mean) and 0.484 (with median). They argue that the quality of the neighbourhood they identified is significantly improved by the consideration of relevant heterogeneous contextual factors, thus the performance isboosted (i.e. mean prediction error is less than 0.09 and the median prediction error is less than 0.06). DeshengZhang et al [3]states that theexistingsystem of data collection is offline and collected by manual investigations and it may result in inaccurate data for real-time analysis. To address this, they have used a model called Dmodel, employing roving taxicabs and using them as real-time mobile sensors. By implementing this, they can infer arriving passenger moments by investigating the logical information. They used 450GB dataset of 14,000 taxicabs for a half year and it achieves 83% accuracy and outperforms the statistical model by 42%. Passenger demand prediction may be halted by bad weather, special events or accidents. They provide two parts of the solution for this. First, they mine a large dataset of historical data, consisting of taxi passengers and their related pickups. Second, to address the real-time dynamics, they consider thousands of taxi cabs and use them as real-time mobile sensors. It is possible because they use GPS data in dense urban areas. The front-end systemandtheback-enddispatchingcentre form a network called roving sensor network. The used Dmodel observes hidden contexts to infer demand based on historical and real-time taxi cab data. Dmodel analyses both the offlineanalysisofdata and real-time data collected from roving taxis or sensor network formed by them. They used a novel parameter,namelythepickuppattern.Dmodelutilizes the real-time pickup pattern so that the model can select customized and compact training data to increase the accuracyof inference. The use ofDmodel yields 83% accuracy and outperforms statistical model by 42%. They used a Hidden Markov Chain for implementation. The Dmodel- based dispatching outperforms basic and SDD based by 11% on an average. This is due to the accurate inference by Dmodel. LuisDamasetal[4]proposedanovelmethodologyfor predicting the spatial distribution of taxi–passengers for a short time horizon using streaming input data. First, the information was accumulated into a histogram time-series. Then, three time-series forecasting techniques were combined to emerge a prediction. Experimental tests were conducted using the online data set that is transmitted by 441 vehicles of a fleet running in the city of Porto, Portugal. The resultselaboratedsothattheframeworkproposedcan provide effective learning into the spatio-temporal distributionoftaxi–passengerdemandforahorizonof 30minutes.Thispaperfocusesonthereal-timechoice problem of which is the best cab service stand to go to after a passenger drop-off (i.e., the stand where another passenger can be picked up within a short span of time). An intelligent approach regarding this flaw will improve network reliability for both companies and customers; an intelligent distribution of vehicles throughout stands will minimize the average waitingtime to pickup a passenger, whilethe distance travelled will be more profitable. Furthermore, whenever they need a taxi, passengers will also experience a lesser waiting time to get a vacanttaxi.Themajorcontributionofthispaperfacing this is to build predictions on the spatio-temporal distribution of the taxi–passenger demand using only streaming data. As a result, the model that is presented has beenable to predict thetaxi–passenger demand at each one of the 64 taxi stands for 30 minutes period intervals. The model has presented a more than satisfactory performance, correctly predicting the 506 874 tested services with an aggregated error measurement lower than 26.11%. Biao Leng et al [5] elaborated the battle between two taxicompaniesinChinanamelyDidiandKauidadithat occurred in 2014. The two companies are backed up by internet giants like Tencent and Alipay. These companies promoted the taxi drivers by giving them
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 03 | Mar 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 6964 incentives for each ride and also allowed the users also to use their application by giving frequent discounts and offers and also promoted payment throughthemobilephone.Inthispaper,theycollected a 37-day trip data and use 9000 entries in Beijing. For the first 18 days there was no battle and for the next 19 days, the battle was competitive. The spatial- temporal data are studied and based on the comprehensive analysis, benefits and drawbacks are discussed. Taxi drivers welcomed this battle and startedtoaccepteverypassengertheycansothatthey can achieve maximum benefits from incentives. Customers started to complain that they cannot find any taxis without using these apps. So, offline taxi services may get affected. So, this paper addresses that offline cab services may get affected. These companies wanted to increase the usage of mobile payments and hence used money promotion. Indirectly, money promotion increased cab service indirectly. Our paper helps both online and offlinecab services, as it is purely dependent on historical data based on the Global Positioning System. Nicholas Jing Yuan et al [6] proposed a recommender system for both cab drivers and people expecting to take a cab, using the knowledge of 1) passengers’ mobility patterns and 2) taxi drivers’ picking- up/dropping-off behavioural patterns learned from the GPS trajectories of taxicabs. In their method, they learn the above-mentioned knowledge (probability representation) from GPS trajectories of taxis. Then, they feed knowledge into a probabilistic model that calculate the profit of the candidate locations for a particular cab driver based on where and when the driverrequeststherecommendation.Theyproposeda defined approach to detect parking places based on a large number of GPS trajectories generated by taxis, where the parking places stand for the locations where cab drivers usually wait for passengers with theircars parked. Theyinventedaprobabilistic model to formulate the time-dependent taxi behaviors (picking-up/dropping-off/cruising/parking) and enable a city recommendation system for both passengers and taxi drivers. They also improve the taxi recommender by considering the time-varying queue length at the parking places. Forbothofthetaxi recommender and the passenger recommender, they constructed a model consolidating day of the week and historical weather conditions to the varying pick- up/drop-off behaviors. They also perform large-scale evaluations including in-the-field user studies to validate their system. They also built the recommender system with a data set generated by 12,000 taxicabs in a period of 110 days and evaluated the system by large-scale experiments including a series of in-the-field studies. As a result, the taxi recommender predicts accurately the time-varying queue in parking lots and the recommender successfully suggests the segment of the road where users can easily find vacant taxis. In the future, they alsoplantodeployourrecommenderintherealworld so as to further validate andincreasethe effectiveness and robustness of thissystem. Jun Xu et al [7] proposed a system in which they used recurrent neural networks to predict the cab demand for the future based on the historical data which has GPS. They used Long Short Term Memory (LSTM) to predict the future demand. They used the LSTM in order to store the sequential data and predict the cab demand for future. LSTMs are used in handwriting recognition, Natural language Processing etc. LSTM used in this model uses some gating mechanism to store the previous value. In thismodel,theyhaveused a Mixed Density Network (MDN) along with Long Short Term Memory (LSTM). This paper can achieve 83% accuracy in predicting the future cab demand and it contains 17% prediction error. Tian he et al [8] proposed a receding horizon control (RHC) framework to dispatch cabs, which incorporates highlyspatiotemporallycorrelatedreal- time Global Positioning System and demand/supply, models(GPS)locationandoccupancyinformation.The target includes matching spatiotemporal ratio between demand and supply for service quality with less usage of current and assumed future taxi idle drivingdistance.TheydesignedanRHCframeworkfor large-scale taxi dispatching. They look at both current and future demand, saving costs under constraints by involving expected future idle driving distance for rebalancingsupply.Theframeworkcoverslarge-scale data in real-time control. Sensing this kind of data is used to build predictive passenger demand, taxi mobility models, and serve as real-time feedback for RHC. Their Future plan is to develop a privacy- preservingcontrolframeworkwhendataofsomecabs are not shared with the dispatch centre. Extensive trace-driven analysis with a data set containing taxi operational records in San Francisco, CA, USA, shows that our end solution reduces the average total idle distance by 52.01%, and reduces the supply-demand
  • 5. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 03 | Mar 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 6965 ratio error across the city during one experimental time slot by 45.02%. 3. SYSTEM ARCHITECTURE The real wordtaxi-tripdatasetiscollected.The collected data is pre-processed and the data preprocessing is a data mining technique that involves transforming raw data into an understandable format. Real-world data is often incomplete, inconsistent, and/or lacking in certain behaviors or trends, and is likely to contain many errors. The major tasks of the data preprocessing involves data cleaning, data integration, data transformation, data reduction etc. The missingvaluesareidentifiedanditisreplacedwith mean values.Thecleaneddatasetisfedtotherecurrent neural network model and is it trained using historical data which consists of date, time, pickup location, weather etc. and the future demand for the taxi is predicted. Recurrent Neural Networks (RNN) are a powerful and robust type of neural networks and belong to the most promising algorithms out there at the moment because they are the only ones with an internal memory. The predicted demand for the cab is visualized using a graph. Fig. 3.1 System Architecture 4. SYSTEM IMPLEMENTATION The dataset consists of three columns and 30,426 entries. The minimum index value and maximum index values are found. The maximum and minimum indices are used to format the dataset. The output consists of a graph consisting of missing values in the dataset corresponding to the given data with time in a one-hour interval in x-axis and number of pickups in the y-axis. It also consists of a graph consisting of bookings with time in a one-hourinterval in x-axis and number of pickups in the y-axis. Once, missing values are found, the mean value for the columns is found and this value is replaced in the place of missing entries. There are a lot of options for replacing missing values but the simplest andthemost common one is replacing them with the mean of all the entries under the respective column. At the end of the data preprocessing, a cleaned data set with no missing values is found and all the missing values in the data set were replaced by the mean of the respective columns. The cleaned dataset is important because there may be some irrelevant data in the dataset that may cause prediction error and causes inaccurate results. The Comma Separated file (CSV) is collected from the usage of the user and perform cleaning and processing, as a result of a cleaned data set is achieved. The cleaned data set is used for the next stage in predicting the taxi demand the recurrent neural network is applied to predict the future cab demand. . Fig. 4.1 Bookings graph The figure 4.1 shows the plotted graph using matplotlib. pyplot, where it is plotted as a ratio of time in hours and number of pickups. The graph shows the number of pickups, at a particular date. The Vanilla Neural Networks or Conventional Neural Networks accept only a fixed size inputs and produce fixed sizeoutputs.Itcannotworkforsequence
  • 6. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 03 | Mar 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 6966 of inputs and sequence of outputs. Recurrent Neural Network is a class of Artificial Neural Network (ANN). The network input is the current taxi demand and other relevant information while the output is the demand in the next time-step. The reason a recurrent neural network is used is that it can be trained to store all the relevant information in a sequence to predict particular outcomes in the future. In addition, taxi demand predictionisatimeseriesforecastingproblem in which an intelligent sequence analysis model is required. It is desired to predict taxi demand in small areas so that the drivers know exactlywheretogo.The system is trained with the dataset and the model is used for future demand prediction. Fig. 4.2 Artificial Neural Networks Recurrent Neural Network considers previous input and the current inputtocomputethefutureoutput.For taxi demand, we use the RNN model because taxi data is a sequential data and it is a time series forecasting problem. Unlike, traditional neural network RNN has memory for storing relevant parts of the input and use it for prediction of the future demand. Recurrent Neural Networks take the previous node output or hidden states as inputs. RNNs are useful as their intermediate values (state) can store information about previous inputs for a time interval. The main feature of a Recurrent Neural Network (RNN) is that the network possesses at least one feedback connection, so the activations can flow roundinaloop- wise manner. That enables the networks to learn sequences and to do temporal processing,e.g.,perform sequence recognition/reproduction or temporal association/prediction. Recurrent neural network architectures can have many variated forms. One common type that consists of a standard Multi-Layer Perceptron (MLP) plus added curves. These can escapade the powerfulnon-linearmappingcapabilities of the MLP, and also have some form of memory. Since one can think about recurrent networks in terms of their properties as dynamical systems, it is needed to ask about their stability, observability and controllability. Stability enterprises the boundedness over time of the network outputs and the response of the network output to minute changes to the weights or network inputs. Controllability is concerned with whether it is possibletocontrolthedynamicbehaviour patterns. An RNN is said to becontrollableifanoriginal state is steerable to a desired state within a finite number of sequential time steps. Observability is concerned with whether it is possible to observe the results of the control that is applied. A recurrent network is observable if thestateofthenetworkcanbe dogged from a fixed set of i/p and o/pmeasurements. For predicting the future taxi demand, a dataset which consists of date and time, number of pickups, number of passengers, maximum temperature, minimum temperature, humidity and ,wind speedisused.RNNis called recurrent because it performs the same computation for every input element and each output is conditioned on previous input element. Fig. 4.3 Recurrent Neural Network Long short-term memory is RNNarchitecturemodel.It can be used for handwriting recognition, speech recognition etc. LSTM consists of an input gate, output gate and a forget gate. LSTMs are explicitly designed in order to avoid the long term dependency problem. Traditional recurrent neural network has a chain of repeating modules. But, in LSTM this structure is different.
  • 7. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 03 | Mar 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 6967 Fig. 4.4 Repeating Modules in standard RNN Fig. 4.5 Repeating Modules in LSTM LSTMs also have this chain like structure, but the repeating module has a different structure. Instead of having a single neural network layer, there are four, interacting in a very special way. Fig. 4.6 LSTM Model loss The figure 4.6 shows the model shows the Long Short- Term Memory (LSTM) model loss graph, whose accuracy can be increased in each iteration. In the neural network, when each data passes from the input layer to the output layer, through the hidden layer the accuracy of the model increases and each timethedata passes the accuracy increases. Figure 4.7 shows the number of days with greaterthan 800 pickups, which indicate peak hours. Peak hours indicate the time in which the demand for the taxi is high. Fig. 4.7 Predicted Number of taxi pickups vs. Actual Number of taxi pickups Fig. 4.8 Predicted Number of taxi pickups vs. Actual Number of taxi pickups
  • 8. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 03 | Mar 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 6968 Figure 4.8 shows the graph plotted between predicted number of taxi pickups and actual number of taxi pickups from the historical data. Fig 4.9 Estimated Number of Taxi Pickups Fig 4.9 shows the estimated number of taxi pickups for the next six hours. 5. CONCLUSION The proposed system is a sequential learning model with recurrent neural network for predicting the taxi demandin different areas inthecity.Learningfromthe past historical data, the demand prediction is done for the location. Three Years data of the Airport is used to train our model. This model gives the predictionoftaxi demand for hourly basis and a particular time. This work can be extended in the future by adding more input such as holidays, festivals etc. Taxis can be organized and send based on the prediction of the model. In addition, it can save so much gas that is currently being spent by taxis to find passengers. REFERENCES [1] Jun Xu , Rouhollah Rahmatizadeh, Ladislau Bölöni, and Damla Turgut, “Real-Time Prediction of Taxi Demand Using Recurrent Neural Networks”, IEEE Transaction on Intelligent transport system, vol. 19, no. 8, pp. 2572-2581,Aug.2018. [2] Fei Miao, Shuo Han, Shan Lin, Qian Wang, John A. Stankovic,AbdeltawabHendawi,DeshengZhang,Tain he and George J. Pappas, "Data-Driven Robust Taxi Dispatch Under Demand Uncertainties", IEEE Transactions on Control Systems Technology,vol.27, no. 1, pp. 175 - 191, Jan.2019. [3] Mohammad Saiedur Rahaman, Yongli Ren, Margaret Hamilton and Flora D. Salim, "Wait Time Prediction for Airport Taxis Using Weighted Nearest Neighbor Regression", IEEE Access, vol. 6, pp. 74660 - 74672, Nov.2018. [4] Desheng Zhang, Tian He, Shan Lin, Sirajum Munirand John A. Stankovic,"Taxi-Passenger-DemandModeling Based on Big Data from a Roving Sensor Network", IEEE Transactions on Big Data, vol. 3, no. 3, pp. 362 - 374, Sept.2017. [5] Luis Moreira-Matias, JoãoGama,MichelFerreira,João Mendes-Moreira, and Luis Damas, "Predicting Taxi– Passenger Demand Using Streaming Data", IEEE Transactions on Intelligent Transportation Systems, vol. 14, no. 3, pp. 1393 - 1402, sept.2013. [6] Biao Leng, Heng Du, Jianyuan Wang, Li Li and Zhang Xiong, "Analysis of Taxi Drivers’ Behaviors Within a Battle BetweenTwoTaxiApps",IEEETransactionson Intelligent Transportation Systems, vol. 17, no. 1, pp. 296 - 300, jan.2016. [7] Nicholas Jing Yuan, Yu Zheng,LiuhangZhangandXing Xie, "T-Finder: A Recommender System for Finding Passengers and Vacant Taxis", IEEE Transactions on Knowledge and Data Engineering, vol. 25, no. 10, pp. 2390 - 2403, Oct.2013. [8] Fei Miao, Student Member, Shuo Han, Member, Shan Lin, John A. Stankovic,DeshengZhang,SirajumMunir, Hua Huang, Tian He, and George J. Pappas, "Taxi Dispatch With Real-Time Sensing Data in Metropolitan Areas: A Receding Horizon Control Approach", IEEE TransactionsonAutomationScience and Engineering, vol. 13, no. 2, pp. 463 - 478, Mar.2016. [9] P. Santi, G. Resta, M. Szell, S. Sobolevsky,S.H.Strogatz, and C. Ratti, “Quantifying the benefits of vehicle pooling with shareability networks,” Proc. Nat. Acad. Sci. USA, vol. 111, no. 37, pp. 13290–13294, 2014.