SlideShare a Scribd company logo
International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064
Volume 2 Issue 9, September 2013
www.ijsr.net
Predicting Movie Success Using Neural Network
1
Arundeep Kaur, 2
AP Nidhi
Department of computer Science, Swami Vivekanand Institute of Engineering & Technology,
Punjab Technical University, Jalandhar, India
Abstract: In this research work we have developed a mathematical model for predicting the success class [flop , hit , super hit] of the
Indian movies, for doing this we have develop a methodology in which the historical data of each component [e. G actor , actress,
director, music ]that influences the success or failure of a movie is given is due weightage and then based on multiple thresholds
calculated on the basis of descriptive statistics of dataset of each component it is given class [flop , hit, super hit] label. This dataset is
then subjected to neural network [LM] based learning algorithm for automating the process and results in terms of match between
actual class labels and predicted labels are evaluated. Results show that our strategy of identifying the class of success is highly effective
and accurate which apparent from the classification matrix also.
Keywords: Movie prediction, neural network, weights of variables
1. Introduction
Today, the trouble is that the more things change, the
more they stay in the same horizons. However, this may not
be time for the movie industry, as it can break completely
free of the cycles which had marked its history for hundreds
of years, and it will be in fact, a departure from reality, It’s
not predicting the future success of movie is problematical,
it’s the realization that you have to relive the past again
and again and still make highly intelligent guess about the
success and failure of the movie. An attempt is made to
predict the past as well as the future of movie for the
purpose of business certainty or simply a theoretical
condition in which decision making [the success of the
movie] is without risk, because the decision maker [movie
makers and stake holders] has all the information about the
exact outcome of the decision, before he or she makes the
decision [release of the movie].
With over two million spectators a day and films exported
to over 100 countries, the impact of Bollywood film
industry is formidable. From the first Indian film “Raja
Harishchandra by Dhundhiraj Govind (Dadasaheb)
Phalke in 1913 to 1981, India produced over 15000
feature films. Since then it has produced, at least another
15000 at a rate of more than 1000 films a year (1091 in
2006, 1146 in 2007 and 1325 in 2008) in 26 languages
[1]. The industry is world’s largest in terms on number of
movies produced and also in terms of number of cinema
goers. Bollywood p r o d u c e s as many films as the next
three largest producers – US, Japan and China- combined.
In terms of money it is second only to Hollywood [2]. Now,
film making in India is a multimillion dollar industry
employing over 6 million workers and reaching millions
of people worldwide. In 2008 industry was valued at 107.1
billion rupees. Pricewaterhouse Coopers [3] predict that
industry will be 184.3 billion in 2013. With such a fortune
and employment of so many people at stake every Friday, it
will be of immense interest to producers to know the
probability of success or failure of a movie. However, due
to their definition as experience goods with short product
life time cycles; it is difficult to forecast the demand for
motion pictures. Nevertheless, producers and distributors of
new movies need to forecast box-office results in an
attempt to reduce the uncertainty in the motion picture
business and as a stake holder in the movie industry , one
needs to know then the minimum sum of money a he/she
can accept to forgo the opportunity to participate in an
event [make/distribute etc ., movie] for which the
outcome [success or failure of movie], and therefore his
or her receipt of a reward, is uncertain [success of the
movie].
2. Research Gap
Literature survey has revealed only two studies which have
attempted to predict the success of movies. While one study
uses Bayesian belief network to predict the success, the
other one uses neural network for the same. Lee and Chang
[2] in their study using Bayesian Belief Network for
predicting box office performance concluded that Bayesian
Belief Networks were better in predicting the success as
compared to neural networks. However, Zhang et al [1] in
their study concluded that the MLBP prediction model [1]
achieves more satisfactory results as compared with MLP
method, and it is more reliable and effective to solve the
problem. Since, not much work has been carried out in this
area we intend to develop a model which can predict the
financial success of the movie.
2.1 Rationale of the Study
As movies are defined as experience goods with short
product lifetime cycles, it is difficult to forecast the demand
for motion pictures. Nevertheless, producers and distributors
of new movies need to forecast box-office results in an
attempt to reduce the uncertainty in the motion picture
business. The study intends to develop a model to predict the
financial success of a movie.
3. Proposed Work
For developing a model that can help to predict whether the
movie flop, hit, or superhot, we propose that we need to
create the historical data set relating to parameters that
influence movie success and to develop an algorithm to
assign weights and develop a mathematical model to
automate and predict movie success and finally evaluate the
performance of the algorithm to know how good or bad our
movie prediction system is.
Paper ID: 12013159 69
International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064
Volume 2 Issue 9, September 2013
www.ijsr.net
4. Implementation Steps
The entire study was conducted in a following manner as
listed below [7]:
1) Collection of data pertaining to parameters under study
(Actor, Actress, Producer, Director, Writer, Music
Director Time of Release and Marketing Budget)
2) Data processing for assigning weights and calculating
thresholds
3) Input & Target pattern formation
4) Architecture design and Neural Network Learning
5) Performance Analysis
Figure 4.1: Basic Flow of the thesis
Step 1: Develop/collect a dataset based on following
attributes [7]
Table4.1: Dataset characteristics of Neural Network input
S. No Dataset Characteristics : Multivariate
a Attribute Characteristic Real Valued
b Missing Values None
c Number of Instances of Observations: 111
d Number of Attributes : 07
e Parameter that influence characters
Step 2: Based on the above characteristics, assign weights to
each features row consisting of following parameters [7]
Table 4.2: Attributes/Parameters under Study of Texture
Based Observations
S. No Attribute Description Mathematical Expressions
1 Actor Leading Actor and status
of his last 10 movies
∑ Ah = Aw
10
2 Actress Leading Actress and
status of his last 10
movies
∑ Ash= ASw
10
3 Director Director and status of his
last 10 movies
∑ Ad= Ad
10
4 Producer Producer and status of his
last 10 movies
∑ Ap= Ap
10
5 Music
Director
Music Director and status
of his last 10 movies
∑ Am= Ap
10
6 Writer Writer and status of his
last 10 movies
∑ Aw= Aw
10
7 Marketing
Budget
Base value of Rs.10.00
crores
∑ MB= MBw
10
8 Time of
Release
Release during holiday season
=0.9
Release during other time =0.7
Step 3: Design neural network classifier with table (4.2) as
input dataset for building learning validation and testing
phases
Step 4: Design of output/target layer
Target Classes Target Pattern
Class A Flop 1 0 0 0 0 0
Class B Hit 0 1 0 0 0 0
Class C Superhit 0 0 1 0 0 0
Step 5: Run neural network based on LM algorithm [3]
having different configuration of hidden layers to finally
find I-H-O architectures combination that produces best
results in terms of true positive rate and that can be
visualized in confusion matrix.
5. Results
Figure 5.1: Confusion Matrix for predicting movie success
Figure 5.2: Actual Class matrix (Actual movie success)
Figure 5.3: Predicted Class Matrix (Movie success as
predicted by the designed algorithm)
Paper ID: 12013159 70
International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064
Volume 2 Issue 9, September 2013
www.ijsr.net
5.1 Interpretation of Results
The strength of our algorithm is that it identifies predictors
of movie success as well as their quantities so that the round
truth is properly matched with the dedicated results, once
this prediction framework work is put into practice.
Therefore after designing multiple classifiers with various
possible parameters of input observations, hidden layers and
fixed number of output classes. We have tried to build a low
computational resource intensive as well as less time
consuming framework to predict movie success and it is
apparent from the confusion matrix the accuracy is quiet
high (93.3%).
The selection of parameters for the design of classifier has
been meticulously and empirically found after many
experiments. The appropriate selection of initial weights for
the learning function was found by analysis of historical
data. If initial weights are too small then net input to hidden
or output unit will approach 0 which would have led to slow
learning but if weight were too large the initial input signal
to each hidden or output unit would fall in the saturation
region where the derivative of the activation function
(sigmoid) would have very small value 0.
The selection of learning rate was also done keeping in mind
changes in weight factor must be small in order to reduce
oscillations or any deviation. For deciding the training and
testing patterns. We developed disjoint sets of training and
testing datasets and got these validated using K4 cross
validation method. In all the various deigns of classifiers the
major focus was also to identify the no of hidden units care
was taken that no unnecessary additional computational
resource usage comes into play for each additional hidden
layers and finally we can see from figure (5.2 & 5.3) then
the size, length and color of actual and predicted class
matrix are similar, due to high accuracy.
6. Conclusion
The data pertaining to parameters under study were collected
from the leading Bollywood websites.
 Data Normalization must be used to reduce the number
of samples and the complexity of the neural network and
the computation time of the neural network.
 For the classification schemes, it was found that training
the model with a large number of test data and with fast
training algorithm would greatly enhance the accuracy
and hence the reliability of the system.
 The design of our classifier was done by running the
neural network with different number of hidden layers
and it was apparent from the graphs that it affected the
accuracy.
 It was found that as we increase the number of hidden
layers there was also an increase in computation time but
high order of accuracy is also achieved until we have
reached the maximum of hidden layers, therefore, we
need an optimal combination of parameters to achieve
93.3% accuracy
7. Future Scope
We can explore more unsupervised machine learning
algorithms which would offer more versatile method of
predicting movie success. These methods may be based on
some computational clustering technique and which can be
evaluated on the basis of recall and precision values.
 We can explore more algorithms and techniques for the
feature extraction and classification of parameters
influencing movie success to further improve the accuracy
of the defect identification system.
 We can further improve the system by reducing the
complexity. The main objective could be to find the best
algorithms which optimize the performance and
complexity this can be done by changing normalization of
input data or by changing sample methods with other
possible learning rate parameters etc.
 The accuracy of classifier can also be enhanced by using
more and equal number of training patterns.
8. Acknowledgement
I am thankful to A P Nidhi, Assistant Professor, Swami
Vivekanand Institute of Engineering and Technology,
Banur, for providing constant guidance and encouragement
for this research work.
References
[1] L.Zhang, J.Luo, S.Yang. “Forecasting Box Office
Revenues of Movies with BP Neural Networks”. Expert
Systems with Applications 2009, vol. 36 (3) part 2, page
6580-6587.
[2] K.J.Lee, W. Chang. “Bayesian Belief Network for Box
Office Performance: A Case Study of Korean Movies”.
Expert Systems with Applications, 2009, vol. 36 (1),
page 280-291.
[3] The Levenberg-Marquardt Algorithm , Ananth
Ranganathan, 8th June 2004
[4] T. Efendigil, S.Onut, C. Kahraman. “A Decision
Support System for Demand Forecasting with Artificial
Neural Network and Neuro Fuzzy Models: A
Comparative Analysis”. Expert Systems with
Applications 2009, vol. 36 (3), part 2, 5697-5707.
[5] K.Y. Chan, T.S.Dhillon, J.Singh, E.Chang. “Traffic
Flow Forecasting Neural Network Based on
Exponential Smoothing Method”. 6th
IEEE Conference
on Industrial Electronics and Application 21-23 June
2011, page 376-381.
[6] U.Reuter, B.Moller. “Artificial Neural Network for
Forecasting of Fuzzy Time Series”. Computer Aided
Civil and Infrastructure Engineering 2010, vol.25 (5),
page 363-374.
[7] Arundeep Kaur and AP Gurpinder Kaur, Predicting
Movie Success: Review of Existing Literature ,
International Journal of Advanced Research in
Computer Science and Software Engineering, Volume
3, Issue 6, June 2013
Paper ID: 12013159 71

More Related Content

PDF
Assessment and Mitigation of Risks Involved in Electronics Payment Systems
PDF
Unascended Left Kidney with Malrotation: A Rare Congenital Anomaly
PDF
Effects of Risk Management Practices on the Performance of Insurance Firms in...
PDF
Carbon Financing for Renewable Energy Projects in Zimbabwe – A Case of Chipen...
Assessment and Mitigation of Risks Involved in Electronics Payment Systems
Unascended Left Kidney with Malrotation: A Rare Congenital Anomaly
Effects of Risk Management Practices on the Performance of Insurance Firms in...
Carbon Financing for Renewable Energy Projects in Zimbabwe – A Case of Chipen...

Viewers also liked (8)

PDF
Effects of Spondias Mombin Leaf Extract on the Cytoarchitecture of the Cereba...
PDF
Caspase Dependent Apoptosis is Only Inhibited on Γ Irradiation of Cells Condi...
PDF
Proposing an Encryption Algorithm based on DES
PDF
Study on Emotional Maturity and Coping Strategies among the Students Pursuing...
Effects of Spondias Mombin Leaf Extract on the Cytoarchitecture of the Cereba...
Caspase Dependent Apoptosis is Only Inhibited on Γ Irradiation of Cells Condi...
Proposing an Encryption Algorithm based on DES
Study on Emotional Maturity and Coping Strategies among the Students Pursuing...
Ad

Similar to Predicting Movie Success Using Neural Network (20)

PDF
Predicting movie success from search
PDF
MOVIE SUCCESS PREDICTION AND PERFORMANCE COMPARISON USING VARIOUS STATISTICAL...
PDF
A comparative analysis of machine learning approaches for movie success predi...
PDF
IRJET- Movie Success Prediction using Data Mining and Social Media
PDF
IRJET - Movie Opinion Mining & Emotions Rating Software
PDF
Agile & Lean Movie Making by Utpal Chakraborty
PDF
PowerPoint Presentation
PDF
IRJET- Movie Success Prediction using Popularity Factor from Social Media
PPTX
Data Analytics in INDIAN FILM INDUSTRY
PPTX
Forecast Model for Box-Office Revenue of Bollywood Feature Films
PPTX
Forecast Model for Box-Office Revenue of Bollywood Feature Films
PDF
Semantic Web Based Sentiment Engine
PDF
IRJET - Enhanced Movie Recommendation Engine using Content Filtering, Collabo...
PDF
movieRecommendation_FinalReport
PDF
APPLYING SUPERVISED AND UN-SUPERVISED LEARNING APPROACHES FOR MOVIE RECOMMEND...
PDF
Applying supervised and un supervised learning approaches for movie recommend...
PDF
Lean Kanban India 2016 | Lean & Agile Movie Making | Utpal Chakraborty
PDF
20320140501009 2
PPTX
Predicting Movie Success on IMDb: A Data-Driven Approach
PPTX
Movie recommendation Engine using Artificial Intelligence
Predicting movie success from search
MOVIE SUCCESS PREDICTION AND PERFORMANCE COMPARISON USING VARIOUS STATISTICAL...
A comparative analysis of machine learning approaches for movie success predi...
IRJET- Movie Success Prediction using Data Mining and Social Media
IRJET - Movie Opinion Mining & Emotions Rating Software
Agile & Lean Movie Making by Utpal Chakraborty
PowerPoint Presentation
IRJET- Movie Success Prediction using Popularity Factor from Social Media
Data Analytics in INDIAN FILM INDUSTRY
Forecast Model for Box-Office Revenue of Bollywood Feature Films
Forecast Model for Box-Office Revenue of Bollywood Feature Films
Semantic Web Based Sentiment Engine
IRJET - Enhanced Movie Recommendation Engine using Content Filtering, Collabo...
movieRecommendation_FinalReport
APPLYING SUPERVISED AND UN-SUPERVISED LEARNING APPROACHES FOR MOVIE RECOMMEND...
Applying supervised and un supervised learning approaches for movie recommend...
Lean Kanban India 2016 | Lean & Agile Movie Making | Utpal Chakraborty
20320140501009 2
Predicting Movie Success on IMDb: A Data-Driven Approach
Movie recommendation Engine using Artificial Intelligence
Ad

More from International Journal of Science and Research (IJSR) (20)

PDF
Innovations in the Diagnosis and Treatment of Chronic Heart Failure
PDF
Design and implementation of carrier based sinusoidal pwm (bipolar) inverter
PDF
Polarization effect of antireflection coating for soi material system
PDF
Image resolution enhancement via multi surface fitting
PDF
Ad hoc networks technical issues on radio links security & qo s
PDF
Microstructure analysis of the carbon nano tubes aluminum composite with diff...
PDF
Improving the life of lm13 using stainless spray ii coating for engine applic...
PDF
An overview on development of aluminium metal matrix composites with hybrid r...
PDF
Pesticide mineralization in water using silver nanoparticles incorporated on ...
PDF
Comparative study on computers operated by eyes and brain
PDF
T s eliot and the concept of literary tradition and the importance of allusions
PDF
Effect of select yogasanas and pranayama practices on selected physiological ...
PDF
Grid computing for load balancing strategies
PDF
A new algorithm to improve the sharing of bandwidth
PDF
Main physical causes of climate change and global warming a general overview
PDF
Performance assessment of control loops
PDF
Capital market in bangladesh an overview
PDF
Faster and resourceful multi core web crawling
PDF
Extended fuzzy c means clustering algorithm in segmentation of noisy images
PDF
Parallel generators of pseudo random numbers with control of calculation errors
Innovations in the Diagnosis and Treatment of Chronic Heart Failure
Design and implementation of carrier based sinusoidal pwm (bipolar) inverter
Polarization effect of antireflection coating for soi material system
Image resolution enhancement via multi surface fitting
Ad hoc networks technical issues on radio links security & qo s
Microstructure analysis of the carbon nano tubes aluminum composite with diff...
Improving the life of lm13 using stainless spray ii coating for engine applic...
An overview on development of aluminium metal matrix composites with hybrid r...
Pesticide mineralization in water using silver nanoparticles incorporated on ...
Comparative study on computers operated by eyes and brain
T s eliot and the concept of literary tradition and the importance of allusions
Effect of select yogasanas and pranayama practices on selected physiological ...
Grid computing for load balancing strategies
A new algorithm to improve the sharing of bandwidth
Main physical causes of climate change and global warming a general overview
Performance assessment of control loops
Capital market in bangladesh an overview
Faster and resourceful multi core web crawling
Extended fuzzy c means clustering algorithm in segmentation of noisy images
Parallel generators of pseudo random numbers with control of calculation errors

Recently uploaded (20)

PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PDF
Basic Mud Logging Guide for educational purpose
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PDF
RMMM.pdf make it easy to upload and study
PPTX
Cell Types and Its function , kingdom of life
PDF
Complications of Minimal Access Surgery at WLH
PDF
Sports Quiz easy sports quiz sports quiz
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
01-Introduction-to-Information-Management.pdf
PDF
Pre independence Education in Inndia.pdf
PPTX
master seminar digital applications in india
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PPTX
Cell Structure & Organelles in detailed.
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
O5-L3 Freight Transport Ops (International) V1.pdf
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Module 4: Burden of Disease Tutorial Slides S2 2025
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
Basic Mud Logging Guide for educational purpose
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
RMMM.pdf make it easy to upload and study
Cell Types and Its function , kingdom of life
Complications of Minimal Access Surgery at WLH
Sports Quiz easy sports quiz sports quiz
2.FourierTransform-ShortQuestionswithAnswers.pdf
01-Introduction-to-Information-Management.pdf
Pre independence Education in Inndia.pdf
master seminar digital applications in india
Supply Chain Operations Speaking Notes -ICLT Program
Cell Structure & Organelles in detailed.
school management -TNTEU- B.Ed., Semester II Unit 1.pptx

Predicting Movie Success Using Neural Network

  • 1. International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064 Volume 2 Issue 9, September 2013 www.ijsr.net Predicting Movie Success Using Neural Network 1 Arundeep Kaur, 2 AP Nidhi Department of computer Science, Swami Vivekanand Institute of Engineering & Technology, Punjab Technical University, Jalandhar, India Abstract: In this research work we have developed a mathematical model for predicting the success class [flop , hit , super hit] of the Indian movies, for doing this we have develop a methodology in which the historical data of each component [e. G actor , actress, director, music ]that influences the success or failure of a movie is given is due weightage and then based on multiple thresholds calculated on the basis of descriptive statistics of dataset of each component it is given class [flop , hit, super hit] label. This dataset is then subjected to neural network [LM] based learning algorithm for automating the process and results in terms of match between actual class labels and predicted labels are evaluated. Results show that our strategy of identifying the class of success is highly effective and accurate which apparent from the classification matrix also. Keywords: Movie prediction, neural network, weights of variables 1. Introduction Today, the trouble is that the more things change, the more they stay in the same horizons. However, this may not be time for the movie industry, as it can break completely free of the cycles which had marked its history for hundreds of years, and it will be in fact, a departure from reality, It’s not predicting the future success of movie is problematical, it’s the realization that you have to relive the past again and again and still make highly intelligent guess about the success and failure of the movie. An attempt is made to predict the past as well as the future of movie for the purpose of business certainty or simply a theoretical condition in which decision making [the success of the movie] is without risk, because the decision maker [movie makers and stake holders] has all the information about the exact outcome of the decision, before he or she makes the decision [release of the movie]. With over two million spectators a day and films exported to over 100 countries, the impact of Bollywood film industry is formidable. From the first Indian film “Raja Harishchandra by Dhundhiraj Govind (Dadasaheb) Phalke in 1913 to 1981, India produced over 15000 feature films. Since then it has produced, at least another 15000 at a rate of more than 1000 films a year (1091 in 2006, 1146 in 2007 and 1325 in 2008) in 26 languages [1]. The industry is world’s largest in terms on number of movies produced and also in terms of number of cinema goers. Bollywood p r o d u c e s as many films as the next three largest producers – US, Japan and China- combined. In terms of money it is second only to Hollywood [2]. Now, film making in India is a multimillion dollar industry employing over 6 million workers and reaching millions of people worldwide. In 2008 industry was valued at 107.1 billion rupees. Pricewaterhouse Coopers [3] predict that industry will be 184.3 billion in 2013. With such a fortune and employment of so many people at stake every Friday, it will be of immense interest to producers to know the probability of success or failure of a movie. However, due to their definition as experience goods with short product life time cycles; it is difficult to forecast the demand for motion pictures. Nevertheless, producers and distributors of new movies need to forecast box-office results in an attempt to reduce the uncertainty in the motion picture business and as a stake holder in the movie industry , one needs to know then the minimum sum of money a he/she can accept to forgo the opportunity to participate in an event [make/distribute etc ., movie] for which the outcome [success or failure of movie], and therefore his or her receipt of a reward, is uncertain [success of the movie]. 2. Research Gap Literature survey has revealed only two studies which have attempted to predict the success of movies. While one study uses Bayesian belief network to predict the success, the other one uses neural network for the same. Lee and Chang [2] in their study using Bayesian Belief Network for predicting box office performance concluded that Bayesian Belief Networks were better in predicting the success as compared to neural networks. However, Zhang et al [1] in their study concluded that the MLBP prediction model [1] achieves more satisfactory results as compared with MLP method, and it is more reliable and effective to solve the problem. Since, not much work has been carried out in this area we intend to develop a model which can predict the financial success of the movie. 2.1 Rationale of the Study As movies are defined as experience goods with short product lifetime cycles, it is difficult to forecast the demand for motion pictures. Nevertheless, producers and distributors of new movies need to forecast box-office results in an attempt to reduce the uncertainty in the motion picture business. The study intends to develop a model to predict the financial success of a movie. 3. Proposed Work For developing a model that can help to predict whether the movie flop, hit, or superhot, we propose that we need to create the historical data set relating to parameters that influence movie success and to develop an algorithm to assign weights and develop a mathematical model to automate and predict movie success and finally evaluate the performance of the algorithm to know how good or bad our movie prediction system is. Paper ID: 12013159 69
  • 2. International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064 Volume 2 Issue 9, September 2013 www.ijsr.net 4. Implementation Steps The entire study was conducted in a following manner as listed below [7]: 1) Collection of data pertaining to parameters under study (Actor, Actress, Producer, Director, Writer, Music Director Time of Release and Marketing Budget) 2) Data processing for assigning weights and calculating thresholds 3) Input & Target pattern formation 4) Architecture design and Neural Network Learning 5) Performance Analysis Figure 4.1: Basic Flow of the thesis Step 1: Develop/collect a dataset based on following attributes [7] Table4.1: Dataset characteristics of Neural Network input S. No Dataset Characteristics : Multivariate a Attribute Characteristic Real Valued b Missing Values None c Number of Instances of Observations: 111 d Number of Attributes : 07 e Parameter that influence characters Step 2: Based on the above characteristics, assign weights to each features row consisting of following parameters [7] Table 4.2: Attributes/Parameters under Study of Texture Based Observations S. No Attribute Description Mathematical Expressions 1 Actor Leading Actor and status of his last 10 movies ∑ Ah = Aw 10 2 Actress Leading Actress and status of his last 10 movies ∑ Ash= ASw 10 3 Director Director and status of his last 10 movies ∑ Ad= Ad 10 4 Producer Producer and status of his last 10 movies ∑ Ap= Ap 10 5 Music Director Music Director and status of his last 10 movies ∑ Am= Ap 10 6 Writer Writer and status of his last 10 movies ∑ Aw= Aw 10 7 Marketing Budget Base value of Rs.10.00 crores ∑ MB= MBw 10 8 Time of Release Release during holiday season =0.9 Release during other time =0.7 Step 3: Design neural network classifier with table (4.2) as input dataset for building learning validation and testing phases Step 4: Design of output/target layer Target Classes Target Pattern Class A Flop 1 0 0 0 0 0 Class B Hit 0 1 0 0 0 0 Class C Superhit 0 0 1 0 0 0 Step 5: Run neural network based on LM algorithm [3] having different configuration of hidden layers to finally find I-H-O architectures combination that produces best results in terms of true positive rate and that can be visualized in confusion matrix. 5. Results Figure 5.1: Confusion Matrix for predicting movie success Figure 5.2: Actual Class matrix (Actual movie success) Figure 5.3: Predicted Class Matrix (Movie success as predicted by the designed algorithm) Paper ID: 12013159 70
  • 3. International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064 Volume 2 Issue 9, September 2013 www.ijsr.net 5.1 Interpretation of Results The strength of our algorithm is that it identifies predictors of movie success as well as their quantities so that the round truth is properly matched with the dedicated results, once this prediction framework work is put into practice. Therefore after designing multiple classifiers with various possible parameters of input observations, hidden layers and fixed number of output classes. We have tried to build a low computational resource intensive as well as less time consuming framework to predict movie success and it is apparent from the confusion matrix the accuracy is quiet high (93.3%). The selection of parameters for the design of classifier has been meticulously and empirically found after many experiments. The appropriate selection of initial weights for the learning function was found by analysis of historical data. If initial weights are too small then net input to hidden or output unit will approach 0 which would have led to slow learning but if weight were too large the initial input signal to each hidden or output unit would fall in the saturation region where the derivative of the activation function (sigmoid) would have very small value 0. The selection of learning rate was also done keeping in mind changes in weight factor must be small in order to reduce oscillations or any deviation. For deciding the training and testing patterns. We developed disjoint sets of training and testing datasets and got these validated using K4 cross validation method. In all the various deigns of classifiers the major focus was also to identify the no of hidden units care was taken that no unnecessary additional computational resource usage comes into play for each additional hidden layers and finally we can see from figure (5.2 & 5.3) then the size, length and color of actual and predicted class matrix are similar, due to high accuracy. 6. Conclusion The data pertaining to parameters under study were collected from the leading Bollywood websites.  Data Normalization must be used to reduce the number of samples and the complexity of the neural network and the computation time of the neural network.  For the classification schemes, it was found that training the model with a large number of test data and with fast training algorithm would greatly enhance the accuracy and hence the reliability of the system.  The design of our classifier was done by running the neural network with different number of hidden layers and it was apparent from the graphs that it affected the accuracy.  It was found that as we increase the number of hidden layers there was also an increase in computation time but high order of accuracy is also achieved until we have reached the maximum of hidden layers, therefore, we need an optimal combination of parameters to achieve 93.3% accuracy 7. Future Scope We can explore more unsupervised machine learning algorithms which would offer more versatile method of predicting movie success. These methods may be based on some computational clustering technique and which can be evaluated on the basis of recall and precision values.  We can explore more algorithms and techniques for the feature extraction and classification of parameters influencing movie success to further improve the accuracy of the defect identification system.  We can further improve the system by reducing the complexity. The main objective could be to find the best algorithms which optimize the performance and complexity this can be done by changing normalization of input data or by changing sample methods with other possible learning rate parameters etc.  The accuracy of classifier can also be enhanced by using more and equal number of training patterns. 8. Acknowledgement I am thankful to A P Nidhi, Assistant Professor, Swami Vivekanand Institute of Engineering and Technology, Banur, for providing constant guidance and encouragement for this research work. References [1] L.Zhang, J.Luo, S.Yang. “Forecasting Box Office Revenues of Movies with BP Neural Networks”. Expert Systems with Applications 2009, vol. 36 (3) part 2, page 6580-6587. [2] K.J.Lee, W. Chang. “Bayesian Belief Network for Box Office Performance: A Case Study of Korean Movies”. Expert Systems with Applications, 2009, vol. 36 (1), page 280-291. [3] The Levenberg-Marquardt Algorithm , Ananth Ranganathan, 8th June 2004 [4] T. Efendigil, S.Onut, C. Kahraman. “A Decision Support System for Demand Forecasting with Artificial Neural Network and Neuro Fuzzy Models: A Comparative Analysis”. Expert Systems with Applications 2009, vol. 36 (3), part 2, 5697-5707. [5] K.Y. Chan, T.S.Dhillon, J.Singh, E.Chang. “Traffic Flow Forecasting Neural Network Based on Exponential Smoothing Method”. 6th IEEE Conference on Industrial Electronics and Application 21-23 June 2011, page 376-381. [6] U.Reuter, B.Moller. “Artificial Neural Network for Forecasting of Fuzzy Time Series”. Computer Aided Civil and Infrastructure Engineering 2010, vol.25 (5), page 363-374. [7] Arundeep Kaur and AP Gurpinder Kaur, Predicting Movie Success: Review of Existing Literature , International Journal of Advanced Research in Computer Science and Software Engineering, Volume 3, Issue 6, June 2013 Paper ID: 12013159 71