SlideShare a Scribd company logo
DEEP CODERS
ECKOVATION MACHINE LEARNING
members
Nitin Khatkar :01711503116
Sourav Tiwari :03011503116
Gulshan :01211503116
Shrey Achreja :41311503116
What cuisine is this recipe??
Picture yourself strolling through
your local, open-air market... What
do you see? What do you smell?
What will you make for dinner
tonight?
We want to thank Yummly for providing this unique dataset.
2
Data Description
▫ In the dataset, we include the
recipe id, the type of cuisine,
and the list of ingredients of
each recipe (of variable length).
The data is stored in JSON
format.
▫ An example of a recipe node in
train.json is given aside:
3
“We would predict the cuisine for
each recipe in the test case.”
4
5
STEPS FOLLOWED TO SOLVE THE GIVEN PROBLEM
STEP 3
At last, we will apply
the suitable algorithm
to it and find the best
suitable algorithm to
it.
STEP 1
First we will perform
EDA and will remove
all the redundant data
from given dataset.
STEP 2
Then , we will form our
feature matrix as well
target metrix.
PRE-PROCESSING
TOP 10 INGREDIENTS ACCORDING TO THE CUSINE
ALGORTITHM FOR FINDING TOP 10 CUISINE GIVEN ON NEXT SLIDE ->
7
ALGORITHM FOR FINDING TOP 10 CUISINE
▫ First make a dictionary with keys as different cuisine’s
and ingredients present in it as values.(dic)
▫ Then, with the help of above dictionary make a new
dictionary containing the counts of the ingredients
present in it.(count_dictionary)
▫ At last, make the pie chart of top 10 ingredients with
the help of above two dictionaries.(code given on next
slide)
8
CODE FOR PLOTTING TOP 10 INGREDIENTS
9
APPLYING
ML TO IT
FIRST GENERATING X AND Y FOR FURTHER
APPLYING ANY ALGORITHM TO IT.
10
11
GENERATING X AND Y
▫ Create an empty list
y,total_ingredients.
▫ Append all the unique ingredients in
the list total_ingredients.
▫ Create a zero matrix using numpy and
name it as x.(number of rows equal to
y and columns equal to
total_ingredients)
▫ For every ingredient in y replace with
1.
▫ Our feature matrix x and target y is
ready.
DIFFERENT ALGORITHM USED
12
FOURTH
FINALLY WE WENT FOR FINSL TEST OF DEEP
LEARNING BUT DUE COULD NOT PERFORM ON
FULL DATA DUE TO LOW END SOECIFIACTIONS
THIRD
OUT OF CUROSITY, WE ALSO
APPLIED NAÏVE BAYES TO IT, BUT
AGAIN GOT A VERY LOW SCORE OF
0.36
SECOND
THEN ,WE TESTED FOR RANDOM
FOREST AND GOT A SATISFACTORY
SCORE OF 0.72
FIRST
WE STARTED WITH DECISION TREE.
OUTCOME WAS NOT AT ALL FRUITFUL
AS IT ACHIEVED SCORE OF 0.30
13
ALGORITHM 1
▫ APPLYING DECISION
TREE TO IT.
▫ GOT A SCORE OF 0.26
14
ALGORITHM 2
▫ APPLYING RANDOM
FOREST TO IT.
▫ GOT A SCORE OF 0.72
15
ALGORITHM 3
▫ APPLYING NAÏVE NAYES
TO IT
▫ GOT A SCORE OF 0.36
16
ALGORITHM 4
▫ APPLYING DEEP
LEARNING TO IT.
▫ GOT A SCORE OF 0.64 ON
ONLY 10000 DATA
SHORTCOMINGS
Deep learning could not be
applied on whole dataset due to
low end specifications and SVM
could also not be applied due to
memory error.
RESULT
Conclusion
Random Forest is best algo ,
But if deep learning performed
on full dataset then , conclusion
may differ.
17
DATA COMPARISON
SCORE
DECISION TREE 0.30
NAÏVE BAYES 0.36
RANDOM FOREST 0.72
DEEP LEARNING 0.64
18
0.72
Final score achieved
Highest is 0.82
19
20
Forest Cover Type Prediction
▫ The study area includes four wilderness areas located in the
Roosevelt National Forest of northern Colorado. Each observation
is a 30m x 30m patch. We are asked to predict an integer
classification for the forest cover type.
21
Data Description
▫ The seven types are:
▫ 1 - Spruce/Fir
2 - Lodgepole Pine
3 - Ponderosa Pine
4 - Cottonwood/Willow
5 - Aspen
6 - Douglas-fir
7 - Krummholz
▫ The training set (15120 observations)
contains both features and the Cover_Type.
The test set contains only the features. You
must predict the Cover_Type for every
row in the test set (565892 observations).
“We would predict the forest-cover
type based upon the given value of
parameters.”
22
23
STEPS FOLLOWED TO SOLVE THE GIVEN PROBLEM
23
STEP 3
At last, we will apply
the suitable algorithm
to it and find the best
suitable algorithm to
it.
STEP 1
First we will perform
EDA and will remove
all the redundant data
from given dataset.
STEP 2
Then , we will form our
feature matrix as well
target metrix.
24
PRE-PROCESSING
PLOTTING PARAMETER’S WITH RESPECT TO FOREST-COVER TYPE
25
SAMPLE CODE FOR POTTING GRAPH
26
27
APPLYING
ML TO IT
FIRST GENERATING X AND Y FOR FURTHER
APPLYING ANY ALGORITHM TO IT.
27
28
GENERATING X AND Y
▫ Assigning the value’s in
cover_type column in target
matrix(y).
▫ Then, after removing the
cover_type from data-frame
assigning the value to x.
▫ Removing the redundant columns.
29
DIFFERENT ALGORITHM USED
FOURTH
Finally we went ahead for deep learning and got
score 0.84 as the Random forest.
THIRD
OUT OF CUROSITY, WE ALSO
APPLIED NAÏVE BAYES and SVM TO
IT, BUT AGAIN GOT A VERY LOW
SCORE OF 0.58 and 0.14 respectively.
SECOND
THEN ,WE TESTED FOR RANDOM
FOREST AND GOT A SATISFACTORY
SCORE OF 0.84
FIRST
WE STARTED WITH DECISION TREE.
OUTCOME WAS NOT AT ALL FRUITFUL
AS IT ACHIEVED SCORE OF 0.60
30
ALGORITHM 1
▫ Applying decision tree to
our problem
▫ We got a score of 0.30
31
ALGORITHM 2
▫ Applying random forest to
our problem we get.
▫ We got a score of 0.84
32
ALGORITHM 3
▫ Applied naïve-bayes and
SVM to it.
▫ But did not get fruitful
result because it’s not a
probability problem
33
ALGORITHM 4
▫ At last applying deep
learning to our problem
▫ We got a score of 0.80
34
DATA COMPARISON
SCORE
DECISION TREE 0.60
NAÏVE BAYES 0.58
RANDOM FOREST 0.84
SVM 0.14
DEEP LEARNING 0.80
34
35
0.84
Final score achieved
35
36
THANKS!

More Related Content

PPTX
Advanced Sorting Algorithms
PPTX
Insertion Sort, Quick Sort And Their complexity
PPT
3.2 insertion sort
PDF
2415systems_odes
PDF
system linear equations
PPTX
Numerical solution of ordinary differential equations
PPT
Gauss sediel
PPTX
Iterativos Methods
Advanced Sorting Algorithms
Insertion Sort, Quick Sort And Their complexity
3.2 insertion sort
2415systems_odes
system linear equations
Numerical solution of ordinary differential equations
Gauss sediel
Iterativos Methods

What's hot (19)

PPTX
Solution of equations for methods iterativos
PDF
n-squared_sorts
PDF
Me202 engineering mechanics l3
PPT
Absolute inequalities
PPT
Linear Systems Gauss Seidel
PPTX
LYAPUNOV STABILITY PROBLEM SOLUTION
PPT
Lyapunov stability
PPTX
Directs Methods
 
PPTX
Insertion sort algorithm power point presentation
PDF
Ib maths sl chain rule
PDF
SQL: Unique IDs, Primary Keys and Archiving Inactive Rows Without Violating C...
PPT
Selection sort
PPTX
Me202 engineering mechanics l2
PPTX
Interactives Methods
 
PPTX
Iterative methods
PDF
ME202 Engineering Mechanics L6
PDF
Ib maths sl product and quotient rules
Solution of equations for methods iterativos
n-squared_sorts
Me202 engineering mechanics l3
Absolute inequalities
Linear Systems Gauss Seidel
LYAPUNOV STABILITY PROBLEM SOLUTION
Lyapunov stability
Directs Methods
 
Insertion sort algorithm power point presentation
Ib maths sl chain rule
SQL: Unique IDs, Primary Keys and Archiving Inactive Rows Without Violating C...
Selection sort
Me202 engineering mechanics l2
Interactives Methods
 
Iterative methods
ME202 Engineering Mechanics L6
Ib maths sl product and quotient rules
Ad

Similar to deep_coders(sourav,nitin) (20)

PPTX
House Sale Price Prediction
PPTX
Artificial Neural Network
PDF
Principal component analysis and lda
PDF
Week 4
PDF
Predicting rainfall using ensemble of ensembles
DOC
ALGORITHMS - SHORT NOTES
PPT
Design and analysis of algorithm in Computer Science
PDF
Workshop 4
PDF
Machine Learning Notes for beginners ,Step by step
PPTX
08 neural networks
PDF
Ma3bfet par 10.6 5 aug 2014
PPTX
Data Science Job Required Skill Analysis
PPTX
Numerical Techniques
PDF
ADA Unit — 2 Greedy Strategy and Examples | RGPV De Bunkers
PDF
Introduction to Algorithm Design and Analysis.pdf
PPTX
Efficient anomaly detection via matrix sketching
PPTX
Algorithm & data structures lec1
PDF
Sienna 1 intro
PDF
APLICACIONES DE LA DERIVADA EN LA CARRERA DE (Mecánica, Electrónica, Telecomu...
PDF
Bubble Sort algorithm in Assembly Language
House Sale Price Prediction
Artificial Neural Network
Principal component analysis and lda
Week 4
Predicting rainfall using ensemble of ensembles
ALGORITHMS - SHORT NOTES
Design and analysis of algorithm in Computer Science
Workshop 4
Machine Learning Notes for beginners ,Step by step
08 neural networks
Ma3bfet par 10.6 5 aug 2014
Data Science Job Required Skill Analysis
Numerical Techniques
ADA Unit — 2 Greedy Strategy and Examples | RGPV De Bunkers
Introduction to Algorithm Design and Analysis.pdf
Efficient anomaly detection via matrix sketching
Algorithm & data structures lec1
Sienna 1 intro
APLICACIONES DE LA DERIVADA EN LA CARRERA DE (Mecánica, Electrónica, Telecomu...
Bubble Sort algorithm in Assembly Language
Ad

Recently uploaded (20)

PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PPTX
Pharma ospi slides which help in ospi learning
PDF
Basic Mud Logging Guide for educational purpose
PDF
Pre independence Education in Inndia.pdf
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PPTX
Cell Types and Its function , kingdom of life
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PPTX
master seminar digital applications in india
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
Pharma ospi slides which help in ospi learning
Basic Mud Logging Guide for educational purpose
Pre independence Education in Inndia.pdf
O5-L3 Freight Transport Ops (International) V1.pdf
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
Cell Types and Its function , kingdom of life
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
Module 4: Burden of Disease Tutorial Slides S2 2025
102 student loan defaulters named and shamed – Is someone you know on the list?
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
Abdominal Access Techniques with Prof. Dr. R K Mishra
master seminar digital applications in india
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
STATICS OF THE RIGID BODIES Hibbelers.pdf

deep_coders(sourav,nitin)

  • 1. DEEP CODERS ECKOVATION MACHINE LEARNING members Nitin Khatkar :01711503116 Sourav Tiwari :03011503116 Gulshan :01211503116 Shrey Achreja :41311503116
  • 2. What cuisine is this recipe?? Picture yourself strolling through your local, open-air market... What do you see? What do you smell? What will you make for dinner tonight? We want to thank Yummly for providing this unique dataset. 2
  • 3. Data Description ▫ In the dataset, we include the recipe id, the type of cuisine, and the list of ingredients of each recipe (of variable length). The data is stored in JSON format. ▫ An example of a recipe node in train.json is given aside: 3
  • 4. “We would predict the cuisine for each recipe in the test case.” 4
  • 5. 5 STEPS FOLLOWED TO SOLVE THE GIVEN PROBLEM STEP 3 At last, we will apply the suitable algorithm to it and find the best suitable algorithm to it. STEP 1 First we will perform EDA and will remove all the redundant data from given dataset. STEP 2 Then , we will form our feature matrix as well target metrix.
  • 6. PRE-PROCESSING TOP 10 INGREDIENTS ACCORDING TO THE CUSINE
  • 7. ALGORTITHM FOR FINDING TOP 10 CUISINE GIVEN ON NEXT SLIDE -> 7
  • 8. ALGORITHM FOR FINDING TOP 10 CUISINE ▫ First make a dictionary with keys as different cuisine’s and ingredients present in it as values.(dic) ▫ Then, with the help of above dictionary make a new dictionary containing the counts of the ingredients present in it.(count_dictionary) ▫ At last, make the pie chart of top 10 ingredients with the help of above two dictionaries.(code given on next slide) 8
  • 9. CODE FOR PLOTTING TOP 10 INGREDIENTS 9
  • 10. APPLYING ML TO IT FIRST GENERATING X AND Y FOR FURTHER APPLYING ANY ALGORITHM TO IT. 10
  • 11. 11 GENERATING X AND Y ▫ Create an empty list y,total_ingredients. ▫ Append all the unique ingredients in the list total_ingredients. ▫ Create a zero matrix using numpy and name it as x.(number of rows equal to y and columns equal to total_ingredients) ▫ For every ingredient in y replace with 1. ▫ Our feature matrix x and target y is ready.
  • 12. DIFFERENT ALGORITHM USED 12 FOURTH FINALLY WE WENT FOR FINSL TEST OF DEEP LEARNING BUT DUE COULD NOT PERFORM ON FULL DATA DUE TO LOW END SOECIFIACTIONS THIRD OUT OF CUROSITY, WE ALSO APPLIED NAÏVE BAYES TO IT, BUT AGAIN GOT A VERY LOW SCORE OF 0.36 SECOND THEN ,WE TESTED FOR RANDOM FOREST AND GOT A SATISFACTORY SCORE OF 0.72 FIRST WE STARTED WITH DECISION TREE. OUTCOME WAS NOT AT ALL FRUITFUL AS IT ACHIEVED SCORE OF 0.30
  • 13. 13 ALGORITHM 1 ▫ APPLYING DECISION TREE TO IT. ▫ GOT A SCORE OF 0.26
  • 14. 14 ALGORITHM 2 ▫ APPLYING RANDOM FOREST TO IT. ▫ GOT A SCORE OF 0.72
  • 15. 15 ALGORITHM 3 ▫ APPLYING NAÏVE NAYES TO IT ▫ GOT A SCORE OF 0.36
  • 16. 16 ALGORITHM 4 ▫ APPLYING DEEP LEARNING TO IT. ▫ GOT A SCORE OF 0.64 ON ONLY 10000 DATA
  • 17. SHORTCOMINGS Deep learning could not be applied on whole dataset due to low end specifications and SVM could also not be applied due to memory error. RESULT Conclusion Random Forest is best algo , But if deep learning performed on full dataset then , conclusion may differ. 17
  • 18. DATA COMPARISON SCORE DECISION TREE 0.30 NAÏVE BAYES 0.36 RANDOM FOREST 0.72 DEEP LEARNING 0.64 18
  • 20. 20 Forest Cover Type Prediction ▫ The study area includes four wilderness areas located in the Roosevelt National Forest of northern Colorado. Each observation is a 30m x 30m patch. We are asked to predict an integer classification for the forest cover type.
  • 21. 21 Data Description ▫ The seven types are: ▫ 1 - Spruce/Fir 2 - Lodgepole Pine 3 - Ponderosa Pine 4 - Cottonwood/Willow 5 - Aspen 6 - Douglas-fir 7 - Krummholz ▫ The training set (15120 observations) contains both features and the Cover_Type. The test set contains only the features. You must predict the Cover_Type for every row in the test set (565892 observations).
  • 22. “We would predict the forest-cover type based upon the given value of parameters.” 22
  • 23. 23 STEPS FOLLOWED TO SOLVE THE GIVEN PROBLEM 23 STEP 3 At last, we will apply the suitable algorithm to it and find the best suitable algorithm to it. STEP 1 First we will perform EDA and will remove all the redundant data from given dataset. STEP 2 Then , we will form our feature matrix as well target metrix.
  • 24. 24 PRE-PROCESSING PLOTTING PARAMETER’S WITH RESPECT TO FOREST-COVER TYPE
  • 25. 25
  • 26. SAMPLE CODE FOR POTTING GRAPH 26
  • 27. 27 APPLYING ML TO IT FIRST GENERATING X AND Y FOR FURTHER APPLYING ANY ALGORITHM TO IT. 27
  • 28. 28 GENERATING X AND Y ▫ Assigning the value’s in cover_type column in target matrix(y). ▫ Then, after removing the cover_type from data-frame assigning the value to x. ▫ Removing the redundant columns.
  • 29. 29 DIFFERENT ALGORITHM USED FOURTH Finally we went ahead for deep learning and got score 0.84 as the Random forest. THIRD OUT OF CUROSITY, WE ALSO APPLIED NAÏVE BAYES and SVM TO IT, BUT AGAIN GOT A VERY LOW SCORE OF 0.58 and 0.14 respectively. SECOND THEN ,WE TESTED FOR RANDOM FOREST AND GOT A SATISFACTORY SCORE OF 0.84 FIRST WE STARTED WITH DECISION TREE. OUTCOME WAS NOT AT ALL FRUITFUL AS IT ACHIEVED SCORE OF 0.60
  • 30. 30 ALGORITHM 1 ▫ Applying decision tree to our problem ▫ We got a score of 0.30
  • 31. 31 ALGORITHM 2 ▫ Applying random forest to our problem we get. ▫ We got a score of 0.84
  • 32. 32 ALGORITHM 3 ▫ Applied naïve-bayes and SVM to it. ▫ But did not get fruitful result because it’s not a probability problem
  • 33. 33 ALGORITHM 4 ▫ At last applying deep learning to our problem ▫ We got a score of 0.80
  • 34. 34 DATA COMPARISON SCORE DECISION TREE 0.60 NAÏVE BAYES 0.58 RANDOM FOREST 0.84 SVM 0.14 DEEP LEARNING 0.80 34