SlideShare a Scribd company logo
Module -3
Deta Wacnzun
J4s poOCess conveghng Mappina a dala 8 f ting
it eady f analsis .
J t involveg movin3 o Combining comdep dola
aCCaR,ble g eose to analyse
3et o mace hem
lgo Enou a3 DaA munging , tongshn
cleanung ,orRnzing buSformig aus data nto desIx
fosMa far aualysy to be
gecd goY deusim makin
.Make auw dta ugable.
C o m b i e dala iow Vo9ou sous Co adt cetna
bocaôn
POAS
DALO dasa o xauixd Sormat
umdalgtamd
busines
ontex of dal
. Acutomoted Integnoatim
tools o s e s {r a Ca -data se
amalysis.
5. clean dalk lom NOISe,muging
lauves
elemetg
6. 4elp
busineas USes fdake timey
deuisions
decistong.
S i s O6 Spam
ViagRa efence4 Cspelling chaun)
.Any
mal
containtn Yaa
efeces Cspelling dhawng
. Lemn
o s u b j e Clot o
exdamohen.
pumchiotuon)
Sugshn for spam
'tdevdicaim
T u y a poobabliue
Mdel
3. y -NN,
Linea Regpeon.
hy Lneasn Rogiesgion &k -for leaing spam.
isute about uinean ReagoogO) spam fHeaun
3. Cenaide dotaset ag a malic ,whee eoch od CoTCLPendg-to
a emoi. d i f l e u
3Ccaiu columns fur eath doydg, heee Viafa' a uolumn
4 Ony emad Contain the 0 d Viagia, Bhon that
Column Alled uith value 1 elge assin o
alkanalvey one c a put no. o imes e oorld appeal
5. for ineoa ReReO we need training
eMal whee
be lab eled wth cutcome vasiabde
email haue be
i.e spam à Aot
6. A humam gooe spam Rg
tabe ling tak
c a l be d fo dodtig
T . n e h Romeeon Wmodo e buil
buil
8 An emau h o u t
lobel 9 gve to pedict-he
labals
9 TaslEA binvy Cofoy not spam, 1 osspam)
10 dn LineA Paspesiom oudcome ig a numbeh amd
coninoMs evau
Choose a
Cntesl
value, 3 Pedcled
valueg'
aboue tt
outpud 4 ,
belous hen outpuut u 'o'
12 J+ donoF uRk
beuause
o u toD Many
Ua9uablep
te l0,000 eMoulk wukh der O
,00,aDO W6rde
not
1 Thue Camot be t r d tna MaaX n o t invegtsble
TR
invests ble
13.
4. we could uaut tha D.
wuDds, but shl ineas
appropale to binasuy
o u t c e m e
wuDdg, but shil ineaA
O U t t L O M A
PeRaos wot
hy k-NN dLD not uskus Spam Haing
wute abeu -NN
2 eMcuile aRe paegeuted as Malsu x., uuth Ou06or emoud
Owid columng i ?
3 Malux eibue9 ale ether o r 1 depemdung on peence
h o t wid
neos, basld on
4Fos k-NN, tud e m o as ad < be
L
usdg thoy both con+aLn.
Loo manu dmenSiong g
5. HeI 1,00, Do uode uul have
Cornpuhna diShance m OD, oD0
-
dimevsional spoce which
a s e LDt Compuodh m wk
63uMs Rom ue O dimeusionalilty&
ut maka K-NN
PooY olgRth m
D1gut Recognihon
Rappee eath in a 16x16 pixel grid
UnwsaP 16x1b qid into 256
dmensiomal space
veCctonize ap Py ENN tune
Acclay, Confutosn a
NaLive Bayes taud
-classacahon
nethod bosed on bayes tad
Exanple
Rore dusCasp uuhsL 17 o ppudatian injedad
99 Scck pattewlg
tes posdwe
997 healty
pabevdg
-cst negative
GIvOMa potiemt
test
posiuive,whot g he
poobab-lib
h a
G,iuOA olly sic
Ppulodton
0,000 ppl
Pahe achuall sKck,
99 haalhy tagt +
asdeAuy+
Hee SO%
99 hehy
9900pP
sick
0opp
9 egl+ 1leg
Peson
1les+
11ppl
f997test
980 PP
Let ,ybe venug ut probablng px),p(y)
POX,9) be join poobabliay wheu both hoppeu
Londitional poobasduby whA one haPPRnsive nupths
has Aoppend
P(xl4) PCy)=
PCx,u)
=
P(u|x) PCX
olwe kor P ( u ) , a g M Pa) fo
PCya)= PCxly) p)
PCx)
-Le y e{to euen- "Jam Sid o"sick
he to ev egt u potdue +
P(Sick+)
= Pt
|sick) p(sick)
PCH)
o 99 x 0.0
o.99x o.01)+o.olxo.9a)
5 0 7
Naie Baye
Spam At foY Jndiuidual wrdg ug
Naie Baye
O Cuss O
a w d , add& to poobab
emoid 8 Spam
condla oly one u0nd at afme
e han
wdicaleg
non sPam
PCSpam) psobablity of SPa
PCha
Spa
on SRam
oebalbului of
PCham)
1-P(spam)
PCwsdspam) pobotsuly
owod
in
sF
P(wc|ham)
probabluiy ondd in
ham
emaul
wdd tn ham emanl
Apply Bayeg La P(SOYe spam PCspam)
PCNOd)
PCspam |Wád) =
PCwod)=P(uusd spam) P(Spam)t PlwBd|ham) plham)
NO- O spam
emoulg
Tot No. o emaulg
PCSpam) =
No ok Non-spam emauls
Tbt No emad
Pham)
Exap EMployee emais with I500 SPams, 362 ham.
Meehng wRd oappeass 6 times in spamM
Is3 hmag in ham
Pepam) =soo
ISbot 3672
P Cham) = I- P (spam) l-o-29 = 0.4|
0 Olo6
PCMeehng |spam) 500
PCmeetins Iham) 53
3672
=0.0yl6
PCspamlmeehing) =(meetins|spam) PCSpam)
PCmeebin4)
6-o106 O29
(-ol06)0.29+(oo4l6x o1)
0.09
Cmechin5)=POmeehnalspam)PEpam)+PCmeahng|ham) Phem)
Aspam HHe tor Combining W8de.
.Eath emal Rpejevled b a binay w vechh
e i y ü 1 r o , depending on appeRame of h uid
3 e be email ve (or
ndep for jh wRd
denole Spam
c
PC|C) poobabity thoak emaul veckor is spam
C-)
pxlc)=TOic C1-0je)
who, paubabdiley hat individuual uad in spam
cPobabiluy of u hwuRd spam
4. Take Lo on both tda Lto Conuet produuuh to suu]
Log
CPCxlc)) = v
-
Lugoi. -eje
( )
Aog Ojc+L-je)
log
jc t z - ) Loa-Ojc)
log jc + logC-0jc)
-
j log(i-0j)
togjc
-
log(i-0je)+ logC1-8jc)
loC | Cu-oje)+ lo C1-0c)
logCpoxl) j i t No
who
j j- LogCjc/t-0jc)) w . (og(1-8j)
weahg j vay for eath Maid, must be computed
- Compude pClc) them eshm at p(c|)
3+ uRkg l & cheap tolain wum pre-labled
dataSe
Laplace Smoothing :
pootabluby o a give uBd in spam emad
Yeline oj as oduo of e to 1c
e mjc wlhee ne 0 hmes " uRd
appeoag n spam
eMoulU
Mo imgh Rd
appeaai in
-Laplace SMoottingeeu to
de of replaung oj
a n y e M O d
aß
9= Tje ta x =l,f=lo to paeveN gOhing
poobalsdt oK 0 r 1
DDala se
OS43Maxp P CDIe)
ML MAX ime
Ukelhood
eghimoR
ARuw
Naint Bug way of choocing Oj for enh i
log(jc (1-0)e-Tjc)
teke dasivatu get t to 0 Hham
jc
eMAP
=agMax P(O|b) MAxi
a
PoSteuo
4
Com pogung Nauwe Bayes to k-AN
Navi Bages
NN
J4 has tuo hupeapahamotea t h a s ony One upes
palamete ie
a neosA
J g Non-
ünean laSRkia
3. Dimenconolby Loa
Dimmemserety ue
heoise 8ek
problem
not ketuee
poObleM.
4Teuin
J t Yeausu ainiMg
Boh a Labele.d upevised Leasnng
SGhaping the nleb APIs y otes Tools
Snapin
otRTools
Dals Suietsts need dala o ask aushar Solve pooblem
t u do eseaNh
ak
aneshom, Solve pooblem,
eals uth extactng
e dads bRom
Ssapmg the
websdes
web
Dos API key
3 ExkMSimg
-
DiffeR ways
to gtE
-othu polsins Tools I. un cnd ynx - -
dump
- dumP
R Beaulufd Soap (Robust but slow)
3. Mehanze Dont posgeTaweaspt)
4.PostSaipt (Jmas clasCaten)
LAPI ey
Proudod
to davelopu
to douonlod
doia in Speuked
forma
Delevope Rosiaeg
and cce Fey Cuke posswRd)
-
APTs M hae i t abou aCCos
dowlorcd Sj2e
-
Pa oY wutheu Pa
Cuke pOSSwRd)
Dala can be in JSon r othee etandasd fomoNe
Yauhoos YQL Ca
be uxd
selet Ron ficke phofo.sAGCh
whee Ext = "cat
whe ext = "cat
whette api-key='legidj#|sdvt
t lo
Exemgion
- whan APIs a not
extemsim of Raebox
available , exleusing e ixbus
Use Jnspeut the elemet on anuy webpage
HTMLfielde cam be acceed avd edded
fAfto ocaling te shuff we need nside HTM
emd
Cur Gek, qp, audk pehl e
Sip to e h dala
Same b done Uina Pthon or R
L a s Recomiition
Chelk fuor landscape Or headshot
colleu data, agk sot ne to label or R ticEy
Repe eath imase as RGB numbeR betwo2n o 255
Doaw 3 gtosam
Fo eath olos
deude houo nmch bue
Modeeg lodscape&
heedghot
Nawe Bayeg fr
A hicle classificadhan
Mulhclass text
-
Asts,Busines,
Polical,p&g
to efh arhelg
-Use New y& tineg deuelopeR
APL to
-Appy
Bernoulli wmodlol -for w8d pnekecQ to claiy
Rogiskr Resueg
A IFe
anclo?
2
Download 2000 rece
arhila
3 Save ashcle m each sechon to sepaiole
fle in t b
d e u m t e d
f o w a t - ashcle tilte ashide uRL boy petugned by
3 Sae
APT
Set of Couteqiiu r claukcahom
artide
Jek
C bee
e O , 1 , 2 .
C Coke of atide
X Sposse binasy
alalx
w d
Xii =I indicalug
ashole has
S Tain by Coumtng wdg , douumete
elassto e9timale
uRain each
c
jc
uwhae D, no.of
dounete 4 class c
no.
dotunct o cles c hauing jhw&dH
,
hypepoSalay o smookh
eshimatim
Caluote Loa odrg fos each clas
d a l a to bage classso
6
to bage Clask o
(PCy=cl «) Z c
o PCy = olt)
wlhe
eCt-Bjo)
wjc O j e - j )
-Ojc
WoCZ to.
T.Pead THe body aande
asncle
-Pemoe
u n U O a i l e d
punuuahong g
d h a s a c t e o
puunuahong dhasacoteg
-Tokeni2e
wwtu wwodg
-
Pito stop wusds
- Eshimato
,Pa inputs
oukpu
postiorr
P r o b a b l t
fr eaundorg
n d o g
Diwde into
solsDtoain/te spit
Poese Contugm a t r x
oikfcuu o dany
Re poRt Top 10
a h c l u

More Related Content

PDF
Data Science module 4 advance of data science
PDF
Data Science module 1 basics of data science
PDF
java_notes.pdf
PDF
System software btech cse Revision all.pdf
PDF
OPERATION RESEARCH_Cheat sheets.pdf
PDF
unit-4.pdf
PDF
Mechanical waves.pdf
PDF
low power 1.pdf
Data Science module 4 advance of data science
Data Science module 1 basics of data science
java_notes.pdf
System software btech cse Revision all.pdf
OPERATION RESEARCH_Cheat sheets.pdf
unit-4.pdf
Mechanical waves.pdf
low power 1.pdf

Similar to Data Science module 1 statistics of data science (20)

PDF
unit-5.pdf
PDF
This pdf is on the growth model called Harrod domar model.
PDF
2000031240_Atfl_Tutorial5.pdf
PDF
Physics activity file class 12
PDF
الابصلمودية السنوية
PDF
Class 12 Physics Notes
PDF
module 5 of enginerring ....................................
PDF
Spring complete notes natraz
PDF
Ch 1 STT.pdf
PDF
DA unit1.pdf notes provide from my side.
PDF
Som ii {theories of failure}
PDF
Adobe Scan 23-Mar-2023 (1).pdfsbsjshbwwhsus
PDF
1.introduction to electrical circuits
PDF
U4, ANALYSIS_1, CAREWELL_PHARMA.pdf
PDF
msfm unit 4.pdfffhhfhdddddddddddddfffffff
PDF
4th Semester (December; January-2014 and 2015) Computer Science and Informati...
DOCX
Assignment for Chapter 3You are a Systems Analyst hired by zippy.docx
PDF
dacvd lab.pdfgjghhghhhjjjffkkdh worldly dog
PDF
Engineering_mechanics_statics_and_dynami.pdf
PDF
unit-5.pdf
This pdf is on the growth model called Harrod domar model.
2000031240_Atfl_Tutorial5.pdf
Physics activity file class 12
الابصلمودية السنوية
Class 12 Physics Notes
module 5 of enginerring ....................................
Spring complete notes natraz
Ch 1 STT.pdf
DA unit1.pdf notes provide from my side.
Som ii {theories of failure}
Adobe Scan 23-Mar-2023 (1).pdfsbsjshbwwhsus
1.introduction to electrical circuits
U4, ANALYSIS_1, CAREWELL_PHARMA.pdf
msfm unit 4.pdfffhhfhdddddddddddddfffffff
4th Semester (December; January-2014 and 2015) Computer Science and Informati...
Assignment for Chapter 3You are a Systems Analyst hired by zippy.docx
dacvd lab.pdfgjghhghhhjjjffkkdh worldly dog
Engineering_mechanics_statics_and_dynami.pdf
Ad

Recently uploaded (20)

PPTX
Supervised vs unsupervised machine learning algorithms
PDF
.pdf is not working space design for the following data for the following dat...
PDF
annual-report-2024-2025 original latest.
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PPTX
IB Computer Science - Internal Assessment.pptx
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPT
Quality review (1)_presentation of this 21
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PDF
Lecture1 pattern recognition............
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
Introduction to Knowledge Engineering Part 1
Supervised vs unsupervised machine learning algorithms
.pdf is not working space design for the following data for the following dat...
annual-report-2024-2025 original latest.
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
IB Computer Science - Internal Assessment.pptx
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Quality review (1)_presentation of this 21
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
STUDY DESIGN details- Lt Col Maksud (21).pptx
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Qualitative Qantitative and Mixed Methods.pptx
Lecture1 pattern recognition............
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
Fluorescence-microscope_Botany_detailed content
Introduction-to-Cloud-ComputingFinal.pptx
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Introduction to Knowledge Engineering Part 1
Ad

Data Science module 1 statistics of data science

  • 1. Module -3 Deta Wacnzun J4s poOCess conveghng Mappina a dala 8 f ting it eady f analsis . J t involveg movin3 o Combining comdep dola aCCaR,ble g eose to analyse 3et o mace hem lgo Enou a3 DaA munging , tongshn cleanung ,orRnzing buSformig aus data nto desIx fosMa far aualysy to be gecd goY deusim makin .Make auw dta ugable. C o m b i e dala iow Vo9ou sous Co adt cetna bocaôn POAS DALO dasa o xauixd Sormat umdalgtamd busines ontex of dal . Acutomoted Integnoatim tools o s e s {r a Ca -data se amalysis. 5. clean dalk lom NOISe,muging lauves elemetg 6. 4elp busineas USes fdake timey deuisions decistong. S i s O6 Spam ViagRa efence4 Cspelling chaun) .Any mal containtn Yaa efeces Cspelling dhawng . Lemn o s u b j e Clot o exdamohen. pumchiotuon) Sugshn for spam 'tdevdicaim T u y a poobabliue Mdel 3. y -NN, Linea Regpeon.
  • 2. hy Lneasn Rogiesgion &k -for leaing spam. isute about uinean ReagoogO) spam fHeaun 3. Cenaide dotaset ag a malic ,whee eoch od CoTCLPendg-to a emoi. d i f l e u 3Ccaiu columns fur eath doydg, heee Viafa' a uolumn 4 Ony emad Contain the 0 d Viagia, Bhon that Column Alled uith value 1 elge assin o alkanalvey one c a put no. o imes e oorld appeal 5. for ineoa ReReO we need training eMal whee be lab eled wth cutcome vasiabde email haue be i.e spam à Aot 6. A humam gooe spam Rg tabe ling tak c a l be d fo dodtig T . n e h Romeeon Wmodo e buil buil 8 An emau h o u t lobel 9 gve to pedict-he labals 9 TaslEA binvy Cofoy not spam, 1 osspam) 10 dn LineA Paspesiom oudcome ig a numbeh amd coninoMs evau Choose a Cntesl value, 3 Pedcled valueg' aboue tt outpud 4 , belous hen outpuut u 'o' 12 J+ donoF uRk beuause o u toD Many Ua9uablep te l0,000 eMoulk wukh der O ,00,aDO W6rde not 1 Thue Camot be t r d tna MaaX n o t invegtsble TR invests ble 13. 4. we could uaut tha D. wuDds, but shl ineas appropale to binasuy o u t c e m e wuDdg, but shil ineaA O U t t L O M A PeRaos wot
  • 3. hy k-NN dLD not uskus Spam Haing wute abeu -NN 2 eMcuile aRe paegeuted as Malsu x., uuth Ou06or emoud Owid columng i ? 3 Malux eibue9 ale ether o r 1 depemdung on peence h o t wid neos, basld on 4Fos k-NN, tud e m o as ad < be L usdg thoy both con+aLn. Loo manu dmenSiong g 5. HeI 1,00, Do uode uul have Cornpuhna diShance m OD, oD0 - dimevsional spoce which a s e LDt Compuodh m wk 63uMs Rom ue O dimeusionalilty& ut maka K-NN PooY olgRth m D1gut Recognihon Rappee eath in a 16x16 pixel grid UnwsaP 16x1b qid into 256 dmensiomal space veCctonize ap Py ENN tune Acclay, Confutosn a NaLive Bayes taud -classacahon nethod bosed on bayes tad Exanple Rore dusCasp uuhsL 17 o ppudatian injedad 99 Scck pattewlg tes posdwe 997 healty pabevdg -cst negative GIvOMa potiemt test posiuive,whot g he poobab-lib h a G,iuOA olly sic Ppulodton 0,000 ppl Pahe achuall sKck, 99 haalhy tagt + asdeAuy+ Hee SO% 99 hehy 9900pP sick 0opp 9 egl+ 1leg Peson 1les+ 11ppl f997test 980 PP
  • 4. Let ,ybe venug ut probablng px),p(y) POX,9) be join poobabliay wheu both hoppeu Londitional poobasduby whA one haPPRnsive nupths has Aoppend P(xl4) PCy)= PCx,u) = P(u|x) PCX olwe kor P ( u ) , a g M Pa) fo PCya)= PCxly) p) PCx) -Le y e{to euen- "Jam Sid o"sick he to ev egt u potdue + P(Sick+) = Pt |sick) p(sick) PCH) o 99 x 0.0 o.99x o.01)+o.olxo.9a) 5 0 7 Naie Baye Spam At foY Jndiuidual wrdg ug Naie Baye O Cuss O a w d , add& to poobab emoid 8 Spam condla oly one u0nd at afme e han wdicaleg non sPam PCSpam) psobablity of SPa PCha Spa on SRam oebalbului of PCham) 1-P(spam) PCwsdspam) pobotsuly owod in sF P(wc|ham) probabluiy ondd in ham emaul wdd tn ham emanl Apply Bayeg La P(SOYe spam PCspam) PCNOd) PCspam |Wád) =
  • 5. PCwod)=P(uusd spam) P(Spam)t PlwBd|ham) plham) NO- O spam emoulg Tot No. o emaulg PCSpam) = No ok Non-spam emauls Tbt No emad Pham) Exap EMployee emais with I500 SPams, 362 ham. Meehng wRd oappeass 6 times in spamM Is3 hmag in ham Pepam) =soo ISbot 3672 P Cham) = I- P (spam) l-o-29 = 0.4| 0 Olo6 PCMeehng |spam) 500 PCmeetins Iham) 53 3672 =0.0yl6 PCspamlmeehing) =(meetins|spam) PCSpam) PCmeebin4) 6-o106 O29 (-ol06)0.29+(oo4l6x o1) 0.09 Cmechin5)=POmeehnalspam)PEpam)+PCmeahng|ham) Phem)
  • 6. Aspam HHe tor Combining W8de. .Eath emal Rpejevled b a binay w vechh e i y ü 1 r o , depending on appeRame of h uid 3 e be email ve (or ndep for jh wRd denole Spam c PC|C) poobabity thoak emaul veckor is spam C-) pxlc)=TOic C1-0je) who, paubabdiley hat individuual uad in spam cPobabiluy of u hwuRd spam 4. Take Lo on both tda Lto Conuet produuuh to suu] Log CPCxlc)) = v - Lugoi. -eje ( ) Aog Ojc+L-je) log jc t z - ) Loa-Ojc) log jc + logC-0jc) - j log(i-0j) togjc - log(i-0je)+ logC1-8jc) loC | Cu-oje)+ lo C1-0c) logCpoxl) j i t No who j j- LogCjc/t-0jc)) w . (og(1-8j)
  • 7. weahg j vay for eath Maid, must be computed - Compude pClc) them eshm at p(c|) 3+ uRkg l & cheap tolain wum pre-labled dataSe Laplace Smoothing : pootabluby o a give uBd in spam emad Yeline oj as oduo of e to 1c e mjc wlhee ne 0 hmes " uRd appeoag n spam eMoulU Mo imgh Rd appeaai in -Laplace SMoottingeeu to de of replaung oj a n y e M O d aß 9= Tje ta x =l,f=lo to paeveN gOhing poobalsdt oK 0 r 1 DDala se OS43Maxp P CDIe) ML MAX ime Ukelhood eghimoR ARuw Naint Bug way of choocing Oj for enh i log(jc (1-0)e-Tjc) teke dasivatu get t to 0 Hham jc eMAP =agMax P(O|b) MAxi a PoSteuo 4
  • 8. Com pogung Nauwe Bayes to k-AN Navi Bages NN J4 has tuo hupeapahamotea t h a s ony One upes palamete ie a neosA J g Non- ünean laSRkia 3. Dimenconolby Loa Dimmemserety ue heoise 8ek problem not ketuee poObleM. 4Teuin J t Yeausu ainiMg Boh a Labele.d upevised Leasnng SGhaping the nleb APIs y otes Tools Snapin otRTools Dals Suietsts need dala o ask aushar Solve pooblem t u do eseaNh ak aneshom, Solve pooblem, eals uth extactng e dads bRom Ssapmg the websdes web Dos API key 3 ExkMSimg - DiffeR ways to gtE -othu polsins Tools I. un cnd ynx - - dump - dumP R Beaulufd Soap (Robust but slow) 3. Mehanze Dont posgeTaweaspt) 4.PostSaipt (Jmas clasCaten) LAPI ey Proudod to davelopu to douonlod doia in Speuked forma Delevope Rosiaeg and cce Fey Cuke posswRd) - APTs M hae i t abou aCCos dowlorcd Sj2e - Pa oY wutheu Pa Cuke pOSSwRd) Dala can be in JSon r othee etandasd fomoNe Yauhoos YQL Ca be uxd selet Ron ficke phofo.sAGCh whee Ext = "cat whe ext = "cat whette api-key='legidj#|sdvt t lo
  • 9. Exemgion - whan APIs a not extemsim of Raebox available , exleusing e ixbus Use Jnspeut the elemet on anuy webpage HTMLfielde cam be acceed avd edded fAfto ocaling te shuff we need nside HTM emd Cur Gek, qp, audk pehl e Sip to e h dala Same b done Uina Pthon or R L a s Recomiition Chelk fuor landscape Or headshot colleu data, agk sot ne to label or R ticEy Repe eath imase as RGB numbeR betwo2n o 255 Doaw 3 gtosam Fo eath olos deude houo nmch bue Modeeg lodscape& heedghot Nawe Bayeg fr A hicle classificadhan Mulhclass text - Asts,Busines, Polical,p&g to efh arhelg -Use New y& tineg deuelopeR APL to -Appy Bernoulli wmodlol -for w8d pnekecQ to claiy Rogiskr Resueg A IFe anclo? 2 Download 2000 rece arhila 3 Save ashcle m each sechon to sepaiole fle in t b d e u m t e d f o w a t - ashcle tilte ashide uRL boy petugned by 3 Sae APT Set of Couteqiiu r claukcahom artide Jek C bee e O , 1 , 2 . C Coke of atide X Sposse binasy alalx w d Xii =I indicalug ashole has
  • 10. S Tain by Coumtng wdg , douumete elassto e9timale uRain each c jc uwhae D, no.of dounete 4 class c no. dotunct o cles c hauing jhw&dH , hypepoSalay o smookh eshimatim Caluote Loa odrg fos each clas d a l a to bage classso 6 to bage Clask o (PCy=cl «) Z c o PCy = olt) wlhe eCt-Bjo) wjc O j e - j ) -Ojc WoCZ to. T.Pead THe body aande asncle -Pemoe u n U O a i l e d punuuahong g d h a s a c t e o puunuahong dhasacoteg -Tokeni2e wwtu wwodg - Pito stop wusds - Eshimato ,Pa inputs oukpu postiorr P r o b a b l t fr eaundorg n d o g Diwde into solsDtoain/te spit Poese Contugm a t r x oikfcuu o dany Re poRt Top 10 a h c l u