SlideShare a Scribd company logo
Dota snalytes
(Kes-05)
Unit
Sntroduction te ata smalytes
Sylia bus
Srlbooduction to data analyies; ouwcs and maun qdata,classiticalien
dala (stutuned, semisuetued, unsouctuund), chaacinislie
deta gntrcductionto ba data platjorm, Nerd d daita analjtic
Evolition. oanalytiz scalabilty, cralytie pretas and torls, analysis
uporing, merdexndata analytie toes, applucalien dala
amalytics.
0
Data Analylis iecyeh: Nead, key volas for ucal analyt
project,varios phau efdath analytes like cyes- discovt
data prepaxation, model þlanning, model buibng, Commuritaio
iault, ohexalionalizalion .
Introduc tiont dato Analylics.
OwNCEA and rnatiu_ data
Data
a t a isa sact o fiqun ehtaind em eapeimana Suw
a basus ormaking calculatiens drauimg
s8ked enetsions
Sor examplu,ttxt, imaguu and ssund in a
seim that is sitnb
er sloraga in er brocusing by a
tombuu
SowceA o dato
The au two types of douN CA of dato avaulakol
Jimau sowu o dala
2 etondauy soce dato
Timay sounce dato
he data which i raw, oriainal and exlra ttl diretly em he
dl
o ces is known as prim aruy
dlala.
This type data is diuctly cetlhehd by þerformingtichniquas inh
unslion arus, intxviees and suveys
*ata Collectd must be a ccerdingto the dema nd and AequvamonEs
th taxget audisnts on which analysis i befemed otheruist it Lsul
be a bwden in he dlato protsstng
Metheols far collecing primany dala
Tntevies method
Swvey me-thod
ObseH valion method
Ecbexi mental method
Er
peimenta) method: 7he expeximental methood hu pro qcolluetig
data through pexerming e«pestment, ressavch andinvestgation.
The most qunty uasd expeimed metheols ara (CRD, RBD ,LSD, FD.
CRD Cempltly Rondemized lesgn
CRD Js a simpl epeximental design used in data analytiet which u
besed on randemizolion andvepbcalion.4l is mestly tucd or Cembau
he expbeimeris.
ReD-Randlomized Block Besign
RBDi an excpedmetal deslgm in sohich the expeximehb is diidd
inte émall unit calles blocka. Kandom expeiment as pexfermd m
each ohu blotks and nsuls are dauen usimg a tichnique kmBn as
fralysis o Variante (ANoVA). RBD is oviginatad m the agriutu
LSD- Latin Square Vesign
LSD s an ecperd mental dastgnhat s similax te CRD and RBD blat
but lenlains cws and coumns. 9t is an
auxongement b XN OT
wh an equal amount Tus and celumns tokich Cortains utte ha
OccLus ony onuîin aTow, Hene+Re diwuncu Con br easi ouna
Lsthewer evu in he expeimeñts. Sud o buzzk is an ercoml
atm sguaxe design .
Factuial Besign-£D
rOs On ecpeximental design uwhou eoch expeimend has tüe dacters
each with possibl values and en pexfozming traw othex Combinatisna
acts aa duived.
e Condauy sSowee odala
bece nday data is he daia hhich has already besn telluatd and
ud again er some valid buxpeas
Tho
type s data is previously seCordas om primary data and il as
tüve typer sourcu mamed inttnal eune and exunalescuwru.
Thuna soun te
These ybes daa can éauiy beJeund within 09onizolon zth
as maxkd u toyd, sales netord,Torsacton Customu dat, acteum
sCw. ie.Tre etest and tme consumpi o i s obtaining nto
Eblivnal ahowte
Thi data Lokich Con ' be found at inloral ovgonizalion. and tan
ba
oough eztövina) +yd baity JesOLUKCLA is esdinal s6eunu o daf
huge
TK cost and time Censumbtton iv mose becauashis Contauns q huge
ambumd vt data .
Ecambles- Goveunmenl þublicotiont,NeuspubUcations , yrolicatt k
and other Meqevemmental pubicaluons
othe sorces data
ensor data : with th advonoment otOT devicu,R åsnko1A 0
hese dvtes Collct datä which can hr uard dur srsor data aalyiui
track th þexfermantr and usage o product.
satoll data: Satllites erllret a lol images and dalä tmtonb
n
odaily basis th7oush swveillan ta Comea ohich ton be usd fo
callet usnul indormaticn
eb tsaic&u to fast and cheah interned ai Gti a many omau
dato okich Lan be uploado.d by uauu on dunt plotmCgn b
bvedictad and Collect with hi þexmiasion for data analysis."TK
w6each engines alse brevid +hee data thrbugh keyuoorda.Qndquit
Seathed mosty.
analyais, Ths
ata to llo ction `ourtes
imov dota
Se tondaTy dala
Suwey
Intunaldoto Cxlorna
data
i nNeuvieu
OrQanization
CnovernmU
clossieation dala ceWelien Aouma
Noture odlalo
The natuns of data is clasitis int feur Catigerias.
Nomi nal dala
Drdinal dato
Snterval data
Ratio data
Noninal data
n a l edcale is ustdorauigning mumbvs as the identialien of
ndividuual unt. fer exampu, A clasikatione iownals atwrdny
to*ha
dusibne hy balong may he considaadas mOminal olata. 3 rumbucs
a assignad to dlsibe +Ba cailagoris,h numbtrs ipreatnd
he name h e o Calegory.
Orvdinal data
t indicatis He ordered or 9sade rrlaionshp among he num berA @Asigna
to tha cbservation made. THese rumbs eannot Tonks ant talegerus
hawng a ulatonship in a diinithorde.
0
ker
exambl. ë
study the respensivenu ot a Ubrasy statt a esa chor
may assign J'to ind'cat ÞooY, '2'tndicat.aHaqk,'3' toindiala
Gcod ond 'y'to indicate enlent Te numbuus ,2 3, tn his Cau
au át o Ordinal data
The 0Td'nal data sshoo the divecton othe dun and netthe exa t
anmeunt diount.
Intval data
ntva datà au ordoud Categeries o data and s duntts beuun
vasLus Cattqris gre o eual measuTemenc
foreampu,hse Can measure hu T9 70 ch ldren,tl4
Onna numeudtvau to he g oeatch chi ld,ha olata Can be
2T0up with tta inttval 0-to l6, lo-to 20 and As m.
Ratie data
Katto data an +e quantatie measwume nt o a vauuabe in y ms o
tuie thrie.
magnitude .Tn ratie data, we Can
bay hat oneHing istuies 6 th.
ma
anethe
Foreamblu, meastoents measiment tnvolving ueighd, di bnca,biu tk,
Clateatien data
t tn inoTmatien stored _in a batieulas fth uhressnTad
a
m
data.
The data ae clasitid inte thret germs dota
ötwrt om
Unstructind orm
demi tuctunad gorm
IC d koTm : Any term erelationa! database abuctiun Lohe
ulatin bitinn altibuhu is bossibla. That h r ewisb a xulation
atwean eUN and column im the dalabase oth atable teucturr
Eg.ivg database progiammang arg uage ql Oracle, mysqe
Unsbuclaed form: Any ferm t dlata that des mot have þredetned
stuclss is epresenta as unstueliura form ot olata. g
vfdee,imag, Cormmentu pest few wrbsites such as blegs ond
wiki pedia
Seminstwotned data : ta eu mel hae erm tabelar dab
imilas KDBMS. sedefnedogonize8rmaisavaiabe. Eg: c
xmk jstn, t«t filh tth tab keberator ete.
Chaxactvisties a t
>Accuracy
Comble lines
ReGabi ty
Reavanu
Time ine
CCLUNacy > Data acwnacy e
t eOT XtCovds that can uAeda
a
seuliabe dcueee inomalinn
comptbneds Data ComblutincasTelers h the comprehs ns we nas ov Lahaisca
oheleness e dala.Thau sheuldbiregapi 0v missing ingemati data
t b
truly complst
heiabiGy Data uliabity means that data is compati and accuurato
and it is a Czucial doundatüonr bilding data trusr a c e s s tUhe
Opnization.
Kelevance Data velevan ta assesies oheathen the ingormation can dátrve
i purpese tnapavticulas Ccoteszts.
Timeliness Rata timelUness xsles to ths hou u to daa tha
incrmation is .
Introduction te_Bi9_ate Platform.
Bidata
oig data is a iudthat aat ways Ta analyze , dystmatical
exract fndevmaion em, 0r OHevusui daals uith se data adh tat
ae tBe lasge Or Cormplae To be daal uwih y tadt linal data
pvOCLABing -appication satiar
es big
det
stuluud
Unstuustiwad
demi. suetud
Chana ctivisties
bi daa
The a 5 thavac uisties of big data ae
asfeleus
Volume
Vaxie ty
Vena cty
Value
Ve toby
Volume
*The name 'bjq data' tses i relatadto a Aize Lohich ts eney meus
*Yolume is a huge gmourd olota.
To datmine th vau evalus sijx data playsa vey tualae
ive luma o data is vey lage than it is
actiallcensidesd
as
abi data' This means uoheather a
þarücular data can
be actitally bt consideud as a
bigdata oT
Tot, is dsberolun tsen
volume o data
*Examp Jn he yar 2016,+Ae eshmatnd alobal mebile t
oas 62
enabye62 biUion GB) þev menth. Alse by
2020 we wiU have almost y0, oo Exabyts of data.
Velocity
Veloty re t ths high sstad acumlatiene data
Tn big dato velotiby dato flouss im
om sowCLs Uke machenes,
iuork, soio/ madia, mobiofbhone"ele
Th u e massive and cotinous ow dala .Ths debimimesthe
botntas ot data hes fast +a data is
genaratad and procea
t meetho dmands.
Sanpling dlata ean halp m daaling usi-th the issue U'velotry
Eocampla. Thus au move than 3:5 biLuten seaehas pe day are
mads en
9oga. Aue, Jasbook usus aru
incaastg by22/ (appm
Jeax by yeas
Vate ty
tu
St eos th natvu o data +hat is stuucned, dsmirsu eua,
unstructiuned data
st asc e to haluegunteus öouH Cs.
* Vasiety is basically Re avri val e data hom neLs Aou cas that as
beth nside and outide t an entwbrse .t an he sucliurad,
a$emstuetuud and unsbucure),
Vewaety (Truthgul)
*St es Toin Consistenies and untuta'niy in olo, Ha
data a vaslaba Can semetime get messy and 9ua
and
acuracy e
dct to contol.
ajq dauta is also vaviab» bicause d ta muldude et datadimenälors
Testin4 om ult plu disparate olato upes and souru,
Exampl, datà in bulk would creat confusiors olerea A
data could Convey ha orintombttt intormauon
Value
The bulk of data havin no valus is ne god to tomban
unlass you +un it Trto semething uushul
a t a in iaa is no uL l imbovtanu butt it naads to br
Conve tad îto Some- ng
vauoblu to extrat întermation.
Aldvantages big data
0pbortunities tt make bett docistons
Inctasing hroductivity avdF'ciency
Keducing costs
Tmrevî ng cuusomar suviCe QndCustom expeiena
raud and fnomaly datetion
reatis Agility and speadto masket
Disadvantoges a Bi a
uastionabla data qualit
Heightend secusuty risks
Spdta ComplGane headachos
Cast and îngrostuuctiun tssees
Rig olata sills shortage
Data fAncalylics
Qata fnalyics
Data Analyticss -thu stiunte
analama Taw data in ovde
maka Cpncusions abouthat dbdtae .fnor mation. This inormat
Can bt usee t5 0btimize bTocisses t inctasa he oveall c
o bunoas or 8yshm.
Retot
(
MM Crseat
Collet olala
Kete
tnalyse dota
ata nalyicr,
Hbesoat Analyts
4.Desoübtive analytiu
2Predictive analuteu
3recbtive analytius
4agerdste grayHa.
RSpuve onaluue Th deseribüve analyti the resuld is duocay
ngad oi+the brobabityamong nTumbns ropin
Lohe to ch obtin has an equal chantu dobabiutu
Eg (obsevalion ,Case- study 6uve)
redictive analti : TRà huhe onaluticu deals uih predi
past data to make decisiors baxed on Ctdin alaoithms. n
Case oa doctor th doctor quations th batient aboud th pa
*o
CoTTet hu fllnss rough already existing reudue
Eg heahtart hera eathen, însurane, Secialmadia analyLs
heCuveanalyücu : Prescribtve analyls twerk with preditwe
analytu, whiches cuss data to adrlumine TRar -tuum ottomes.
resCriptve analytis make t s machine larning to he lb busirewe
ducide a Coumse e actien bud on comput roa7om' prediCiena.
E: heath ta , bankung
Siag oniste anaytica TKs tousAs moTe n ohy semethin hapnad
TFis invclves move diveue data inpuh and bt hubetheizing
Netd7 otatnalyie
>Gathu hidden inah
Gerata Reporh
Peuterm mokat arolyu
Imprevt businass requremunta.
athana a
atho kiddeninsjada Hidden insjghtsomdata au
aathiui
he
nen
ana d wih respeut to busin4 Aquismants,
euat Reberta Rebost ae genenatad hom fo dáta and ara
pau
tohe Faspretive teams and ndvidual to dral wih fu thun ad.
high rist in bustness.
lexkerm Markat Analysis: Market andlyais can bt ptmsdt
unduustand he stunqths and ha weaknaises Com tos.
mpreve businessrequiremarts: Analysis data um busin
t cstomen equirementi and expevimen Cs,
Eyolarkin AnalytiedealbbiB ty
Scalabilty The abilt q a
dyum te handl inereasing ameunt
work uqund t bextorm i Task.
The in canase in dala sbrage abGty has gro Loh in Yecent yeas attemi.
the mud r kig data
TTaolitional Analytic Architctur
he had t pull all togethe inle a
separah analytes enuTOnm.
te de analysis
Data base 3
Database
date baieu
data bas e2
ocus
The heavy
Cn
inhe anayti
environmant.
Analytie due
Morduin in utabase Arehi tecure
The breussing stays in the database ohew hu olata has heen Constidta-
atabase atabas 3
databae 4
data bas2)
CenstUdate
REnta piedatAe
Oauhouuse Asi
Submit
R e q u s t
t h e ' m a t h u u
J u s t s u b m d h a
Ahrady tie Seve
DY Pe.
Mausively Pasallel lrocusing(Mer)
tn MPP datobase breaks ha data ito îndabendnd chunks si th
ndapenden disk and cPo
Singla ovexloada Multip ghtty loadad Auves
Serve
Shael lothing!
Oneuab
ablo
tOTGigab
Chun
too.Gb
Guns
Chun ks Chunk
Clunl
Alradtional dadabasr
100-h
udi quyabyt tab 169 4
Chun
ChunK Uuny Chun
One ueu ad im
0 SinuCtaneous 100-
8gabt quaies
onCuLUt
frcu
* Mf ytm alleuh oteunt drti cPo Ond dis k ttT
rttASContuwunty
n MPf
J e b i n t o
p e s
3
J
Singlathresdhd
fowall mcus
MPP6ysüm build in kedundloney tmake u tovenyeasy
MPP ysbim hovt escut
moragnont trls
Mana H CPU and duk shau
CLeud Campulng
Mekinae aá Cempany pabes fum 2tt
- Mask hs unduuling inustwca em tusv
-
k zlastie ical e damord
-
ena poy- p- bais
.Natinas Tratinb d trdans ord Technckg (NMsT)
-Cndemand i Auie
Kexwru peuLng
-
rapid elatt
Too tybes ocous Environment
JPoblic cloud
Tha &uvico and ngastwtiue ae þrovidsd H-kit ever thu
inTivne
Greatst uvel et e{deny in shared useu ca.
Lesst cunas andmoveylnorabu han brivat clous
2 rivat cloud
nhasbuuclue chexaltd boluly fr singk o19anizolion
The same fealunes publie cloud
Ca he reatust uvel o decunity andconbel
Necas oy to purchaue and own he entive cloud infrastrut ie
podCompuing
The sedenatizn Computir resourcus to reath a commen
a
MapKedue
A baxallelogramming ameoerk.
Parallezothe
Faull tolvona
gate dstsibutm
nadbalnun map eduu
Mab n tion
frc sing a kay valus bauu to gina a sd qintxmadiat kyvalua
pauves
Redutfunticn
Menging all madiato valuas auociatad wi h tha Bame immudiat
Me-9
How map veduca hki
lwi assUmaharu a 20 trabytu dala and 20 mabreduc Ave
moodsoro pvo
Jslibut a obgt t each t the 20ncolss uLing a dimpl dil
Coby rocs
2 ubmit io þroghoma (Mab, Reduu)to Achad
3 ke map þrogiam find: the data em dik ard execlu tRa oo:
Centains
4 he Jeault eh mab sth au Hn hassed T Th Acduit
procus
bummaiz nd aggl:gat th: inal ansueeu,
Mapunetion
Sehaolulu
Results
maluytic breccas and fhaly tie teol
Data Analybi precess is a
þreCa Cellseting ransorming ,
cleaning qnd
modelling data uwithhe goal dicoveing he Aemuire irrmalien
The Lsuutu so
tbtained axe tommuniccatd,suggesüng conusiens and
duppe tiy cde cisien ma
king.
ata Analyis precsss Consub f the follousint Þhases that are i rati
Dath Reguiumunt sheaicalien
Data coUadien
Sata broci
ato clari
ata Analyiu
ommunication
2ata Reqüge me r
Spedtcation
Data Colleetüon
Data prcusing
Data elaaning
atAnalysiu
Commundcalion
JSpreadshest
MiCrcsehl extel
Spreadshush
2 Database
Relationa
Co umn
Dcumen
Grah.
3Progvammung Zang9uags
Rand lrthen
dely éuice dat vitualizaion
Tabua
be Pou BI
AwS Guick ght
5.
Big data toels
Hadee
ato laks
fAhoche spask
6Cloud
A os
Geog cloud
Mic ese dge
Analysta vg posing
Paxamet Reporting Analrties
fanmpos eus ohat iuhatnlain ohy is happening
gani zing
Fovna ting
Summasizing
uLstioring
ntrpreti
Explerin
Tasks
Kesults ae pushadto tlaus bul Jesulu o answe
uLt fer Revie
Kesult
quslonns
Trons hts data înt d s utomma ndatio n t drive
infermation
Value
atios.
Applratin qdato Qnalyi
Secui
1City plonning
2Franaortation 8.Healthcare
3 Praud and Ruk dstection 9. Trave
4Manage kiks J0 tnengY management
6 frchex spending J Snornt webseare
6 Cuwtome intona ttiona 2.gial advertseme
(
Dato Aralyin 2ccyra:
6dedo (hases afdath analytiosi:yela
The data analytie cye desigmad or big data þrobems and
data s6enu þrqets. Tke yels s Healire to frepresent real brojred.
The bhases ohich involved in data analy hies le cyeh ans -
Ofs Covey
2ata brebasation
3-Medelþlanning
4. Medel buildij
5-Cemmuritation kesuts
6 Oherationalize
iscouey
The data science kam len and investigat the probem
evelob tonteset and undenstarding
Come to knt us about data sturtes necededond available jor theprojet.
Te teom sprmulatu nitialhy bethesiu Hhat con he lata tust with
data.
Beta pnehaaian
Stebs t explere, þreprecisd and Con diton dath prierte mo d: Lling
and analsis
t Tequirts ho presenca an anaytit sondhere, he kam execute,
oadandtransfevm to qut intohe&ondboe.
Kata oepasation a sks ave likely o be perfome! nat HihG times and
nol tn fredelined order.
Sevenateels Commcn used for i s phase are-Hadeop, Aine minex
Chen Kefme e4
Model blanning
Team ecþlbves data te lkarn obout Ae laHlonshi baluweon vaig kt
Ubieputly, eleet kuys vasiobls and hs most suitabe mtd
nthis phase, dath stiont Jeam drelop dato e for troinj
tsting and þroduttion puxpese.
Team burlds and exectis models bascd on he uoorkdone in
b
in ti
model bloning phase
Several tools commen y uacd der thu phase ae
-
Matlab , STA
Mede! building
Team olevele ba datasets fer tsting »thainig and
prToduction burpose.
Team alse considess whethen its esting too w?U 3u T
Tunning he modls e
hay nccd m6re rcbust envitonmet
executing medels.
>Prec 0T
then- souTCe toeu -Rand PLR, Octave oERN .
Comm cia tooa - Mat ab, STAS TGcA
Commurication Kesult
Ar <xeulin9 a
model eam naed to Cempans outomes modl
to critoio ësta blihed fr suecus ond faiune.
Team Conuides htw bist to auticulat fineling4 Ond ouhoms to
eom membus and slake holdors ta king ioto attounrd Kan
assumptior.
Teomzheuld iounti key indergs, qusntity bukines value, and
dove lop manalive to
Aummarize ond to nvey indings te sO
2
3 eheta oherationabize
The eam Lommunitalis banu o es more hroadly and A
bilet mojct te
depay Lot7k in Contyr lled n brpove broaderit
h wenk to full eti uae q usu.
Tht abhroach enablu eam to leaih about omanc Aalatad consho,l
mode n productioo environmend on small'Aale l nbsb, and ma
adjustmunts brferrful! daploymint
Thettam dslives fnal auperb, brieerg.codu
Pree C7 Cpen scuce tonls -olave, weka,9ql,.Modlib
istoy
(obeiatia
gato
prebaralu
d a b a n a u t i u
(LemmuR Cortin
R e s u l
Modu
budi,
Meds
plann
JBuuinss ls
The business usu u he One woho Unolustonds Ho main ansa pd
and is ase basically banstited om the Araults ,
nu e
9ivts adviu ano tensut Ho ttam werkn ntto pretaou
T Value othe ne3ults ebtainad and hou tho tperhlion m 4
Os den
DUSines managll, Une manager, 0r deep Aubjet malte e»«se in
Ho pejactmains full:ls this Yo.
2freetShonsor
he rojct Sponser is tho one tohe ii yesponsibl to initiat thopojcd
Projpctsponsor brovides the actual Tequiremenu o Ha
projet ondbrean
tRo bauise businass issus.
He gennal rovids4 th funds Ond mnasUres ta
dagreaof vauus hom
tinalscdpu ethtom Kor king on ths projct,
Thi peson intro dauee he bsime Cencern and brooma he redeulb
reek Monaze
Ths beLson enstes that key mihstne and puabese th: prepek u
bntim and dtha expeitad qualt
asiness tnt llugente fnalyst
7kotustnuas fnk lligan ce analyst provids bui nas domaim be n bau
na detiled and dath urdenstanding tH data, key pefermaha
ndicals (kPa ) ; kay matie and bud inds intlleganci em a Auholi
abeint vieu,
Thi beson generally cata fascka and ashort ond kneus beut ha
data drds and sowrtes.
Iatabase Administrater (DBA)
DEA Jaditats and avange ha datbass environ ment to supert Hha
analytHes TeLd H-keam uwerking en a preject.
H bonsibiltëy may include providing hexmisstent key data bases
or tablis ancd making dwu Had i aphrc þriat stcility s
stages au in Rui cerecd lbts Aclats t he data a
ubosito7ies cr nel.
Lato Engtner
ato ergint grasp deap tehricl skillh asisl wih kning $gL
gwnior data maragimant md dato ectro ction andtrevides sun
r dat intka int the analytis -sandbox.
The dato engune'ocTks iointty ci th tho data stienist te helb bu
dat in Coreet vnys anltysis,
Lab Adentiit
nth sse ntut fa dlilalts oith the subjek matt. .eepetise fer aralytis
ehriquuei, date modelng and ahflying CeTech analitica thaipes
ov ogvtn kusincss du"
He ensuns oveall Onalyial ojactives a met
ote dntist Buis and a bby analy ti cal m +hedi ard precked
tsuands the data availa b f Concunad rejzek.

More Related Content

PDF
Data Science module 1 basics of data science
PPTX
UNIT_1-BD.pptx
PPTX
sybca-bigdata-ppt.pptx
PDF
BIG DATA AND HADOOP.pdf
PDF
Pelatihan Data Analitik
PDF
Data Science & AI Road Map by Python & Computer science tutor in Malaysia
PPTX
Unit 1 - Introduction to Big Data and Big Data Analytics.pptx
PDF
bda-unit-bda-unit-materail big data1.pdf
Data Science module 1 basics of data science
UNIT_1-BD.pptx
sybca-bigdata-ppt.pptx
BIG DATA AND HADOOP.pdf
Pelatihan Data Analitik
Data Science & AI Road Map by Python & Computer science tutor in Malaysia
Unit 1 - Introduction to Big Data and Big Data Analytics.pptx
bda-unit-bda-unit-materail big data1.pdf

Similar to DA unit1.pdf notes provide from my side. (20)

PPTX
Big Data Analytics
PDF
Big Data in Practice.pdf
PPTX
Big data Analytics Fundamentals Chapter 1
PDF
Lecture1 introduction to big data
PDF
Big Data Analysis
PPTX
Unit – 1 introduction to big datannj.pptx
PPTX
Evolution & Introduction to Big data-2.pptx
PPTX
Big Data Analytics Anurag Introduction.pptx
PDF
Python Data Cleaning Cookbook Second Edition Michael Walker
PPT
Data Munging in concepts of data mining in DS
PPTX
Bigdata Hadoop introduction
PDF
Detailed Presentation on BigData for beginners 2025
PPTX
Big data analytics
PPTX
bigdata introduction for students pg msc
PPTX
Unit-1 -2-3- BDA PIET 6 AIDS.pptx
PDF
Big Data Analytics Lecture notes pdf notes
PPTX
Big Data Analytics_Unit1.pptx
PDF
New Features in Revolution R Enterprise 5.0 to Support Scalable Data Analysis
PPT
Big Data and data analytics ,Business Intelligence/Analytics
PPT
Big Data and Data Analytics,Business Intelligence/Analytics
Big Data Analytics
Big Data in Practice.pdf
Big data Analytics Fundamentals Chapter 1
Lecture1 introduction to big data
Big Data Analysis
Unit – 1 introduction to big datannj.pptx
Evolution & Introduction to Big data-2.pptx
Big Data Analytics Anurag Introduction.pptx
Python Data Cleaning Cookbook Second Edition Michael Walker
Data Munging in concepts of data mining in DS
Bigdata Hadoop introduction
Detailed Presentation on BigData for beginners 2025
Big data analytics
bigdata introduction for students pg msc
Unit-1 -2-3- BDA PIET 6 AIDS.pptx
Big Data Analytics Lecture notes pdf notes
Big Data Analytics_Unit1.pptx
New Features in Revolution R Enterprise 5.0 to Support Scalable Data Analysis
Big Data and data analytics ,Business Intelligence/Analytics
Big Data and Data Analytics,Business Intelligence/Analytics
Ad

Recently uploaded (20)

PPTX
Introduction to Knowledge Engineering Part 1
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
SAP 2 completion done . PRESENTATION.pptx
PPTX
Leprosy and NLEP programme community medicine
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PDF
Lecture1 pattern recognition............
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PDF
Mega Projects Data Mega Projects Data
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
Managing Community Partner Relationships
PPT
Predictive modeling basics in data cleaning process
PDF
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
PPTX
IB Computer Science - Internal Assessment.pptx
PPT
Reliability_Chapter_ presentation 1221.5784
Introduction to Knowledge Engineering Part 1
oil_refinery_comprehensive_20250804084928 (1).pptx
SAP 2 completion done . PRESENTATION.pptx
Leprosy and NLEP programme community medicine
IBA_Chapter_11_Slides_Final_Accessible.pptx
Clinical guidelines as a resource for EBP(1).pdf
Data_Analytics_and_PowerBI_Presentation.pptx
Lecture1 pattern recognition............
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
Mega Projects Data Mega Projects Data
Acceptance and paychological effects of mandatory extra coach I classes.pptx
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
Miokarditis (Inflamasi pada Otot Jantung)
Managing Community Partner Relationships
Predictive modeling basics in data cleaning process
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
IB Computer Science - Internal Assessment.pptx
Reliability_Chapter_ presentation 1221.5784
Ad

DA unit1.pdf notes provide from my side.

  • 1. Dota snalytes (Kes-05) Unit Sntroduction te ata smalytes Sylia bus Srlbooduction to data analyies; ouwcs and maun qdata,classiticalien dala (stutuned, semisuetued, unsouctuund), chaacinislie deta gntrcductionto ba data platjorm, Nerd d daita analjtic Evolition. oanalytiz scalabilty, cralytie pretas and torls, analysis uporing, merdexndata analytie toes, applucalien dala amalytics. 0 Data Analylis iecyeh: Nead, key volas for ucal analyt project,varios phau efdath analytes like cyes- discovt data prepaxation, model þlanning, model buibng, Commuritaio iault, ohexalionalizalion . Introduc tiont dato Analylics. OwNCEA and rnatiu_ data Data a t a isa sact o fiqun ehtaind em eapeimana Suw a basus ormaking calculatiens drauimg s8ked enetsions Sor examplu,ttxt, imaguu and ssund in a seim that is sitnb er sloraga in er brocusing by a tombuu
  • 2. SowceA o dato The au two types of douN CA of dato avaulakol Jimau sowu o dala 2 etondauy soce dato Timay sounce dato he data which i raw, oriainal and exlra ttl diretly em he dl o ces is known as prim aruy dlala. This type data is diuctly cetlhehd by þerformingtichniquas inh unslion arus, intxviees and suveys *ata Collectd must be a ccerdingto the dema nd and AequvamonEs th taxget audisnts on which analysis i befemed otheruist it Lsul be a bwden in he dlato protsstng Metheols far collecing primany dala Tntevies method Swvey me-thod ObseH valion method Ecbexi mental method Er peimenta) method: 7he expeximental methood hu pro qcolluetig data through pexerming e«pestment, ressavch andinvestgation. The most qunty uasd expeimed metheols ara (CRD, RBD ,LSD, FD. CRD Cempltly Rondemized lesgn CRD Js a simpl epeximental design used in data analytiet which u besed on randemizolion andvepbcalion.4l is mestly tucd or Cembau he expbeimeris.
  • 3. ReD-Randlomized Block Besign RBDi an excpedmetal deslgm in sohich the expeximehb is diidd inte émall unit calles blocka. Kandom expeiment as pexfermd m each ohu blotks and nsuls are dauen usimg a tichnique kmBn as fralysis o Variante (ANoVA). RBD is oviginatad m the agriutu LSD- Latin Square Vesign LSD s an ecperd mental dastgnhat s similax te CRD and RBD blat but lenlains cws and coumns. 9t is an auxongement b XN OT wh an equal amount Tus and celumns tokich Cortains utte ha OccLus ony onuîin aTow, Hene+Re diwuncu Con br easi ouna Lsthewer evu in he expeimeñts. Sud o buzzk is an ercoml atm sguaxe design . Factuial Besign-£D rOs On ecpeximental design uwhou eoch expeimend has tüe dacters each with possibl values and en pexfozming traw othex Combinatisna acts aa duived. e Condauy sSowee odala bece nday data is he daia hhich has already besn telluatd and ud again er some valid buxpeas Tho type s data is previously seCordas om primary data and il as tüve typer sourcu mamed inttnal eune and exunalescuwru. Thuna soun te These ybes daa can éauiy beJeund within 09onizolon zth as maxkd u toyd, sales netord,Torsacton Customu dat, acteum sCw. ie.Tre etest and tme consumpi o i s obtaining nto
  • 4. Eblivnal ahowte Thi data Lokich Con ' be found at inloral ovgonizalion. and tan ba oough eztövina) +yd baity JesOLUKCLA is esdinal s6eunu o daf huge TK cost and time Censumbtton iv mose becauashis Contauns q huge ambumd vt data . Ecambles- Goveunmenl þublicotiont,NeuspubUcations , yrolicatt k and other Meqevemmental pubicaluons othe sorces data ensor data : with th advonoment otOT devicu,R åsnko1A 0 hese dvtes Collct datä which can hr uard dur srsor data aalyiui track th þexfermantr and usage o product. satoll data: Satllites erllret a lol images and dalä tmtonb n odaily basis th7oush swveillan ta Comea ohich ton be usd fo callet usnul indormaticn eb tsaic&u to fast and cheah interned ai Gti a many omau dato okich Lan be uploado.d by uauu on dunt plotmCgn b bvedictad and Collect with hi þexmiasion for data analysis."TK w6each engines alse brevid +hee data thrbugh keyuoorda.Qndquit Seathed mosty. analyais, Ths ata to llo ction `ourtes imov dota Se tondaTy dala Suwey Intunaldoto Cxlorna data i nNeuvieu OrQanization CnovernmU clossieation dala ceWelien Aouma
  • 5. Noture odlalo The natuns of data is clasitis int feur Catigerias. Nomi nal dala Drdinal dato Snterval data Ratio data Noninal data n a l edcale is ustdorauigning mumbvs as the identialien of ndividuual unt. fer exampu, A clasikatione iownals atwrdny to*ha dusibne hy balong may he considaadas mOminal olata. 3 rumbucs a assignad to dlsibe +Ba cailagoris,h numbtrs ipreatnd he name h e o Calegory. Orvdinal data t indicatis He ordered or 9sade rrlaionshp among he num berA @Asigna to tha cbservation made. THese rumbs eannot Tonks ant talegerus hawng a ulatonship in a diinithorde. 0 ker exambl. ë study the respensivenu ot a Ubrasy statt a esa chor may assign J'to ind'cat ÞooY, '2'tndicat.aHaqk,'3' toindiala Gcod ond 'y'to indicate enlent Te numbuus ,2 3, tn his Cau au át o Ordinal data The 0Td'nal data sshoo the divecton othe dun and netthe exa t anmeunt diount. Intval data ntva datà au ordoud Categeries o data and s duntts beuun vasLus Cattqris gre o eual measuTemenc foreampu,hse Can measure hu T9 70 ch ldren,tl4 Onna numeudtvau to he g oeatch chi ld,ha olata Can be 2T0up with tta inttval 0-to l6, lo-to 20 and As m.
  • 6. Ratie data Katto data an +e quantatie measwume nt o a vauuabe in y ms o tuie thrie. magnitude .Tn ratie data, we Can bay hat oneHing istuies 6 th. ma anethe Foreamblu, meastoents measiment tnvolving ueighd, di bnca,biu tk, Clateatien data t tn inoTmatien stored _in a batieulas fth uhressnTad a m data. The data ae clasitid inte thret germs dota ötwrt om Unstructind orm demi tuctunad gorm IC d koTm : Any term erelationa! database abuctiun Lohe ulatin bitinn altibuhu is bossibla. That h r ewisb a xulation atwean eUN and column im the dalabase oth atable teucturr Eg.ivg database progiammang arg uage ql Oracle, mysqe Unsbuclaed form: Any ferm t dlata that des mot have þredetned stuclss is epresenta as unstueliura form ot olata. g vfdee,imag, Cormmentu pest few wrbsites such as blegs ond wiki pedia Seminstwotned data : ta eu mel hae erm tabelar dab imilas KDBMS. sedefnedogonize8rmaisavaiabe. Eg: c xmk jstn, t«t filh tth tab keberator ete.
  • 7. Chaxactvisties a t >Accuracy Comble lines ReGabi ty Reavanu Time ine CCLUNacy > Data acwnacy e t eOT XtCovds that can uAeda a seuliabe dcueee inomalinn comptbneds Data ComblutincasTelers h the comprehs ns we nas ov Lahaisca oheleness e dala.Thau sheuldbiregapi 0v missing ingemati data t b truly complst heiabiGy Data uliabity means that data is compati and accuurato and it is a Czucial doundatüonr bilding data trusr a c e s s tUhe Opnization. Kelevance Data velevan ta assesies oheathen the ingormation can dátrve i purpese tnapavticulas Ccoteszts. Timeliness Rata timelUness xsles to ths hou u to daa tha incrmation is . Introduction te_Bi9_ate Platform. Bidata oig data is a iudthat aat ways Ta analyze , dystmatical exract fndevmaion em, 0r OHevusui daals uith se data adh tat ae tBe lasge Or Cormplae To be daal uwih y tadt linal data pvOCLABing -appication satiar
  • 8. es big det stuluud Unstuustiwad demi. suetud Chana ctivisties bi daa The a 5 thavac uisties of big data ae asfeleus Volume Vaxie ty Vena cty Value Ve toby Volume *The name 'bjq data' tses i relatadto a Aize Lohich ts eney meus *Yolume is a huge gmourd olota. To datmine th vau evalus sijx data playsa vey tualae ive luma o data is vey lage than it is actiallcensidesd as abi data' This means uoheather a þarücular data can be actitally bt consideud as a bigdata oT Tot, is dsberolun tsen volume o data *Examp Jn he yar 2016,+Ae eshmatnd alobal mebile t oas 62 enabye62 biUion GB) þev menth. Alse by 2020 we wiU have almost y0, oo Exabyts of data. Velocity Veloty re t ths high sstad acumlatiene data Tn big dato velotiby dato flouss im om sowCLs Uke machenes, iuork, soio/ madia, mobiofbhone"ele Th u e massive and cotinous ow dala .Ths debimimesthe botntas ot data hes fast +a data is genaratad and procea t meetho dmands. Sanpling dlata ean halp m daaling usi-th the issue U'velotry
  • 9. Eocampla. Thus au move than 3:5 biLuten seaehas pe day are mads en 9oga. Aue, Jasbook usus aru incaastg by22/ (appm Jeax by yeas Vate ty tu St eos th natvu o data +hat is stuucned, dsmirsu eua, unstructiuned data st asc e to haluegunteus öouH Cs. * Vasiety is basically Re avri val e data hom neLs Aou cas that as beth nside and outide t an entwbrse .t an he sucliurad, a$emstuetuud and unsbucure), Vewaety (Truthgul) *St es Toin Consistenies and untuta'niy in olo, Ha data a vaslaba Can semetime get messy and 9ua and acuracy e dct to contol. ajq dauta is also vaviab» bicause d ta muldude et datadimenälors Testin4 om ult plu disparate olato upes and souru, Exampl, datà in bulk would creat confusiors olerea A data could Convey ha orintombttt intormauon Value The bulk of data havin no valus is ne god to tomban unlass you +un it Trto semething uushul a t a in iaa is no uL l imbovtanu butt it naads to br Conve tad îto Some- ng vauoblu to extrat întermation. Aldvantages big data 0pbortunities tt make bett docistons Inctasing hroductivity avdF'ciency Keducing costs Tmrevî ng cuusomar suviCe QndCustom expeiena
  • 10. raud and fnomaly datetion reatis Agility and speadto masket Disadvantoges a Bi a uastionabla data qualit Heightend secusuty risks Spdta ComplGane headachos Cast and îngrostuuctiun tssees Rig olata sills shortage Data fAncalylics Qata fnalyics Data Analyticss -thu stiunte analama Taw data in ovde maka Cpncusions abouthat dbdtae .fnor mation. This inormat Can bt usee t5 0btimize bTocisses t inctasa he oveall c o bunoas or 8yshm. Retot ( MM Crseat Collet olala Kete tnalyse dota ata nalyicr,
  • 11. Hbesoat Analyts 4.Desoübtive analytiu 2Predictive analuteu 3recbtive analytius 4agerdste grayHa. RSpuve onaluue Th deseribüve analyti the resuld is duocay ngad oi+the brobabityamong nTumbns ropin Lohe to ch obtin has an equal chantu dobabiutu Eg (obsevalion ,Case- study 6uve) redictive analti : TRà huhe onaluticu deals uih predi past data to make decisiors baxed on Ctdin alaoithms. n Case oa doctor th doctor quations th batient aboud th pa *o CoTTet hu fllnss rough already existing reudue Eg heahtart hera eathen, însurane, Secialmadia analyLs heCuveanalyücu : Prescribtve analyls twerk with preditwe analytu, whiches cuss data to adrlumine TRar -tuum ottomes. resCriptve analytis make t s machine larning to he lb busirewe ducide a Coumse e actien bud on comput roa7om' prediCiena. E: heath ta , bankung Siag oniste anaytica TKs tousAs moTe n ohy semethin hapnad TFis invclves move diveue data inpuh and bt hubetheizing Netd7 otatnalyie >Gathu hidden inah Gerata Reporh Peuterm mokat arolyu Imprevt businass requremunta.
  • 12. athana a atho kiddeninsjada Hidden insjghtsomdata au aathiui he nen ana d wih respeut to busin4 Aquismants, euat Reberta Rebost ae genenatad hom fo dáta and ara pau tohe Faspretive teams and ndvidual to dral wih fu thun ad. high rist in bustness. lexkerm Markat Analysis: Market andlyais can bt ptmsdt unduustand he stunqths and ha weaknaises Com tos. mpreve businessrequiremarts: Analysis data um busin t cstomen equirementi and expevimen Cs, Eyolarkin AnalytiedealbbiB ty Scalabilty The abilt q a dyum te handl inereasing ameunt work uqund t bextorm i Task. The in canase in dala sbrage abGty has gro Loh in Yecent yeas attemi. the mud r kig data TTaolitional Analytic Architctur he had t pull all togethe inle a separah analytes enuTOnm. te de analysis Data base 3 Database date baieu data bas e2 ocus The heavy Cn inhe anayti environmant. Analytie due
  • 13. Morduin in utabase Arehi tecure The breussing stays in the database ohew hu olata has heen Constidta- atabase atabas 3 databae 4 data bas2) CenstUdate REnta piedatAe Oauhouuse Asi Submit R e q u s t t h e ' m a t h u u J u s t s u b m d h a Ahrady tie Seve DY Pe. Mausively Pasallel lrocusing(Mer) tn MPP datobase breaks ha data ito îndabendnd chunks si th ndapenden disk and cPo Singla ovexloada Multip ghtty loadad Auves Serve Shael lothing! Oneuab ablo tOTGigab Chun too.Gb Guns Chun ks Chunk Clunl Alradtional dadabasr 100-h udi quyabyt tab 169 4 Chun ChunK Uuny Chun One ueu ad im 0 SinuCtaneous 100- 8gabt quaies
  • 14. onCuLUt frcu * Mf ytm alleuh oteunt drti cPo Ond dis k ttT rttASContuwunty n MPf J e b i n t o p e s 3 J Singlathresdhd fowall mcus MPP6ysüm build in kedundloney tmake u tovenyeasy MPP ysbim hovt escut moragnont trls Mana H CPU and duk shau CLeud Campulng Mekinae aá Cempany pabes fum 2tt - Mask hs unduuling inustwca em tusv - k zlastie ical e damord - ena poy- p- bais .Natinas Tratinb d trdans ord Technckg (NMsT) -Cndemand i Auie Kexwru peuLng - rapid elatt
  • 15. Too tybes ocous Environment JPoblic cloud Tha &uvico and ngastwtiue ae þrovidsd H-kit ever thu inTivne Greatst uvel et e{deny in shared useu ca. Lesst cunas andmoveylnorabu han brivat clous 2 rivat cloud nhasbuuclue chexaltd boluly fr singk o19anizolion The same fealunes publie cloud Ca he reatust uvel o decunity andconbel Necas oy to purchaue and own he entive cloud infrastrut ie podCompuing The sedenatizn Computir resourcus to reath a commen a MapKedue A baxallelogramming ameoerk. Parallezothe Faull tolvona gate dstsibutm nadbalnun map eduu Mab n tion frc sing a kay valus bauu to gina a sd qintxmadiat kyvalua pauves Redutfunticn Menging all madiato valuas auociatad wi h tha Bame immudiat Me-9 How map veduca hki lwi assUmaharu a 20 trabytu dala and 20 mabreduc Ave moodsoro pvo Jslibut a obgt t each t the 20ncolss uLing a dimpl dil Coby rocs 2 ubmit io þroghoma (Mab, Reduu)to Achad
  • 16. 3 ke map þrogiam find: the data em dik ard execlu tRa oo: Centains 4 he Jeault eh mab sth au Hn hassed T Th Acduit procus bummaiz nd aggl:gat th: inal ansueeu, Mapunetion Sehaolulu Results maluytic breccas and fhaly tie teol Data Analybi precess is a þreCa Cellseting ransorming , cleaning qnd modelling data uwithhe goal dicoveing he Aemuire irrmalien The Lsuutu so tbtained axe tommuniccatd,suggesüng conusiens and duppe tiy cde cisien ma king. ata Analyis precsss Consub f the follousint Þhases that are i rati Dath Reguiumunt sheaicalien Data coUadien Sata broci ato clari ata Analyiu ommunication
  • 17. 2ata Reqüge me r Spedtcation Data Colleetüon Data prcusing Data elaaning atAnalysiu Commundcalion JSpreadshest MiCrcsehl extel Spreadshush 2 Database Relationa Co umn Dcumen Grah. 3Progvammung Zang9uags Rand lrthen dely éuice dat vitualizaion Tabua be Pou BI AwS Guick ght
  • 18. 5. Big data toels Hadee ato laks fAhoche spask 6Cloud A os Geog cloud Mic ese dge Analysta vg posing Paxamet Reporting Analrties fanmpos eus ohat iuhatnlain ohy is happening gani zing Fovna ting Summasizing uLstioring ntrpreti Explerin Tasks Kesults ae pushadto tlaus bul Jesulu o answe uLt fer Revie Kesult quslonns Trons hts data înt d s utomma ndatio n t drive infermation Value atios. Applratin qdato Qnalyi Secui 1City plonning 2Franaortation 8.Healthcare 3 Praud and Ruk dstection 9. Trave 4Manage kiks J0 tnengY management 6 frchex spending J Snornt webseare 6 Cuwtome intona ttiona 2.gial advertseme (
  • 19. Dato Aralyin 2ccyra: 6dedo (hases afdath analytiosi:yela The data analytie cye desigmad or big data þrobems and data s6enu þrqets. Tke yels s Healire to frepresent real brojred. The bhases ohich involved in data analy hies le cyeh ans - Ofs Covey 2ata brebasation 3-Medelþlanning 4. Medel buildij 5-Cemmuritation kesuts 6 Oherationalize iscouey The data science kam len and investigat the probem evelob tonteset and undenstarding Come to knt us about data sturtes necededond available jor theprojet. Te teom sprmulatu nitialhy bethesiu Hhat con he lata tust with data. Beta pnehaaian Stebs t explere, þreprecisd and Con diton dath prierte mo d: Lling and analsis t Tequirts ho presenca an anaytit sondhere, he kam execute, oadandtransfevm to qut intohe&ondboe. Kata oepasation a sks ave likely o be perfome! nat HihG times and nol tn fredelined order. Sevenateels Commcn used for i s phase are-Hadeop, Aine minex Chen Kefme e4
  • 20. Model blanning Team ecþlbves data te lkarn obout Ae laHlonshi baluweon vaig kt Ubieputly, eleet kuys vasiobls and hs most suitabe mtd nthis phase, dath stiont Jeam drelop dato e for troinj tsting and þroduttion puxpese. Team burlds and exectis models bascd on he uoorkdone in b in ti model bloning phase Several tools commen y uacd der thu phase ae - Matlab , STA Mede! building Team olevele ba datasets fer tsting »thainig and prToduction burpose. Team alse considess whethen its esting too w?U 3u T Tunning he modls e hay nccd m6re rcbust envitonmet executing medels. >Prec 0T then- souTCe toeu -Rand PLR, Octave oERN . Comm cia tooa - Mat ab, STAS TGcA Commurication Kesult Ar <xeulin9 a model eam naed to Cempans outomes modl to critoio ësta blihed fr suecus ond faiune. Team Conuides htw bist to auticulat fineling4 Ond ouhoms to eom membus and slake holdors ta king ioto attounrd Kan assumptior. Teomzheuld iounti key indergs, qusntity bukines value, and dove lop manalive to Aummarize ond to nvey indings te sO 2 3 eheta oherationabize The eam Lommunitalis banu o es more hroadly and A bilet mojct te depay Lot7k in Contyr lled n brpove broaderit h wenk to full eti uae q usu.
  • 21. Tht abhroach enablu eam to leaih about omanc Aalatad consho,l mode n productioo environmend on small'Aale l nbsb, and ma adjustmunts brferrful! daploymint Thettam dslives fnal auperb, brieerg.codu Pree C7 Cpen scuce tonls -olave, weka,9ql,.Modlib istoy (obeiatia gato prebaralu d a b a n a u t i u (LemmuR Cortin R e s u l Modu budi, Meds plann JBuuinss ls The business usu u he One woho Unolustonds Ho main ansa pd and is ase basically banstited om the Araults , nu e 9ivts adviu ano tensut Ho ttam werkn ntto pretaou T Value othe ne3ults ebtainad and hou tho tperhlion m 4 Os den DUSines managll, Une manager, 0r deep Aubjet malte e»«se in Ho pejactmains full:ls this Yo. 2freetShonsor he rojct Sponser is tho one tohe ii yesponsibl to initiat thopojcd Projpctsponsor brovides the actual Tequiremenu o Ha projet ondbrean tRo bauise businass issus. He gennal rovids4 th funds Ond mnasUres ta dagreaof vauus hom tinalscdpu ethtom Kor king on ths projct,
  • 22. Thi peson intro dauee he bsime Cencern and brooma he redeulb reek Monaze Ths beLson enstes that key mihstne and puabese th: prepek u bntim and dtha expeitad qualt asiness tnt llugente fnalyst 7kotustnuas fnk lligan ce analyst provids bui nas domaim be n bau na detiled and dath urdenstanding tH data, key pefermaha ndicals (kPa ) ; kay matie and bud inds intlleganci em a Auholi abeint vieu, Thi beson generally cata fascka and ashort ond kneus beut ha data drds and sowrtes. Iatabase Administrater (DBA) DEA Jaditats and avange ha datbass environ ment to supert Hha analytHes TeLd H-keam uwerking en a preject. H bonsibiltëy may include providing hexmisstent key data bases or tablis ancd making dwu Had i aphrc þriat stcility s stages au in Rui cerecd lbts Aclats t he data a ubosito7ies cr nel. Lato Engtner ato ergint grasp deap tehricl skillh asisl wih kning $gL gwnior data maragimant md dato ectro ction andtrevides sun r dat intka int the analytis -sandbox. The dato engune'ocTks iointty ci th tho data stienist te helb bu dat in Coreet vnys anltysis, Lab Adentiit nth sse ntut fa dlilalts oith the subjek matt. .eepetise fer aralytis ehriquuei, date modelng and ahflying CeTech analitica thaipes ov ogvtn kusincss du" He ensuns oveall Onalyial ojactives a met
  • 23. ote dntist Buis and a bby analy ti cal m +hedi ard precked tsuands the data availa b f Concunad rejzek.