SlideShare a Scribd company logo
Bigdata -> Data Science -> AI,
and some $$$ in between
DNA’s journey in data science & big data
prologue of prologue
you have to have an idea
DNA - Einstein - Data science ja bigdata
THE IDEA
ALL OF THE DATA
WE HAVE
PROFIT
some data
some data
some data some data
some data
some data
some data
some
report
some
report
some
report
some
report
some
report
some
report
some
report
ONE SOURCE OF
TRUTH
+
CUSTOMER FIRST
+
AUTOMATE ALL
THE THINGS
WTF?
PROFIT
?activities?
who
cares
webdata?
who cares
DNA - Einstein - Data science ja bigdata
Agenda
Prologue:
The big thing(s)
The four things of analytics ~ the roadmap on how to do those things
Achievements
Whats inside: AWS good stuff & hype & love
Culture stuff
Upcoming
prologue
The BIG THING(s)
1. Business: it was the omnichannel customer
a. the ever-more-demanding, influential and independent customer
b. rise of need for analytical insight & data
c. demanding inf. management and analytics to be operational, not
finance-driven
d. stop sub-optimizing the system (customer)
2. Tech: it was cloud, open-source, and data science
a. suddenly - endless scale & processing power
b. reduced time-to-environment from weeks to minutes
c. reduced cost
d. ability to create intelligent data products that reduce time-to-insight and
time-to-action
hard for humans data science, machine learning data engineering, data pipelines
easy for humans AI / NLP reporting, basic calculus
hard for machines easy for machines
System requirements
- Infinite scale
- Process 10’000++ messages per sec
- Automated deploy & tests
- Version control
- Pay-for-use, not for-licence
- Real-time pipeline, disaster recovery, exactly-once-quarantees
- Real-time analytics, sub-second latency for everything
- Infinite processing power for data science stuff & large analytical deployments
- Array of libraries to make the data scientist’s life easier
- Modular, i can change any part of it, being that software or hardware
- Secure, EU referendums and Safe Harbour etc.
- Pipeline and persistent storage & data platform can be done from scratch to
production in 6 months
- Cant cost really anything, since had to scrape a small budget. 3 developers max.
OKAY! SOUNDS FAIR.
Business requirements
- Understand the omnichannel customer
- Reduce churn
- Increase cross-sales
- Increase product usage & increase retention
- Increase marketing ROI
- Insight should be real-time
- Actions should be near-real-time and everyone can do them
- Know where to put infrastructure better than before
- Make sense of unstructured data & text & speech & so forth
- Automate 80% of insight / data that was previously done by hand
- Your system shall not cost anything
- But it should deliver competitive advantage
OKAY! SOUNDS FAIR.
WHAT WOULD MACGYVER DO?
WHAT WOULD MACGYVER DO?
WOULD HE:
a) go and buy a licence and servers
and then wait around
b) build the damn thing from what
he happens to find with zero cost
WHAT WOULD MACGYVER DO?
YES!
b) build the damn thing from what he
happens to find with zero cost
Achievements & upcoming
Done (within a year):
Assisted investments & business (1) operations:
xx-xxx mil. / year
Directly optimized / machine learning (2) -handled
operations: x-xx mil. / year
Machine learning* & Data Science introduced
Marketing efforts from weeks to minutes
Automation from 10% to 80%
Conversion on direct channels up from 50 to 300
percent
Amount of automated & personalized channels
from 1 to 5 (all)
One source of truth & self-made
-> we know how it works
Ability to handle all types of data
Upcoming 2017:
Artifical intelligence (AI)*
Chatbots (AI)
“Acquistion” of display advertising
Understanding speech (AI)
Moving from CPU to GPU
DNA.FI fully personalized (w/ new concept)
* Data Science -> Machine Learning -> Artifical Intelligence
whats inside
code! (surprise)
clojure
python
c++
tensorflow
syntaxnet
spark
scala
sql
postgres
redshift
ec2
R
random forest
s3
jenkins
ansible
cnn / rnn / lstm
jupyter
aerospike
kafka
snowplow
scikit learn
matplotlib
als
k means
mllib
numpy, pandas, scipy
… etc
DNA - Einstein - Data science ja bigdata
COLLECT
real-time
batch
omnichannel
COMBINE
digital to brick n mortar
digital to everything
context to everything
customer to everything
COMPUTE
recommendations
analysis
reports
segments
predictions
descriptions
next best actions
customer journey
EXECUTE
churn prevention
cross-sales
targeted marketing
customer service efficiency
customer experience improvement
omnichannel optimization
react in real time
product development
CONTROL
continuous deployment
infrastructure as code
Customer interface layer
Channel layer
Delivery layer
Data / Machine
learning layer
Collecting layer
realtime 1.3T batch ~ 100gb
-> to redshift, we load 5’511’649’731 rows
Why redshift?
reporting on top of raw data;
17’072’941 rows joined to 110’773’366 rows
joined to 24’945’364 rows joined to 2’297’076
rows joined to 1’841’262 rows + some
dimensions and result returned in < 10 sec
-> no db-admins, no indexes, no “tuning”
Class: TV, Liiga
Rank: 0.87, 0.90
What happens in social
media? What is talked
about?
What’s
wrong?
from reporting sales to
reporting potential
(and the ways of going
from potential to sales)
R is still goooood.
And jupyter.
ALS recommendations /w
1.3 T data = good
1 0 1 1 0 1 0 0 1 1 1 0 1
ALS recommendations /w
1.3 T data = good
1 0 1 1 0 1 0 0 1 1 1 0 1
DNA - Einstein - Data science ja bigdata
culture stuff
more important than you’d think
http://www.slideshare.net/reed2001/
culture-1798664/
http://www.slideshare.net/reed2001
/culture-1798664/
MacGyver (remember?, what would MacGyver
do) = The thinker-doer
- Usually development methods split thinkers (project managers, scrum managers,
product owners and the lot) with doers (developers, analysts)
- This is (mostly) shit
- You’d need people leading who also know their stuff
- Saves money, time and nerves
- People communicate better
- Thinker-doers can communicate with business and translate to development
actions, even develop the things themselves
Demos & openness = The secret sauce to
success (and freedom to do more stuff)
- We sit on the “business floor”, right in between of basically everyone
- And we almost always have something displayed on a screen
- We make it easy to come and talk to us
- We make demos available to everyone
- We connect
- This makes all the difference
always connected
kindergarten - no output but
loads of fun
if done right, ultimate success
forced connection
(procedures!)
basic IT waterfall project basic IT “agile” project
never connected cave-people? chaos
nothing changes (or we close
our eyes that it does)
everything changes
all-the-time
business - IT alignment
Bigdata/
AI
Business
Directors* are doing
their own marketing
automation activities
without any help
*ping Solita, how many directors code...
And now, we have business even writing their
own code! (no, really)
upcoming
DNA - Einstein - Data science ja bigdata
1st try: word2vec + naive bayes
2nd try: convolutional neural net
3rd try: LSTM/RNN
4th try: syntaxnet
5th “try”: -> include speech recognition
6th try: spaCy
7th try, part I: latent dirichlet allocation
8th try: ?
Nth try: ?
Now?
in a good place. can’t fully disclose what we’re running though. :)
basically we can understand both speech and written natural language so that the
language can “flow” and it can be in a chat context or in longer formats;
ex:
- hi do you happen to have iPhones on stock?
- yea!
- cool. what’s the price? <- have to link to previous parts of conversation
NB! this is quite simple in English but tear-your-eyes-off-to-scratch-your-brain* -hard
with Finnish. we might be the first ones actually there.
*modified from: Friends, 1995, The One with the Baby on the Bus
Lessons learned
Understand the BIG THINGS (cloud, open source, omnichannel customer, data science, time-to-x)
Sit where business sits. And sit together. DO STUFF TOGETHER.
Don’t use project managers who can’t code (or who are not really good in the subject domain).
Apply advanced analytics to automate 80% of small decisions made all the time.
Continuous communication beats meetings. Don’t meet.
At least start with AI. dont just tweet about that shit.
DNA - Einstein - Data science ja bigdata

More Related Content

PDF
Data analysis trend 2015 2016 v071
PPTX
Infochimps + CloudCon: Infinite Monkey Theorem
PDF
Data Democratization at Nubank
PDF
R, Spark, Tensorflow, H20.ai Applied to Streaming Analytics
PDF
Big Data LDN 2017: The 3rd Wave of Business Intelligence
PPTX
Deep Learning Workflows: Training and Inference
PDF
Guest Lecture: Introduction to Big Data at Indian Institute of Technology
PDF
Webinar - Bringing Game Changing Insights with Graph Databases
Data analysis trend 2015 2016 v071
Infochimps + CloudCon: Infinite Monkey Theorem
Data Democratization at Nubank
R, Spark, Tensorflow, H20.ai Applied to Streaming Analytics
Big Data LDN 2017: The 3rd Wave of Business Intelligence
Deep Learning Workflows: Training and Inference
Guest Lecture: Introduction to Big Data at Indian Institute of Technology
Webinar - Bringing Game Changing Insights with Graph Databases

What's hot (20)

PDF
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
PDF
Hortonworks & IBM solutions
PDF
Big data trends challenges opportunities
PDF
IMCSummit 2015 - Day 2 IT Business Track - Real-time Interactive Big Data Ana...
PPTX
IoT and Big Data - Iot Asia 2014
PPTX
Big Data Application Architectures - IoT
PDF
Big Data on AWS
PPTX
AI in the Enterprise at Scale
PDF
Real Time Analytics with Apache Cassandra - Cassandra Day Munich
PDF
Demo Showcase: Graphs for Cybersecurity in Action
PDF
Big Data
PDF
Introduction to Big Data
PDF
Introduction to big data and apache spark
PDF
Big Data Expo 2015 - Clusterpoint The Future of Big Data
PPTX
Big Data Course - BigData HUB
PDF
Modern Thinking área digital MSKM 21/09/2017
PDF
Smart data for a predictive bank
PDF
Big Data and Fast Data - big and fast combined, is it possible?
PPTX
Deep Learning with Cloudera
PDF
Introduction to Big Data
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
Hortonworks & IBM solutions
Big data trends challenges opportunities
IMCSummit 2015 - Day 2 IT Business Track - Real-time Interactive Big Data Ana...
IoT and Big Data - Iot Asia 2014
Big Data Application Architectures - IoT
Big Data on AWS
AI in the Enterprise at Scale
Real Time Analytics with Apache Cassandra - Cassandra Day Munich
Demo Showcase: Graphs for Cybersecurity in Action
Big Data
Introduction to Big Data
Introduction to big data and apache spark
Big Data Expo 2015 - Clusterpoint The Future of Big Data
Big Data Course - BigData HUB
Modern Thinking área digital MSKM 21/09/2017
Smart data for a predictive bank
Big Data and Fast Data - big and fast combined, is it possible?
Deep Learning with Cloudera
Introduction to Big Data
Ad

Viewers also liked (20)

PDF
Verso i bigdata giudiziari? (Nexa Torino, luglio 2016)
PDF
[분석]서울시 2030 나홀로족을 위한 라이프 가이드북
PDF
BigData - Hadoop -by 侯圣文@secooler
PPTX
ITEC - Qua trinh phat trien he thong BigData
PDF
Oxalide MorningTech #1 - BigData
PPTX
Bigdata analytics and our IoT gateway
PDF
Big Data Patients and New Requirements for Clinical Systems
PPTX
Hadoop and BigData - July 2016
PPTX
AWS Finland March meetup 2017 - selecting enterprise IoT platform
PPSX
Retour d'expérience Large IoT project / BigData : détail du cas réel de Hager...
PDF
Big Data Analytics Infrastructure for Dummies
PPTX
TOP UNIVERSITIES IN US FOR MS IN DATA SCIENCE
PPTX
Semantech Inc. - Mastering Enterprise Big Data - Intro
PDF
Data Visualization: A Quick Tour for Data Science Enthusiasts
PDF
Introduction on Data Science
PPTX
Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo Lee
PDF
You Don't Have to Be a Data Scientist to Do Data Science
PDF
Introduction to Data Science
KEY
Intro to Data Science for Enterprise Big Data
PDF
Myths and Mathemagical Superpowers of Data Scientists
Verso i bigdata giudiziari? (Nexa Torino, luglio 2016)
[분석]서울시 2030 나홀로족을 위한 라이프 가이드북
BigData - Hadoop -by 侯圣文@secooler
ITEC - Qua trinh phat trien he thong BigData
Oxalide MorningTech #1 - BigData
Bigdata analytics and our IoT gateway
Big Data Patients and New Requirements for Clinical Systems
Hadoop and BigData - July 2016
AWS Finland March meetup 2017 - selecting enterprise IoT platform
Retour d'expérience Large IoT project / BigData : détail du cas réel de Hager...
Big Data Analytics Infrastructure for Dummies
TOP UNIVERSITIES IN US FOR MS IN DATA SCIENCE
Semantech Inc. - Mastering Enterprise Big Data - Intro
Data Visualization: A Quick Tour for Data Science Enthusiasts
Introduction on Data Science
Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo Lee
You Don't Have to Be a Data Scientist to Do Data Science
Introduction to Data Science
Intro to Data Science for Enterprise Big Data
Myths and Mathemagical Superpowers of Data Scientists
Ad

Similar to DNA - Einstein - Data science ja bigdata (20)

PDF
Maximizing Big Data ROI via Best of Breed Technology Patterns and Practices -...
PPTX
ferret_company_facts_en(30.03.17)
PDF
AI 2023.pdf
PDF
StartupTalk #36 - Feedback Beyond the Buzz
PPSX
Maximize Big Data ROI via Best of Breed Patterns and Practices
PDF
The 3 Key Barriers Keeping Companies from Deploying Data Products
PDF
How To Build Mature SM - final
PDF
Smarter Analytics: Supporting the Enterprise with Automation
PDF
Hadoop and the Relational Database: The Best of Both Worlds
PPTX
Is IIOT Right for You?
PDF
InSource 2017 IIoT Roadshow: Evolution or Revolution
PDF
AI at Scale in Enterprises
PDF
28022017 Simen Munter Mindfields
PDF
Automated Systems Aid Managed Service Provider InTechnology with Managing Clo...
PDF
SDD2017 - 03 Abed Ajraou - putting data science in your business a first uti...
PDF
Serverless projects at Myplanet
PPTX
In-Memory Computing Webcast. Market Predictions 2017
PPTX
Big Data for the CMO
PDF
Motion savvy deck (public)
PDF
Taming Big Data With Modern Software Architecture
Maximizing Big Data ROI via Best of Breed Technology Patterns and Practices -...
ferret_company_facts_en(30.03.17)
AI 2023.pdf
StartupTalk #36 - Feedback Beyond the Buzz
Maximize Big Data ROI via Best of Breed Patterns and Practices
The 3 Key Barriers Keeping Companies from Deploying Data Products
How To Build Mature SM - final
Smarter Analytics: Supporting the Enterprise with Automation
Hadoop and the Relational Database: The Best of Both Worlds
Is IIOT Right for You?
InSource 2017 IIoT Roadshow: Evolution or Revolution
AI at Scale in Enterprises
28022017 Simen Munter Mindfields
Automated Systems Aid Managed Service Provider InTechnology with Managing Clo...
SDD2017 - 03 Abed Ajraou - putting data science in your business a first uti...
Serverless projects at Myplanet
In-Memory Computing Webcast. Market Predictions 2017
Big Data for the CMO
Motion savvy deck (public)
Taming Big Data With Modern Software Architecture

More from Rolf Koski (20)

PDF
AWS Tampere Meetup February 2019 - Real World Well-Architected
PDF
AWS Finland Meetup 2020 January
PDF
AWS Finland Meetup 2019 November
PDF
AWS Finland Meetup 2019 October
PDF
AWS Finland Meetup 2019 September - sponsored by Digia
PDF
AWS Finland meetup 2019 september - sponsored by Zalando
PDF
AWS Stockholm Meetup June 2019 - Cybercom DeepRacer story
PDF
Serverless Days Helsinki 2019 Rolf Koski - Business Driven Availability
PDF
AWS Finland Meetup 2019 April
PDF
AWS Community Day 2019 - Business Driven Availability
PDF
Match AWS Pori - Rolf Koski - Cybercom
PPTX
AWS Finland meetup 2018 August
PDF
AWS Community Day Nordics 2018 - Aino Health: Transition to serverless and le...
PDF
AWS Community Day Nordics 2018 - Vivek Balakrishnan (Rovio): Learnings from g...
PDF
AWS Community Day Nordics 2018 - Alexander Schachtschabel (Dazzle Rocks): Big...
PDF
AWS Community Day Nordics 2018 - Saku Vaittinen (VR): Data driven public tran...
PDF
AWS Community Day Nordics 2018: Rolf Koski - Building Successful Enterprise C...
PPTX
AWS Finland meetup 2017 October
PPTX
AWS Finland meetup 2017 August
PDF
AWS Finland User Group Meetup 2017-05-23
AWS Tampere Meetup February 2019 - Real World Well-Architected
AWS Finland Meetup 2020 January
AWS Finland Meetup 2019 November
AWS Finland Meetup 2019 October
AWS Finland Meetup 2019 September - sponsored by Digia
AWS Finland meetup 2019 september - sponsored by Zalando
AWS Stockholm Meetup June 2019 - Cybercom DeepRacer story
Serverless Days Helsinki 2019 Rolf Koski - Business Driven Availability
AWS Finland Meetup 2019 April
AWS Community Day 2019 - Business Driven Availability
Match AWS Pori - Rolf Koski - Cybercom
AWS Finland meetup 2018 August
AWS Community Day Nordics 2018 - Aino Health: Transition to serverless and le...
AWS Community Day Nordics 2018 - Vivek Balakrishnan (Rovio): Learnings from g...
AWS Community Day Nordics 2018 - Alexander Schachtschabel (Dazzle Rocks): Big...
AWS Community Day Nordics 2018 - Saku Vaittinen (VR): Data driven public tran...
AWS Community Day Nordics 2018: Rolf Koski - Building Successful Enterprise C...
AWS Finland meetup 2017 October
AWS Finland meetup 2017 August
AWS Finland User Group Meetup 2017-05-23

Recently uploaded (20)

PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
Business Acumen Training GuidePresentation.pptx
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PPT
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
PPTX
Database Infoormation System (DBIS).pptx
PDF
Mega Projects Data Mega Projects Data
PDF
Launch Your Data Science Career in Kochi – 2025
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PDF
Clinical guidelines as a resource for EBP(1).pdf
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPT
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
PPTX
Introduction to Knowledge Engineering Part 1
PPTX
IB Computer Science - Internal Assessment.pptx
Supervised vs unsupervised machine learning algorithms
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Business Acumen Training GuidePresentation.pptx
Business Ppt On Nestle.pptx huunnnhhgfvu
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
Database Infoormation System (DBIS).pptx
Mega Projects Data Mega Projects Data
Launch Your Data Science Career in Kochi – 2025
Introduction-to-Cloud-ComputingFinal.pptx
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
Fluorescence-microscope_Botany_detailed content
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
Clinical guidelines as a resource for EBP(1).pdf
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
STUDY DESIGN details- Lt Col Maksud (21).pptx
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
Introduction to Knowledge Engineering Part 1
IB Computer Science - Internal Assessment.pptx

DNA - Einstein - Data science ja bigdata

  • 1. Bigdata -> Data Science -> AI, and some $$$ in between DNA’s journey in data science & big data
  • 2. prologue of prologue you have to have an idea
  • 4. THE IDEA ALL OF THE DATA WE HAVE PROFIT some data some data some data some data some data some data some data some report some report some report some report some report some report some report ONE SOURCE OF TRUTH + CUSTOMER FIRST + AUTOMATE ALL THE THINGS WTF? PROFIT ?activities? who cares webdata? who cares
  • 6. Agenda Prologue: The big thing(s) The four things of analytics ~ the roadmap on how to do those things Achievements Whats inside: AWS good stuff & hype & love Culture stuff Upcoming
  • 8. The BIG THING(s) 1. Business: it was the omnichannel customer a. the ever-more-demanding, influential and independent customer b. rise of need for analytical insight & data c. demanding inf. management and analytics to be operational, not finance-driven d. stop sub-optimizing the system (customer) 2. Tech: it was cloud, open-source, and data science a. suddenly - endless scale & processing power b. reduced time-to-environment from weeks to minutes c. reduced cost d. ability to create intelligent data products that reduce time-to-insight and time-to-action
  • 9. hard for humans data science, machine learning data engineering, data pipelines easy for humans AI / NLP reporting, basic calculus hard for machines easy for machines
  • 10. System requirements - Infinite scale - Process 10’000++ messages per sec - Automated deploy & tests - Version control - Pay-for-use, not for-licence - Real-time pipeline, disaster recovery, exactly-once-quarantees - Real-time analytics, sub-second latency for everything - Infinite processing power for data science stuff & large analytical deployments - Array of libraries to make the data scientist’s life easier - Modular, i can change any part of it, being that software or hardware - Secure, EU referendums and Safe Harbour etc. - Pipeline and persistent storage & data platform can be done from scratch to production in 6 months - Cant cost really anything, since had to scrape a small budget. 3 developers max. OKAY! SOUNDS FAIR.
  • 11. Business requirements - Understand the omnichannel customer - Reduce churn - Increase cross-sales - Increase product usage & increase retention - Increase marketing ROI - Insight should be real-time - Actions should be near-real-time and everyone can do them - Know where to put infrastructure better than before - Make sense of unstructured data & text & speech & so forth - Automate 80% of insight / data that was previously done by hand - Your system shall not cost anything - But it should deliver competitive advantage OKAY! SOUNDS FAIR.
  • 13. WHAT WOULD MACGYVER DO? WOULD HE: a) go and buy a licence and servers and then wait around b) build the damn thing from what he happens to find with zero cost
  • 14. WHAT WOULD MACGYVER DO? YES! b) build the damn thing from what he happens to find with zero cost
  • 15. Achievements & upcoming Done (within a year): Assisted investments & business (1) operations: xx-xxx mil. / year Directly optimized / machine learning (2) -handled operations: x-xx mil. / year Machine learning* & Data Science introduced Marketing efforts from weeks to minutes Automation from 10% to 80% Conversion on direct channels up from 50 to 300 percent Amount of automated & personalized channels from 1 to 5 (all) One source of truth & self-made -> we know how it works Ability to handle all types of data Upcoming 2017: Artifical intelligence (AI)* Chatbots (AI) “Acquistion” of display advertising Understanding speech (AI) Moving from CPU to GPU DNA.FI fully personalized (w/ new concept) * Data Science -> Machine Learning -> Artifical Intelligence
  • 17. code! (surprise) clojure python c++ tensorflow syntaxnet spark scala sql postgres redshift ec2 R random forest s3 jenkins ansible cnn / rnn / lstm jupyter aerospike kafka snowplow scikit learn matplotlib als k means mllib numpy, pandas, scipy … etc
  • 19. COLLECT real-time batch omnichannel COMBINE digital to brick n mortar digital to everything context to everything customer to everything COMPUTE recommendations analysis reports segments predictions descriptions next best actions customer journey EXECUTE churn prevention cross-sales targeted marketing customer service efficiency customer experience improvement omnichannel optimization react in real time product development CONTROL continuous deployment infrastructure as code
  • 20. Customer interface layer Channel layer Delivery layer Data / Machine learning layer Collecting layer
  • 21. realtime 1.3T batch ~ 100gb -> to redshift, we load 5’511’649’731 rows
  • 22. Why redshift? reporting on top of raw data; 17’072’941 rows joined to 110’773’366 rows joined to 24’945’364 rows joined to 2’297’076 rows joined to 1’841’262 rows + some dimensions and result returned in < 10 sec -> no db-admins, no indexes, no “tuning”
  • 23. Class: TV, Liiga Rank: 0.87, 0.90 What happens in social media? What is talked about?
  • 24. What’s wrong? from reporting sales to reporting potential (and the ways of going from potential to sales)
  • 25. R is still goooood. And jupyter.
  • 26. ALS recommendations /w 1.3 T data = good 1 0 1 1 0 1 0 0 1 1 1 0 1
  • 27. ALS recommendations /w 1.3 T data = good 1 0 1 1 0 1 0 0 1 1 1 0 1
  • 29. culture stuff more important than you’d think
  • 32. MacGyver (remember?, what would MacGyver do) = The thinker-doer - Usually development methods split thinkers (project managers, scrum managers, product owners and the lot) with doers (developers, analysts) - This is (mostly) shit - You’d need people leading who also know their stuff - Saves money, time and nerves - People communicate better - Thinker-doers can communicate with business and translate to development actions, even develop the things themselves
  • 33. Demos & openness = The secret sauce to success (and freedom to do more stuff) - We sit on the “business floor”, right in between of basically everyone - And we almost always have something displayed on a screen - We make it easy to come and talk to us - We make demos available to everyone - We connect - This makes all the difference
  • 34. always connected kindergarten - no output but loads of fun if done right, ultimate success forced connection (procedures!) basic IT waterfall project basic IT “agile” project never connected cave-people? chaos nothing changes (or we close our eyes that it does) everything changes all-the-time business - IT alignment
  • 36. Directors* are doing their own marketing automation activities without any help *ping Solita, how many directors code...
  • 37. And now, we have business even writing their own code! (no, really)
  • 40. 1st try: word2vec + naive bayes 2nd try: convolutional neural net 3rd try: LSTM/RNN
  • 41. 4th try: syntaxnet 5th “try”: -> include speech recognition 6th try: spaCy
  • 42. 7th try, part I: latent dirichlet allocation 8th try: ?
  • 44. Now? in a good place. can’t fully disclose what we’re running though. :) basically we can understand both speech and written natural language so that the language can “flow” and it can be in a chat context or in longer formats; ex: - hi do you happen to have iPhones on stock? - yea! - cool. what’s the price? <- have to link to previous parts of conversation NB! this is quite simple in English but tear-your-eyes-off-to-scratch-your-brain* -hard with Finnish. we might be the first ones actually there. *modified from: Friends, 1995, The One with the Baby on the Bus
  • 45. Lessons learned Understand the BIG THINGS (cloud, open source, omnichannel customer, data science, time-to-x) Sit where business sits. And sit together. DO STUFF TOGETHER. Don’t use project managers who can’t code (or who are not really good in the subject domain). Apply advanced analytics to automate 80% of small decisions made all the time. Continuous communication beats meetings. Don’t meet. At least start with AI. dont just tweet about that shit.