SlideShare a Scribd company logo
Ingesting clicks data for analytics
Francesco Furiani, CTO @
$ whoami
Francesco Furiani (@ilfurio):
 Backend Engineer
 Roamed these halls not too long ago
Ingesting clicks data for analytics
Loves:
 Studying new CS stuff
 PlayStation / Bike / Traveling / Soccer
 O RLY? books
How do i make a living:
 CTO @ ClickMeter
 Backend Engineer @ ClickMeter
 Enum.take_random(IT_ROLES,1) @ ClickMeter
Ingesting clicks data for analytics
ClickMeter
 100k+ customers
 Getting events for customers from 10 to 3000 req/sec
Ingesting clicks data for analytics
ClickMeter
We receive data anytime someone:
 Clicks our links
 Views our pixels
 Calls our postbacks
Our customers use us:
 Inside a famous app the day of the big release ✔
 Advertising on an extremely big video portal ✔
 A tiny travel blog ✔
 A physical device for advertising ✔
Ingesting clicks data for analytics
Getting the data
We need to:
 Try not lose the events we receive (duh)
 Show to customers data for better insight on their campaigns
 Scale up/down according to the incoming fluxes
 Improve the product by using the data we get
 Do it as fast as possible (wasn’t this ready a week ago?)
 Do it as cheap as possible
Ingesting clicks data for analytics
The challenge
Find the size of the problem you’re trying to solve
 How much data do you expect? Rate?
 What do you have to do with it?
 Do I have to do something with ALL of it?
 How fast do I have to do it?
Answers to these questions are a starting point.
Ingesting clicks data for analytics
Size
Once we know how big and bad the beast is, we
need to design the ranch that will keep it in check.
Iterative process and prone to a lot of failures, but
the world is out there to help us.
Think, write and draw a lot.
Ingesting clicks data for analytics
Design
… drawn too much ...
Ingesting clicks data for analytics
Design
Most of us will never have the joy (and the horror) of
creating a new stack novel in theory and practice.
Still we need to understand the theory behind every
brick.
Read the info, read the opinions, try little proof of
concepts of the moving parts, it helps a lot!
Ingesting clicks data for analytics
Which bricks should I use
A very important brick.
Elasticity of computation power, many *aaS, managed solutions are
really a great help in terms of saved manpower and fast iterations.
It comes at a great cost to consider:
• $$$ (ymmv)
• Possible lock-ins
Ingesting clicks data for analytics
The cloud is a brick too
… well it’s never definitive ...
Ingesting clicks data for analytics
Design with bricks
Obviously we haven’t followed those guidelines.
One becomes savvy after crashing and burning
many times.
But still thanks to those errors we got there and
built, at every iteration, a better infrastructure.
Ingesting clicks data for analytics
How we did it
ClickMeter was already live and growing
It needed an overhaul in its infrastructure/backend.
The growth fueled the need to be ready for more power to handle more data.
Obviously this had to be a tablecloth trick migration 
Ingesting clicks data for analytics
How we did it
Already on the cloud (AWS), we thought of having an hybrid approach but it
didn’t make sense.
Review of old components already in production to see what to kill, keep or
update.
Kept good stuff and designed some new layers to make them work flawlessly in
the new infrastructure.
Ingesting clicks data for analytics
How we did it
Ingesting clicks data for analytics
Pretty important, they need to:
• Stay up
• Scale up/down depending on the income traffic
• Never lose anything
• Be as fast as possible in processing
They’re a custom web app application that undergoes a lot of testing.
We used stuff like Beanstalk, Scaling groups, Load Balancers and Health routing
offered by our cloud provider to manage the webapp scaling/availability
Ingesting clicks data for analytics
Redirect engine
aka events collector
Pipeline
Most of this part uses our cloud provider
technology.
This simplify maintenance and provisioning,
keeping the focus on the value of our product.
Some moving parts are custom made by us to
interact with the cloud technology (might be
proprietary or just repackaged known one).
Ingesting clicks data for analytics
Tracking engine
and friends
SQS Pipeline
Kinesis
• Events • Preprocessing
• Postprocessing
• DynamoDB
Ingesting clicks data for analytics
Tracking engine
and friends
Combination of real-time and batch
technologies.
One of the scaling parts that actually provides
value to the customers.
Computes analysis on events data from a
simple count to some predictions.
Check the data produced by your processing
system to improve step by step the pipeline!
Ingesting clicks data for analytics
Pipeline
Ingesting clicks data for analytics
Pipeline
We employ different storage based on speed of delivery and data type.
All the data is accessible via a REST API.
This permits to develop a frontend layer with pretty much ease and allows
customers to take control of the data and use them in way we might haven’t
thought.
Ingesting clicks data for analytics
Storage and data delivery
Managed services on the cloud help us a lot!
Most of the team can focus on improvements
and shipping (users are happy, so is the CEO).
Some of us (me) still have to be the
CloudOp/DevOp.
p.s.: prepare always a plan b for when you’ll
break things!
Ingesting clicks data for analytics
Operations
Cloud is typically more expensive of your own metal.
This extra money you’ve to spend is anyway well spent:
• Flexibility
• Easier provisioning
• Easier management
• Easier operations
There are different types of clouds choose wisely.
Ingesting clicks data for analytics
Cloud co$t$
Creating and managing a “big data” ready infrastructure is no easy task,
but it can be done step by step also by startups.
The cloud is a cool starting ground providing you with many of the toys
you need, so you can focus on what part of “big data” gives you value!
Use the wisdom shared by the big/medium players that already have
been there(and built most of the stuff you’re using).
Ingesting clicks data for analytics
Conclusions
Thank You
Any questions?
@il_furio
francesco@clickmeter.com

More Related Content

PDF
Network visualization for financial crime detection
PDF
Data Analytics for Security Intelligence
PDF
Data Analyst: il top player che tutti desiderano in azienda
PPTX
Seminario Big Data
PDF
Fundamentals of Big Data in 2 minutes!!
PDF
Come diventare data scientist - Paolo Pellegrini
PPTX
Building Innovative Data Products in a Banking Environment
PDF
NewMR 2016 presents: 9 Big Applications of Big Data
Network visualization for financial crime detection
Data Analytics for Security Intelligence
Data Analyst: il top player che tutti desiderano in azienda
Seminario Big Data
Fundamentals of Big Data in 2 minutes!!
Come diventare data scientist - Paolo Pellegrini
Building Innovative Data Products in a Banking Environment
NewMR 2016 presents: 9 Big Applications of Big Data

What's hot (19)

PPTX
An Introduction to Big Data
PPTX
Big Data and The Future of Insight - Future Foundation
PDF
Big Data : Risks and Opportunities
PPTX
Big data - What is It?
PDF
Big Data Trends - WorldFuture 2015 Conference
PPTX
Big data Presentation
PDF
On Big Data Analytics - opportunities and challenges
PDF
Big Data analytics
PPTX
Big Data Analytics
PPT
Real time analytics of big data
PPTX
Data Science Courses - BigData VS Data Science
PPTX
Big data characteristics, value chain and challenges
PDF
Data Science: De la Matemática a la Práctica
PDF
Impact of big data on analytics
PPTX
Synthetic Data for Big Data Privacy
PPTX
The big data value chain r1-31 oct13
PPTX
Big Data - 25 Amazing Facts Everyone Should Know
PDF
Introduction on Data Science
DOCX
Small data vs. Big data : back to the basics
An Introduction to Big Data
Big Data and The Future of Insight - Future Foundation
Big Data : Risks and Opportunities
Big data - What is It?
Big Data Trends - WorldFuture 2015 Conference
Big data Presentation
On Big Data Analytics - opportunities and challenges
Big Data analytics
Big Data Analytics
Real time analytics of big data
Data Science Courses - BigData VS Data Science
Big data characteristics, value chain and challenges
Data Science: De la Matemática a la Práctica
Impact of big data on analytics
Synthetic Data for Big Data Privacy
The big data value chain r1-31 oct13
Big Data - 25 Amazing Facts Everyone Should Know
Introduction on Data Science
Small data vs. Big data : back to the basics
Ad

Viewers also liked (20)

PPTX
Language Translation re-invented with Big Data
PDF
Genomic Data Analysis
PDF
Data Driven Business Model: le opportunità di monetizzazione
PDF
Social Big Data
PDF
BitConeView: Visualization of Flows in the Bitcoin Transaction Graph
PPT
Social Media per fare analisi della concorrenza
PPT
Big Data & Privacy @ #Datadriven16
PDF
PDF
BigData: una nuova fonte per la ricerca storica
PDF
Data Driven UX - From Social networks to target audience
PPTX
4th industrial revolution – impact of data on the real world
PPTX
Holographic Data Visualization - M. Valoriani & A. Musone
PDF
Towards intelligent data insights in central banks: challenges and opportunit...
PDF
A visual approach to fraud detection and investigation - Giuseppe Francavilla
PDF
IoT & fresh food
PDF
Il deep learning ed una nuova generazione di AI - Simone Scardapane
PDF
Healthware for medicine - Roberto Ascione
PDF
Big Data, Psychografics and Social Media Advertising - Alessandro Sisti
PDF
INDUSTRIA 4.0 - Il trasferimento tecnologico attraverso i Digital Innovation ...
PPTX
Enhanced site search with cognitive APIs - Glynn Bird
Language Translation re-invented with Big Data
Genomic Data Analysis
Data Driven Business Model: le opportunità di monetizzazione
Social Big Data
BitConeView: Visualization of Flows in the Bitcoin Transaction Graph
Social Media per fare analisi della concorrenza
Big Data & Privacy @ #Datadriven16
BigData: una nuova fonte per la ricerca storica
Data Driven UX - From Social networks to target audience
4th industrial revolution – impact of data on the real world
Holographic Data Visualization - M. Valoriani & A. Musone
Towards intelligent data insights in central banks: challenges and opportunit...
A visual approach to fraud detection and investigation - Giuseppe Francavilla
IoT & fresh food
Il deep learning ed una nuova generazione di AI - Simone Scardapane
Healthware for medicine - Roberto Ascione
Big Data, Psychografics and Social Media Advertising - Alessandro Sisti
INDUSTRIA 4.0 - Il trasferimento tecnologico attraverso i Digital Innovation ...
Enhanced site search with cognitive APIs - Glynn Bird
Ad

Similar to Ingesting click events for analytics (20)

PDF
Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida
PPTX
Partner webinar presentation aws pebble_treasure_data
PDF
Take Action: The New Reality of Data-Driven Business
PDF
Smarter Analytics: Supporting the Enterprise with Automation
PPTX
Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...
PDF
How Celtra Optimizes its Advertising Platform with Databricks
PDF
Bridging the Gap: Analyzing Data in and Below the Cloud
PPTX
Gov Day Sacramento 2015 - Keynote/Overview
PDF
Data Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
PDF
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...
PDF
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
PPTX
All you need to know about yelowsofts new version update
PPSX
Maximize Big Data ROI via Best of Breed Patterns and Practices
PDF
Learning Azure Synapse Analytics (Third Early Release) Paul Andrew
PDF
TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...
PPTX
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
PDF
Horses for Courses: Database Roundtable
PDF
Before vs After: Redesigning a Website to be Useful and Informative for Devel...
PDF
SPT 104 Unlock your big data with analytics and BI on Office 365
PDF
2022 Trends in Enterprise Analytics
Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida
Partner webinar presentation aws pebble_treasure_data
Take Action: The New Reality of Data-Driven Business
Smarter Analytics: Supporting the Enterprise with Automation
Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...
How Celtra Optimizes its Advertising Platform with Databricks
Bridging the Gap: Analyzing Data in and Below the Cloud
Gov Day Sacramento 2015 - Keynote/Overview
Data Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
All you need to know about yelowsofts new version update
Maximize Big Data ROI via Best of Breed Patterns and Practices
Learning Azure Synapse Analytics (Third Early Release) Paul Andrew
TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
Horses for Courses: Database Roundtable
Before vs After: Redesigning a Website to be Useful and Informative for Devel...
SPT 104 Unlock your big data with analytics and BI on Office 365
2022 Trends in Enterprise Analytics

More from Data Driven Innovation (20)

PDF
Integrazione della mobilità elettrica nei sistemi urbani (Stefano Carrese, Un...
PDF
La statistica ufficiale e i trasporti marittimi nell'era dei big data (Vincen...
PDF
How can we realize the Mobility as a Service (Maas) (Andrea Paletti, London S...
PDF
Il DTC-Lazio e i dati del patrimonio culturale (Maria Prezioso, Università To...
PDF
CHNet-DHLab: Servizi Cloud a supporto dei beni culturali (Fabio Proietti, INF...
PDF
Progetto EOSC-Pillar (Fulvio Galeazzi, GARR)
PDF
Una infrastruttura per l’accesso al patrimonio culturale: il Progetto del Por...
PDF
Utilizzo dei Big data per l’analisi dei flussi veicolari e della mobilità (Ma...
PDF
I dati personali nell'analisi comportamentale della mobilità di dipendenti e ...
PDF
Estrarre valore dai dati: tecnologie per ottimizzare la mobilità del futuro (...
PPTX
Le piattaforme dati per la mobilità nelle città italiane (Marco Mena, EY)
PDF
WiseTown, un ecosistema di applicazioni e strumenti per migliorare la qualità...
PDF
CityOpenSource as a civic tech tool (Ilaria Vitellio, CityOpenSource)
PDF
Big Data Confederation: toward the local urban data market place (Renzo Taffa...
PDF
Making citizens the eyes of policy makers: a sweet spot for hybrid AI? (Danie...
PDF
Dall'Agenda Digitale alla Smart City: il percorso di Roma Capitale verso il D...
PDF
Reusing open data: how to make a difference (Vittorio Scarano, Università di ...
PDF
Gestire i beni culturali con i big data (Sandro Stancampiano, Istat)
PDF
Data Governance: cos’è e perché è importante? (Elena Arista, Erwin)
PDF
Data driven economy: bastano i dati per avviare una start up? (Gabriele Anton...
Integrazione della mobilità elettrica nei sistemi urbani (Stefano Carrese, Un...
La statistica ufficiale e i trasporti marittimi nell'era dei big data (Vincen...
How can we realize the Mobility as a Service (Maas) (Andrea Paletti, London S...
Il DTC-Lazio e i dati del patrimonio culturale (Maria Prezioso, Università To...
CHNet-DHLab: Servizi Cloud a supporto dei beni culturali (Fabio Proietti, INF...
Progetto EOSC-Pillar (Fulvio Galeazzi, GARR)
Una infrastruttura per l’accesso al patrimonio culturale: il Progetto del Por...
Utilizzo dei Big data per l’analisi dei flussi veicolari e della mobilità (Ma...
I dati personali nell'analisi comportamentale della mobilità di dipendenti e ...
Estrarre valore dai dati: tecnologie per ottimizzare la mobilità del futuro (...
Le piattaforme dati per la mobilità nelle città italiane (Marco Mena, EY)
WiseTown, un ecosistema di applicazioni e strumenti per migliorare la qualità...
CityOpenSource as a civic tech tool (Ilaria Vitellio, CityOpenSource)
Big Data Confederation: toward the local urban data market place (Renzo Taffa...
Making citizens the eyes of policy makers: a sweet spot for hybrid AI? (Danie...
Dall'Agenda Digitale alla Smart City: il percorso di Roma Capitale verso il D...
Reusing open data: how to make a difference (Vittorio Scarano, Università di ...
Gestire i beni culturali con i big data (Sandro Stancampiano, Istat)
Data Governance: cos’è e perché è importante? (Elena Arista, Erwin)
Data driven economy: bastano i dati per avviare una start up? (Gabriele Anton...

Recently uploaded (20)

PPT
Quality review (1)_presentation of this 21
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PDF
Mega Projects Data Mega Projects Data
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPTX
IB Computer Science - Internal Assessment.pptx
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PDF
Fluorescence-microscope_Botany_detailed content
PDF
annual-report-2024-2025 original latest.
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PPTX
Business Acumen Training GuidePresentation.pptx
PPT
Miokarditis (Inflamasi pada Otot Jantung)
Quality review (1)_presentation of this 21
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
Mega Projects Data Mega Projects Data
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
climate analysis of Dhaka ,Banglades.pptx
IB Computer Science - Internal Assessment.pptx
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
Reliability_Chapter_ presentation 1221.5784
STUDY DESIGN details- Lt Col Maksud (21).pptx
Fluorescence-microscope_Botany_detailed content
annual-report-2024-2025 original latest.
Galatica Smart Energy Infrastructure Startup Pitch Deck
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Qualitative Qantitative and Mixed Methods.pptx
Business Ppt On Nestle.pptx huunnnhhgfvu
Business Acumen Training GuidePresentation.pptx
Miokarditis (Inflamasi pada Otot Jantung)

Ingesting click events for analytics

  • 1. Ingesting clicks data for analytics Francesco Furiani, CTO @
  • 2. $ whoami Francesco Furiani (@ilfurio):  Backend Engineer  Roamed these halls not too long ago Ingesting clicks data for analytics Loves:  Studying new CS stuff  PlayStation / Bike / Traveling / Soccer  O RLY? books How do i make a living:  CTO @ ClickMeter  Backend Engineer @ ClickMeter  Enum.take_random(IT_ROLES,1) @ ClickMeter
  • 3. Ingesting clicks data for analytics ClickMeter
  • 4.  100k+ customers  Getting events for customers from 10 to 3000 req/sec Ingesting clicks data for analytics ClickMeter
  • 5. We receive data anytime someone:  Clicks our links  Views our pixels  Calls our postbacks Our customers use us:  Inside a famous app the day of the big release ✔  Advertising on an extremely big video portal ✔  A tiny travel blog ✔  A physical device for advertising ✔ Ingesting clicks data for analytics Getting the data
  • 6. We need to:  Try not lose the events we receive (duh)  Show to customers data for better insight on their campaigns  Scale up/down according to the incoming fluxes  Improve the product by using the data we get  Do it as fast as possible (wasn’t this ready a week ago?)  Do it as cheap as possible Ingesting clicks data for analytics The challenge
  • 7. Find the size of the problem you’re trying to solve  How much data do you expect? Rate?  What do you have to do with it?  Do I have to do something with ALL of it?  How fast do I have to do it? Answers to these questions are a starting point. Ingesting clicks data for analytics Size
  • 8. Once we know how big and bad the beast is, we need to design the ranch that will keep it in check. Iterative process and prone to a lot of failures, but the world is out there to help us. Think, write and draw a lot. Ingesting clicks data for analytics Design
  • 9. … drawn too much ... Ingesting clicks data for analytics Design
  • 10. Most of us will never have the joy (and the horror) of creating a new stack novel in theory and practice. Still we need to understand the theory behind every brick. Read the info, read the opinions, try little proof of concepts of the moving parts, it helps a lot! Ingesting clicks data for analytics Which bricks should I use
  • 11. A very important brick. Elasticity of computation power, many *aaS, managed solutions are really a great help in terms of saved manpower and fast iterations. It comes at a great cost to consider: • $$$ (ymmv) • Possible lock-ins Ingesting clicks data for analytics The cloud is a brick too
  • 12. … well it’s never definitive ... Ingesting clicks data for analytics Design with bricks
  • 13. Obviously we haven’t followed those guidelines. One becomes savvy after crashing and burning many times. But still thanks to those errors we got there and built, at every iteration, a better infrastructure. Ingesting clicks data for analytics How we did it
  • 14. ClickMeter was already live and growing It needed an overhaul in its infrastructure/backend. The growth fueled the need to be ready for more power to handle more data. Obviously this had to be a tablecloth trick migration  Ingesting clicks data for analytics How we did it
  • 15. Already on the cloud (AWS), we thought of having an hybrid approach but it didn’t make sense. Review of old components already in production to see what to kill, keep or update. Kept good stuff and designed some new layers to make them work flawlessly in the new infrastructure. Ingesting clicks data for analytics How we did it
  • 16. Ingesting clicks data for analytics
  • 17. Pretty important, they need to: • Stay up • Scale up/down depending on the income traffic • Never lose anything • Be as fast as possible in processing They’re a custom web app application that undergoes a lot of testing. We used stuff like Beanstalk, Scaling groups, Load Balancers and Health routing offered by our cloud provider to manage the webapp scaling/availability Ingesting clicks data for analytics Redirect engine aka events collector
  • 18. Pipeline Most of this part uses our cloud provider technology. This simplify maintenance and provisioning, keeping the focus on the value of our product. Some moving parts are custom made by us to interact with the cloud technology (might be proprietary or just repackaged known one). Ingesting clicks data for analytics Tracking engine and friends
  • 19. SQS Pipeline Kinesis • Events • Preprocessing • Postprocessing • DynamoDB Ingesting clicks data for analytics Tracking engine and friends
  • 20. Combination of real-time and batch technologies. One of the scaling parts that actually provides value to the customers. Computes analysis on events data from a simple count to some predictions. Check the data produced by your processing system to improve step by step the pipeline! Ingesting clicks data for analytics Pipeline
  • 21. Ingesting clicks data for analytics Pipeline
  • 22. We employ different storage based on speed of delivery and data type. All the data is accessible via a REST API. This permits to develop a frontend layer with pretty much ease and allows customers to take control of the data and use them in way we might haven’t thought. Ingesting clicks data for analytics Storage and data delivery
  • 23. Managed services on the cloud help us a lot! Most of the team can focus on improvements and shipping (users are happy, so is the CEO). Some of us (me) still have to be the CloudOp/DevOp. p.s.: prepare always a plan b for when you’ll break things! Ingesting clicks data for analytics Operations
  • 24. Cloud is typically more expensive of your own metal. This extra money you’ve to spend is anyway well spent: • Flexibility • Easier provisioning • Easier management • Easier operations There are different types of clouds choose wisely. Ingesting clicks data for analytics Cloud co$t$
  • 25. Creating and managing a “big data” ready infrastructure is no easy task, but it can be done step by step also by startups. The cloud is a cool starting ground providing you with many of the toys you need, so you can focus on what part of “big data” gives you value! Use the wisdom shared by the big/medium players that already have been there(and built most of the stuff you’re using). Ingesting clicks data for analytics Conclusions