SlideShare a Scribd company logo
Serverless Data
Streaming at Scale
Anahit Pogosova
Lead Cloud Software Engineer (Solita / Yle)
20.10.2020
o Who, What & Why?
o Under the Hood
o Gotchas and Lessons Learned
AWS Community Nordics Virtual Meetup
• Lead Cloud Software Engineer
• Part of the Data & AI team at
Finland’s national public
broadcasting company
Me
• AWS Community Builder
10+ years
”full stack”, all kinds of stuff
• Yle Areena, the biggest
streaming service in Finland
• Areena recommendations
• Areena image personalisation
• Automatic image extraction
• Article recommendations (yle.fi)
• Smart notifications (Yle Uutisvahti)
• .. and more
Yle
• Data!
• user interaction and content metadata
• Collecting the data
• Storing the data
• Visualizing the data
• Utilizing the data (ML & AI)
• To understand the customers
• To help provide better service for everyone
{
"adobe":true,
"is_heartbeat":true,
"collectorreceived":1555267233549,
...
"s:asset:name":"Yle TV1",
"s:event:type":"start",
"s:meta:category":"nettitv",
"s:meta:content_type":"livetv",
"s:meta:ns_st_st":"yle tv1",
...
"s:meta:title":"eduskuntavaalit 2019 - tulosilta",
...
"s:meta:yle.vrsContent":"video",
"s:meta:yle.vrsDevice":"android",
"s:meta:yle.vrsPlatform":"mobile",
"s:meta:yle.vrsProduct":"areena",
"s:meta:yle_client":"android.areena.481-b4ce224bf",
"s:meta:yle_language":"fi",
"s:sp:channel":"yleisradio",
"s:sp:hb_version":"android-2.2.1.214-d5c678",
"s:user:mid":"71057009616815049761612335654599557361"
}
Yle
• ~ 500 000 000 requests per day
• ~ 600 000 rpm during prime time
• > 0.5 TB event data per day
• Apache Parquet
• JSON
• Max so far: ~ 2.5 mln rpm
• elections + hockey finals
Yle
Under the Hood
AWS Community Nordics Virtual Meetup
Agenda
Load the data to the datalake in a
columnar format
Enable content personalization
through near real-time analytics
No server is easier to manage than
“no server”.
Dr. Werner Vogels
(CTO, Amazon)
AWS Community Nordics Virtual Meetup
Kinesis Data Streams
• Fully managed and massively scalable service to stream data
• Data available in milliseconds and stored from 24 hours to up to 7
days
• Custom stream processing with consumers
• Shard is the unit of parallelism
• In: 1 MB/sec or 1 000 records/sec
• Out: 2 MB/sec
Amazon Kinesis
Agent
• Stand-alone Java
application to
stream data from
files
Service
Integrations
• CloudWatch Logs
• CloudWatch Events
• AWS IoT
• DB Migration Service
• API Gateway
Amazon Kinesis
Producer Library
(KPL)
• Provides higher
level of abstraction
over API calls
Amazon Kinesis
API (AWS SDK)
• Most flexible
• Allows full control
over writing data
Kinesis Data Streams, Writing Data
• putRecord(params, callback)
• putRecords(params, callback)
• Up to 500 records
• Up to 5 MiB
Kinesis Data Streams, AWS SDK
putRecords(params, callback)
• Request failure
• Retries by default up to 3 times
• Uses exponential backoff
• Base delay by default is 100 ms
Kinesis Data Streams, AWS SDK
Lambda
o One Lambda is invoked per each shard by default
• NEW(ish)! Parallelization factor (max 10)
• Up to 10 times as many concurrent Lambdas as there are shards!
o Lambda is invoked once per second,
or:
• the number of records reaches the configured batch size
(max 10 000 records)
• the record batch size reaches synchronous Lambda’s payload limit
(6MB)
• NEW(ish)! the batch window reaches its maximum value
(max 5 min)
Lambda
Before..
• Lambda retries the batch until
success or data expiration
• No other batches are
processed from the
shard (aka poison pill)!
Lambda, Error Handling
After!
• Maximum retry attempts
(max 10 000)
• Maximum record age
(1 min – 7 days)
• Bisect batch on function failure
• On-failure destination
(SQS or SNS)
Agenda
Load the data to the datalake in a
columnar format
Enable content personalization
through near real-time analytics
AWS Community Nordics Virtual Meetup
• Fully managed service to load streaming
data into a data lake
• S3, Redshift, AWS Elasticsearch
• HTTP endpoints (New!)
• Datadog, New Relic, MongoDB, and Splunk (Newish!)
• Allows to load streaming data with 0 lines of code
• Scales automatically (no shards to manage)
• Can batch, compress, transform and convert
data before loading to the destination
Kinesis Firehose
• Data stored up to 24 hours
• Batches records to certain size or for certain
period of time
• 1 to 128 MB
• 60 to 900 seconds
• Uses Glue Data Catalog to convert JSON to
• Apache Parquet
• Apache ORC
Kinesis Firehose
Kinesis Streams vs. Firehose
• Fully managed service to
stream data
• Data available up to 7 days
• Scaling using shards
• Custom stream processing with
consumers
• Fully managed service to
load data into a data lake
• Data available for 24 hours
• Scales automatically
• Batching, compressing, converting
data out of the box
+ custom transformations with
Amazon Kinesis
Agent
• Stand-alone Java
application to
stream data from
files
Service
Integrations
• Kinesis Streams
• CloudWatch Logs
• CloudWatch Events
• AWS IoT
Amazon Kinesis API
(AWS SDK)
• Most flexible
• Allows full control
over writing data
Kinesis Firehose, Writing Data
putRecordBatch(params, callback)
• Request failure
• Retries by default up to 3 times
• Uses exponential backoff
• Base delay by default is 100 ms
Kinesis Firehose, AWS SDK
Load the data to the datalake in a
columnar format
Enable content personalization
through near real-time analytics
Agenda
AWS Community Nordics Virtual Meetup
• Fully managed service to run SQL queries
on the streaming data
• Join, filter and aggregate data over a time-based or a row-based window
Kinesis Data Analytics
Ingests data from:
Kinesis Data Stream
Kinesis Firehose
Sends results to:
Kinesis Data Stream
Kinesis Firehose
Lambda function
AWS Community Nordics Virtual Meetup
Gotchas and
Lessons Learned
putRecords(params, callback)
Partial failure:
• Exponential backoff + jitter
Gotcha!
Writing to Kinesis Streams
• Kinesis limits are per second, CloudWatch metrics are per minute
• 1 MB/sec or 1 000 records/sec
• Can 5 000 records/minute exceed the throughput?
• Beware of network latency!
• can be one reason for bursts in Kinesis
• avoid the external network by using a Kinesis VPC endpoint
Gotcha!
Writing to Kinesis Streams
• IncomingRecords = PutRecord + PutRecods
The number of records successfully put to the Kinesis Stream
• WriteProvisionedThroughputExceeded = PutRecord + PutRecords
The number of records rejected due to throttling
• IncomingRecords + WriteProvisionedThroughputExceeded = Total
amount of incoming records
Gotcha!
Writing to Kinesis Streams
• IteratorAge
• latency between when a record is added, and when it is processed
• If it’s increasing, increase the number of shards, or
• increase the parallelization factor (NEWish)
• Two different iterator age metrics
• Kinesis stream iterator age is a combination metric across all consumers
• not too informative
• Lambda’s own iterator age should be used instead!
Gotcha!
Reading from Kinesis Streams
• Beware of timeouts!
• connectTimeout: timeout for establishing a new connection on a
socket
• if not explicitly set, this value will default to the value of timeout
• timeout: read timeout for an existing socket (2 min)
• time between when request ends and the response is
received, including service and network round-trips
Gotcha!
• Firehose scales endlessly!
Or does it?
• “It is a fully managed service that automatically scales to match the
throughput of your data.”
• ”When Direct PUT is configured as the data source, each Kinesis Data
Firehose delivery stream is subject to the following limits: […]
5,000 records/second, 2,000 transactions/second, and 5 MiB/second.
• ThrottledRecords: the number of records that were throttled because data
ingestion exceeded one of the delivery stream limits.
Gotcha!
• Always learn about the service limits
• (there are always limits)
• hard and soft limits
• Keep a close eye on lambda’s
concurrency limits
• Deep dive into the error handling
• Don’t just assume things ..
• If not sure, ask the AWS Support
• Keep a close eye on service updates
• Everything fails all the time, especially at scale, so better be prepared and
fail fast
“Everything fails,
all the time”
Dr. Werner Vogels
(CTO, Amazon)
Lessons Learned
Thank you!
ANAHIT POGOSOVA
@anahit_fi
Shameless Plug
Real World Serverless with Yan Cui, @theburningmonk
episode #14
Mastering AWS Kinesis Data Streams, Part 1 (2)
dev.solita.fi
@anahit_fi
Anahit Pogosova
AWS Community Nordics Virtual Meetup

More Related Content

PDF
Event driven architectures with Kinesis
PDF
Aws Kinesis
PPTX
Case studies session 2
PPTX
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
PDF
netflix-real-time-data-strata-talk
PPTX
What's new in MongoDB 2.6
PDF
Aleksei Udatšnõi – Crunching thousands of events per second in nearly real ti...
PDF
Cloud Connect 2012, Big Data @ Netflix
Event driven architectures with Kinesis
Aws Kinesis
Case studies session 2
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
netflix-real-time-data-strata-talk
What's new in MongoDB 2.6
Aleksei Udatšnõi – Crunching thousands of events per second in nearly real ti...
Cloud Connect 2012, Big Data @ Netflix

What's hot (20)

PDF
Real-time Cassandra
PDF
Introducing Kafka Connect and Implementing Custom Connectors
PDF
MongoDB World 2016: Scaling Targeted Notifications in the Music Streaming Wor...
PPTX
MMS - Monitoring, backup and management at a single click
PDF
Cost Effectively and Reliably Aggregating Billions of Messages Per Day Using ...
PDF
Mongo db eveningschemadesign
PDF
Cassandra 2.0 (Introduction)
PDF
Big data serving: Processing and inference at scale in real time
PDF
Real-time analytics with Druid at Appsflyer
PDF
Game Analytics at London Apache Druid Meetup
PDF
Clickhouse MeetUp@ContentSquare - ContentSquare's Experience Sharing
PDF
NoSQL benchmarking
PPTX
Webinar : Nouveautés de MongoDB 3.2
PDF
Queue Based Solr Indexing with Collection Management: Presented by Devansh Dh...
PDF
MongoDB Introduction talk at Dr Dobbs Conference, MongoDB Evenings at Bangalo...
PDF
Meetup070416 Presentations
PDF
Netflix Keystone—Cloud scale event processing pipeline
PPTX
HBaseCon 2013: Near Real Time Indexing for eBay Search
PDF
Дмитрий Лавриненко "Blockchain for Identity Management, based on Fast Big Data"
PPTX
Python Awareness for Exploration and Production Students and Professionals
Real-time Cassandra
Introducing Kafka Connect and Implementing Custom Connectors
MongoDB World 2016: Scaling Targeted Notifications in the Music Streaming Wor...
MMS - Monitoring, backup and management at a single click
Cost Effectively and Reliably Aggregating Billions of Messages Per Day Using ...
Mongo db eveningschemadesign
Cassandra 2.0 (Introduction)
Big data serving: Processing and inference at scale in real time
Real-time analytics with Druid at Appsflyer
Game Analytics at London Apache Druid Meetup
Clickhouse MeetUp@ContentSquare - ContentSquare's Experience Sharing
NoSQL benchmarking
Webinar : Nouveautés de MongoDB 3.2
Queue Based Solr Indexing with Collection Management: Presented by Devansh Dh...
MongoDB Introduction talk at Dr Dobbs Conference, MongoDB Evenings at Bangalo...
Meetup070416 Presentations
Netflix Keystone—Cloud scale event processing pipeline
HBaseCon 2013: Near Real Time Indexing for eBay Search
Дмитрий Лавриненко "Blockchain for Identity Management, based on Fast Big Data"
Python Awareness for Exploration and Production Students and Professionals
Ad

Similar to AWS Community Nordics Virtual Meetup (13)

PDF
Structure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan Waite
PDF
MongoDB .local Houston 2019: Building an IoT Streaming Analytics Platform to ...
PDF
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015
PPTX
Amazon Kinesis Data Streams Vs Msk (1).pptx
PDF
Choose Right Stream Storage: Amazon Kinesis Data Streams vs MSK
PDF
Big problems Big Data, simple solutions
PDF
Big problems Big data, simple AWS solution
PDF
Building real time data-driven products
PDF
AWS Kinesis - Streams, Firehose, Analytics
PDF
Palringo AWS London Summit 2017
PPTX
Riga dev day: Lambda architecture at AWS
PDF
SLC .Net User Group -- .Net, Kinesis Firehose, Glue, Athena
PDF
Realtime Analytics on AWS
Structure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan Waite
MongoDB .local Houston 2019: Building an IoT Streaming Analytics Platform to ...
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015
Amazon Kinesis Data Streams Vs Msk (1).pptx
Choose Right Stream Storage: Amazon Kinesis Data Streams vs MSK
Big problems Big Data, simple solutions
Big problems Big data, simple AWS solution
Building real time data-driven products
AWS Kinesis - Streams, Firehose, Analytics
Palringo AWS London Summit 2017
Riga dev day: Lambda architecture at AWS
SLC .Net User Group -- .Net, Kinesis Firehose, Glue, Athena
Realtime Analytics on AWS
Ad

Recently uploaded (20)

PPT
Quality review (1)_presentation of this 21
PPTX
Major-Components-ofNKJNNKNKNKNKronment.pptx
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PPTX
Business Acumen Training GuidePresentation.pptx
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
Introduction to Knowledge Engineering Part 1
PDF
Lecture1 pattern recognition............
PPTX
Moving the Public Sector (Government) to a Digital Adoption
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
IB Computer Science - Internal Assessment.pptx
PPTX
climate analysis of Dhaka ,Banglades.pptx
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPT
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
PDF
Introduction to Business Data Analytics.
Quality review (1)_presentation of this 21
Major-Components-ofNKJNNKNKNKNKronment.pptx
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
Business Acumen Training GuidePresentation.pptx
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Introduction to Knowledge Engineering Part 1
Lecture1 pattern recognition............
Moving the Public Sector (Government) to a Digital Adoption
Clinical guidelines as a resource for EBP(1).pdf
IB Computer Science - Internal Assessment.pptx
climate analysis of Dhaka ,Banglades.pptx
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
oil_refinery_comprehensive_20250804084928 (1).pptx
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
Introduction to Business Data Analytics.

AWS Community Nordics Virtual Meetup

  • 1. Serverless Data Streaming at Scale Anahit Pogosova Lead Cloud Software Engineer (Solita / Yle) 20.10.2020
  • 2. o Who, What & Why? o Under the Hood o Gotchas and Lessons Learned
  • 4. • Lead Cloud Software Engineer • Part of the Data & AI team at Finland’s national public broadcasting company Me • AWS Community Builder 10+ years ”full stack”, all kinds of stuff
  • 5. • Yle Areena, the biggest streaming service in Finland • Areena recommendations • Areena image personalisation • Automatic image extraction • Article recommendations (yle.fi) • Smart notifications (Yle Uutisvahti) • .. and more Yle
  • 6. • Data! • user interaction and content metadata • Collecting the data • Storing the data • Visualizing the data • Utilizing the data (ML & AI) • To understand the customers • To help provide better service for everyone { "adobe":true, "is_heartbeat":true, "collectorreceived":1555267233549, ... "s:asset:name":"Yle TV1", "s:event:type":"start", "s:meta:category":"nettitv", "s:meta:content_type":"livetv", "s:meta:ns_st_st":"yle tv1", ... "s:meta:title":"eduskuntavaalit 2019 - tulosilta", ... "s:meta:yle.vrsContent":"video", "s:meta:yle.vrsDevice":"android", "s:meta:yle.vrsPlatform":"mobile", "s:meta:yle.vrsProduct":"areena", "s:meta:yle_client":"android.areena.481-b4ce224bf", "s:meta:yle_language":"fi", "s:sp:channel":"yleisradio", "s:sp:hb_version":"android-2.2.1.214-d5c678", "s:user:mid":"71057009616815049761612335654599557361" } Yle
  • 7. • ~ 500 000 000 requests per day • ~ 600 000 rpm during prime time • > 0.5 TB event data per day • Apache Parquet • JSON • Max so far: ~ 2.5 mln rpm • elections + hockey finals Yle
  • 10. Agenda Load the data to the datalake in a columnar format Enable content personalization through near real-time analytics
  • 11. No server is easier to manage than “no server”. Dr. Werner Vogels (CTO, Amazon)
  • 13. Kinesis Data Streams • Fully managed and massively scalable service to stream data • Data available in milliseconds and stored from 24 hours to up to 7 days • Custom stream processing with consumers • Shard is the unit of parallelism • In: 1 MB/sec or 1 000 records/sec • Out: 2 MB/sec
  • 14. Amazon Kinesis Agent • Stand-alone Java application to stream data from files Service Integrations • CloudWatch Logs • CloudWatch Events • AWS IoT • DB Migration Service • API Gateway Amazon Kinesis Producer Library (KPL) • Provides higher level of abstraction over API calls Amazon Kinesis API (AWS SDK) • Most flexible • Allows full control over writing data Kinesis Data Streams, Writing Data
  • 15. • putRecord(params, callback) • putRecords(params, callback) • Up to 500 records • Up to 5 MiB Kinesis Data Streams, AWS SDK
  • 16. putRecords(params, callback) • Request failure • Retries by default up to 3 times • Uses exponential backoff • Base delay by default is 100 ms Kinesis Data Streams, AWS SDK
  • 17. Lambda o One Lambda is invoked per each shard by default • NEW(ish)! Parallelization factor (max 10) • Up to 10 times as many concurrent Lambdas as there are shards!
  • 18. o Lambda is invoked once per second, or: • the number of records reaches the configured batch size (max 10 000 records) • the record batch size reaches synchronous Lambda’s payload limit (6MB) • NEW(ish)! the batch window reaches its maximum value (max 5 min) Lambda
  • 19. Before.. • Lambda retries the batch until success or data expiration • No other batches are processed from the shard (aka poison pill)! Lambda, Error Handling After! • Maximum retry attempts (max 10 000) • Maximum record age (1 min – 7 days) • Bisect batch on function failure • On-failure destination (SQS or SNS)
  • 20. Agenda Load the data to the datalake in a columnar format Enable content personalization through near real-time analytics
  • 22. • Fully managed service to load streaming data into a data lake • S3, Redshift, AWS Elasticsearch • HTTP endpoints (New!) • Datadog, New Relic, MongoDB, and Splunk (Newish!) • Allows to load streaming data with 0 lines of code • Scales automatically (no shards to manage) • Can batch, compress, transform and convert data before loading to the destination Kinesis Firehose
  • 23. • Data stored up to 24 hours • Batches records to certain size or for certain period of time • 1 to 128 MB • 60 to 900 seconds • Uses Glue Data Catalog to convert JSON to • Apache Parquet • Apache ORC Kinesis Firehose
  • 24. Kinesis Streams vs. Firehose • Fully managed service to stream data • Data available up to 7 days • Scaling using shards • Custom stream processing with consumers • Fully managed service to load data into a data lake • Data available for 24 hours • Scales automatically • Batching, compressing, converting data out of the box + custom transformations with
  • 25. Amazon Kinesis Agent • Stand-alone Java application to stream data from files Service Integrations • Kinesis Streams • CloudWatch Logs • CloudWatch Events • AWS IoT Amazon Kinesis API (AWS SDK) • Most flexible • Allows full control over writing data Kinesis Firehose, Writing Data
  • 26. putRecordBatch(params, callback) • Request failure • Retries by default up to 3 times • Uses exponential backoff • Base delay by default is 100 ms Kinesis Firehose, AWS SDK
  • 27. Load the data to the datalake in a columnar format Enable content personalization through near real-time analytics Agenda
  • 29. • Fully managed service to run SQL queries on the streaming data • Join, filter and aggregate data over a time-based or a row-based window Kinesis Data Analytics Ingests data from: Kinesis Data Stream Kinesis Firehose Sends results to: Kinesis Data Stream Kinesis Firehose Lambda function
  • 32. putRecords(params, callback) Partial failure: • Exponential backoff + jitter Gotcha! Writing to Kinesis Streams
  • 33. • Kinesis limits are per second, CloudWatch metrics are per minute • 1 MB/sec or 1 000 records/sec • Can 5 000 records/minute exceed the throughput? • Beware of network latency! • can be one reason for bursts in Kinesis • avoid the external network by using a Kinesis VPC endpoint Gotcha! Writing to Kinesis Streams
  • 34. • IncomingRecords = PutRecord + PutRecods The number of records successfully put to the Kinesis Stream • WriteProvisionedThroughputExceeded = PutRecord + PutRecords The number of records rejected due to throttling • IncomingRecords + WriteProvisionedThroughputExceeded = Total amount of incoming records Gotcha! Writing to Kinesis Streams
  • 35. • IteratorAge • latency between when a record is added, and when it is processed • If it’s increasing, increase the number of shards, or • increase the parallelization factor (NEWish) • Two different iterator age metrics • Kinesis stream iterator age is a combination metric across all consumers • not too informative • Lambda’s own iterator age should be used instead! Gotcha! Reading from Kinesis Streams
  • 36. • Beware of timeouts! • connectTimeout: timeout for establishing a new connection on a socket • if not explicitly set, this value will default to the value of timeout • timeout: read timeout for an existing socket (2 min) • time between when request ends and the response is received, including service and network round-trips Gotcha!
  • 37. • Firehose scales endlessly! Or does it? • “It is a fully managed service that automatically scales to match the throughput of your data.” • ”When Direct PUT is configured as the data source, each Kinesis Data Firehose delivery stream is subject to the following limits: […] 5,000 records/second, 2,000 transactions/second, and 5 MiB/second. • ThrottledRecords: the number of records that were throttled because data ingestion exceeded one of the delivery stream limits. Gotcha!
  • 38. • Always learn about the service limits • (there are always limits) • hard and soft limits • Keep a close eye on lambda’s concurrency limits • Deep dive into the error handling • Don’t just assume things .. • If not sure, ask the AWS Support • Keep a close eye on service updates • Everything fails all the time, especially at scale, so better be prepared and fail fast “Everything fails, all the time” Dr. Werner Vogels (CTO, Amazon) Lessons Learned
  • 40. Shameless Plug Real World Serverless with Yan Cui, @theburningmonk episode #14 Mastering AWS Kinesis Data Streams, Part 1 (2) dev.solita.fi @anahit_fi Anahit Pogosova