SlideShare a Scribd company logo
Serverless Compose
vs
Data Warehouse
Artur Wita
The overall brief of the presentation:
Agenda
It’s not a golden hammer but only an attempt.
● Intro
● Problem context
● Solution
● Challenges
○ Architectural
○ Serverless
● Results
About me
● Node.js Developer with 2 years of commercial experience
(1 year in Serverless Framework)
● At The Software House, I am responsible for greeting new
resources employees (occasionally blowing up production)
● Co-author of the data warehouse
● Privately mountains lover, sports freak (hiking, ice skating,
swimming, ultimate frisbee), workation enthusiast,
and a friendly-neighborhood rapper
A data warehouse is a system responsible for gathering miscellaneous data. 🙃
Intro
One of the most important things to take care of in such systems are:
● ensuring that the data is up to date
● having data backup / the ability to recreate data
● and surely something else
With data warehouses, we are able to create complex queries useful for analyzing the trends
in order to make decisions that are going to define our future decisions.
Sounds good?
Because it is good…
but…
How we built our data warehouse?
Our internal data warehouse is based on AWS Step Functions.
● ensuring that the data is up to date
○ cron jobs responsible for updating the data every day
● having data backup / the ability to recreate data
○ manually-triggered recreate workflows
Almost each feature supports 2 workflows, sometimes we were
able to use the recreate flow for updating the data.
Extract
● Collect data e.g. from an external API
2 3
How does a workflow might look like?
1
Transform
● Transform and model raw data
Load
● Store the data in the database
Granulation advantages
Building workflows using single-responsible lambdas has many
benefits, such as:
● easier testing
● reusable steps
● improved scalability
● ability to adjust particular resources
(memory size, timeouts, etc.)
● easier debugging
(a graphical representation of steps)
Time for some maths
At the time of making the decision to split our warehouse
we had about 16 features.
Each feature supported 2 workflows.
Each workflow consisted of average 3-4 lambdas*.
16 features • 4 lambdas = 64 lambdas
An average time of a deployment of the whole application*
was something between 18 - 20 minutes.
And we still had a small warehouse...
One workflow does not make a difference, but how about seventeen?
Can we fix it?
Serverless Compose is an official package created to
“simplify deploying and orchestrating multiple services”.
The key features allow us to:
● Deploy multiple services in parallel
● Deploy services in a specific order
● Share outputs from one service to another
● Run commands across multiple services
Serverless Compose
The heart of each Serverless Framework application is
a serverless.yml file. It's used for configuring things such as:
How does it work?
In a certain sense what Serverless Compose does
is handle multiple serverless.yml files and specify
dependencies between services described by them.
● the framework itself
● cloud provider (e.g. AWS, Azure, Google)
● functions
● state machines
● resources
● etc.
There are two ways of specifying dependencies
between services:
Dependent services
● implicit - by referencing a resource belonging to
another service
● explicit - by marking a service as a dependent
Firstly the downstream is deployed, then the target.
Come on,
it’s easy!
It’s not a problem - it’s a challenge
Serverless
Architectural
● How should we organize directories?
● Where should we put shared code?
● What about configuration files?
● Are we going to have a problem with the database?
● At least we can share webpack config, can’t we?
● How many deployment buckets are we going to need?
The same for the data lake ones?
● Are we still going to be able to work locally?
● Will pipelines configuration be troublesome?
Goals
Our aim was to achieve the following criteria:
● shorten deployment time
● allow the ability to deploy a single service
as well as everything at once
● maintain the ability to work locally
● keep a single package.json*
Architectural
challenges
Organising directories
We decided to split features by domains and created 4 services:
● api
● finance
● migrations
● people
In each service, we kept the previous directories structure
(functions + shared).
Also, we kept the original shared folder in the project’s root
directory.
Organising directories
Before: After:
Configuration files
Since the very beginning, we opted for small, dedicated configs,
rather than one huge config. Thanks to that, we were able to:
● keep feature-specific configs in features’ directories
● keep service-specific configs in the service’s shared directory
● keep common configs in the project’s shared directory
In order to hermetise environment variables we split the original
.env file and created dedicated .envs per each service.
Configuration files
Before: After:
Database configuration
Because of a few saboteurs, we decided to keep all models in the
project’s shared directory.
Even though we put all models in the same directory, we were able
to create separate, slightly differing ORM configs per each service.
In order to keep our database in sync, we created a dedicated
service for migrations and marked other services as a dependent.
Database configuration
Migration: Others:
Serverless
challenges
Webpack configuration
If it looks the same, it works the same, right?
It seems that nope. 😕
What could be the reason?
A single static config was causing a race.
As soon as we created dedicated config for every service
our serverless roads were safe again.
Webpack configuration
Before: After:
S3 buckets
Although the Serverless Framework allows us to use an existing
bucket for uploading source code files, we still had to use
separate deployment buckets per each service.
Also, we decided to split our data lake bucket as well, so that
every service has its own one. That was not necessary,
however, we thought that such hermetization might be valuable.
We only had to migrate the data from the old data lake bucket to
the new ones. For that we used the AWS CLI sync method.
S3 buckets
Before: After:
Local environment
The great thing about our warehouse is its ability to run 100%
locally. That is why we had to maintain it.
We achieved it by:
● using service-dedicated commands
● assigning unique http and lambda ports to each service
The only thing that was left was refactoring docker-compose.yml
by adding local step functions instances per each service.
Local environment
Before: After:
After: We had to use different ports to be able to run multiple services concurrently.
Pipelines
We had to slightly refactor our pipelines too.
That consisted of:
● adding a few more pipelines to be able to deploy each
service independently to each environment
● call the lambda responsible for running migrations in a
different way
Before: A single definition for the deploy step. Migrations are being run immediately after the deployment.
After: A single definition per each service. To run migrations we need to call the migrations-service.
Results
Before After Difference
Services count 1 service 4 services +3 services
Deployment time
(all at once)
20 minutes 10 minutes - 10 minutes
Monthly cost X Y Δ ≈ 0
Capabilities 100% 100% none
Tips & tricks
Despite gaining knowledge on how to split an application using
Serverless Compose we also made some observations:
● use short services names
● name your resources using a common prefix:
<service><stage><resource_name>
● don’t be afraid of the lack of serverless knowledge
Summary
We managed not only to shorten the deployment time of our
warehouse but also to make it scalable.
The application is still easy to maintain thanks to its modular
structure.
Serverless Compose is a truly powerful tool and you should
definitely give it a try!
Q&A
Join the community!
https://tsh.io/programowanko https://github.com/arturwita/serverless-compose-boilerplate
tsh.io
Thank you
for your attention
Serverless Compose vs Data Warehouse

More Related Content

PDF
Crio.do - Deployment on AWS Masterclass
PDF
Jak nie zwariować z architekturą Serverless?
PDF
2020-04-02 DevConf - How to migrate an existing application to serverless
PPTX
Serverless - DevOps Lessons Learned From Production
PPTX
Operationnal challenges behind Serverless architectures by Laurent Bernaille
PDF
Introduction to Serverless through Architectural Patterns
PDF
Individual Serverless Development Environments for AWS
PPTX
Operational challenges behind Serverless architectures
Crio.do - Deployment on AWS Masterclass
Jak nie zwariować z architekturą Serverless?
2020-04-02 DevConf - How to migrate an existing application to serverless
Serverless - DevOps Lessons Learned From Production
Operationnal challenges behind Serverless architectures by Laurent Bernaille
Introduction to Serverless through Architectural Patterns
Individual Serverless Development Environments for AWS
Operational challenges behind Serverless architectures

Similar to Serverless Compose vs hurtownia danych (20)

PPTX
Building serverless app_using_aws_lambda_b4usolution
PDF
Building a serverless company on AWS lambda and Serverless framework
PPTX
Enterprise Serverless Adoption. An Experience Report
PDF
Microservices and Serverless for Mega Startups - DevOps IL Meetup
PDF
Wildrydes Serverless Workshop Tel Aviv
PPTX
Demistifying serverless on aws
PDF
Skillenza Build with Serverless Challenge - Advanced Serverless Concepts
PDF
Google Cloud Functions & Firebase Crash Course
PDF
How to build a social network on serverless | Yan Cui
PDF
How to build a social network on Serverless (AWS Community Summit)
PDF
Sean schofield & Richard Lister, Spree Commerce_ Fearless deployment @ Open C...
PDF
JFokus 2020 - How to migrate an application to serverless
PDF
Čtvrtkon #64 - AWS Serverless - Michal Haták
PDF
Building a Serverless company with Node.js, React and the Serverless Framewor...
PPTX
Serverless at Lifestage
PDF
Beyond serverless.pptx
PDF
AWS Application Service Workshop - Serverless Architecture
PDF
Serverless: A love hate relationship
PDF
How to Build a Big Data Application: Serverless Edition
PDF
The future will be Serverless - JSDay Verona 2018
Building serverless app_using_aws_lambda_b4usolution
Building a serverless company on AWS lambda and Serverless framework
Enterprise Serverless Adoption. An Experience Report
Microservices and Serverless for Mega Startups - DevOps IL Meetup
Wildrydes Serverless Workshop Tel Aviv
Demistifying serverless on aws
Skillenza Build with Serverless Challenge - Advanced Serverless Concepts
Google Cloud Functions & Firebase Crash Course
How to build a social network on serverless | Yan Cui
How to build a social network on Serverless (AWS Community Summit)
Sean schofield & Richard Lister, Spree Commerce_ Fearless deployment @ Open C...
JFokus 2020 - How to migrate an application to serverless
Čtvrtkon #64 - AWS Serverless - Michal Haták
Building a Serverless company with Node.js, React and the Serverless Framewor...
Serverless at Lifestage
Beyond serverless.pptx
AWS Application Service Workshop - Serverless Architecture
Serverless: A love hate relationship
How to Build a Big Data Application: Serverless Edition
The future will be Serverless - JSDay Verona 2018
Ad

More from The Software House (20)

PDF
Jak kraść miliony, czyli o błędach bezpieczeństwa, które mogą spotkać również...
PDF
Uszanowanko Podsumowanko
PDF
Jak efektywnie podejść do certyfikacji w AWS?
PDF
O co chodzi z tą dostępnością cyfrową?
PDF
Chat tekstowy z użyciem Amazon Chime
PDF
Migracje danych serverless
PDF
Analiza semantyczna artykułów prasowych w 5 sprintów z użyciem AWS
PDF
Feature flags na ratunek projektu w JavaScript
PDF
Typowanie nominalne w TypeScript
PDF
Automatyzacja tworzenia frontendu z wykorzystaniem GraphQL
PDF
Testy API: połączenie z bazą danych czy implementacja w pamięci
PDF
Jak skutecznie read model. Case study
PDF
Firestore czyli ognista baza od giganta z Doliny Krzemowej
PDF
Jak utrzymać stado Lambd w ryzach
PDF
Jak poskromić AWS?
PDF
O łączeniu Storyblok i Next.js
PDF
Amazon Step Functions. Sposób na implementację procesów w chmurze
PDF
Od Figmy do gotowej aplikacji bez linijki kodu
PDF
Co QA może i czego nie powinien się bać?
PDF
Zmigrujmy 30 tys. użytkowników ze starego systemu. Co może pójść nie tak?
Jak kraść miliony, czyli o błędach bezpieczeństwa, które mogą spotkać również...
Uszanowanko Podsumowanko
Jak efektywnie podejść do certyfikacji w AWS?
O co chodzi z tą dostępnością cyfrową?
Chat tekstowy z użyciem Amazon Chime
Migracje danych serverless
Analiza semantyczna artykułów prasowych w 5 sprintów z użyciem AWS
Feature flags na ratunek projektu w JavaScript
Typowanie nominalne w TypeScript
Automatyzacja tworzenia frontendu z wykorzystaniem GraphQL
Testy API: połączenie z bazą danych czy implementacja w pamięci
Jak skutecznie read model. Case study
Firestore czyli ognista baza od giganta z Doliny Krzemowej
Jak utrzymać stado Lambd w ryzach
Jak poskromić AWS?
O łączeniu Storyblok i Next.js
Amazon Step Functions. Sposób na implementację procesów w chmurze
Od Figmy do gotowej aplikacji bez linijki kodu
Co QA może i czego nie powinien się bać?
Zmigrujmy 30 tys. użytkowników ze starego systemu. Co może pójść nie tak?
Ad

Recently uploaded (20)

PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
KodekX | Application Modernization Development
PPTX
Cloud computing and distributed systems.
PPTX
breach-and-attack-simulation-cybersecurity-india-chennai-defenderrabbit-2025....
PPT
Teaching material agriculture food technology
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Advanced IT Governance
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
PDF
cuic standard and advanced reporting.pdf
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Modernizing your data center with Dell and AMD
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
20250228 LYD VKU AI Blended-Learning.pptx
Understanding_Digital_Forensics_Presentation.pptx
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
KodekX | Application Modernization Development
Cloud computing and distributed systems.
breach-and-attack-simulation-cybersecurity-india-chennai-defenderrabbit-2025....
Teaching material agriculture food technology
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Advanced IT Governance
Advanced methodologies resolving dimensionality complications for autism neur...
Dropbox Q2 2025 Financial Results & Investor Presentation
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
cuic standard and advanced reporting.pdf
The AUB Centre for AI in Media Proposal.docx
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Spectral efficient network and resource selection model in 5G networks
Reach Out and Touch Someone: Haptics and Empathic Computing
Modernizing your data center with Dell and AMD
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...

Serverless Compose vs hurtownia danych

  • 2. The overall brief of the presentation: Agenda It’s not a golden hammer but only an attempt. ● Intro ● Problem context ● Solution ● Challenges ○ Architectural ○ Serverless ● Results
  • 3. About me ● Node.js Developer with 2 years of commercial experience (1 year in Serverless Framework) ● At The Software House, I am responsible for greeting new resources employees (occasionally blowing up production) ● Co-author of the data warehouse ● Privately mountains lover, sports freak (hiking, ice skating, swimming, ultimate frisbee), workation enthusiast, and a friendly-neighborhood rapper
  • 4. A data warehouse is a system responsible for gathering miscellaneous data. 🙃 Intro One of the most important things to take care of in such systems are: ● ensuring that the data is up to date ● having data backup / the ability to recreate data ● and surely something else With data warehouses, we are able to create complex queries useful for analyzing the trends in order to make decisions that are going to define our future decisions. Sounds good? Because it is good…
  • 6. How we built our data warehouse? Our internal data warehouse is based on AWS Step Functions. ● ensuring that the data is up to date ○ cron jobs responsible for updating the data every day ● having data backup / the ability to recreate data ○ manually-triggered recreate workflows Almost each feature supports 2 workflows, sometimes we were able to use the recreate flow for updating the data.
  • 7. Extract ● Collect data e.g. from an external API 2 3 How does a workflow might look like? 1 Transform ● Transform and model raw data Load ● Store the data in the database
  • 8. Granulation advantages Building workflows using single-responsible lambdas has many benefits, such as: ● easier testing ● reusable steps ● improved scalability ● ability to adjust particular resources (memory size, timeouts, etc.) ● easier debugging (a graphical representation of steps)
  • 9. Time for some maths At the time of making the decision to split our warehouse we had about 16 features. Each feature supported 2 workflows. Each workflow consisted of average 3-4 lambdas*. 16 features • 4 lambdas = 64 lambdas An average time of a deployment of the whole application* was something between 18 - 20 minutes. And we still had a small warehouse...
  • 10. One workflow does not make a difference, but how about seventeen?
  • 11. Can we fix it?
  • 12. Serverless Compose is an official package created to “simplify deploying and orchestrating multiple services”. The key features allow us to: ● Deploy multiple services in parallel ● Deploy services in a specific order ● Share outputs from one service to another ● Run commands across multiple services Serverless Compose
  • 13. The heart of each Serverless Framework application is a serverless.yml file. It's used for configuring things such as: How does it work? In a certain sense what Serverless Compose does is handle multiple serverless.yml files and specify dependencies between services described by them. ● the framework itself ● cloud provider (e.g. AWS, Azure, Google) ● functions ● state machines ● resources ● etc.
  • 14. There are two ways of specifying dependencies between services: Dependent services ● implicit - by referencing a resource belonging to another service ● explicit - by marking a service as a dependent Firstly the downstream is deployed, then the target.
  • 16. It’s not a problem - it’s a challenge Serverless Architectural ● How should we organize directories? ● Where should we put shared code? ● What about configuration files? ● Are we going to have a problem with the database? ● At least we can share webpack config, can’t we? ● How many deployment buckets are we going to need? The same for the data lake ones? ● Are we still going to be able to work locally? ● Will pipelines configuration be troublesome?
  • 17. Goals Our aim was to achieve the following criteria: ● shorten deployment time ● allow the ability to deploy a single service as well as everything at once ● maintain the ability to work locally ● keep a single package.json*
  • 19. Organising directories We decided to split features by domains and created 4 services: ● api ● finance ● migrations ● people In each service, we kept the previous directories structure (functions + shared). Also, we kept the original shared folder in the project’s root directory.
  • 21. Configuration files Since the very beginning, we opted for small, dedicated configs, rather than one huge config. Thanks to that, we were able to: ● keep feature-specific configs in features’ directories ● keep service-specific configs in the service’s shared directory ● keep common configs in the project’s shared directory In order to hermetise environment variables we split the original .env file and created dedicated .envs per each service.
  • 23. Database configuration Because of a few saboteurs, we decided to keep all models in the project’s shared directory. Even though we put all models in the same directory, we were able to create separate, slightly differing ORM configs per each service. In order to keep our database in sync, we created a dedicated service for migrations and marked other services as a dependent.
  • 26. Webpack configuration If it looks the same, it works the same, right? It seems that nope. 😕 What could be the reason? A single static config was causing a race. As soon as we created dedicated config for every service our serverless roads were safe again.
  • 28. S3 buckets Although the Serverless Framework allows us to use an existing bucket for uploading source code files, we still had to use separate deployment buckets per each service. Also, we decided to split our data lake bucket as well, so that every service has its own one. That was not necessary, however, we thought that such hermetization might be valuable. We only had to migrate the data from the old data lake bucket to the new ones. For that we used the AWS CLI sync method.
  • 30. Local environment The great thing about our warehouse is its ability to run 100% locally. That is why we had to maintain it. We achieved it by: ● using service-dedicated commands ● assigning unique http and lambda ports to each service The only thing that was left was refactoring docker-compose.yml by adding local step functions instances per each service.
  • 32. After: We had to use different ports to be able to run multiple services concurrently.
  • 33. Pipelines We had to slightly refactor our pipelines too. That consisted of: ● adding a few more pipelines to be able to deploy each service independently to each environment ● call the lambda responsible for running migrations in a different way
  • 34. Before: A single definition for the deploy step. Migrations are being run immediately after the deployment.
  • 35. After: A single definition per each service. To run migrations we need to call the migrations-service.
  • 37. Before After Difference Services count 1 service 4 services +3 services Deployment time (all at once) 20 minutes 10 minutes - 10 minutes Monthly cost X Y Δ ≈ 0 Capabilities 100% 100% none
  • 38. Tips & tricks Despite gaining knowledge on how to split an application using Serverless Compose we also made some observations: ● use short services names ● name your resources using a common prefix: <service><stage><resource_name> ● don’t be afraid of the lack of serverless knowledge
  • 39. Summary We managed not only to shorten the deployment time of our warehouse but also to make it scalable. The application is still easy to maintain thanks to its modular structure. Serverless Compose is a truly powerful tool and you should definitely give it a try!
  • 40. Q&A
  • 41. Join the community! https://tsh.io/programowanko https://github.com/arturwita/serverless-compose-boilerplate
  • 42. tsh.io Thank you for your attention Serverless Compose vs Data Warehouse