SlideShare a Scribd company logo
HOW TO REALIZE AN ADDITIONAL
270% ROI WITH SNOWFLAKE
2
Introduction
How to dramatically increase the ROI of your Snowflake investment by:
● Managing the size of your data warehouse
● Defining and setting limits on query times to prevent runaway queries
● Implementing visibility and telemetry to monitor usage
● Automating the creation, maintenance and management of data aggregates
3
Today’s Speakers
VP, Analytics, Rakuten
Rewards
Twitter?
Mark is VP, Analytics, at Rakuten
Rewards, formerly Ebates. He’s been
with the company since 2014, and
was with the team that sold Ebates
to Rakuten in 2015. Mark plays a
double role, leading a center of
excellence analytics group and
product managing the enterprise
business intelligence stack. Prior to
joining Ebates, Mark worked in the
residential real estate, and online
tournament spaces.
Mark Stange-Tregear
Chief Strategy Officer, AtScale
@dmariani
Dave is one of the co-founders of
AtScale and is currently the Chief
Strategy Officer.
Prior to AtScale, Dave was VP of
Engineering at Klout & at Yahoo!
where he built the world's largest
multi-dimensional cube for BI on
Hadoop.
Dave is a Big Data visionary & serial
entrepreneur.
Dave Mariani
@rakutenrewards
4
In Pursuit of Processing Power - A Timeline
2014 2016 20172015 2018 2019
SSRS
SQL Server
SSAS
M.Strategy
Hadoop (Cloudera)
@scale
Tableau
Hadoop (Cloudera)
@scale
Tableau
Hadoop (Cloudera)
@scale
Tableau
@scale
Tableau
Snowflake
5
Why Snowflake?
USERS “TRIP” OVER EACH OTHER
FRUSTRATION AND MISSED GOALS
SEPARATE COMPUTE INTO
DISCRETE CLUSTERS
• Dozens of discreet warehouses
• Budget cost to the business unit
• Separate ETL from ad hoc workload
• Horizontally scalable on-demand
PROBLEM
RESULT
Managing Data Warehouses
MORE WAREHOUSES MORE VISIBILITY MORE CONTROL
HOW MANY?
● At least one warehouse per business team or engineering group (often several)
● Dedicated warehouses for ETL components
● Dedicated warehouses for 3rd party products
CONTROL OF WAREHOUSE SIZE?
● Constantly reviewed for potential down-sizing
● Resizing control centralized with cost management and oversight “team”
IS BIGGER BETTER?
● Not always
● IO intensive workloads can work on smaller clusters
● Aggregations and joins on bigger clusters
7
More Visibility … More Control
Get to know and love:
● "SNOWFLAKE"."ACCOUNT_USAGE"."WAREHOUSE_METERING_HISTORY"
● "SNOWFLAKE"."ACCOUNT_USAGE"."QUERY_HISTORY”
and these are well worth knowing as well…
● "SNOWFLAKE"."ACCOUNT_USAGE"."STORAGE_USAGE"
● "SNOWFLAKE"."ACCOUNT_USAGE"."METERING_DAILY_HISTORY"
● SNOWFLAKE"."READER_ACCOUNT_USAGE"."WAREHOUSE_METERING_HISTORY”
MAKE MONITORING EASY… AUTOMATED DAILY REPORTS
8
Levers to Pull
9
1. Warehouse size… typically moving down, but sometimes up
2. Horizontal scaling
3. Move jobs between warehouses
4. Caching
5. Code rewrite
6. Clustering
A note on code rewrite…
● Data modeling is still important
● Snowflake is very powerful, but joins and aggregations cost
● Repetitive joins, repetitive aggregation? Consider creating “flat” warehouse table
10
Query
Performance
User
Concurrency
Compute Costs
How fast can the Cloud
Data Warehouse answer a
query for one user?
How do multiple users
running queries affect
performance & stability?
How do query workloads
and configuration impact
your monthly bill?
Semantic
Complexity
How difficult is it to write
the query to answer the
business question?
Additional Considerations When Managing Snowflake
The Cloud Analytics Stack
COMPONENT
CONSUMPTION
VISUALIZATION, ANALYSIS, REPORTING
SEMANTIC LAYER
QUERY ACCESS,METADATA, MASKING, AUDITING
PREPARED DATA
DATA PROCESSING, MODELING
RAW DATA
DATA STORAGE, ENCRYPTION
DATA TRANSFORMATION
ETL,MERGING, AGGREGATION
LAYER (FUNCTION)
BI Tools AI/ML Tools Applications
UNIVERSAL SEMANTIC LAYER
Data Warehouse File Access Engine
ETL Engine
File System (Data Lake)
Data
Catalog
The Cloud Analytics Stack
12
COMPONENT
CONSUMPTION
VISUALIZATION, ANALYSIS, REPORTING
SEMANTIC LAYER
QUERY ACCESS, FILTERING, MASKING, AUDITING
PREPARED DATA
DATA PROCESSING, MODELING
RAW DATA
DATA STORAGE, ENCRYPTION
DATA TRANSFORMATION
ETL,MERGING, AGGREGATION
LAYER (FUNCTION)
BI Tools AI/ML Tools Applications
Multi-dimensional Engine
Data Governance Engine
Virtualization Engine
Data Warehouse File Access Engine
ETL Engine
File System (Data Lake)
Data
Catalog
A Semantic Layer is Critical to Success
13
1. Simplicity
2. Single Source of truth
3. Governance for all
14
A Semantic Layer Simplifies & Normalizes Data Access
SELECT
`d_product_manufacturer_id` AS `d_product_manufacturer_id`,
SUM( `Total Ext Sales Price` ) AS `sum_total__ext_sales_price_ok`
FROM
`tpc-ds benchmark model` `TPC-DS Benchmark Model`
WHERE
`I Category` = 'Electronics'
AND `Sold Calendar Year-Week` = 1999
AND `Sold d_customer_gmt_offset` = -5.00
AND `Sold d_month_of_year` = 7
GROUP BY 1
ORDER BY 2 DESC
LIMIT 100;
with ss as (
select
i_manufact_id,sum(ss_ext_sales_price) total_sales
from
store_sales,
date_dim,
customer_address,
item
where
i_manufact_id in (select
i_manufact_id
from
item
where i_category in ('Electronics'))
and ss_item_sk = i_item_sk
and ss_sold_date_sk = d_date_sk
and d_year = 1999
and d_moy = 7
and ss_addr_sk = ca_address_sk
and ca_gmt_offset = -5
group by i_manufact_id),
cs as (
select
i_manufact_id,sum(cs_ext_sales_price) total_sales
from
catalog_sales,
date_dim,
customer_address,
item
where
...
TPC-DS Query
#33:
What is the monthly sales
figure based on extended
price for a specific month
in a specific year, for
manufacturers in a specific
category in a given time
zone? Group sales by
manufacturer identifier
and sort output by sales
amount, by channel, and
give Total sales.
398 bytes 1,872 bytes
AtScale SQL TPC-DS Raw
15
AtScale’s TPC-DS 10TB Benchmark (10,000 Scale Factor)
THE TPC-DS 10TB
DATASET HAS:
Multiple fact tables
Large fact tables
Large dimensions
1
2
3
Delivers orders of
magnitude query
improvements that
are amplified with high
user concurrency
16
Benchmark Results: Query Performance
14x
Faster
Smooths out & mitigates
user concurrency
challenges without
requiring additional
compute resources
17
Benchmark Results: Concurrency
Note: Tthread group 1 is the average of 5 runs for each of the 20
queries, The 5, 25 & 50 thead groups ran each of the 20 queries 1
time per thread.
18
Benchmark Results: Compute Cost
Allows for smaller compute
resources
& mitigates unpredictable
& unbounded costs for
on-demand pricing models
4x
Less Cost
19
Same Workloads, Smaller Warehouses
Snowflake - Raw Snowflake + AtScale
1, 5, 25, 50 threads 1, 5, 25, 50, 100 threads
Test
Improvement Factor
with AtScale
Snowflake
Query Performance1 4x Faster
User Concurrency2 14x Faster
Compute Cost3 73% Cheaper
Complexity4 76% less complex
SQL queries
Results of TPC-DS 10TB Benchmark Test
20
1. Elapsed time for executing 1 query five times
2. Elapsed time executing 1 (x5), 5, 25, 50 queries
3. Compute costs for cluster time (Redshift, Snowflake) or bytes read (BigQuery) for user concurrency test
4. Complexity score for SQL queries for number of: functions, operations, tables, objects & subqueries (AtScale = 258, TPC-DS = 1,057)
Configuration
Virtual Data
Warehouse Used
Compute
Cost
per Hour1
Snowflake
3X-Large (64
credits/hour)
$128.00
AtScale on
Snowflake
1X-Large (16
credits/hour)
$32.00
AtScale customers realize an additional
270% ROI on Snowflake
21
DEMO
SELECT
`d_product_manufacturer_id` AS `d_product_manufacturer_id`,
SUM( `Total Ext Sales Price` ) AS `sum_total__ext_sales_price_ok`
FROM
`tpc-ds benchmark model` `TPC-DS Benchmark Model`
WHERE
`I Category` = 'Electronics'
AND `Sold Calendar Year-Week` = 1999
AND `Sold d_customer_gmt_offset` = -5.00
AND `Sold d_month_of_year` = 7
GROUP BY 1
ORDER BY 2 DESC
LIMIT 100;
with ss as (
select
i_manufact_id,sum(ss_ext_sales_price) total_sales
from
store_sales,
date_dim,
customer_address,
item
where
i_manufact_id in (select
i_manufact_id
from
item
where i_category in ('Electronics'))
and ss_item_sk = i_item_sk
and ss_sold_date_sk = d_date_sk
and d_year = 1999
and d_moy = 7
and ss_addr_sk = ca_address_sk
and ca_gmt_offset = -5
group by i_manufact_id),
cs as (
select
i_manufact_id,sum(cs_ext_sales_price) total_sales
from
catalog_sales,
date_dim,
customer_address,
item
where
...
TPC-DS Query
#33:
What is the monthly sales
figure based on extended
price for a specific month
in a specific year, for
manufacturers in a specific
category in a given time
zone? Group sales by
manufacturer identifier
and sort output by sales
amount, by channel, and
give Total sales.
398 bytes 1,872 bytes
AtScale SQL TPC-DS Raw
22
Summary: How to realize an additional 270% ROI on Snowflake
▵ Download the Snowflake benchmark report
at: https://www.atscale.com/snowflake
benchmark
▵ Read the Rakuten Rewards case study at:
https://www.atscale.com/rakutenrewards
▵ COMING SOON! Estimate your cost savings
using the AtScale calculator
Q&A!
www.atscale.com

More Related Content

PPTX
How to Optimize Sales Analytics Using 10x the Data at 1/10th the Cost
PPTX
Altis AWS Snowflake Practice
PPTX
Master the Multi-Clustered Data Warehouse - Snowflake
PDF
SLC Snowflake User Group - Mar 12, 2020
PDF
Melbourne: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cl...
PPTX
Free Training: How to Build a Lakehouse
PPTX
Exploiting Data Lakes: Architecture, Capabilities & Future
PPTX
Snowflake + Power BI: Cloud Analytics for Everyone
How to Optimize Sales Analytics Using 10x the Data at 1/10th the Cost
Altis AWS Snowflake Practice
Master the Multi-Clustered Data Warehouse - Snowflake
SLC Snowflake User Group - Mar 12, 2020
Melbourne: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cl...
Free Training: How to Build a Lakehouse
Exploiting Data Lakes: Architecture, Capabilities & Future
Snowflake + Power BI: Cloud Analytics for Everyone

What's hot (18)

PDF
Definitive Guide to Select Right Data Warehouse (2020)
PDF
2021 gartner mq dsml
PDF
Using Machine Learning to Optimize COVID-19 Predictions
PPTX
Altis Webinar: Use Cases For The Modern Data Platform
PPTX
Redshift vs BigQuery lessons learned at Yahoo!
DOCX
Varadarajan CV
PDF
Worst Practices in Data Warehouse Design
PPTX
A brief history of data warehousing
PDF
The Power Of Snowflake for SAP BusinessObjects
PDF
Analytics in a Day Virtual Workshop
 
PPTX
Building and Maintaining Bulletproof Systems with DataStax
PDF
Achieving Agility and Scale for Your Data Lake - Talend
PDF
What’s New with Databricks Machine Learning
PPTX
Snowflake Overview
PPTX
DataStax on Azure: Deploying an industry-leading data platform for cloud apps...
PDF
Analytics in a Day Virtual Workshop
 
PDF
Big Data Expo 2015 - Talend Delivering Real Time
PDF
Building Lakehouses on Delta Lake with SQL Analytics Primer
Definitive Guide to Select Right Data Warehouse (2020)
2021 gartner mq dsml
Using Machine Learning to Optimize COVID-19 Predictions
Altis Webinar: Use Cases For The Modern Data Platform
Redshift vs BigQuery lessons learned at Yahoo!
Varadarajan CV
Worst Practices in Data Warehouse Design
A brief history of data warehousing
The Power Of Snowflake for SAP BusinessObjects
Analytics in a Day Virtual Workshop
 
Building and Maintaining Bulletproof Systems with DataStax
Achieving Agility and Scale for Your Data Lake - Talend
What’s New with Databricks Machine Learning
Snowflake Overview
DataStax on Azure: Deploying an industry-leading data platform for cloud apps...
Analytics in a Day Virtual Workshop
 
Big Data Expo 2015 - Talend Delivering Real Time
Building Lakehouses on Delta Lake with SQL Analytics Primer
Ad

Similar to How to Realize an Additional 270% ROI on Snowflake (20)

PPTX
OLAP on the Cloud with Azure Databricks and Azure Synapse
PPT
Msbi by quontra us
PDF
Mstr meetup
PDF
Scaling AutoML-Driven Anomaly Detection With Luminaire
PDF
Sherlock holmes for dba’s
PPTX
Why Big Query is so Powerful - Trusted Conf
PDF
Analyst View of Data Virtualization: Conversations with Boulder Business Inte...
PPTX
Big Data Hadoop Customer 360 Degree View
PDF
The Anchor Store: Four Confluence Examples to Root Your Deployment
PDF
Simplify Feature Engineering in Your Data Warehouse
PPT
SetFocus SQL Portfolio
PPTX
Hug meetup impala 2.5 performance overview
PPTX
Apache Impala (incubating) 2.5 Performance Update
PPTX
Understanding Web Analytics and Google Analytics
PPTX
Boosting the Performance of your Rails Apps
PDF
22-4_PerformanceTuningUsingtheAdvisorFramework.pdf
PPTX
Smart solutions for productivity gain IQA conference 2017
PPT
OLAP Cubes in Datawarehousing
PPTX
Digital analytics with R - Sydney Users of R Forum - May 2015
PDF
Big Data, Bigger Analytics
OLAP on the Cloud with Azure Databricks and Azure Synapse
Msbi by quontra us
Mstr meetup
Scaling AutoML-Driven Anomaly Detection With Luminaire
Sherlock holmes for dba’s
Why Big Query is so Powerful - Trusted Conf
Analyst View of Data Virtualization: Conversations with Boulder Business Inte...
Big Data Hadoop Customer 360 Degree View
The Anchor Store: Four Confluence Examples to Root Your Deployment
Simplify Feature Engineering in Your Data Warehouse
SetFocus SQL Portfolio
Hug meetup impala 2.5 performance overview
Apache Impala (incubating) 2.5 Performance Update
Understanding Web Analytics and Google Analytics
Boosting the Performance of your Rails Apps
22-4_PerformanceTuningUsingtheAdvisorFramework.pdf
Smart solutions for productivity gain IQA conference 2017
OLAP Cubes in Datawarehousing
Digital analytics with R - Sydney Users of R Forum - May 2015
Big Data, Bigger Analytics
Ad

Recently uploaded (20)

PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Advanced Soft Computing BINUS July 2025.pdf
PDF
Advanced IT Governance
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Electronic commerce courselecture one. Pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPTX
Big Data Technologies - Introduction.pptx
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Network Security Unit 5.pdf for BCA BBA.
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Review of recent advances in non-invasive hemoglobin estimation
Spectral efficient network and resource selection model in 5G networks
Advanced Soft Computing BINUS July 2025.pdf
Advanced IT Governance
Reach Out and Touch Someone: Haptics and Empathic Computing
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Electronic commerce courselecture one. Pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Mobile App Security Testing_ A Comprehensive Guide.pdf
Dropbox Q2 2025 Financial Results & Investor Presentation
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Understanding_Digital_Forensics_Presentation.pptx
Big Data Technologies - Introduction.pptx
Diabetes mellitus diagnosis method based random forest with bat algorithm
MYSQL Presentation for SQL database connectivity
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton

How to Realize an Additional 270% ROI on Snowflake

  • 1. HOW TO REALIZE AN ADDITIONAL 270% ROI WITH SNOWFLAKE
  • 2. 2 Introduction How to dramatically increase the ROI of your Snowflake investment by: ● Managing the size of your data warehouse ● Defining and setting limits on query times to prevent runaway queries ● Implementing visibility and telemetry to monitor usage ● Automating the creation, maintenance and management of data aggregates
  • 3. 3 Today’s Speakers VP, Analytics, Rakuten Rewards Twitter? Mark is VP, Analytics, at Rakuten Rewards, formerly Ebates. He’s been with the company since 2014, and was with the team that sold Ebates to Rakuten in 2015. Mark plays a double role, leading a center of excellence analytics group and product managing the enterprise business intelligence stack. Prior to joining Ebates, Mark worked in the residential real estate, and online tournament spaces. Mark Stange-Tregear Chief Strategy Officer, AtScale @dmariani Dave is one of the co-founders of AtScale and is currently the Chief Strategy Officer. Prior to AtScale, Dave was VP of Engineering at Klout & at Yahoo! where he built the world's largest multi-dimensional cube for BI on Hadoop. Dave is a Big Data visionary & serial entrepreneur. Dave Mariani @rakutenrewards
  • 4. 4 In Pursuit of Processing Power - A Timeline 2014 2016 20172015 2018 2019 SSRS SQL Server SSAS M.Strategy Hadoop (Cloudera) @scale Tableau Hadoop (Cloudera) @scale Tableau Hadoop (Cloudera) @scale Tableau @scale Tableau Snowflake
  • 5. 5 Why Snowflake? USERS “TRIP” OVER EACH OTHER FRUSTRATION AND MISSED GOALS SEPARATE COMPUTE INTO DISCRETE CLUSTERS • Dozens of discreet warehouses • Budget cost to the business unit • Separate ETL from ad hoc workload • Horizontally scalable on-demand PROBLEM RESULT
  • 6. Managing Data Warehouses MORE WAREHOUSES MORE VISIBILITY MORE CONTROL HOW MANY? ● At least one warehouse per business team or engineering group (often several) ● Dedicated warehouses for ETL components ● Dedicated warehouses for 3rd party products CONTROL OF WAREHOUSE SIZE? ● Constantly reviewed for potential down-sizing ● Resizing control centralized with cost management and oversight “team” IS BIGGER BETTER? ● Not always ● IO intensive workloads can work on smaller clusters ● Aggregations and joins on bigger clusters
  • 7. 7 More Visibility … More Control Get to know and love: ● "SNOWFLAKE"."ACCOUNT_USAGE"."WAREHOUSE_METERING_HISTORY" ● "SNOWFLAKE"."ACCOUNT_USAGE"."QUERY_HISTORY” and these are well worth knowing as well… ● "SNOWFLAKE"."ACCOUNT_USAGE"."STORAGE_USAGE" ● "SNOWFLAKE"."ACCOUNT_USAGE"."METERING_DAILY_HISTORY" ● SNOWFLAKE"."READER_ACCOUNT_USAGE"."WAREHOUSE_METERING_HISTORY” MAKE MONITORING EASY… AUTOMATED DAILY REPORTS
  • 8. 8
  • 9. Levers to Pull 9 1. Warehouse size… typically moving down, but sometimes up 2. Horizontal scaling 3. Move jobs between warehouses 4. Caching 5. Code rewrite 6. Clustering A note on code rewrite… ● Data modeling is still important ● Snowflake is very powerful, but joins and aggregations cost ● Repetitive joins, repetitive aggregation? Consider creating “flat” warehouse table
  • 10. 10 Query Performance User Concurrency Compute Costs How fast can the Cloud Data Warehouse answer a query for one user? How do multiple users running queries affect performance & stability? How do query workloads and configuration impact your monthly bill? Semantic Complexity How difficult is it to write the query to answer the business question? Additional Considerations When Managing Snowflake
  • 11. The Cloud Analytics Stack COMPONENT CONSUMPTION VISUALIZATION, ANALYSIS, REPORTING SEMANTIC LAYER QUERY ACCESS,METADATA, MASKING, AUDITING PREPARED DATA DATA PROCESSING, MODELING RAW DATA DATA STORAGE, ENCRYPTION DATA TRANSFORMATION ETL,MERGING, AGGREGATION LAYER (FUNCTION) BI Tools AI/ML Tools Applications UNIVERSAL SEMANTIC LAYER Data Warehouse File Access Engine ETL Engine File System (Data Lake) Data Catalog
  • 12. The Cloud Analytics Stack 12 COMPONENT CONSUMPTION VISUALIZATION, ANALYSIS, REPORTING SEMANTIC LAYER QUERY ACCESS, FILTERING, MASKING, AUDITING PREPARED DATA DATA PROCESSING, MODELING RAW DATA DATA STORAGE, ENCRYPTION DATA TRANSFORMATION ETL,MERGING, AGGREGATION LAYER (FUNCTION) BI Tools AI/ML Tools Applications Multi-dimensional Engine Data Governance Engine Virtualization Engine Data Warehouse File Access Engine ETL Engine File System (Data Lake) Data Catalog
  • 13. A Semantic Layer is Critical to Success 13 1. Simplicity 2. Single Source of truth 3. Governance for all
  • 14. 14 A Semantic Layer Simplifies & Normalizes Data Access SELECT `d_product_manufacturer_id` AS `d_product_manufacturer_id`, SUM( `Total Ext Sales Price` ) AS `sum_total__ext_sales_price_ok` FROM `tpc-ds benchmark model` `TPC-DS Benchmark Model` WHERE `I Category` = 'Electronics' AND `Sold Calendar Year-Week` = 1999 AND `Sold d_customer_gmt_offset` = -5.00 AND `Sold d_month_of_year` = 7 GROUP BY 1 ORDER BY 2 DESC LIMIT 100; with ss as ( select i_manufact_id,sum(ss_ext_sales_price) total_sales from store_sales, date_dim, customer_address, item where i_manufact_id in (select i_manufact_id from item where i_category in ('Electronics')) and ss_item_sk = i_item_sk and ss_sold_date_sk = d_date_sk and d_year = 1999 and d_moy = 7 and ss_addr_sk = ca_address_sk and ca_gmt_offset = -5 group by i_manufact_id), cs as ( select i_manufact_id,sum(cs_ext_sales_price) total_sales from catalog_sales, date_dim, customer_address, item where ... TPC-DS Query #33: What is the monthly sales figure based on extended price for a specific month in a specific year, for manufacturers in a specific category in a given time zone? Group sales by manufacturer identifier and sort output by sales amount, by channel, and give Total sales. 398 bytes 1,872 bytes AtScale SQL TPC-DS Raw
  • 15. 15 AtScale’s TPC-DS 10TB Benchmark (10,000 Scale Factor) THE TPC-DS 10TB DATASET HAS: Multiple fact tables Large fact tables Large dimensions 1 2 3
  • 16. Delivers orders of magnitude query improvements that are amplified with high user concurrency 16 Benchmark Results: Query Performance 14x Faster
  • 17. Smooths out & mitigates user concurrency challenges without requiring additional compute resources 17 Benchmark Results: Concurrency Note: Tthread group 1 is the average of 5 runs for each of the 20 queries, The 5, 25 & 50 thead groups ran each of the 20 queries 1 time per thread.
  • 18. 18 Benchmark Results: Compute Cost Allows for smaller compute resources & mitigates unpredictable & unbounded costs for on-demand pricing models 4x Less Cost
  • 19. 19 Same Workloads, Smaller Warehouses Snowflake - Raw Snowflake + AtScale 1, 5, 25, 50 threads 1, 5, 25, 50, 100 threads
  • 20. Test Improvement Factor with AtScale Snowflake Query Performance1 4x Faster User Concurrency2 14x Faster Compute Cost3 73% Cheaper Complexity4 76% less complex SQL queries Results of TPC-DS 10TB Benchmark Test 20 1. Elapsed time for executing 1 query five times 2. Elapsed time executing 1 (x5), 5, 25, 50 queries 3. Compute costs for cluster time (Redshift, Snowflake) or bytes read (BigQuery) for user concurrency test 4. Complexity score for SQL queries for number of: functions, operations, tables, objects & subqueries (AtScale = 258, TPC-DS = 1,057) Configuration Virtual Data Warehouse Used Compute Cost per Hour1 Snowflake 3X-Large (64 credits/hour) $128.00 AtScale on Snowflake 1X-Large (16 credits/hour) $32.00 AtScale customers realize an additional 270% ROI on Snowflake
  • 21. 21 DEMO SELECT `d_product_manufacturer_id` AS `d_product_manufacturer_id`, SUM( `Total Ext Sales Price` ) AS `sum_total__ext_sales_price_ok` FROM `tpc-ds benchmark model` `TPC-DS Benchmark Model` WHERE `I Category` = 'Electronics' AND `Sold Calendar Year-Week` = 1999 AND `Sold d_customer_gmt_offset` = -5.00 AND `Sold d_month_of_year` = 7 GROUP BY 1 ORDER BY 2 DESC LIMIT 100; with ss as ( select i_manufact_id,sum(ss_ext_sales_price) total_sales from store_sales, date_dim, customer_address, item where i_manufact_id in (select i_manufact_id from item where i_category in ('Electronics')) and ss_item_sk = i_item_sk and ss_sold_date_sk = d_date_sk and d_year = 1999 and d_moy = 7 and ss_addr_sk = ca_address_sk and ca_gmt_offset = -5 group by i_manufact_id), cs as ( select i_manufact_id,sum(cs_ext_sales_price) total_sales from catalog_sales, date_dim, customer_address, item where ... TPC-DS Query #33: What is the monthly sales figure based on extended price for a specific month in a specific year, for manufacturers in a specific category in a given time zone? Group sales by manufacturer identifier and sort output by sales amount, by channel, and give Total sales. 398 bytes 1,872 bytes AtScale SQL TPC-DS Raw
  • 22. 22 Summary: How to realize an additional 270% ROI on Snowflake ▵ Download the Snowflake benchmark report at: https://www.atscale.com/snowflake benchmark ▵ Read the Rakuten Rewards case study at: https://www.atscale.com/rakutenrewards ▵ COMING SOON! Estimate your cost savings using the AtScale calculator
  • 23. Q&A!

Editor's Notes

  • #3: Companies of all sizes have embraced the power, scale and ease of use of Snowflake along with the promise of cost savings. But as some have learned, cloud compute usage can sneak up on you if you aren’t careful. Today, our experts will discuss how to dramatically increase the ROI of your Snowflake investment by:
  • #12: AtScale is built to leverage the efficiencies and performance of the cloud for the data consumer whether you’re on premise or in the cloud (or both). We connect people to data. We do that without moving data and without complexity—leveraging existing investments in big data platforms, applications and tools. We also do that consistently, securely and with one set of semantics—and without interrupting existing data usage so that data workers no longer have to understand how or where it is stored. Performance Optimizing performance is difficult and that’s where we focus our energies. AtScale’s data warehouse virtualization can reduce queries performance from 5 weeks to 5 seconds—automatically optimizing each time a user queries the database. Security Because we haven’t copied the data and applied new code or embedded rules, we’ve reduced the amount of complexity and maintain consistent data lineage throughout the data lifecycle. AtScale not only leverages existing data security and governance but applies an additional layer so that data can be ported to new data tools, applications and platforms. Agility What’s more powerful is we create simple interface to querying data and building models for data science and analytics data workers with deep integrations with BI and AI/ML tools. For the first time, users (and IT) have visibilities into how data is being queried and used throughout the organization (no more data silos).
  • #13: AtScale is built to leverage the efficiencies and performance of the cloud for the data consumer whether you’re on premise or in the cloud (or both). We connect people to data. We do that without moving data and without complexity—leveraging existing investments in big data platforms, applications and tools. We also do that consistently, securely and with one set of semantics—and without interrupting existing data usage so that data workers no longer have to understand how or where it is stored. Performance Optimizing performance is difficult and that’s where we focus our energies. AtScale’s data warehouse virtualization can reduce queries performance from 5 weeks to 5 seconds—automatically optimizing each time a user queries the database. Security Because we haven’t copied the data and applied new code or embedded rules, we’ve reduced the amount of complexity and maintain consistent data lineage throughout the data lifecycle. AtScale not only leverages existing data security and governance but applies an additional layer so that data can be ported to new data tools, applications and platforms. Agility What’s more powerful is we create simple interface to querying data and building models for data science and analytics data workers with deep integrations with BI and AI/ML tools. For the first time, users (and IT) have visibilities into how data is being queried and used throughout the organization (no more data silos).