SlideShare a Scribd company logo
javier ramirez
@supercoco9
How we are using
BigQuery and
Apps Scripts
at teowaki
How we are using BigQuery and Apps Scripts at teowaki
How we are using BigQuery and Apps Scripts at teowaki
How we are using BigQuery and Apps Scripts at teowaki
How we are using BigQuery and Apps Scripts at teowaki
Set a distance.
Set an expiration time.
Bye bye noise.
How we are using BigQuery and Apps Scripts at teowaki
How we are using BigQuery and Apps Scripts at teowaki
How we are using BigQuery and Apps Scripts at teowaki
How we are using BigQuery and Apps Scripts at teowaki
How we are using BigQuery and Apps Scripts at teowaki
Analytics flow
Analytics flow, by segment
Automatic Alerts
How we are using BigQuery and Apps Scripts at teowaki
How we are using BigQuery and Apps Scripts at teowaki
How we are using BigQuery and Apps Scripts at teowaki
javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
REST API (Ruby on Rails)
+
Web on top (AngularJS)
javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
data that exceeds the
processing capacity of
conventional database
systems. The data is too big,
moves too fast, or doesn’t fit
the structures of your
database architectures.
Ed Dumbill
program chair for the O’Reilly Strata Conference
javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
1. non intrusive metrics
2. keep the history
3. avoid vendor lock-in
4. interactive queries
5. cheap
6. extra ball: real time
javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
Cloud Storage:
Cost-efficient storage of
files
javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
Hadoop
Cassandra
Amazon Redshift
...
javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
tools we considered:
Our choice:
Google BigQuery
Data analysis as a service
http://developers.google.com/bigquery
javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
Based on “Dremel”
Specifically designed for
interactive queries over
petabytes of real-time
data
javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
loading data
You just send the data in
text (or JSON) format
javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
SQL
javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
select name from USERS order by date;
select count(*) from users;
select max(date) from USERS;
select sum(total) from ORDERS group by user;
specific extensions for
analytics
javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
within
flatten
nest
stddev
top
first
last
nth
variance
var_pop
var_samp
covar_pop
covar_samp
quantiles
web console screenshot
javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
our most active user
javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
country segmented traffic
javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
10 request we should be caching
javier ramirez @supercoco9 http://teowaki.com startup launch summit london 14
5 most created resources
new users per month
SELECT repository_name, repository_language,
repository_description, COUNT(repository_name) as cnt,
repository_url
FROM github.timeline
WHERE type="WatchEvent"
AND PARSE_UTC_USEC(created_at) >=
PARSE_UTC_USEC("#{yesterday} 20:00:00")
AND repository_url IN (
SELECT repository_url
FROM github.timeline
WHERE type="CreateEvent"
AND PARSE_UTC_USEC(repository_created_at) >=
PARSE_UTC_USEC('#{yesterday} 20:00:00')
AND repository_fork = "false"
AND payload_ref_type = "repository"
GROUP BY repository_url
)
GROUP BY repository_name, repository_language,
repository_description, repository_url
HAVING cnt >= 5
ORDER BY cnt DESC
LIMIT 25
How we are using BigQuery and Apps Scripts at teowaki
How we are using BigQuery and Apps Scripts at teowaki
How we are using BigQuery and Apps Scripts at teowaki
Automation with Apps Script
Read from bigquery
Create a spreadsheet on Drive
E-mail it everyday as a PDF
javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
How we are using BigQuery and Apps Scripts at teowaki
How we are using BigQuery and Apps Scripts at teowaki
How we are using BigQuery and Apps Scripts at teowaki
How we are using BigQuery and Apps Scripts at teowaki
How we are using BigQuery and Apps Scripts at teowaki
cloud storage pricing
$0.032 per GB
a gzipped 4.8 MB file stores 1MM
rows
$0.000092 / month per 1MM rows
javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
bigquery pricing
$26 per stored TB
1000000 rows => $0.00416 / month
£0.00243 / month
$5 per processed TB
1 full scan = 160 MB
1 count = 0 MB
1 full scan over 1 column = 5.4 MB
100 GB => $0.05 / month £0.03javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
£0.054307 / month*
per 1MM rows
*the 1st
100GB every month are free of charge
javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
1. non intrusive metrics
2. keep the history
3. avoid vendor lock-in
4. interactive queries
5. cheap
6. extra ball: real time
javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
ig
Find related links at
https://teowaki.com/teams/javier-community/link-categories/bigquery-talk
Thanks!
Javier Ramírez
@supercoco9
startup launch summit london 14

More Related Content

PDF
Bigdata for small pockets, by Javier Ramirez from teowaki. RubyC Kiev 2014
PDF
Agent Side Lookups with HashiCorp Vault and Puppet 6
PDF
Vault 1.1: Secret Caching with Vault Agent and Other New Features
PDF
10 Excellent Ways to Secure Your Spring Boot Application - Devoxx Belgium 2019
PDF
Be Mean to your Code with Gauntlt #txlf 2013
PDF
10 Excellent Ways to Secure Your Spring Boot Application - Devoxx Morocco 2019
PDF
Brining Harmony between Dev and Ops and Security Teams using Gauntlt at ISC2 ...
PDF
How to measure everything - a million metrics per second with minimal develop...
Bigdata for small pockets, by Javier Ramirez from teowaki. RubyC Kiev 2014
Agent Side Lookups with HashiCorp Vault and Puppet 6
Vault 1.1: Secret Caching with Vault Agent and Other New Features
10 Excellent Ways to Secure Your Spring Boot Application - Devoxx Belgium 2019
Be Mean to your Code with Gauntlt #txlf 2013
10 Excellent Ways to Secure Your Spring Boot Application - Devoxx Morocco 2019
Brining Harmony between Dev and Ops and Security Teams using Gauntlt at ISC2 ...
How to measure everything - a million metrics per second with minimal develop...

What's hot (11)

PPTX
Architecting for High Resiliency @ Strangeloop - Steven Dang
PDF
Micrometerでメトリクスを収集してAmazon CloudWatchで可視化
PDF
10 Excellent Ways to Secure Your Spring Boot Application - The Secure Develop...
PDF
Mergesort divide
PDF
Intoroduce milkcocoa for english
PDF
決済サービスのSpring Bootのバージョンを2系に上げた話
PDF
Server Side Swift - AppBuilders 2017
PDF
How to Measure Everything: A Million Metrics Per Second with Minimal Develope...
PDF
Rugged by example with Gauntlt (Hacker Headshot)
PDF
Secure your Web Application With The New Python Audit Hooks
PDF
Be Mean to Your Code - OWASP San Antonio
Architecting for High Resiliency @ Strangeloop - Steven Dang
Micrometerでメトリクスを収集してAmazon CloudWatchで可視化
10 Excellent Ways to Secure Your Spring Boot Application - The Secure Develop...
Mergesort divide
Intoroduce milkcocoa for english
決済サービスのSpring Bootのバージョンを2系に上げた話
Server Side Swift - AppBuilders 2017
How to Measure Everything: A Million Metrics Per Second with Minimal Develope...
Rugged by example with Gauntlt (Hacker Headshot)
Secure your Web Application With The New Python Audit Hooks
Be Mean to Your Code - OWASP San Antonio
Ad

Viewers also liked (7)

PDF
API Analytics with Redis and Bigquery. NoSQLmatters Cologne '14 edition. Javi...
PDF
usable rest apis, by Javier Ramirez from teowaki (Apidays Mediterranea)
PDF
What is rest. Why is it part of the Rails way?. Railsconf 2014. Javier Ramirez
PDF
api analytics redis bigquery. Lrug
PDF
Why and how we built teowaki
PDF
Basics of the Highly Available Distributed Databases - teowaki - javier ramir...
ODP
Google Cloud Platform for DeVops, by Javier Ramirez @ teowaki
API Analytics with Redis and Bigquery. NoSQLmatters Cologne '14 edition. Javi...
usable rest apis, by Javier Ramirez from teowaki (Apidays Mediterranea)
What is rest. Why is it part of the Rails way?. Railsconf 2014. Javier Ramirez
api analytics redis bigquery. Lrug
Why and how we built teowaki
Basics of the Highly Available Distributed Databases - teowaki - javier ramir...
Google Cloud Platform for DeVops, by Javier Ramirez @ teowaki
Ad

Similar to How we are using BigQuery and Apps Scripts at teowaki (20)

PDF
API analytics with Redis and Google Bigquery. NoSQL matters edition
PDF
Big Data Analytics with Google BigQuery. GDG Summit Spain 2014
PDF
Choose Your Own Adventure with JHipster & Kubernetes - Utah JUG 2020
ODP
Big Data analytics with Nginx, Logstash, Redis, Google Bigquery and Neo4j, ja...
PDF
How you can benefit from using Redis - Ramirez
ODP
Big Data Analytics with Google BigQuery. By Javier Ramirez. All your base Co...
PDF
Rust Meetup - How the Qovery Engine written in Rust works
PDF
Workshop KrakYourNet2016 - Web applications hacking Ruby on Rails example
PDF
Fun with ruby and redis, arrrrcamp edition, javier_ramirez, teowaki
PDF
The Big Cloud native FaaS Lebowski
PDF
APIs for the Internet of Things
PDF
RoR Workshop - Web applications hacking - Ruby on Rails example
PDF
Bootiful Development with Spring Boot and Vue - Devnexus 2019
PDF
Comparing Native Java REST API Frameworks - Devoxx France 2022
PDF
Choose Your Own Adventure with JHipster & Kubernetes - Denver JUG 2020
PPTX
Criando API's com HapiJS
PDF
Front End Development for Backend Developers - GIDS 2019
PDF
Comparing Native Java REST API Frameworks - Seattle JUG 2022
PDF
Bringing JAMStack to the Enterprise
PDF
Front End Development for Back End Java Developers - NYJavaSIG 2019
API analytics with Redis and Google Bigquery. NoSQL matters edition
Big Data Analytics with Google BigQuery. GDG Summit Spain 2014
Choose Your Own Adventure with JHipster & Kubernetes - Utah JUG 2020
Big Data analytics with Nginx, Logstash, Redis, Google Bigquery and Neo4j, ja...
How you can benefit from using Redis - Ramirez
Big Data Analytics with Google BigQuery. By Javier Ramirez. All your base Co...
Rust Meetup - How the Qovery Engine written in Rust works
Workshop KrakYourNet2016 - Web applications hacking Ruby on Rails example
Fun with ruby and redis, arrrrcamp edition, javier_ramirez, teowaki
The Big Cloud native FaaS Lebowski
APIs for the Internet of Things
RoR Workshop - Web applications hacking - Ruby on Rails example
Bootiful Development with Spring Boot and Vue - Devnexus 2019
Comparing Native Java REST API Frameworks - Devoxx France 2022
Choose Your Own Adventure with JHipster & Kubernetes - Denver JUG 2020
Criando API's com HapiJS
Front End Development for Backend Developers - GIDS 2019
Comparing Native Java REST API Frameworks - Seattle JUG 2022
Bringing JAMStack to the Enterprise
Front End Development for Back End Java Developers - NYJavaSIG 2019

More from javier ramirez (20)

PDF
The Future of Fast Databases: Lessons from a Decade of QuestDB
PDF
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...
PDF
How We Added Replication to QuestDB - JonTheBeach
PDF
The Building Blocks of QuestDB, a Time Series Database
PDF
¿Se puede vivir del open source? T3chfest
PDF
QuestDB: The building blocks of a fast open-source time-series database
PDF
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
PDF
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
PDF
Deduplicating and analysing time-series data with Apache Beam and QuestDB
PDF
Your Database Cannot Do this (well)
PDF
Your Timestamps Deserve Better than a Generic Database
PDF
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
PDF
QuestDB-Community-Call-20220728
PDF
Processing and analysing streaming data with Python. Pycon Italy 2022
PDF
QuestDB: ingesting a million time series per second on a single instance. Big...
PDF
Servicios e infraestructura de AWS y la próxima región en Aragón
PPTX
Primeros pasos en desarrollo serverless
PDF
How AWS is reinventing the cloud
PDF
Analitica de datos en tiempo real con Apache Flink y Apache BEAM
PDF
Getting started with streaming analytics
The Future of Fast Databases: Lessons from a Decade of QuestDB
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...
How We Added Replication to QuestDB - JonTheBeach
The Building Blocks of QuestDB, a Time Series Database
¿Se puede vivir del open source? T3chfest
QuestDB: The building blocks of a fast open-source time-series database
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
Deduplicating and analysing time-series data with Apache Beam and QuestDB
Your Database Cannot Do this (well)
Your Timestamps Deserve Better than a Generic Database
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
QuestDB-Community-Call-20220728
Processing and analysing streaming data with Python. Pycon Italy 2022
QuestDB: ingesting a million time series per second on a single instance. Big...
Servicios e infraestructura de AWS y la próxima región en Aragón
Primeros pasos en desarrollo serverless
How AWS is reinventing the cloud
Analitica de datos en tiempo real con Apache Flink y Apache BEAM
Getting started with streaming analytics

Recently uploaded (20)

PPTX
L1 - Introduction to python Backend.pptx
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PPTX
Online Work Permit System for Fast Permit Processing
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PPTX
Transform Your Business with a Software ERP System
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PDF
How Creative Agencies Leverage Project Management Software.pdf
PPTX
ai tools demonstartion for schools and inter college
PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
PPTX
Odoo POS Development Services by CandidRoot Solutions
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PPT
Introduction Database Management System for Course Database
PDF
AI in Product Development-omnex systems
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
PDF
System and Network Administraation Chapter 3
PPTX
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
PDF
PTS Company Brochure 2025 (1).pdf.......
L1 - Introduction to python Backend.pptx
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
Online Work Permit System for Fast Permit Processing
Which alternative to Crystal Reports is best for small or large businesses.pdf
Transform Your Business with a Software ERP System
Design an Analysis of Algorithms I-SECS-1021-03
How Creative Agencies Leverage Project Management Software.pdf
ai tools demonstartion for schools and inter college
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
Odoo POS Development Services by CandidRoot Solutions
Odoo Companies in India – Driving Business Transformation.pdf
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
Introduction Database Management System for Course Database
AI in Product Development-omnex systems
Internet Downloader Manager (IDM) Crack 6.42 Build 41
How to Migrate SBCGlobal Email to Yahoo Easily
VVF-Customer-Presentation2025-Ver1.9.pptx
System and Network Administraation Chapter 3
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
PTS Company Brochure 2025 (1).pdf.......

How we are using BigQuery and Apps Scripts at teowaki

  • 1. javier ramirez @supercoco9 How we are using BigQuery and Apps Scripts at teowaki
  • 6. Set a distance. Set an expiration time. Bye bye noise.
  • 18. javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14 REST API (Ruby on Rails) + Web on top (AngularJS)
  • 19. javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
  • 20. data that exceeds the processing capacity of conventional database systems. The data is too big, moves too fast, or doesn’t fit the structures of your database architectures. Ed Dumbill program chair for the O’Reilly Strata Conference javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
  • 21. 1. non intrusive metrics 2. keep the history 3. avoid vendor lock-in 4. interactive queries 5. cheap 6. extra ball: real time javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
  • 22. javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
  • 23. Cloud Storage: Cost-efficient storage of files javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
  • 24. javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
  • 25. Hadoop Cassandra Amazon Redshift ... javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14 tools we considered:
  • 26. Our choice: Google BigQuery Data analysis as a service http://developers.google.com/bigquery javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
  • 27. Based on “Dremel” Specifically designed for interactive queries over petabytes of real-time data javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
  • 28. loading data You just send the data in text (or JSON) format javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
  • 29. SQL javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14 select name from USERS order by date; select count(*) from users; select max(date) from USERS; select sum(total) from ORDERS group by user;
  • 30. specific extensions for analytics javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14 within flatten nest stddev top first last nth variance var_pop var_samp covar_pop covar_samp quantiles
  • 31. web console screenshot javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
  • 32. javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14 our most active user
  • 33. javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14 country segmented traffic
  • 34. javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14 10 request we should be caching
  • 35. javier ramirez @supercoco9 http://teowaki.com startup launch summit london 14 5 most created resources
  • 36. new users per month
  • 37. SELECT repository_name, repository_language, repository_description, COUNT(repository_name) as cnt, repository_url FROM github.timeline WHERE type="WatchEvent" AND PARSE_UTC_USEC(created_at) >= PARSE_UTC_USEC("#{yesterday} 20:00:00") AND repository_url IN ( SELECT repository_url FROM github.timeline WHERE type="CreateEvent" AND PARSE_UTC_USEC(repository_created_at) >= PARSE_UTC_USEC('#{yesterday} 20:00:00') AND repository_fork = "false" AND payload_ref_type = "repository" GROUP BY repository_url ) GROUP BY repository_name, repository_language, repository_description, repository_url HAVING cnt >= 5 ORDER BY cnt DESC LIMIT 25
  • 41. Automation with Apps Script Read from bigquery Create a spreadsheet on Drive E-mail it everyday as a PDF javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
  • 47. cloud storage pricing $0.032 per GB a gzipped 4.8 MB file stores 1MM rows $0.000092 / month per 1MM rows javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
  • 48. bigquery pricing $26 per stored TB 1000000 rows => $0.00416 / month £0.00243 / month $5 per processed TB 1 full scan = 160 MB 1 count = 0 MB 1 full scan over 1 column = 5.4 MB 100 GB => $0.05 / month £0.03javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
  • 49. £0.054307 / month* per 1MM rows *the 1st 100GB every month are free of charge javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
  • 50. 1. non intrusive metrics 2. keep the history 3. avoid vendor lock-in 4. interactive queries 5. cheap 6. extra ball: real time javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
  • 51. ig
  • 52. Find related links at https://teowaki.com/teams/javier-community/link-categories/bigquery-talk Thanks! Javier Ramírez @supercoco9 startup launch summit london 14