SlideShare a Scribd company logo
Big Data Analytics 
with Google BigQuery 
javier ramirez 
@supercoco9
REST API 
+ 
AngularJS web as 
an API client 
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2013
javier ramirez @supercoco9 https://teowaki.com
data that’s an order of 
magnitude greater than 
data you’re accustomed 
to 
javier ramirez @supercoco9 https://teowaki.com 
Doug Laney 
VP Research, Business Analytics and Performance Management at Gartner
data that exceeds the 
processing capacity of 
conventional database 
systems. The data is too big, 
moves too fast, or doesn’t fit 
the structures of your 
database architectures. 
Ed Dumbill 
program chair for the O’Reilly Strata Conference 
javier ramirez @supercoco9 https://teowaki.com
bigdata is doing a 
fullscan to 330MM rows, 
matching them against a 
regexp, and getting the 
result (223MM rows) in 
just 5 seconds 
javier ramirez @supercoco9 https://teowaki.com 
Javier Ramirez 
impresionable teowaki founder
bigdata is cool but... 
expensive cluster 
hard to set up and monitor 
not interactive enough
Our choice: 
Google BigQuery 
Data analysis as a service 
http://developers.google.com/bigquery 
javier ramirez @supercoco9 https://teowaki.com
Based on Dremel 
Specifically designed for 
interactive queries over 
petabytes of real-time 
data 
javier ramirez @supercoco9 https://teowaki.com
What Dremel is used for in Google 
• Analysis of crawled web documents. 
• Tracking install data for applications on Android Market. 
• Crash reporting for Google products. 
• OCR results from Google Books. 
• Spam analysis. 
• Debugging of map tiles on Google Maps. 
• Tablet migrations in managed Bigtable instances. 
• Results of tests run on Google’s distributed build system. 
• Disk I/O statistics for hundreds of thousands of disks. 
• Resource monitoring for jobs run in Google’s data centers. 
• Symbols and dependencies in Google’s codebase.
Columnar 
storage 
javier ramirez @supercoco9 https://teowaki.com
highly distributed 
execution using a tree 
javier ramirez @supercoco9 https://teowaki.com rubyc kiev 14
loading data 
You can feed flat CSV-like 
files or nested JSON 
objects 
javier ramirez @supercoco9 https://teowaki.com
bq cli 
bq load --nosynchronous_mode 
--encoding UTF-8 
--field_delimiter 'tab' 
--max_bad_records 100 
--source_format CSV 
api.stats 
20131014T11-42-05Z.gz 
javier ramirez @supercoco9 https://teowaki.com
web console screenshot 
javier ramirez @supercoco9 https://teowaki.com
analytical SQL functions. 
correlations. 
window functions. 
views. 
JSON fields. 
timestamped tables. 
javier ramirez @supercoco9 https://teowaki.com
Things you always wanted to 
try but were too scared to 
select count(*) from 
publicdata:samples.wikipedia 
where REGEXP_MATCH(title, "[0-9]*") 
AND wp_namespace = 0; 
223,163,387 
Query complete (5.6s elapsed, 9.13 GB processed, Cost: 32¢) 
javier ramirez @supercoco9 https://teowaki.com
Big Data Analytics with Google BigQuery. GDG Summit Spain 2014
SELECT repository_name, repository_language, 
repository_description, COUNT(repository_name) as cnt, 
repository_url 
FROM github.timeline 
WHERE type="WatchEvent" 
AND PARSE_UTC_USEC(created_at) >= 
PARSE_UTC_USEC("#{yesterday} 20:00:00") 
AND repository_url IN ( 
SELECT repository_url 
FROM github.timeline 
WHERE type="CreateEvent" 
AND PARSE_UTC_USEC(repository_created_at) >= 
PARSE_UTC_USEC('#{yesterday} 20:00:00') 
AND repository_fork = "false" 
AND payload_ref_type = "repository" 
GROUP BY repository_url 
) 
GROUP BY repository_name, repository_language, 
repository_description, repository_url 
HAVING cnt >= 5 
ORDER BY cnt DESC 
LIMIT 25
Global Database of 
Events, Language and 
Tone 
quarter billion rows 
30 years 
updated daily 
http://gdeltproject.org/data.html#googlebigquery
SELECT Year, Actor1Name, Actor2Name, Count FROM ( 
SELECT Actor1Name, Actor2Name, Year, 
COUNT(*) Count, RANK() OVER(PARTITION BY YEAR ORDER BY 
Count DESC) rank 
FROM 
(SELECT Actor1Name, Actor2Name, Year FROM 
[gdelt-bq:full.events] WHERE Actor1Name < Actor2Name 
and Actor1CountryCode != '' and Actor2CountryCode != '' 
and Actor1CountryCode!=Actor2CountryCode), 
(SELECT Actor2Name Actor1Name, Actor1Name Actor2Name, 
Year FROM [gdelt-bq:full.events] WHERE 
Actor1Name > Actor2Name and Actor1CountryCode != '' and 
Actor2CountryCode != '' and 
Actor1CountryCode!=Actor2CountryCode), 
WHERE Actor1Name IS NOT null 
AND Actor2Name IS NOT null 
GROUP EACH BY 1, 2, 3 
HAVING Count > 100 
) 
WHERE rank=1 
ORDER BY Year
Big Data Analytics with Google BigQuery. GDG Summit Spain 2014
our most active user 
javier ramirez @supercoco9 https://teowaki.com
10 request we should be caching 
javier ramirez @supercoco9 https://teowaki.com
5 most created resources 
select uri, count(*) total from 
stats where method = 'POST' 
group by URI; 
javier ramirez @supercoco9 http://teowaki.com
...but 
/users/javier/shouts 
/users/rgo/shouts 
/teams/javier-community/links 
/teams/nosqlmatters-cgn/links 
javier ramirez @supercoco9 http://teowaki.com
5 most created resources 
javier ramirez @supercoco9 http://teowaki.com
javier ramirez @supercoco9 https://teowaki.com
Big Data Analytics with Google BigQuery. GDG Summit Spain 2014
Automation with Apps Script 
Read from bigquery 
Create a spreadsheet on Drive 
E-mail it everyday as a PDF 
javier ramirez @supercoco9 https://teowaki.com
Big Data Analytics with Google BigQuery. GDG Summit Spain 2014
Big Data Analytics with Google BigQuery. GDG Summit Spain 2014
Big Data Analytics with Google BigQuery. GDG Summit Spain 2014
Big Data Analytics with Google BigQuery. GDG Summit Spain 2014
Big Data Analytics with Google BigQuery. GDG Summit Spain 2014
bigquery pricing 
$80 per stored TB 
300000 rows => $0.007629392 / month 
$35 per processed TB 
1 full scan = 84 MB 
1 count = 0 MB 
1 full scan over 1 column = 5.4 MB 
10 GB => $0.35 / month 
*the 1st TB every month is free of charge 
javier ramirez @supercoco9 https://teowaki.com
Find related links at 
https://teowaki.com/teams/javier-community/link-categories/bigquery-talk 
Grazas 
Javier Ramírez 
@supercoco9

More Related Content

PDF
Event Sourcing + CQRS
PDF
How we cooked Elasticsearch, Consul, HAproxy and DNS-recursor
PDF
WordPress RESTful API & Amazon API Gateway - WordCamp Kansai 2016
PDF
[db tech showcase Tokyo 2018] #dbts2018 #C32 『Deep Dive on the Amazon Aurora ...
PDF
Amazon AI のスゴいデモ(仮) - Serverless Meetup Osaka
PDF
Routing @ Scuk.cz
PDF
AWS re:Invent 2017 주요 신규 서비스 분야별 요약 - 윤석찬 (AWS테크에반젤리스트)
PDF
Understanding Git - GOTO London 2015
Event Sourcing + CQRS
How we cooked Elasticsearch, Consul, HAproxy and DNS-recursor
WordPress RESTful API & Amazon API Gateway - WordCamp Kansai 2016
[db tech showcase Tokyo 2018] #dbts2018 #C32 『Deep Dive on the Amazon Aurora ...
Amazon AI のスゴいデモ(仮) - Serverless Meetup Osaka
Routing @ Scuk.cz
AWS re:Invent 2017 주요 신규 서비스 분야별 요약 - 윤석찬 (AWS테크에반젤리스트)
Understanding Git - GOTO London 2015

What's hot (19)

PDF
Understanding git: Voxxed Vienna 2016
PPT
Real-Time Python Web: Gevent and Socket.io
PPTX
regular expressions and the world wide web
PPTX
Altitude San Francisco 2018: Logging at the Edge
PPTX
Elastic 101 log enrichment
PDF
Time series databases
PDF
Spatial script for my JS.Everywhere 2012
ODT
Spatial script for MongoBoulder
PDF
Interview with Developer Jose Luis Arenas regarding Google App Engine & Geosp...
PDF
Real World Optimization
PPTX
Elastic 101 ingest manager
PDF
Manageable data pipelines with airflow (and kubernetes) november 27, 11 45 ...
PPTX
Access pattern of tags
PPT
Joomla! Day UK 2009 .htaccess
PPTX
Elastic 101 - API Logs
PDF
Weightlifting at SimplySocial
PPT
Python And GIS - Beyond Modelbuilder And Pythonwin
KEY
How I Learned to Stop Worrying and Love the Cloud - Wesley Beary, Engine Yard
PDF
AWS re:Invent 특집 세미나 - (2) DB/분석 분야 신규 서비스 요약 :: 윤석찬 (AWS 테크에반젤리스트)
Understanding git: Voxxed Vienna 2016
Real-Time Python Web: Gevent and Socket.io
regular expressions and the world wide web
Altitude San Francisco 2018: Logging at the Edge
Elastic 101 log enrichment
Time series databases
Spatial script for my JS.Everywhere 2012
Spatial script for MongoBoulder
Interview with Developer Jose Luis Arenas regarding Google App Engine & Geosp...
Real World Optimization
Elastic 101 ingest manager
Manageable data pipelines with airflow (and kubernetes) november 27, 11 45 ...
Access pattern of tags
Joomla! Day UK 2009 .htaccess
Elastic 101 - API Logs
Weightlifting at SimplySocial
Python And GIS - Beyond Modelbuilder And Pythonwin
How I Learned to Stop Worrying and Love the Cloud - Wesley Beary, Engine Yard
AWS re:Invent 특집 세미나 - (2) DB/분석 분야 신규 서비스 요약 :: 윤석찬 (AWS 테크에반젤리스트)
Ad

Viewers also liked (9)

PPTX
Big Data and Marketing Technology
PDF
Big query
PDF
Google BigQuery - Features & Benefits
PDF
Crunching Data with Google BigQuery. JORDAN TIGANI at Big Data Spain 2012
PPTX
(Almost) Serverless Analytics System with BigQuery & AppEngine
PDF
Google BigQuery
PDF
Complex realtime event analytics using BigQuery @Crunch Warmup
PDF
Google Cloud Dataflow
PDF
Big Query Basics
Big Data and Marketing Technology
Big query
Google BigQuery - Features & Benefits
Crunching Data with Google BigQuery. JORDAN TIGANI at Big Data Spain 2012
(Almost) Serverless Analytics System with BigQuery & AppEngine
Google BigQuery
Complex realtime event analytics using BigQuery @Crunch Warmup
Google Cloud Dataflow
Big Query Basics
Ad

Similar to Big Data Analytics with Google BigQuery. GDG Summit Spain 2014 (20)

ODP
Big Data Analytics with Google BigQuery. By Javier Ramirez. All your base Co...
PDF
Big Data Analytics with Google BigQuery, by Javier Ramirez, datawaki, at Span...
ODP
Big Data analytics with Nginx, Logstash, Redis, Google Bigquery and Neo4j, ja...
PDF
API analytics with Redis and Google Bigquery. NoSQL matters edition
PDF
api analytics redis bigquery. Lrug
PDF
Bigdata for small pockets, by Javier Ramirez from teowaki. RubyC Kiev 2014
PDF
How we are using BigQuery and Apps Scripts at teowaki
PDF
Big Data with BigQuery, presented at DevoxxUK 2014 by Javier Ramirez from teo...
PDF
API Analytics with Redis and Bigquery. NoSQLmatters Cologne '14 edition. Javi...
ODP
Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...
PDF
Works with persistent graphs using OrientDB
PDF
Economies of Scaling Software
PDF
Painless Persistence in a Disconnected World
PDF
Apache Spark v3.0.0
PDF
Using Apache Solr
PPTX
Mongodb beijingconf yottaa_3.3
PPTX
BDW Chicago 2016 - Jim Scott, Director, Enterprise Strategy & Architecture - ...
PDF
Presto anatomy
KEY
HTML 5 & CSS 3
PDF
Get more from Analytics with Google BigQuery - Javier Ramirez - Datawaki- BBVACI
Big Data Analytics with Google BigQuery. By Javier Ramirez. All your base Co...
Big Data Analytics with Google BigQuery, by Javier Ramirez, datawaki, at Span...
Big Data analytics with Nginx, Logstash, Redis, Google Bigquery and Neo4j, ja...
API analytics with Redis and Google Bigquery. NoSQL matters edition
api analytics redis bigquery. Lrug
Bigdata for small pockets, by Javier Ramirez from teowaki. RubyC Kiev 2014
How we are using BigQuery and Apps Scripts at teowaki
Big Data with BigQuery, presented at DevoxxUK 2014 by Javier Ramirez from teo...
API Analytics with Redis and Bigquery. NoSQLmatters Cologne '14 edition. Javi...
Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...
Works with persistent graphs using OrientDB
Economies of Scaling Software
Painless Persistence in a Disconnected World
Apache Spark v3.0.0
Using Apache Solr
Mongodb beijingconf yottaa_3.3
BDW Chicago 2016 - Jim Scott, Director, Enterprise Strategy & Architecture - ...
Presto anatomy
HTML 5 & CSS 3
Get more from Analytics with Google BigQuery - Javier Ramirez - Datawaki- BBVACI

More from javier ramirez (20)

PDF
The Future of Fast Databases: Lessons from a Decade of QuestDB
PDF
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...
PDF
How We Added Replication to QuestDB - JonTheBeach
PDF
The Building Blocks of QuestDB, a Time Series Database
PDF
¿Se puede vivir del open source? T3chfest
PDF
QuestDB: The building blocks of a fast open-source time-series database
PDF
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
PDF
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
PDF
Deduplicating and analysing time-series data with Apache Beam and QuestDB
PDF
Your Database Cannot Do this (well)
PDF
Your Timestamps Deserve Better than a Generic Database
PDF
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
PDF
QuestDB-Community-Call-20220728
PDF
Processing and analysing streaming data with Python. Pycon Italy 2022
PDF
QuestDB: ingesting a million time series per second on a single instance. Big...
PDF
Servicios e infraestructura de AWS y la próxima región en Aragón
PPTX
Primeros pasos en desarrollo serverless
PDF
How AWS is reinventing the cloud
PDF
Analitica de datos en tiempo real con Apache Flink y Apache BEAM
PDF
Getting started with streaming analytics
The Future of Fast Databases: Lessons from a Decade of QuestDB
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...
How We Added Replication to QuestDB - JonTheBeach
The Building Blocks of QuestDB, a Time Series Database
¿Se puede vivir del open source? T3chfest
QuestDB: The building blocks of a fast open-source time-series database
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
Deduplicating and analysing time-series data with Apache Beam and QuestDB
Your Database Cannot Do this (well)
Your Timestamps Deserve Better than a Generic Database
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
QuestDB-Community-Call-20220728
Processing and analysing streaming data with Python. Pycon Italy 2022
QuestDB: ingesting a million time series per second on a single instance. Big...
Servicios e infraestructura de AWS y la próxima región en Aragón
Primeros pasos en desarrollo serverless
How AWS is reinventing the cloud
Analitica de datos en tiempo real con Apache Flink y Apache BEAM
Getting started with streaming analytics

Recently uploaded (20)

PDF
How Creative Agencies Leverage Project Management Software.pdf
PPTX
ManageIQ - Sprint 268 Review - Slide Deck
PPTX
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
PDF
System and Network Administration Chapter 2
PDF
top salesforce developer skills in 2025.pdf
PPTX
Operating system designcfffgfgggggggvggggggggg
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PPTX
history of c programming in notes for students .pptx
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PDF
Understanding Forklifts - TECH EHS Solution
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PDF
Digital Strategies for Manufacturing Companies
PDF
Softaken Excel to vCard Converter Software.pdf
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PDF
Nekopoi APK 2025 free lastest update
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PPT
Introduction Database Management System for Course Database
PPTX
CHAPTER 2 - PM Management and IT Context
PDF
Design an Analysis of Algorithms II-SECS-1021-03
How Creative Agencies Leverage Project Management Software.pdf
ManageIQ - Sprint 268 Review - Slide Deck
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
System and Network Administration Chapter 2
top salesforce developer skills in 2025.pdf
Operating system designcfffgfgggggggvggggggggg
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
Wondershare Filmora 15 Crack With Activation Key [2025
history of c programming in notes for students .pptx
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
Understanding Forklifts - TECH EHS Solution
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
Digital Strategies for Manufacturing Companies
Softaken Excel to vCard Converter Software.pdf
Upgrade and Innovation Strategies for SAP ERP Customers
Nekopoi APK 2025 free lastest update
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
Introduction Database Management System for Course Database
CHAPTER 2 - PM Management and IT Context
Design an Analysis of Algorithms II-SECS-1021-03

Big Data Analytics with Google BigQuery. GDG Summit Spain 2014

  • 1. Big Data Analytics with Google BigQuery javier ramirez @supercoco9
  • 2. REST API + AngularJS web as an API client javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2013
  • 3. javier ramirez @supercoco9 https://teowaki.com
  • 4. data that’s an order of magnitude greater than data you’re accustomed to javier ramirez @supercoco9 https://teowaki.com Doug Laney VP Research, Business Analytics and Performance Management at Gartner
  • 5. data that exceeds the processing capacity of conventional database systems. The data is too big, moves too fast, or doesn’t fit the structures of your database architectures. Ed Dumbill program chair for the O’Reilly Strata Conference javier ramirez @supercoco9 https://teowaki.com
  • 6. bigdata is doing a fullscan to 330MM rows, matching them against a regexp, and getting the result (223MM rows) in just 5 seconds javier ramirez @supercoco9 https://teowaki.com Javier Ramirez impresionable teowaki founder
  • 7. bigdata is cool but... expensive cluster hard to set up and monitor not interactive enough
  • 8. Our choice: Google BigQuery Data analysis as a service http://developers.google.com/bigquery javier ramirez @supercoco9 https://teowaki.com
  • 9. Based on Dremel Specifically designed for interactive queries over petabytes of real-time data javier ramirez @supercoco9 https://teowaki.com
  • 10. What Dremel is used for in Google • Analysis of crawled web documents. • Tracking install data for applications on Android Market. • Crash reporting for Google products. • OCR results from Google Books. • Spam analysis. • Debugging of map tiles on Google Maps. • Tablet migrations in managed Bigtable instances. • Results of tests run on Google’s distributed build system. • Disk I/O statistics for hundreds of thousands of disks. • Resource monitoring for jobs run in Google’s data centers. • Symbols and dependencies in Google’s codebase.
  • 11. Columnar storage javier ramirez @supercoco9 https://teowaki.com
  • 12. highly distributed execution using a tree javier ramirez @supercoco9 https://teowaki.com rubyc kiev 14
  • 13. loading data You can feed flat CSV-like files or nested JSON objects javier ramirez @supercoco9 https://teowaki.com
  • 14. bq cli bq load --nosynchronous_mode --encoding UTF-8 --field_delimiter 'tab' --max_bad_records 100 --source_format CSV api.stats 20131014T11-42-05Z.gz javier ramirez @supercoco9 https://teowaki.com
  • 15. web console screenshot javier ramirez @supercoco9 https://teowaki.com
  • 16. analytical SQL functions. correlations. window functions. views. JSON fields. timestamped tables. javier ramirez @supercoco9 https://teowaki.com
  • 17. Things you always wanted to try but were too scared to select count(*) from publicdata:samples.wikipedia where REGEXP_MATCH(title, "[0-9]*") AND wp_namespace = 0; 223,163,387 Query complete (5.6s elapsed, 9.13 GB processed, Cost: 32¢) javier ramirez @supercoco9 https://teowaki.com
  • 19. SELECT repository_name, repository_language, repository_description, COUNT(repository_name) as cnt, repository_url FROM github.timeline WHERE type="WatchEvent" AND PARSE_UTC_USEC(created_at) >= PARSE_UTC_USEC("#{yesterday} 20:00:00") AND repository_url IN ( SELECT repository_url FROM github.timeline WHERE type="CreateEvent" AND PARSE_UTC_USEC(repository_created_at) >= PARSE_UTC_USEC('#{yesterday} 20:00:00') AND repository_fork = "false" AND payload_ref_type = "repository" GROUP BY repository_url ) GROUP BY repository_name, repository_language, repository_description, repository_url HAVING cnt >= 5 ORDER BY cnt DESC LIMIT 25
  • 20. Global Database of Events, Language and Tone quarter billion rows 30 years updated daily http://gdeltproject.org/data.html#googlebigquery
  • 21. SELECT Year, Actor1Name, Actor2Name, Count FROM ( SELECT Actor1Name, Actor2Name, Year, COUNT(*) Count, RANK() OVER(PARTITION BY YEAR ORDER BY Count DESC) rank FROM (SELECT Actor1Name, Actor2Name, Year FROM [gdelt-bq:full.events] WHERE Actor1Name < Actor2Name and Actor1CountryCode != '' and Actor2CountryCode != '' and Actor1CountryCode!=Actor2CountryCode), (SELECT Actor2Name Actor1Name, Actor1Name Actor2Name, Year FROM [gdelt-bq:full.events] WHERE Actor1Name > Actor2Name and Actor1CountryCode != '' and Actor2CountryCode != '' and Actor1CountryCode!=Actor2CountryCode), WHERE Actor1Name IS NOT null AND Actor2Name IS NOT null GROUP EACH BY 1, 2, 3 HAVING Count > 100 ) WHERE rank=1 ORDER BY Year
  • 23. our most active user javier ramirez @supercoco9 https://teowaki.com
  • 24. 10 request we should be caching javier ramirez @supercoco9 https://teowaki.com
  • 25. 5 most created resources select uri, count(*) total from stats where method = 'POST' group by URI; javier ramirez @supercoco9 http://teowaki.com
  • 26. ...but /users/javier/shouts /users/rgo/shouts /teams/javier-community/links /teams/nosqlmatters-cgn/links javier ramirez @supercoco9 http://teowaki.com
  • 27. 5 most created resources javier ramirez @supercoco9 http://teowaki.com
  • 28. javier ramirez @supercoco9 https://teowaki.com
  • 30. Automation with Apps Script Read from bigquery Create a spreadsheet on Drive E-mail it everyday as a PDF javier ramirez @supercoco9 https://teowaki.com
  • 36. bigquery pricing $80 per stored TB 300000 rows => $0.007629392 / month $35 per processed TB 1 full scan = 84 MB 1 count = 0 MB 1 full scan over 1 column = 5.4 MB 10 GB => $0.35 / month *the 1st TB every month is free of charge javier ramirez @supercoco9 https://teowaki.com
  • 37. Find related links at https://teowaki.com/teams/javier-community/link-categories/bigquery-talk Grazas Javier Ramírez @supercoco9