SlideShare a Scribd company logo
How To Build Data Products
in BigQuery for PPC & SEO
Christopher Gutknecht | @chrisgutknecht | Bergzeit
1. Intro
Our Plan: Build Data Products & Activate Data
3. PPC & SEO Use Cases
2. Product Principles
About Chris: Acquisition & Analytics at Bergzeit
Digital Marketer
Data Nerd
Climber
1997 2008 2010 2022
Dad of 2
Online Store for Mountain Gear
145 M Revenue in FY 21/22
14 Countries, 5 Languages
World-class, data-driven team 🔥
Hiring a PPC!
I’d like to Set Clear Expectations for This Session
No BigQuery Intro
Data Management-Talk
What this session IS What it’s NOT
No BigQuery tactics
No ML Focus
Google Cloud & dbt focus
Data Product mindset
75% PPC, 25% SEO
Large-Scale PPC is Becoming the
Science of Managing Data Pipelines
RECAP FROM SMX 2019
Recap of My SMX 2019 Talk
Is Your Data Scattered Across Dozens of Islands?
If You’re Done With This… Get A Data Warehouse
Connectors
Data Warehouse = The New Center of Gravity?
Source: https://mikkeldengsoe.substack.com/p/future-of-the-data-warehouse
You Need Clean & Reliable Data for Data Products
Everybody has a Data Horror Story To Tell
Modern Data Stack: BigQuery & dbt at the core
Data Products
dbt is a Programming Environment for SQL
Are there Alternatives to dbt? Not really
No extra setup
No control
Saved queries Google Dataprep
No SQL (no-code)
Transformation &
scheduling only
Google Dataform
Free for GCP
Smaller ecosystem
SQL framework
The dbt community is rapidly growing
Wait: What about Javascript and Python?
Easy to get started
Instantly ready
Javascript Python
Powerful libraries
Leading data tools
SQL
Super scalable
Ideal for production
Centralized code
Only for smaller
data tasks
Harder to centralize
1. Intro
Plan: Build Data Products & Activate Your Data
3. PPC & SEO Use Cases
2. Product Principles
Principles for Robust Data Products (with dbt)
#1: Structured Data Transformation
Models
Break your Transformations into Small Steps
Sources Exposures
(Data Products)
Stage Intermediate Marts
Use Jinja Macros to Template Your SQL Code
Jinja-Macro
Compiled SQL
https://docs.getdbt.com/docs/building-a-dbt-project/jinja-macros
#2: Version Control
Only Write Production Code with Version Control
Use JIRA-Tickets to Create Feature Branches
#3: Sufficient Data Test Coverage
Typical Characteristics of Data Quality
Define Data Quality Through Automated Tests
dbt core (5) dbt utils (15) dbt expectations (~60) custom tests (*)
#4: Deployment & Environments
Separate Environments for Test and Production
Test = Paper cup Production = Porcelain
#5: Proper Documentation
Keep Your Documentation Close To Your Code
Documentation in .yml and .md Files
Document all Sources, Models and Exposures
Get documentation as HTML
YAML File for Model Metadata
Markdown File for Description
SQL Files with Model code
Get Auto-Generated Project Documentation
Example Documentation for Source & Exposure
Exposure (Output)
Source (Input)
#6: Logs & Alerts
Production Updates and Tests are run as Jobs
Job Failures are logged as Alerts to Slack/Teams
Example Alert Failure
Result: Our BigQuery-focused GCP Architecture
CSV File
Enrichment
select *
Transformation
Ingestion
Web APIs
ML Model
Dashboard
Storage
1. Intro
Let’s Finally Dig Into Some Use Cases!
3. PPC & SEO Use Cases
2. Product Principles
PPC Use Case: No Consent, No Ads Conversions?
?
Our Simple Transaction Backend Tracking
gclid
Conversion value
Timestamp
orderIdentifier
Example Python Code (see SMX 2020): https://gist.github.com/ChrisGutknecht
Identify New Conversions Missing in GA
SQL transformation steps in dbt (called “DAGs”)
Final output for Conversion Import
Yes We can! Upload Missing Ads Conversions
Combining Tag and Import Conversions in GAds
Full conversion coverage,
no modeling 👍
SEO Use Case: Sitemap Efficiency Report
Elias Dabbas’ advertools Package and sitemap_to_df()
Combine Sitemap with Crawl and GSC Data
SQL transformation
Result of combined data (30 days)
Bonus: Spy on a Competitor Sitemap with 4 LOC
PPC Use Case: Sync Product Ratings to Shopping
How do I get these (for free)?
Datainput: You Need A Simple Review Feed
review-level data
sku-level data: GTIN, sku, url
The Flow Diagram for the Shopping Ratings Sync
The CTR results are not conclusive (but, hey ⭐ )
Want to Sync Your Ratings? Test my Notebook!
Notebook on Github: https://gist.github.com/ChrisGutknecht
Use Case: Custom Large-Scale Link Checker
Didn’t work for us!
(Errors, no control, little customisation, Java-based)
Multi-Threaded URL-Crawling with Advertools
Elias Dabbas’ advertools and adv.crawl(url_list, …)
Fast crawling
Customised input
Filter Relevant Columns From Crawl Dataframe
Elias Dabbas’ advertools and adv.crawl(url_list, …)
The Flow Diagram for Our Link Checker
Notebook on Github: https://gist.github.com/ChrisGutknecht
PPC Use Case: Inventory-Based Campaigns
PPC Use Case: Inventory-Based Campaigns
Turn product feed data into campaign structures
We went from This Complex Graph in Dataprep
To This SQL-based Transformation in dbt
better control for complex transformations
easier to extend and reuse parts of logic
lower cost in long run
automated testing and documentation
There Are More Data Products in the Lab…
(Join Our Team!)
In The End: Do You Want to be an Architect?
Your Takeaways from this Session
1. What the modern data stack is and why it’s exciting
3. Which principles to apply for data products in production
2. Why dbt is the best tool for data warehouse transformation
4. A few interesting PPC and SEO use cases for you to try
Thanks for Your Time.
Looking Forward To Questions!
Chris Gutknecht | Teamlead A&O | Hiring a PPC!

More Related Content

PDF
Intro to LLMs
PPTX
Google Vertex AI
PPTX
Migration to Databricks - On-prem HDFS.pptx
PDF
Your Raw Data is Ready - Introduction to Analytics Engineering | SMX Advanced...
PPTX
Building, Evaluating, and Optimizing your RAG App for Production
PPTX
Snowflake + Power BI: Cloud Analytics for Everyone
PDF
Big Query Basics
PDF
Unlocking the Power of Generative AI An Executive's Guide.pdf
Intro to LLMs
Google Vertex AI
Migration to Databricks - On-prem HDFS.pptx
Your Raw Data is Ready - Introduction to Analytics Engineering | SMX Advanced...
Building, Evaluating, and Optimizing your RAG App for Production
Snowflake + Power BI: Cloud Analytics for Everyone
Big Query Basics
Unlocking the Power of Generative AI An Executive's Guide.pdf

What's hot (20)

PDF
Vertex AI - Unified ML Platform for the entire AI workflow on Google Cloud
PDF
Data at the Speed of Business with Data Mastering and Governance
PDF
BigQuery ML - Machine learning at scale using SQL
PDF
Intro to Vertex AI, unified MLOps platform for Data Scientists & ML Engineers
PDF
Building a Marketing Data Warehouse from Scratch - SMX Advanced 202
PPTX
DW Migration Webinar-March 2022.pptx
PDF
Leveraging Generative AI & Best practices
PDF
ChatGPT OpenAI Primer for Business
PDF
Introduction to MLflow
PDF
PPTX
Data Lakehouse Symposium | Day 4
PDF
Importance of ML Reproducibility & Applications with MLfLow
PPTX
Generative AI Masterclass - Model Risk Management.pptx
PPTX
Using Generative AI
PDF
Future of Data Engineering
PPTX
Supercharging your Data with Azure AI Search and Azure OpenAI
PPTX
Big data architectures and the data lake
PDF
Becoming a Data-Driven Organization - Aligning Business & Data Strategy
PPT
Data Quality Rules introduction
PPTX
Data ingestion
Vertex AI - Unified ML Platform for the entire AI workflow on Google Cloud
Data at the Speed of Business with Data Mastering and Governance
BigQuery ML - Machine learning at scale using SQL
Intro to Vertex AI, unified MLOps platform for Data Scientists & ML Engineers
Building a Marketing Data Warehouse from Scratch - SMX Advanced 202
DW Migration Webinar-March 2022.pptx
Leveraging Generative AI & Best practices
ChatGPT OpenAI Primer for Business
Introduction to MLflow
Data Lakehouse Symposium | Day 4
Importance of ML Reproducibility & Applications with MLfLow
Generative AI Masterclass - Model Risk Management.pptx
Using Generative AI
Future of Data Engineering
Supercharging your Data with Azure AI Search and Azure OpenAI
Big data architectures and the data lake
Becoming a Data-Driven Organization - Aligning Business & Data Strategy
Data Quality Rules introduction
Data ingestion
Ad

Similar to Building Data Products with BigQuery for PPC and SEO (SMX 2022) (20)

PDF
Google Analytics Konferenz 2019_Google Cloud Platform_Carl Fernandes & Ksenia...
PDF
Real-time big data analytics based on product recommendations case study
PDF
SMX Advanced - When to use Machine Learning for Search Campaigns
PPT
Óscar Méndez - Big data: de la investigación científica a la gestión empresarial
PPTX
Datasciencein E-commerce industry
PDF
Rakuten - Recommendation Platform
PDF
Impacto del Big Data en la empresa española
PDF
Working With Big Data
PDF
PXL Data Engineering Workshop By Selligent
PPTX
Why Big Query is so Powerful - Trusted Conf
PDF
Bigdata (1) converted
PDF
15dominodatasciencepopupseattleseanmcclureslides 151013133441-lva1-app6891
PDF
How Data Science Builds Better Products - Data Science Pop-up Seattle
PDF
uae views on big data
PPTX
[DSC DACH 24] Ship data faster with dbt - Sean McIntyre
PDF
Big data for product managers
PPTX
Telecom datascience master_public
PDF
Working With Big Data - Nov 2016
PDF
DN18 | The Data Janitor Returns | Daniel Molnar | Oberlo/Shopify
PDF
The Data Janitor Returns | Daniel Molnar | DN18
Google Analytics Konferenz 2019_Google Cloud Platform_Carl Fernandes & Ksenia...
Real-time big data analytics based on product recommendations case study
SMX Advanced - When to use Machine Learning for Search Campaigns
Óscar Méndez - Big data: de la investigación científica a la gestión empresarial
Datasciencein E-commerce industry
Rakuten - Recommendation Platform
Impacto del Big Data en la empresa española
Working With Big Data
PXL Data Engineering Workshop By Selligent
Why Big Query is so Powerful - Trusted Conf
Bigdata (1) converted
15dominodatasciencepopupseattleseanmcclureslides 151013133441-lva1-app6891
How Data Science Builds Better Products - Data Science Pop-up Seattle
uae views on big data
[DSC DACH 24] Ship data faster with dbt - Sean McIntyre
Big data for product managers
Telecom datascience master_public
Working With Big Data - Nov 2016
DN18 | The Data Janitor Returns | Daniel Molnar | Oberlo/Shopify
The Data Janitor Returns | Daniel Molnar | DN18
Ad

More from Christopher Gutknecht (8)

PDF
PMAX Product structures with BigQuery [GERMAN]
PDF
How to recover from an unsuccessful SEO relaunch by activating your data (SMX...
PDF
MeasureCamp_Custom GA4 Channel Groups with dbt
PDF
Scaling Search Campaigns With Bulk Uploads and Ad Customizers (SMX 2023)
PDF
Gross Profit Bidding for Ecommerce | SMX Virtual 2021
PDF
Data Driven Attribution in BigQuery with Shapley Values and Markov Chains
PDF
Questioning data quality and troubleshooting tracking gaps (version2 | Smx Su...
PDF
Questioning Data Quality and Troubleshooting Tracking Gaps (SMX Munich 2020)
PMAX Product structures with BigQuery [GERMAN]
How to recover from an unsuccessful SEO relaunch by activating your data (SMX...
MeasureCamp_Custom GA4 Channel Groups with dbt
Scaling Search Campaigns With Bulk Uploads and Ad Customizers (SMX 2023)
Gross Profit Bidding for Ecommerce | SMX Virtual 2021
Data Driven Attribution in BigQuery with Shapley Values and Markov Chains
Questioning data quality and troubleshooting tracking gaps (version2 | Smx Su...
Questioning Data Quality and Troubleshooting Tracking Gaps (SMX Munich 2020)

Recently uploaded (20)

PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PDF
Lecture1 pattern recognition............
PPT
Quality review (1)_presentation of this 21
PPTX
Database Infoormation System (DBIS).pptx
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
Supervised vs unsupervised machine learning algorithms
PDF
Launch Your Data Science Career in Kochi – 2025
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PPTX
Major-Components-ofNKJNNKNKNKNKronment.pptx
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PPTX
Introduction to Knowledge Engineering Part 1
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
IB Computer Science - Internal Assessment.pptx
oil_refinery_comprehensive_20250804084928 (1).pptx
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Lecture1 pattern recognition............
Quality review (1)_presentation of this 21
Database Infoormation System (DBIS).pptx
Reliability_Chapter_ presentation 1221.5784
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
Supervised vs unsupervised machine learning algorithms
Launch Your Data Science Career in Kochi – 2025
STUDY DESIGN details- Lt Col Maksud (21).pptx
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
Major-Components-ofNKJNNKNKNKNKronment.pptx
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
Introduction to Knowledge Engineering Part 1
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
IB Computer Science - Internal Assessment.pptx

Building Data Products with BigQuery for PPC and SEO (SMX 2022)

  • 1. How To Build Data Products in BigQuery for PPC & SEO Christopher Gutknecht | @chrisgutknecht | Bergzeit
  • 2. 1. Intro Our Plan: Build Data Products & Activate Data 3. PPC & SEO Use Cases 2. Product Principles
  • 3. About Chris: Acquisition & Analytics at Bergzeit Digital Marketer Data Nerd Climber 1997 2008 2010 2022 Dad of 2 Online Store for Mountain Gear 145 M Revenue in FY 21/22 14 Countries, 5 Languages World-class, data-driven team 🔥 Hiring a PPC!
  • 4. I’d like to Set Clear Expectations for This Session No BigQuery Intro Data Management-Talk What this session IS What it’s NOT No BigQuery tactics No ML Focus Google Cloud & dbt focus Data Product mindset 75% PPC, 25% SEO
  • 5. Large-Scale PPC is Becoming the Science of Managing Data Pipelines RECAP FROM SMX 2019 Recap of My SMX 2019 Talk
  • 6. Is Your Data Scattered Across Dozens of Islands?
  • 7. If You’re Done With This… Get A Data Warehouse Connectors
  • 8. Data Warehouse = The New Center of Gravity? Source: https://mikkeldengsoe.substack.com/p/future-of-the-data-warehouse
  • 9. You Need Clean & Reliable Data for Data Products
  • 10. Everybody has a Data Horror Story To Tell
  • 11. Modern Data Stack: BigQuery & dbt at the core Data Products
  • 12. dbt is a Programming Environment for SQL
  • 13. Are there Alternatives to dbt? Not really No extra setup No control Saved queries Google Dataprep No SQL (no-code) Transformation & scheduling only Google Dataform Free for GCP Smaller ecosystem SQL framework
  • 14. The dbt community is rapidly growing
  • 15. Wait: What about Javascript and Python? Easy to get started Instantly ready Javascript Python Powerful libraries Leading data tools SQL Super scalable Ideal for production Centralized code Only for smaller data tasks Harder to centralize
  • 16. 1. Intro Plan: Build Data Products & Activate Your Data 3. PPC & SEO Use Cases 2. Product Principles
  • 17. Principles for Robust Data Products (with dbt)
  • 18. #1: Structured Data Transformation
  • 19. Models Break your Transformations into Small Steps Sources Exposures (Data Products) Stage Intermediate Marts
  • 20. Use Jinja Macros to Template Your SQL Code Jinja-Macro Compiled SQL https://docs.getdbt.com/docs/building-a-dbt-project/jinja-macros
  • 22. Only Write Production Code with Version Control
  • 23. Use JIRA-Tickets to Create Feature Branches
  • 24. #3: Sufficient Data Test Coverage
  • 26. Define Data Quality Through Automated Tests dbt core (5) dbt utils (15) dbt expectations (~60) custom tests (*)
  • 27. #4: Deployment & Environments
  • 28. Separate Environments for Test and Production Test = Paper cup Production = Porcelain
  • 30. Keep Your Documentation Close To Your Code Documentation in .yml and .md Files Document all Sources, Models and Exposures Get documentation as HTML YAML File for Model Metadata Markdown File for Description SQL Files with Model code
  • 31. Get Auto-Generated Project Documentation
  • 32. Example Documentation for Source & Exposure Exposure (Output) Source (Input)
  • 33. #6: Logs & Alerts
  • 34. Production Updates and Tests are run as Jobs
  • 35. Job Failures are logged as Alerts to Slack/Teams Example Alert Failure
  • 36. Result: Our BigQuery-focused GCP Architecture CSV File Enrichment select * Transformation Ingestion Web APIs ML Model Dashboard Storage
  • 37. 1. Intro Let’s Finally Dig Into Some Use Cases! 3. PPC & SEO Use Cases 2. Product Principles
  • 38. PPC Use Case: No Consent, No Ads Conversions? ?
  • 39. Our Simple Transaction Backend Tracking gclid Conversion value Timestamp orderIdentifier Example Python Code (see SMX 2020): https://gist.github.com/ChrisGutknecht
  • 40. Identify New Conversions Missing in GA SQL transformation steps in dbt (called “DAGs”) Final output for Conversion Import
  • 41. Yes We can! Upload Missing Ads Conversions
  • 42. Combining Tag and Import Conversions in GAds Full conversion coverage, no modeling 👍
  • 43. SEO Use Case: Sitemap Efficiency Report Elias Dabbas’ advertools Package and sitemap_to_df()
  • 44. Combine Sitemap with Crawl and GSC Data SQL transformation Result of combined data (30 days)
  • 45. Bonus: Spy on a Competitor Sitemap with 4 LOC
  • 46. PPC Use Case: Sync Product Ratings to Shopping How do I get these (for free)?
  • 47. Datainput: You Need A Simple Review Feed review-level data sku-level data: GTIN, sku, url
  • 48. The Flow Diagram for the Shopping Ratings Sync
  • 49. The CTR results are not conclusive (but, hey ⭐ )
  • 50. Want to Sync Your Ratings? Test my Notebook! Notebook on Github: https://gist.github.com/ChrisGutknecht
  • 51. Use Case: Custom Large-Scale Link Checker Didn’t work for us! (Errors, no control, little customisation, Java-based)
  • 52. Multi-Threaded URL-Crawling with Advertools Elias Dabbas’ advertools and adv.crawl(url_list, …) Fast crawling Customised input
  • 53. Filter Relevant Columns From Crawl Dataframe Elias Dabbas’ advertools and adv.crawl(url_list, …)
  • 54. The Flow Diagram for Our Link Checker Notebook on Github: https://gist.github.com/ChrisGutknecht
  • 55. PPC Use Case: Inventory-Based Campaigns
  • 56. PPC Use Case: Inventory-Based Campaigns Turn product feed data into campaign structures
  • 57. We went from This Complex Graph in Dataprep
  • 58. To This SQL-based Transformation in dbt better control for complex transformations easier to extend and reuse parts of logic lower cost in long run automated testing and documentation
  • 59. There Are More Data Products in the Lab… (Join Our Team!)
  • 60. In The End: Do You Want to be an Architect?
  • 61. Your Takeaways from this Session 1. What the modern data stack is and why it’s exciting 3. Which principles to apply for data products in production 2. Why dbt is the best tool for data warehouse transformation 4. A few interesting PPC and SEO use cases for you to try
  • 62. Thanks for Your Time. Looking Forward To Questions! Chris Gutknecht | Teamlead A&O | Hiring a PPC!