SlideShare a Scribd company logo
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
1TB > 1b+ documents
wow
much documents
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Amazon Athena
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
! "
Analyze Data in MongoDB with AWS
AWS Glue
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
WITH latest_summaries AS
(SELECT user_id AS uid,
max(base_datetime) AS mdt
FROM user_asset_summaries
GROUP BY (user_id))
SELECT user_id
FROM user_asset_summaries s
JOIN latest_summaries ls
ON s.user_id = ls.uid
AND s.base_datetime = ls.mdt
WHERE cardinality(s.cards) > 0
WITH latest_summaries AS
(SELECT user_id AS uid,
max(base_datetime) AS mdt
FROM user_asset_summaries
GROUP BY (user_id))
SELECT user_id
FROM user_asset_summaries s
JOIN latest_summaries ls
ON s.user_id = ls.uid
AND s.base_datetime = ls.mdt
WHERE cardinality(s.cards) > 0
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
glueContext
.getCatalogSource(
database = "mart",
tableName = "raw_user_asset_summaries",
redshiftTmpDir = "",
transformationContext = "ds"
)
.getDynamicFrame()
dynamicFrame
.toDF()
.withColumn(
"base_datetime",
to_timestamp(
col("base_datetime"),
"yyyy-MM-d'T'HH:mm:ss'Z'"
)
)
dataFrame
.write
.mode(SaveMode.Append)
.option("compression", "snappy")
.parquet(targetOutput)
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
[리빙 포인트] Glue ETL Job을 돌릴땐
[리빙 포인트] Glue ETL Job을 돌릴땐
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS
!"
#
!"
#
!
❤
Analyze Data in MongoDB with AWS
Analyze Data in MongoDB with AWS

More Related Content

PPTX
Dbabstraction
PPTX
NoSQL with MongoDB
PPTX
Mongo db modifiers
PPTX
Querying mongo db
PPTX
Google apps script database abstraction exposed version
PPTX
MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Agg...
PPTX
Html indexed db
PPTX
Google cloud datastore driver for Google Apps Script DB abstraction
Dbabstraction
NoSQL with MongoDB
Mongo db modifiers
Querying mongo db
Google apps script database abstraction exposed version
MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Agg...
Html indexed db
Google cloud datastore driver for Google Apps Script DB abstraction

What's hot (20)

PDF
JavaScript client API for Google Apps Script API primer
PPTX
Mongo db updatedocumentusecases
PDF
Displaying data via onclick event. I have a csv file that I'm attaching to a...
PDF
jQuery's Secrets
PPTX
Using MongoDB As a Tick Database
PPTX
Working with NoSQL in a SQL Database (XDevApi)
PPTX
NoSQL in SQL - Lior Altarescu
PPTX
MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregatio...
PDF
Users as Data
PPTX
Ajax for dummies, and not only.
PDF
Enonic Content Repository built on elasticsearch
PPTX
Андрей Козлов (Altoros): Оптимизация производительности Cassandra
PDF
Re:Invent 2018 Database Announcements
PPTX
MongoDB for Time Series Data: Setting the Stage for Sensor Management
PPTX
Mongo db query docuement
PDF
MongoDB: Intro & Application for Big Data
PDF
Exploring the Enron Email Dataset with Kiji and Hive
PDF
Sebastian Schmidt, Rachel Myers - How To Go Serverless And Not Violate The GD...
PPTX
MongoDB for Time Series Data Part 3: Sharding
PPTX
Sydney Python Presentation (Feb 2010) - Tracking Large Metallic Objects / Goo...
JavaScript client API for Google Apps Script API primer
Mongo db updatedocumentusecases
Displaying data via onclick event. I have a csv file that I'm attaching to a...
jQuery's Secrets
Using MongoDB As a Tick Database
Working with NoSQL in a SQL Database (XDevApi)
NoSQL in SQL - Lior Altarescu
MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregatio...
Users as Data
Ajax for dummies, and not only.
Enonic Content Repository built on elasticsearch
Андрей Козлов (Altoros): Оптимизация производительности Cassandra
Re:Invent 2018 Database Announcements
MongoDB for Time Series Data: Setting the Stage for Sensor Management
Mongo db query docuement
MongoDB: Intro & Application for Big Data
Exploring the Enron Email Dataset with Kiji and Hive
Sebastian Schmidt, Rachel Myers - How To Go Serverless And Not Violate The GD...
MongoDB for Time Series Data Part 3: Sharding
Sydney Python Presentation (Feb 2010) - Tracking Large Metallic Objects / Goo...
Ad

Similar to Analyze Data in MongoDB with AWS (20)

PDF
Visualizing Postgres
PPTX
Know your SQL Server - DMVs
PDF
ScalikeJDBC Tutorial for Beginners
PPTX
Strata Presentation: One Billion Objects in 2GB: Big Data Analytics on Small ...
DOCX
library(sparkline)
PPTX
best aws training in bangalore
PDF
Cassandra Summit 2013 Keynote
PPTX
Powering Heap With PostgreSQL And CitusDB (PGConf Silicon Valley 2015)
PDF
OSA Con 2022 - Building Event Collection SDKs and Data Models - Paul Boocock ...
PPT
GHC Participant Training
PDF
Get Value From Your Data
PPTX
Machine Learning with Microsoft Azure
PDF
Querying Data Pipeline with AWS Athena
PPTX
MongoDB World 2018: Keynote
PDF
Юрий Буянов «Squeryl — ORM с человеческим лицом»
PPTX
Kåre Rude Andersen - Be a hero – optimize scom and present your services
PPT
Azure Powershell Tips
PDF
PDF
Bulletproof Jobs: Patterns For Large-Scale Spark Processing
PDF
Hadoop + Cassandra: Fast queries on data lakes, and wikipedia search tutorial.
Visualizing Postgres
Know your SQL Server - DMVs
ScalikeJDBC Tutorial for Beginners
Strata Presentation: One Billion Objects in 2GB: Big Data Analytics on Small ...
library(sparkline)
best aws training in bangalore
Cassandra Summit 2013 Keynote
Powering Heap With PostgreSQL And CitusDB (PGConf Silicon Valley 2015)
OSA Con 2022 - Building Event Collection SDKs and Data Models - Paul Boocock ...
GHC Participant Training
Get Value From Your Data
Machine Learning with Microsoft Azure
Querying Data Pipeline with AWS Athena
MongoDB World 2018: Keynote
Юрий Буянов «Squeryl — ORM с человеческим лицом»
Kåre Rude Andersen - Be a hero – optimize scom and present your services
Azure Powershell Tips
Bulletproof Jobs: Patterns For Large-Scale Spark Processing
Hadoop + Cassandra: Fast queries on data lakes, and wikipedia search tutorial.
Ad

Recently uploaded (20)

PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
August Patch Tuesday
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Encapsulation theory and applications.pdf
PDF
A comparative analysis of optical character recognition models for extracting...
PPTX
Spectroscopy.pptx food analysis technology
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Heart disease approach using modified random forest and particle swarm optimi...
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PDF
Approach and Philosophy of On baking technology
PPTX
Machine Learning_overview_presentation.pptx
Encapsulation_ Review paper, used for researhc scholars
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
August Patch Tuesday
Accuracy of neural networks in brain wave diagnosis of schizophrenia
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Encapsulation theory and applications.pdf
A comparative analysis of optical character recognition models for extracting...
Spectroscopy.pptx food analysis technology
MIND Revenue Release Quarter 2 2025 Press Release
Unlocking AI with Model Context Protocol (MCP)
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Heart disease approach using modified random forest and particle swarm optimi...
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Group 1 Presentation -Planning and Decision Making .pptx
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Univ-Connecticut-ChatGPT-Presentaion.pdf
Approach and Philosophy of On baking technology
Machine Learning_overview_presentation.pptx

Analyze Data in MongoDB with AWS