1. Data Pipeline Matters
-- 以 Tracking Pixel 為例
Data Pipeline Matters !!
Take Tracking Pixel as an Example
Jazz Yao-Tsung Wang
Data Architect of TenMax.io
Initiator of Taiwan Data Engineering Association
Co-Founder of Taiwan Hadoop User Group
Shared at 2017-11-12 <2017 台灣資料科學年會>
2. Hello!
I am Jazz Wang
Co-Founder of Hadoop.TW
Initiator of Taiwan Data Engineering Association (TDEA)
Hadoop Evangelist since 2008.
Open Source Promoter. System Admin (Ops).
- 11 years (2002/08 ~ 2014/02) Researcher in HPC field.
- 2 years (2014/03 ~ 2016/04) Assistant Vice President (AVP),
Product Management of ‘Big Data Platform Management Product’
- 1.5 years (2016/04 ~ Now) Data Architect of Real-Time Bidding
You can find me at @jazzwang_tw or
https://fb.com/groups/dataengineering.tw
https://slideshare.net/jazzwang
2
23. 23
Serverless Tracking Pixel Data Pipeline
① ② ③ ④ ⑤
⑥ ⑦
成本
分析
代碼
優點:技術門檻略低,不需自架網頁服務,不怕流量龐大
缺點:僅適用 Server Based Tracking。雲服務元件是黑盒子,不易除錯。
BI Report
DashboardServing Collecting Analysing
http://docs.aws.amazon.com/AmazonS3/latest/dev/WebsiteHosting.html
將「靜態網頁」存放在「雲儲存」服務
是運用雲服務的 Best Practice!!
24. 24
不同雲儲存服務的 Log 格式
▷ Azure Blob Storage
○ Storage Analytics Log Format
○ https://docs.microsoft.com/en-us/rest/api/storageservices/storage-anal
ytics-log-format
▷ Google Cloud Storage
○ Access and storage log format
○ https://cloud.google.com/storage/docs/access-logs#format
▷ Amazon S3
○ Server Access Log Format
○ http://docs.aws.amazon.com/AmazonS3/latest/dev/LogFormat.html