SlideShare a Scribd company logo
Taro L. Saito, Ph.D.
Arm Treasure Data
June 29, 2019
Scala Matsuri 2019 - Tokyo
How To Use Scala At Work
Airframe In Action At Arm Treasure Data
1calaを仕事で使おう - Arm reasure DataでのAirframe活用事例

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
About Me: Taro L. Saito (Leo)
2
● Principal Software Engineer at Arm
Treasure Data
● Building distributed query engine service
● Living in US for 4 years
● DBMS & Data Science Background
● Ph.D. of Computer Science
● Database Systems and Genome
Sciences Research
● Assistant Professor at the University of
Tokyo
● OSS Projects Around Scala
● sbt-sonatype: used for releasing 3000+
Scala projects
● snappy-java: a compression library used
in Spark, Parquet, etc.
自己紹介

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
New Release from O’Reilly Japan
● Helped Japanese translation of Data-Intensive
Application Design
● Techniques and concepts around distributed data
processing systems
● Available at Amazon.co.jp and O’Reilly Japan web sites
● will be published on July 18, 2019
3
分散データシステム入門の決定版の翻訳が来月発売

400+
Customers
Founded in
2011
Raised
$54M
Security
Acquired by Arm / Softbank
2018
Arm Treasure Data
Arm reasure Dataの概要

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
The Architecture of Arm Treasure Data
5
DataLogs
Device
Data
Batch
Data
PlazmaDB
Table Schema
Data Collection Cloud Storage Distributed Data Processing
2 million records / sec. 130 trillion records 1 billion rows processed / sec.
Jobs
Job Management
SQL Editor
Scheduler
Workflows
Machine
Learning
Treasure Data OSS
Third Party OSS
reasure Dataのシステム構成。 calaはどこに?

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
Module Mix-InPackaging
HTTP Requests and
Responses
Data
airframe-launcher
> _
airframe-log
production:
port: 10010
user: xxxx
...
airframe-config
airframe-codec
sbt-pack
airframe-fluentd
Scala
Objects
Table Data
(CSV, TSV)
JSON
airframe-jsonairframe-surface
airframe-tablet
airframe-jmx
Monitor Runtime States
Generate Mapping Codec
Metrics &
Log Data
JDBC
ResultSets
airframe-jdbc
airframe-http
airframe-http-finagle
Launch HTTP
Services
airframe DI
Debug Logs
Schema-On-Read
Mapping
Airframe
サービスの裏側で使われているAirframe ( cala製 ) のモジュール群

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
Our OSS Strategy Around Scala
● Gather the best practices of Scala into Airframe OSS
● Get the real experiences by operating 24/7 services
7
Knowledge
Experiences
Design Decisions
Products
24/7 Services
Business Values
Programming OSS Outcome
Airframeを核にした cala周辺の 戦略

Airframe
Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
● Various internal and third-party Scala/Java libraries
● Managed in different repositories, different release cycles
● High-learning cost
■ The knowledge is confined to engineers’ brains
3 Years Ago...
8
Knowledge
Experiences
Design Decisions
Products
24/7 Services
Business Values
Programming Various Libraries Outcome
3年前、Airframeは存在せず、様々なライブラリが混在していた

logger
launcher
object mapper
JDBC reader
json4s jackson
….
Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
5 Years Ago...
● No Scala engineer in the company
● Scala in 2014: Scala 2.9.x
● Was not good enough to use:
■ e.g., no string interpolation like s”... ${x}...”
9
Knowledge
Experiences
Design Decisions
Products
24/7 Services
Business Values
Programming Ruby, Java Outcome
5年前には calaのエンジニアも、 calaのコードもなかった

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
Today’s Agenda
● How to introduce Scala to your company
● Learn the best practices of using Scala at work
● From 20 Airframe modules
10本日紹介する内容

Airframe
Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
How Can We Introduce Scala?
● Saying “I want to use Scala”
● It will not work, especially if you or your team are not familiar with Scala
● Your managers need more information whether it’s good enough or not
● Even if you are a tech lead:
● Need some confidence in using Scala in production
● How can we establish such confidence in using Scala?
11calaをどう導入するか? calaを使っても良いという自信を得るには?

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
Start With A Small Investment to Scala
● Guidelines
● Think how you can save your time with Scala
● If you can save 1 minute in a day, your can spend 6 hours for this improvement
■ Save 1 minute / day = 365 minutes / year = 6 hour investment
■ Save 10 minutes / week = 520 minutes / year = 8.6 hour investment
■ Save 1 hour / week = 52 hours / year = 2.2 day investment
● Time is your most valuable asset
● Save your time by using Scala
12「 calaを使って」時間を節約するための「小さな投資」をはじめよう

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
● prestop (presto + top)
● Non production service code
● A handy query monitoring tool for Presto, written in Scala
● Display complex JSON data with fancy ANSI color
The First Scala Code in TD
13reasure Data最初の calaプログラム

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
airframe-log
● Scala 2.10: My small investment to test Scala Macros and String interpolation
● A Modern Logging Library for Scala (at Medium)
● ANSI color and source code location display
● Just add LogSupport trait to your class
14プログラムの開発をログメッセージで効率化する

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
airframe-launcher
● Needed to handle complex command line options and nested commands
● e.g., $ prestop -e production monitor (other options …)
● Enabled annotation-based command line definitions
15複雑なコマンドラインプログラムを簡単に作成できるようにする

Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
airframe-config: Application Configuration Flow
● YAML config (embedded into Docker)
● Override credentials, then bind to config objects
YAML
development:
addr: api-dev.com
production:
addr: api.com
Config Object
case class ServerConfig(
addr: String,
port: Int = 8080,
password: String
)
production:
addr: api.com
command: -e production Credentials and Local
Configurations
Merge
Immutable
Object Default Parameters
(e.g., port = 8080)
Object
Mapping
16アプリケーション設定のフローをライブラリ化

airframe-launcher
> _
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
sbt-pack plugin
● A sbt plugin to create standalone Scala packages
● A single folder package with bin and lib folders containing all dependent JARs
● Generates command-line launcher scripts
● My small investment in 2012 to save packaging time
17sbt-packでプログラムをパッケージングし、Dockerイメージを手軽に作成

airframe-launcher
airframe-config
YAML config file
Standalone
Scala Package
sbt-pack Dockerfile
Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
Medium-SIze Investment: Find A Common Pattern
● Extract a common problem pattern and create a solution
● Data -> Object Mapping
● How many data readers and object mappers do we need?
● How can we save our time for handling such various data types?
YAML
JDBC
ResultSet
YAML Parser +
Object Mapper
Config
Object
Table
Object
Object-Relation
Mapper
JSON
JSON Parser +
Object Mapper
Object
18入力データを cala bjectにマッピングしたいケースは多い。中期的な投資が必要

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
airframe-msgpack: MessagePack as Universal Data Format
● MessagePack (msgpack.org)
● Compact JSON-like binary format
● Describes data types and data values at the same time (self-describing)
Object
Unpack
Pack
JDBC
ResultSet
Pack/Unpack
YAML
JSON
19essage ackを中間フォーマットとして使うと、オブジェクトマッパーの実装は1つに

MessagePack
Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
PlazmaDB: MessagePack DBMS
● Fluentd -> MessagePack -> Arm Treasure Data
● Automatically generating table schema from MessagePack data
● Apply schema–on-read for providing table data for Presto/Hive/Spark, etc.
Table Schema
Int Column Reader
String Column Reader
Update
Schema
Generate
Reader Set
Table Reader
Schema-free Data
20
Data Collection Distributed Data Processing
Arm reasure Dataは essage ackベースの chema-on-readシステム

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
Schema-On-Read Data Processing with MessagePack
● Users can store arbitrary typed data (No table design is required)
● Data can be read in a target type required by the application (e.g., SQL query)
Int
Float
Boolean
String
Array
Map
Binary
SQL BigInt
parseInt
toInt
0 or 1
IntCodec
Pack Unpack
Error or null
“100”
(string)
100
(int)
100
(int)
21
Logs
データ読み込み時に、アプリケーションの要求する型に合わせる ( chema-on- ead)

CSV
command-line
arguments
Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
airframe-codec: Schema-On-Read Pack/Unpack Interface
● Apply schema-on-read for Scala objects
Input MessagePack Output
Pack Unpack
PackUnpack
22essage ackを通した chema-on-readデータ変換インターフェースを calaに適用

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
Pre-defined Codecs in airframe-codec
● Primitive Codecs
● ByteCodec, CharCodec, ShortCodec, IntCodec, LongCodec
● FloatCodec, DoubleCodec
● StringCodec
● BooleanCodec
● TimeStampCodec
● Collection Codec
● ArrayCodec, SeqCodec, ListCodec, IndexSeqCodec, MapCodec, etc.
● OptionCodec
● JsonCodec (airframe-json)
● Java-specific Codec
● FileCodec, ZonedDateTimeCodec, JDBCResultSetCodec, etc.
● Adding Custom Codecs
● Implement MessageCodec[X] interface
23calaで必要なほぼ全てのデータ型へのマッピングをサポート

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
MessageCodec.of[A]: Combination of Codecs
Unpack
Pack
IntCodec
StringCodec
DoubleCodec
MessagePack
MessageCodec.of[A]
24オブジェクトの型に合わせてCodecを合成

Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
airframe-surface
● Reading Type Signatures From ScalaSig
● Scala compiler embeds Scala Type Signatures (ScalaSig) to class files
● Surface.of[A]
■ returns A’s parameter names and types
class A (data:List[B])
class A
data: List[java.lang.Object]
class A
data: List[java.lang.Object]
ScalaSig: data:List[B]
javac
scalac
Surface.of[A]
data: List[B]
scala.reflect.runtime.universe.TypeTag
Type erasure removes
generic type information
25オブジェクトの型情報を cala igから取得する

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
[WIP] Scala.js RPC
● Scala.js
● Compiling Scala code into JavaScript for Web Browsers
● airframe-codec: Passing model class data between Scala and Scala.js
UserInfo MessagePack UserInfo
Pack Unpack
PackUnpack
Scala
Server Side
Scala.js
Client Side
XML RPC
26airframe-codecは cala.js(ブラウザ側)とのデータ受け渡しにも使える

Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
[WIP] airframe-sql
● Universal stream SQL engine
● Processing various types of data through MessagePack
MessagePack Stream SQL MessagePack
Query
Processing
Filter/Aggregation/Join, etc.
27任意のデータ形式に対し、 essage ackを通して で処理をする

JDBC
ResultSet
Pack
YAML
JSON
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. 28
Scala In Production
Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
A Technical Debt In TD (2015-2016)
● Prestogres: PostgreSQL gateway to Presto
● Enabled using PostgreSQL JDBC/ODBC
drivers to access Presto
● So-called Sada (founder)’s magic
● Was good for the first use cases
● Many Problems:
● Hacks around pgpool-II was hard to
debug
● Hard to support customers upon errors
● Incompatible SQL with Presto
● Nobody could fix these issues
■ including the creator!
29restogresというハックが技術的負債になっていた

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
Replacing Prestogres with Prestobase
30calaで restobaseのプロトタイプを作成. 3ヶ月後にサービスリリース

● Prototyped in Scala within a week after a quick chat with Sada
● Utilizing Airframe assets
● Deployed as a production service in 3 months
Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
airframe-di
● Created a dependency injection library for Scala
● For Prestobase development
● Scala-friendly Syntax
● Useful for combining hundreds of modules
● based on airframe-surface, airframe-log
● See also:
● Airframe Meetup #1 Report (2018)
31restobaseの開発中に calaのためのAirframe DIが誕生

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
Airframe OSS
● Lightweight Building Blocks for Scala
● Collection of our investments to Scala
● Repackaged into wvlet.airframe in 2016
● airframe-log
● airframe-launcher
● airframe-config
● airframe-surface
● airframe-di
● airframe-codec
● ...
● As of 2019, Airframe has 20 modules
● 35+ releases in 2018
● Already had 17+ releases in 2019
● Contributing to the Scala Community Build
● To test the latest Scala versions
322016年に各種ツールをAirframeとして統合。20のモジュール、頻繁なリリースサイクル

Airframe
Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
Monorepo
● Cross build
● For 3 + 1 Scala versions
■ 2.13, 2.12, 2.11, and Scala.js
● 20 modules
■ 4 x 20 = 80 artifacts!
● Challenge
● Publishing took 3 hours with
sbt-release
● Bottleneck
● Sequential run of compile -> test ->
publish for all artifacts
33Airframeはメンテナンスを集約するため単一レポジトリ構成

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
Release Automation on Travis CI
● Single-Step Release
● Triggered by git tag
● Running Tasks In Parallel
● Run tests for each Scala version
● Update doc & release notes
■ Generate release notes
from git logs
● Publish
■ sbt-pgp & sbt-sonatype
○ GPG signature
○ Copy to Maven Central
● Finishes around 10~20 minutes
● Blog: 3 Tips For Maintaining
Scala Projects
34ravis CI上でリリースを全自動化し、頻繁なリリースを可能に

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
sbt-sonatype plugin
● A sbt-plugin for releasing projects to Maven Central
● open staging repository -> verify -> close -> promote -> drop
● A small investment
● At 2015 new year holiday => Payed off for saving Airframe release time
● 3000+ Scala projects are using sbt-sonatype
35sbt-sonatypeはお正月休みに作られたプロジェクト。多くの calaライブラリで使われている

Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
airframe-http
● Created a simple HTTP framework
● Based on Airframe modules:
■ airframe-surface
■ airframe-codec
■ airframe-msgpack
■ etc.
● Blog
● Building Low-Friction Web Service
Over Finagle
● Save the time for choosing a web
framework:
● Many frameworks exist:
● e.g, Finatra, Finch, akka-http, spring,
RESTeasy, open-api, swagger, etc.
36Airframe資産を活用して、Webフレームワークも手軽に作成

Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
airframe-http-client
● Error handling of HTTP requests is
difficult
● 4xx, 5xx status code
● Should we retry the request?
■ IOException, EOFException
■ TimeoutException
■ InterruptedException
■ SSLException
■ InvocationTargetException
● HTTP client
● request retries
● response mapping
■ JSON, MessagePack format
● airframe-codec
37間違いやすいH リクエストのエラーハンドリングをライブラリ化

Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
airframe-control
● Everything can fail …
● Network disconnection
● Servere crash
● ...
● Retry
● Exponential backoff
■ 2x, 4x, ...
● Jittering
■ 1 sec., 2 * rand, 4 * rand, …
● Customize error type classifiers
● retryable failures
● non-retryable failures
38リトライ処理をパターン化

Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
airframe-http-recorder
● Testing against actual web services is time consuming
● Record & Replay HTTP responses
● Reproducible results
● Runnable on small machines (e.g., Travis CI)
39H リクエストをレコーディングして、Webサービスのテストを効率化する

HTTP
Request
HTTP
Recorder
Request
Real Web
Service
Recording Mode:
Response
HTTP
Request
HTTP
Recorder
Replay Mode:
Request
Response Recording
Responses
Request
Recorded
Responses
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. 40
Data Analysis with Scala
Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
Data-Driven System Optimization
● TD is one of the biggest users of TD
● Query logs
● Collecting all Presto query logs since 2015
● Query statements, performance statistics, logs, etc.
● Logs are our valuable assets
● To understand user activities and enable data-driven optimizations
41
Logs
User
Query
Collect Query Logs
Analyze Query Logs
Machine
Learning
Query
Optimization
Optimize System
システムの最適化のためにログの収集、解析が重要

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
airframe-fluentd
● Collect Scala Application Logs To Fluentd
● Scala Objects -> MessagePack -> Fluentd
42essage ackを受け取るFluentdには、airframe-codeの出力を渡せる

Collect Query Logs
Analyze Query Logs
Machine
Learning
Query
Optimization
Optimize System
airframe-fluentd
Scala
Objects
airframe-codec
Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
airframe-jmx
● Add @JMX annotation to your application metrics
● It’s also useful to check the application version, configurations, etc.
● JMX clients can check these metrics
● e.g., jconsole
43J Xで、JV の外側からアプリケーションの状態を確認し、メトリックを収集

Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
airframe-metrics
● Human Readable Data Format (ElapsedTime, DataSize, etc.)
● Handy Time Window String Support
44時間幅、区間、データサイズを人間を扱いやすい形式にし、ログの解析を効率化

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
Taking Snapshots of Data Analysis Tasks
● Save Long-Running Task Results As MessagePack (binary)
● Save the cost of re-computation
Result: Seq[A] MessagePack Storage
Pack
Save
Unpack
Task
Run
Load
Second Run:
Load
Compute
(e.g., 10 min)
First run
Snapshot
45Airframe資産を活用して、データ解析結果をキャッシュし作業を効率化する

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
Module Mix-InPackaging
HTTP Requests and
Responses
Data
airframe-launcher
> _
airframe-log
production:
port: 10010
user: xxxx
...
airframe-config
airframe-codec
sbt-pack
airframe-fluentd
Scala
Objects
Table Data
(CSV, TSV)
JSON
airframe-jsonairframe-surface
airframe-tablet
airframe-jmx
Monitor Runtime States
Generate Mapping Codec
Metrics &
Log Data
JDBC
ResultSets
airframe-jdbc
airframe-http
airframe-http-finagle
Launch HTTP
Services
airframe DI
Debug Logs
Schema-On-Read
Mapping
Airframe
Airframeを中心にコード資産が形成されている

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
Resolving Technical Debts with Airframe Upgrade
● Migrate common programming patterns into Airframe
● Upgrade Airframe Version
● YY.MM.patch versioning: 19.5.x, 19.6.x, …
■ Easy to see how behind the project is from the latest version.
● Reduce code and logic duplications across components
47
Knowledges
Experiences
Design Decisions
Products
24/7 Services
Business Values
Programming OSS Outcome
Airframeをアップグレードする際に技術的負債を解消していく

Airframe
Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
Scala At Arm Treasure Data
● Scala is now an official language at Arm Treasure Data
● 0 -> 10+ engineers who can write Scala
● Use cases are growing:
● Query optimization, API, Spark, data analysis,
storage systems, service operation, etc.
● We are happy to share our Scala assets through Airframe!
48
Add Your GitHub Star!
wvlet/airframe
Airframe
calaエンジニアが充実してきたArm reasure Data。 calaの適用範囲も広がっている

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
Presto Conference Tokyo 2019
● July 11 (Thu), 2019, 13:30 ~ (Free)
● https://techplay.jp/event/733772
● Inviting Presto Creators (Martin, Dain, David)
● Presto Software Foundation
● Talks from big Presto users in Japan
● Yahoo! JAPAN, LINE, Arm Treasure Data
● Presto Source Code Navigation
49
resto Conference okyo 2019を7/11(木) 13:30~より開催 (参加無料)

Confidential © Arm 2017Confidential © Arm 2017Confidential © Arm 2017
Thank You!
Danke!
Merci!
谢谢!
ありがとう!
Gracias!
Kiitos!
50

More Related Content

PDF
Airframe Meetup #3: 2019 Updates & AirSpec
PDF
Scala for Everything: From Frontend to Backend Applications - Scala Matsuri 2020
PDF
Presto At Arm Treasure Data - 2019 Updates
PDF
Reading The Source Code of Presto
PDF
Airframe RPC
PDF
Journey of Migrating 1 Million Presto Queries - Presto Webinar 2020
PDF
Unifying Frontend and Backend Development with Scala - ScalaCon 2021
PDF
td-spark internals: Extending Spark with Airframe - Spark Meetup Tokyo #3 2020
Airframe Meetup #3: 2019 Updates & AirSpec
Scala for Everything: From Frontend to Backend Applications - Scala Matsuri 2020
Presto At Arm Treasure Data - 2019 Updates
Reading The Source Code of Presto
Airframe RPC
Journey of Migrating 1 Million Presto Queries - Presto Webinar 2020
Unifying Frontend and Backend Development with Scala - ScalaCon 2021
td-spark internals: Extending Spark with Airframe - Spark Meetup Tokyo #3 2020

What's hot (20)

PDF
Airframe: Lightweight Building Blocks for Scala @ TD Tech Talk 2018-10-17
PDF
Airframe: Lightweight Building Blocks for Scala - Scale By The Bay 2018
PDF
PLAZMA TD Tech Talk 2018 at Shibuya: Hive2 as a new td hadoop core engine
PDF
Custom Script Execution Environment on TD Workflow @ TD Tech Talk 2018-10-17
PDF
Managing Machine Learning workflows on Treasure Data
PDF
Recent Changes and Challenges for Future Presto
PDF
PGConf.ASIA 2019 - The Future of TDEforPG - Taiki Kondo
PDF
201810 td tech_talk
PDF
Leveraging open source for large scale analytics
PDF
Introduction to Flink Streaming
PDF
Flink Forward Berlin 2017: Roberto Bentivoglio, Saverio Veltri - NSDB (Natura...
PDF
Productionalizing a spark application
PDF
Functional APIs with Absinthe GraphQL
PDF
Improve data engineering work with Digdag and Presto UDF
PDF
Migrating batch ETLs to streaming Flink
PDF
BlackRay - The open Source Data Engine
PDF
P4 Introduction
PDF
Introduction to Structured streaming
PPT
HDF5 In Support of Database Applications
PPTX
Enabling Java: Windows on Arm64 - A Success Story!
Airframe: Lightweight Building Blocks for Scala @ TD Tech Talk 2018-10-17
Airframe: Lightweight Building Blocks for Scala - Scale By The Bay 2018
PLAZMA TD Tech Talk 2018 at Shibuya: Hive2 as a new td hadoop core engine
Custom Script Execution Environment on TD Workflow @ TD Tech Talk 2018-10-17
Managing Machine Learning workflows on Treasure Data
Recent Changes and Challenges for Future Presto
PGConf.ASIA 2019 - The Future of TDEforPG - Taiki Kondo
201810 td tech_talk
Leveraging open source for large scale analytics
Introduction to Flink Streaming
Flink Forward Berlin 2017: Roberto Bentivoglio, Saverio Veltri - NSDB (Natura...
Productionalizing a spark application
Functional APIs with Absinthe GraphQL
Improve data engineering work with Digdag and Presto UDF
Migrating batch ETLs to streaming Flink
BlackRay - The open Source Data Engine
P4 Introduction
Introduction to Structured streaming
HDF5 In Support of Database Applications
Enabling Java: Windows on Arm64 - A Success Story!
Ad

Similar to How To Use Scala At Work - Airframe In Action at Arm Treasure Data (20)

PDF
Five cool ways the JVM can run Apache Spark faster
PDF
Apache Big Data Europe 2016
PPTX
Kubernetes is hard! Lessons learned taking our apps to Kubernetes - Eldad Ass...
PDF
Revisit Dependency Injection in scala
PDF
Apache Spark Performance Observations
PDF
A Java Implementer's Guide to Boosting Apache Spark Performance by Tim Ellison.
PPTX
Make your data fly - Building data platform in AWS
PDF
Make your PySpark Data Fly with Arrow!
PPTX
Optimizing your SparkML pipelines using the latest features in Spark 2.3
PDF
Build Low-Latency Applications in Rust on ScyllaDB
PDF
IBM Runtimes Performance Observations with Apache Spark
PPTX
Interactive Analytics using Apache Spark
PDF
NFF-GO (YANFF) - Yet Another Network Function Framework
PPTX
Using LLVM to accelerate processing of data in Apache Arrow
PDF
Separation of Concerns through APIs: the Essence of #SmartDB
PPTX
Functions and DevOps
PDF
Spark Intro @ analytics big data summit
PDF
Running Spark In Production in the Cloud is Not Easy with Nayur Khan
PPTX
Building Machine Learning Inference Pipelines at Scale (July 2019)
PDF
“Quantum” Performance Effects: beyond the Core
Five cool ways the JVM can run Apache Spark faster
Apache Big Data Europe 2016
Kubernetes is hard! Lessons learned taking our apps to Kubernetes - Eldad Ass...
Revisit Dependency Injection in scala
Apache Spark Performance Observations
A Java Implementer's Guide to Boosting Apache Spark Performance by Tim Ellison.
Make your data fly - Building data platform in AWS
Make your PySpark Data Fly with Arrow!
Optimizing your SparkML pipelines using the latest features in Spark 2.3
Build Low-Latency Applications in Rust on ScyllaDB
IBM Runtimes Performance Observations with Apache Spark
Interactive Analytics using Apache Spark
NFF-GO (YANFF) - Yet Another Network Function Framework
Using LLVM to accelerate processing of data in Apache Arrow
Separation of Concerns through APIs: the Essence of #SmartDB
Functions and DevOps
Spark Intro @ analytics big data summit
Running Spark In Production in the Cloud is Not Easy with Nayur Khan
Building Machine Learning Inference Pipelines at Scale (July 2019)
“Quantum” Performance Effects: beyond the Core
Ad

More from Taro L. Saito (17)

PDF
Tips For Maintaining OSS Projects
PDF
Learning Silicon Valley Culture
PDF
Presto At Treasure Data
PDF
Scala at Treasure Data
PDF
Introduction to Presto at Treasure Data
PDF
Workflow Hacks #1 - dots. Tokyo
PDF
Presto @ Treasure Data - Presto Meetup Boston 2015
PDF
Presto As A Service - Treasure DataでのPresto運用事例
PPTX
JNuma Library
PDF
Presto as a Service - Tips for operation and monitoring
PDF
Treasure Dataを支える技術 - MessagePack編
PDF
Weaving Dataflows with Silk - ScalaMatsuri 2014, Tokyo
PPTX
Spark Internals - Hadoop Source Code Reading #16 in Japan
PPTX
Streaming Distributed Data Processing with Silk #deim2014
PDF
Silkによる並列分散ワークフロープログラミング
PDF
2011年度 生物データベース論 2日目 木構造データ
PDF
Relational-Style XML Query @ SIGMOD-J 2008 Dec.
Tips For Maintaining OSS Projects
Learning Silicon Valley Culture
Presto At Treasure Data
Scala at Treasure Data
Introduction to Presto at Treasure Data
Workflow Hacks #1 - dots. Tokyo
Presto @ Treasure Data - Presto Meetup Boston 2015
Presto As A Service - Treasure DataでのPresto運用事例
JNuma Library
Presto as a Service - Tips for operation and monitoring
Treasure Dataを支える技術 - MessagePack編
Weaving Dataflows with Silk - ScalaMatsuri 2014, Tokyo
Spark Internals - Hadoop Source Code Reading #16 in Japan
Streaming Distributed Data Processing with Silk #deim2014
Silkによる並列分散ワークフロープログラミング
2011年度 生物データベース論 2日目 木構造データ
Relational-Style XML Query @ SIGMOD-J 2008 Dec.

Recently uploaded (20)

PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Approach and Philosophy of On baking technology
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPT
Teaching material agriculture food technology
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
KodekX | Application Modernization Development
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
Big Data Technologies - Introduction.pptx
PPTX
Programs and apps: productivity, graphics, security and other tools
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Approach and Philosophy of On baking technology
Dropbox Q2 2025 Financial Results & Investor Presentation
Teaching material agriculture food technology
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Digital-Transformation-Roadmap-for-Companies.pptx
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
KodekX | Application Modernization Development
Chapter 3 Spatial Domain Image Processing.pdf
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Understanding_Digital_Forensics_Presentation.pptx
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Unlocking AI with Model Context Protocol (MCP)
MIND Revenue Release Quarter 2 2025 Press Release
Reach Out and Touch Someone: Haptics and Empathic Computing
Review of recent advances in non-invasive hemoglobin estimation
Network Security Unit 5.pdf for BCA BBA.
Big Data Technologies - Introduction.pptx
Programs and apps: productivity, graphics, security and other tools

How To Use Scala At Work - Airframe In Action at Arm Treasure Data

  • 1. Taro L. Saito, Ph.D. Arm Treasure Data June 29, 2019 Scala Matsuri 2019 - Tokyo How To Use Scala At Work Airframe In Action At Arm Treasure Data 1calaを仕事で使おう - Arm reasure DataでのAirframe活用事例

  • 2. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. About Me: Taro L. Saito (Leo) 2 ● Principal Software Engineer at Arm Treasure Data ● Building distributed query engine service ● Living in US for 4 years ● DBMS & Data Science Background ● Ph.D. of Computer Science ● Database Systems and Genome Sciences Research ● Assistant Professor at the University of Tokyo ● OSS Projects Around Scala ● sbt-sonatype: used for releasing 3000+ Scala projects ● snappy-java: a compression library used in Spark, Parquet, etc. 自己紹介

  • 3. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. New Release from O’Reilly Japan ● Helped Japanese translation of Data-Intensive Application Design ● Techniques and concepts around distributed data processing systems ● Available at Amazon.co.jp and O’Reilly Japan web sites ● will be published on July 18, 2019 3 分散データシステム入門の決定版の翻訳が来月発売

  • 4. 400+ Customers Founded in 2011 Raised $54M Security Acquired by Arm / Softbank 2018 Arm Treasure Data Arm reasure Dataの概要

  • 5. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. The Architecture of Arm Treasure Data 5 DataLogs Device Data Batch Data PlazmaDB Table Schema Data Collection Cloud Storage Distributed Data Processing 2 million records / sec. 130 trillion records 1 billion rows processed / sec. Jobs Job Management SQL Editor Scheduler Workflows Machine Learning Treasure Data OSS Third Party OSS reasure Dataのシステム構成。 calaはどこに?

  • 6. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Module Mix-InPackaging HTTP Requests and Responses Data airframe-launcher > _ airframe-log production: port: 10010 user: xxxx ... airframe-config airframe-codec sbt-pack airframe-fluentd Scala Objects Table Data (CSV, TSV) JSON airframe-jsonairframe-surface airframe-tablet airframe-jmx Monitor Runtime States Generate Mapping Codec Metrics & Log Data JDBC ResultSets airframe-jdbc airframe-http airframe-http-finagle Launch HTTP Services airframe DI Debug Logs Schema-On-Read Mapping Airframe サービスの裏側で使われているAirframe ( cala製 ) のモジュール群

  • 7. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Our OSS Strategy Around Scala ● Gather the best practices of Scala into Airframe OSS ● Get the real experiences by operating 24/7 services 7 Knowledge Experiences Design Decisions Products 24/7 Services Business Values Programming OSS Outcome Airframeを核にした cala周辺の 戦略
 Airframe
  • 8. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. ● Various internal and third-party Scala/Java libraries ● Managed in different repositories, different release cycles ● High-learning cost ■ The knowledge is confined to engineers’ brains 3 Years Ago... 8 Knowledge Experiences Design Decisions Products 24/7 Services Business Values Programming Various Libraries Outcome 3年前、Airframeは存在せず、様々なライブラリが混在していた
 logger launcher object mapper JDBC reader json4s jackson ….
  • 9. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. 5 Years Ago... ● No Scala engineer in the company ● Scala in 2014: Scala 2.9.x ● Was not good enough to use: ■ e.g., no string interpolation like s”... ${x}...” 9 Knowledge Experiences Design Decisions Products 24/7 Services Business Values Programming Ruby, Java Outcome 5年前には calaのエンジニアも、 calaのコードもなかった

  • 10. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Today’s Agenda ● How to introduce Scala to your company ● Learn the best practices of using Scala at work ● From 20 Airframe modules 10本日紹介する内容
 Airframe
  • 11. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. How Can We Introduce Scala? ● Saying “I want to use Scala” ● It will not work, especially if you or your team are not familiar with Scala ● Your managers need more information whether it’s good enough or not ● Even if you are a tech lead: ● Need some confidence in using Scala in production ● How can we establish such confidence in using Scala? 11calaをどう導入するか? calaを使っても良いという自信を得るには?

  • 12. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Start With A Small Investment to Scala ● Guidelines ● Think how you can save your time with Scala ● If you can save 1 minute in a day, your can spend 6 hours for this improvement ■ Save 1 minute / day = 365 minutes / year = 6 hour investment ■ Save 10 minutes / week = 520 minutes / year = 8.6 hour investment ■ Save 1 hour / week = 52 hours / year = 2.2 day investment ● Time is your most valuable asset ● Save your time by using Scala 12「 calaを使って」時間を節約するための「小さな投資」をはじめよう

  • 13. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. ● prestop (presto + top) ● Non production service code ● A handy query monitoring tool for Presto, written in Scala ● Display complex JSON data with fancy ANSI color The First Scala Code in TD 13reasure Data最初の calaプログラム

  • 14. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. airframe-log ● Scala 2.10: My small investment to test Scala Macros and String interpolation ● A Modern Logging Library for Scala (at Medium) ● ANSI color and source code location display ● Just add LogSupport trait to your class 14プログラムの開発をログメッセージで効率化する

  • 15. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. airframe-launcher ● Needed to handle complex command line options and nested commands ● e.g., $ prestop -e production monitor (other options …) ● Enabled annotation-based command line definitions 15複雑なコマンドラインプログラムを簡単に作成できるようにする

  • 16. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. airframe-config: Application Configuration Flow ● YAML config (embedded into Docker) ● Override credentials, then bind to config objects YAML development: addr: api-dev.com production: addr: api.com Config Object case class ServerConfig( addr: String, port: Int = 8080, password: String ) production: addr: api.com command: -e production Credentials and Local Configurations Merge Immutable Object Default Parameters (e.g., port = 8080) Object Mapping 16アプリケーション設定のフローをライブラリ化
 airframe-launcher > _
  • 17. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. sbt-pack plugin ● A sbt plugin to create standalone Scala packages ● A single folder package with bin and lib folders containing all dependent JARs ● Generates command-line launcher scripts ● My small investment in 2012 to save packaging time 17sbt-packでプログラムをパッケージングし、Dockerイメージを手軽に作成
 airframe-launcher airframe-config YAML config file Standalone Scala Package sbt-pack Dockerfile
  • 18. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Medium-SIze Investment: Find A Common Pattern ● Extract a common problem pattern and create a solution ● Data -> Object Mapping ● How many data readers and object mappers do we need? ● How can we save our time for handling such various data types? YAML JDBC ResultSet YAML Parser + Object Mapper Config Object Table Object Object-Relation Mapper JSON JSON Parser + Object Mapper Object 18入力データを cala bjectにマッピングしたいケースは多い。中期的な投資が必要

  • 19. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. airframe-msgpack: MessagePack as Universal Data Format ● MessagePack (msgpack.org) ● Compact JSON-like binary format ● Describes data types and data values at the same time (self-describing) Object Unpack Pack JDBC ResultSet Pack/Unpack YAML JSON 19essage ackを中間フォーマットとして使うと、オブジェクトマッパーの実装は1つに
 MessagePack
  • 20. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. PlazmaDB: MessagePack DBMS ● Fluentd -> MessagePack -> Arm Treasure Data ● Automatically generating table schema from MessagePack data ● Apply schema–on-read for providing table data for Presto/Hive/Spark, etc. Table Schema Int Column Reader String Column Reader Update Schema Generate Reader Set Table Reader Schema-free Data 20 Data Collection Distributed Data Processing Arm reasure Dataは essage ackベースの chema-on-readシステム

  • 21. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Schema-On-Read Data Processing with MessagePack ● Users can store arbitrary typed data (No table design is required) ● Data can be read in a target type required by the application (e.g., SQL query) Int Float Boolean String Array Map Binary SQL BigInt parseInt toInt 0 or 1 IntCodec Pack Unpack Error or null “100” (string) 100 (int) 100 (int) 21 Logs データ読み込み時に、アプリケーションの要求する型に合わせる ( chema-on- ead)
 CSV command-line arguments
  • 22. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. airframe-codec: Schema-On-Read Pack/Unpack Interface ● Apply schema-on-read for Scala objects Input MessagePack Output Pack Unpack PackUnpack 22essage ackを通した chema-on-readデータ変換インターフェースを calaに適用

  • 23. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Pre-defined Codecs in airframe-codec ● Primitive Codecs ● ByteCodec, CharCodec, ShortCodec, IntCodec, LongCodec ● FloatCodec, DoubleCodec ● StringCodec ● BooleanCodec ● TimeStampCodec ● Collection Codec ● ArrayCodec, SeqCodec, ListCodec, IndexSeqCodec, MapCodec, etc. ● OptionCodec ● JsonCodec (airframe-json) ● Java-specific Codec ● FileCodec, ZonedDateTimeCodec, JDBCResultSetCodec, etc. ● Adding Custom Codecs ● Implement MessageCodec[X] interface 23calaで必要なほぼ全てのデータ型へのマッピングをサポート

  • 24. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. MessageCodec.of[A]: Combination of Codecs Unpack Pack IntCodec StringCodec DoubleCodec MessagePack MessageCodec.of[A] 24オブジェクトの型に合わせてCodecを合成

  • 25. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. airframe-surface ● Reading Type Signatures From ScalaSig ● Scala compiler embeds Scala Type Signatures (ScalaSig) to class files ● Surface.of[A] ■ returns A’s parameter names and types class A (data:List[B]) class A data: List[java.lang.Object] class A data: List[java.lang.Object] ScalaSig: data:List[B] javac scalac Surface.of[A] data: List[B] scala.reflect.runtime.universe.TypeTag Type erasure removes generic type information 25オブジェクトの型情報を cala igから取得する

  • 26. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. [WIP] Scala.js RPC ● Scala.js ● Compiling Scala code into JavaScript for Web Browsers ● airframe-codec: Passing model class data between Scala and Scala.js UserInfo MessagePack UserInfo Pack Unpack PackUnpack Scala Server Side Scala.js Client Side XML RPC 26airframe-codecは cala.js(ブラウザ側)とのデータ受け渡しにも使える

  • 27. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. [WIP] airframe-sql ● Universal stream SQL engine ● Processing various types of data through MessagePack MessagePack Stream SQL MessagePack Query Processing Filter/Aggregation/Join, etc. 27任意のデータ形式に対し、 essage ackを通して で処理をする
 JDBC ResultSet Pack YAML JSON
  • 28. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. 28 Scala In Production
  • 29. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. A Technical Debt In TD (2015-2016) ● Prestogres: PostgreSQL gateway to Presto ● Enabled using PostgreSQL JDBC/ODBC drivers to access Presto ● So-called Sada (founder)’s magic ● Was good for the first use cases ● Many Problems: ● Hacks around pgpool-II was hard to debug ● Hard to support customers upon errors ● Incompatible SQL with Presto ● Nobody could fix these issues ■ including the creator! 29restogresというハックが技術的負債になっていた

  • 30. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Replacing Prestogres with Prestobase 30calaで restobaseのプロトタイプを作成. 3ヶ月後にサービスリリース
 ● Prototyped in Scala within a week after a quick chat with Sada ● Utilizing Airframe assets ● Deployed as a production service in 3 months
  • 31. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. airframe-di ● Created a dependency injection library for Scala ● For Prestobase development ● Scala-friendly Syntax ● Useful for combining hundreds of modules ● based on airframe-surface, airframe-log ● See also: ● Airframe Meetup #1 Report (2018) 31restobaseの開発中に calaのためのAirframe DIが誕生

  • 32. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Airframe OSS ● Lightweight Building Blocks for Scala ● Collection of our investments to Scala ● Repackaged into wvlet.airframe in 2016 ● airframe-log ● airframe-launcher ● airframe-config ● airframe-surface ● airframe-di ● airframe-codec ● ... ● As of 2019, Airframe has 20 modules ● 35+ releases in 2018 ● Already had 17+ releases in 2019 ● Contributing to the Scala Community Build ● To test the latest Scala versions 322016年に各種ツールをAirframeとして統合。20のモジュール、頻繁なリリースサイクル
 Airframe
  • 33. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Monorepo ● Cross build ● For 3 + 1 Scala versions ■ 2.13, 2.12, 2.11, and Scala.js ● 20 modules ■ 4 x 20 = 80 artifacts! ● Challenge ● Publishing took 3 hours with sbt-release ● Bottleneck ● Sequential run of compile -> test -> publish for all artifacts 33Airframeはメンテナンスを集約するため単一レポジトリ構成

  • 34. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Release Automation on Travis CI ● Single-Step Release ● Triggered by git tag ● Running Tasks In Parallel ● Run tests for each Scala version ● Update doc & release notes ■ Generate release notes from git logs ● Publish ■ sbt-pgp & sbt-sonatype ○ GPG signature ○ Copy to Maven Central ● Finishes around 10~20 minutes ● Blog: 3 Tips For Maintaining Scala Projects 34ravis CI上でリリースを全自動化し、頻繁なリリースを可能に

  • 35. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. sbt-sonatype plugin ● A sbt-plugin for releasing projects to Maven Central ● open staging repository -> verify -> close -> promote -> drop ● A small investment ● At 2015 new year holiday => Payed off for saving Airframe release time ● 3000+ Scala projects are using sbt-sonatype 35sbt-sonatypeはお正月休みに作られたプロジェクト。多くの calaライブラリで使われている

  • 36. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. airframe-http ● Created a simple HTTP framework ● Based on Airframe modules: ■ airframe-surface ■ airframe-codec ■ airframe-msgpack ■ etc. ● Blog ● Building Low-Friction Web Service Over Finagle ● Save the time for choosing a web framework: ● Many frameworks exist: ● e.g, Finatra, Finch, akka-http, spring, RESTeasy, open-api, swagger, etc. 36Airframe資産を活用して、Webフレームワークも手軽に作成

  • 37. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. airframe-http-client ● Error handling of HTTP requests is difficult ● 4xx, 5xx status code ● Should we retry the request? ■ IOException, EOFException ■ TimeoutException ■ InterruptedException ■ SSLException ■ InvocationTargetException ● HTTP client ● request retries ● response mapping ■ JSON, MessagePack format ● airframe-codec 37間違いやすいH リクエストのエラーハンドリングをライブラリ化

  • 38. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. airframe-control ● Everything can fail … ● Network disconnection ● Servere crash ● ... ● Retry ● Exponential backoff ■ 2x, 4x, ... ● Jittering ■ 1 sec., 2 * rand, 4 * rand, … ● Customize error type classifiers ● retryable failures ● non-retryable failures 38リトライ処理をパターン化

  • 39. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. airframe-http-recorder ● Testing against actual web services is time consuming ● Record & Replay HTTP responses ● Reproducible results ● Runnable on small machines (e.g., Travis CI) 39H リクエストをレコーディングして、Webサービスのテストを効率化する
 HTTP Request HTTP Recorder Request Real Web Service Recording Mode: Response HTTP Request HTTP Recorder Replay Mode: Request Response Recording Responses Request Recorded Responses
  • 40. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. 40 Data Analysis with Scala
  • 41. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Data-Driven System Optimization ● TD is one of the biggest users of TD ● Query logs ● Collecting all Presto query logs since 2015 ● Query statements, performance statistics, logs, etc. ● Logs are our valuable assets ● To understand user activities and enable data-driven optimizations 41 Logs User Query Collect Query Logs Analyze Query Logs Machine Learning Query Optimization Optimize System システムの最適化のためにログの収集、解析が重要

  • 42. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. airframe-fluentd ● Collect Scala Application Logs To Fluentd ● Scala Objects -> MessagePack -> Fluentd 42essage ackを受け取るFluentdには、airframe-codeの出力を渡せる
 Collect Query Logs Analyze Query Logs Machine Learning Query Optimization Optimize System airframe-fluentd Scala Objects airframe-codec
  • 43. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. airframe-jmx ● Add @JMX annotation to your application metrics ● It’s also useful to check the application version, configurations, etc. ● JMX clients can check these metrics ● e.g., jconsole 43J Xで、JV の外側からアプリケーションの状態を確認し、メトリックを収集

  • 44. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. airframe-metrics ● Human Readable Data Format (ElapsedTime, DataSize, etc.) ● Handy Time Window String Support 44時間幅、区間、データサイズを人間を扱いやすい形式にし、ログの解析を効率化

  • 45. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Taking Snapshots of Data Analysis Tasks ● Save Long-Running Task Results As MessagePack (binary) ● Save the cost of re-computation Result: Seq[A] MessagePack Storage Pack Save Unpack Task Run Load Second Run: Load Compute (e.g., 10 min) First run Snapshot 45Airframe資産を活用して、データ解析結果をキャッシュし作業を効率化する

  • 46. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Module Mix-InPackaging HTTP Requests and Responses Data airframe-launcher > _ airframe-log production: port: 10010 user: xxxx ... airframe-config airframe-codec sbt-pack airframe-fluentd Scala Objects Table Data (CSV, TSV) JSON airframe-jsonairframe-surface airframe-tablet airframe-jmx Monitor Runtime States Generate Mapping Codec Metrics & Log Data JDBC ResultSets airframe-jdbc airframe-http airframe-http-finagle Launch HTTP Services airframe DI Debug Logs Schema-On-Read Mapping Airframe Airframeを中心にコード資産が形成されている

  • 47. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Resolving Technical Debts with Airframe Upgrade ● Migrate common programming patterns into Airframe ● Upgrade Airframe Version ● YY.MM.patch versioning: 19.5.x, 19.6.x, … ■ Easy to see how behind the project is from the latest version. ● Reduce code and logic duplications across components 47 Knowledges Experiences Design Decisions Products 24/7 Services Business Values Programming OSS Outcome Airframeをアップグレードする際に技術的負債を解消していく
 Airframe
  • 48. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Scala At Arm Treasure Data ● Scala is now an official language at Arm Treasure Data ● 0 -> 10+ engineers who can write Scala ● Use cases are growing: ● Query optimization, API, Spark, data analysis, storage systems, service operation, etc. ● We are happy to share our Scala assets through Airframe! 48 Add Your GitHub Star! wvlet/airframe Airframe calaエンジニアが充実してきたArm reasure Data。 calaの適用範囲も広がっている

  • 49. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Presto Conference Tokyo 2019 ● July 11 (Thu), 2019, 13:30 ~ (Free) ● https://techplay.jp/event/733772 ● Inviting Presto Creators (Martin, Dain, David) ● Presto Software Foundation ● Talks from big Presto users in Japan ● Yahoo! JAPAN, LINE, Arm Treasure Data ● Presto Source Code Navigation 49 resto Conference okyo 2019を7/11(木) 13:30~より開催 (参加無料)

  • 50. Confidential © Arm 2017Confidential © Arm 2017Confidential © Arm 2017 Thank You! Danke! Merci! 谢谢! ありがとう! Gracias! Kiitos! 50