SlideShare a Scribd company logo
Open Source Software,
Distributed Systems,
Database as a Cloud Service
第106回オープンソースサロン・総会記念講演
Jul 29, 2016
Satoshi Tagomori (@tagomoris)
Satoshi "Moris" Tagomori
(@tagomoris)
Fluentd, MessagePack-Ruby, Norikra, ...
Treasure Data, Inc.
Topics
• What is Treasure Data?
• Who is tagomoris?
• Treasure Data: Database as a Service
• DB as a Service and Distributed Systems
• Distributed Systems and Open Source Software
• Open Source Software and Developers
Open Source Software, Distributed Systems, Database as a Cloud Service
http://rubybiz.jp/prize.html
Open Source Software, Distributed Systems, Database as a Cloud Service
API
Data
M
arts
O
DBC
/ JDBC
Sensor
ERP
CRM
RDBMS
Mobile
Web
Server
3 Complex ETL
4 End User System
2 Time consuming integration
1 Disparate data silos
Without

Treasure Data
Advanced
Analytics
Reporting
BI
API
Data
M
arts
O
DBC
/ JDBC
Sensor
ERP
CRM
RDBMS
Mobile
Web
Server
IoT
Connectors
Data
Connectors
JavaScript
SDK
Serverside
collector
Bulk
Loader
M
obile
SDK
With Treasure Data
3
Easy to Integrate
2 Zero Management
1
Easy to Collect
50+ Data Outputs
Multi-Tenant Cloud Service
300+ Data Sources
Advanced
Analytics
Reporting
BI
50+Integrations
Schema-flexible, Access via SQL,
Unlimited Users, Queries
HQ
Branch
Matsue
Open Source Software, Distributed Systems, Database as a Cloud Service
Treasure Data, Inc.
• Since Nov 2011
• Headquarters: Mountain View, CA, US
• Japan Branch: Marunouchi, Chiyoda, Tokyo
• Korea Branch: Gangnam, Seoul
• Some remote workers - US, UK, Costa Rica
Developers in TD
• Daily development in each offices
• Communication over Internet
• Slack, JIRA, Confluence & Zoom
• Frontend Team: mainly in US
• Console, Web services, etc
• Backend Team: mainly in JP
• Database, Distributed processing systems, etc
Satoshi "Moris" Tagomori
(@tagomoris)
Born in Matsue, Shimane
Living in Tokyo from 1999
Started to work
as an OSS developer
1. Asahi Net
Internal system developer
2. NTT DATA Intellilink
System consultant
3. livedoor - NHNJ - LINE
Infrastructure engineer
Data analytics platform
engineer
4. Treasure Data
Backend engineer
OSS developer
@tagomoris as
an Open Source Software Developer
• Author
• Norikra, Woothee, xbuild, Shib, Yabitz, Focuslight
• Many fluent-plugin-*
• And many libraries, tools, etc
• Committer, Maintainer
• Fluentd, MessagePack-Ruby, etc
• Contributor
• Docker (logging driver), etc
@tagomoris as
an Open Source Software Developer
• Talks
• Many programming conferences (local, global)
• Many small meetups
• Articles
• WEB+DB Magazine, Software Design
• Many blog posts
• Invented Event: ISUCON
OSS Developers in TD
• MessagePack, Fluentd, Embulk & Digdag founder
• Ruby committer
• Ruby & JRuby committer
• Fluentd & D-language committer
• Hadoop/Spark contributor, pyenv author, ...
Why Are OSS Developers
So Major in TD?
Treasure Data:
Database as a Cloud
Service
API
Data
M
arts
O
DBC
/ JDBC
Sensor
ERP
CRM
RDBMS
Mobile
Web
Server
IoT
Connectors
Data
Connectors
JavaScript
SDK
Serverside
collector
Bulk
Loader
M
obile
SDK
3
Easy to Integrate
2 Zero Management
1
Easy to Collect
50+ Data Outputs
Multi-Tenant Cloud Service
300+ Data Sources
Advanced
Analytics
Reporting
BI
50+Integrations
Database as a Cloud Service
• Collect data
• from remote site - customer side
• Store/Process data
• beyond cloud
• Integrate data
• to remote site - customer side
Two OSS Pattern in TD
• OSS to collect/integrate data from/to remote site
• OSS to store/process data
API
Data
M
arts
O
DBC
/ JDBC
Sensor
ERP
CRM
RDBMS
Mobile
Web
Server
IoT
Connectors
Data
Connectors
JavaScript
SDK
Serverside
collector
Bulk
Loader
M
obile
SDK
3
Easy to Integrate
2 Zero Management
1
Easy to Collect
50+ Data Outputs
Multi-Tenant Cloud Service
300+ Data Sources
Advanced
Analytics
Reporting
BI
50+Integrations
Make Input/Output Easy
• Agent installed in our customers systems
• OSS + Plugin to connect various systems
• No barrier to use TD
1.Make a great OSS product to do it
2.Make it major
3.Potential customer already uses it :)
• very easy to switch to use Treasure Data!
Multi-Tenant Cloud Service
API
Data
M
arts
O
DBC
/ JDBC
Sensor
ERP
CRM
RDBMS
Mobile
Web
Server
IoT
Connectors
Data
Connectors
JavaScript
SDK
Serverside
collector
Bulk
Loader
M
obile
SDK
3
Easy to Integrate
2 Zero Management
1
Easy to Collect
50+ Data Outputs
300+ Data Sources
Advanced
Analytics
Reporting
BI
50+Integrations
Database as a Service
and
Distributed Systems
Many Customers in a System
• Share computer resource
• Provide much more computer resource
• Reduce total cost :-)
Big Data in a System
• Manage big data from many customers
• Manage computing power for many customers
• Create a distributed system!
• for fast query processor
• for resource scheduler
• for high availability
Distributed Systems
and
Open Source Software
Distributed Systems
Distributed System Software
• Major software are all OSS
• Hadoop, Presto, Kafka, Storm, ...
• Concept and Implementation
• MapReduce concept was from Google
• Yahoo! engineers implemented it as Hadoop
• Many others made Hadoop better
• Data is always growing

-> Software MUST be growing too
Deploying Distributed System
• Many things make it hard to fix issues
• Big data, many computers, complex queries, ...
• We MUST fix our issues as soon as possible
• for our customers
• for our operation costs
DO IT YOURSELF! → OSS
Updating Distributed System
• It's very hard to update distributed systems
• many servers, no data lost, no downtime, ...
• Use OSS as-is without dirty fix
• to keep it easy to upgrade "software"
• Contribute your patch to community
• to use patched mainstream software as-is
Open Source Software
and
Developers
DIY Policy Makes "Tech" Company
• Do it yourself "At Your Own Risk": OSS
• Taking risk: more OSS
• OSS: more controllable than proprietary software
• We can read/contribute source code :)
• Technology problem: Can we take a risk? Or not?
Tech Company and Developers
• Taking risk for business success:

more focus on technology
• Quality of OSS depends on each developers
• Who is the committer of that product?
• Who can review quality of that product?
• Tech company needs great developers seriously!
OSS and Developers
• "OSS Committer", not "OSS Committing Company"
• the initiative by developer, not company
• Commit log shows everything about common things
• Who did contribute to that software?
• Who did develop that feature?
• Who did fix that problem?
• People can know who is a good software engineer
• it makes good developers happy!
Developers love OSS Company
• OSS Company: a kind of Tech Companies
• easy to find it: see committers/contributors
• Developers love:
• challenging "technical" tasks/issues to be solved
• great coworkers, like committers of great software
• nice salary brought by taking risk :P
Enjoy Engineering!
MOST IMPORTANT THING:
Thanks!

More Related Content

PDF
How to Make Norikra Perfect
PDF
Distributed Logging Architecture in Container Era
PDF
How To Write Middleware In Ruby
PDF
Perfect Norikra 2nd Season
PDF
To Have Own Data Analytics Platform, Or NOT To
PDF
Data Analytics Service Company and Its Ruby Usage
PDF
Overview of data analytics service: Treasure Data Service
PDF
Technologies, Data Analytics Service and Enterprise Business
How to Make Norikra Perfect
Distributed Logging Architecture in Container Era
How To Write Middleware In Ruby
Perfect Norikra 2nd Season
To Have Own Data Analytics Platform, Or NOT To
Data Analytics Service Company and Its Ruby Usage
Overview of data analytics service: Treasure Data Service
Technologies, Data Analytics Service and Enterprise Business

What's hot (20)

PDF
Norikra Recent Updates
PDF
Presto At Treasure Data
PDF
Fluentd - Flexible, Stable, Scalable
PDF
Treasure Data and AWS - Developers.io 2015
PDF
Data Analytics Service Company and Its Ruby Usage
PDF
Tale of ISUCON and Its Bench Tools
PDF
Plazma - Treasure Data’s distributed analytical database -
PDF
Presto anatomy
PPTX
Bullet: A Real Time Data Query Engine
PDF
Technologies for Data Analytics Platform
PDF
Ruby and Distributed Storage Systems
PDF
Planet-scale Data Ingestion Pipeline: Bigdam
PDF
Presto - Hadoop Conference Japan 2014
PDF
Lambda Architecture Using SQL
PDF
Open Source Logging and Monitoring Tools
PDF
fluentd -- the missing log collector
PDF
"How about no grep and zabbix?". ELK based alerts and metrics.
PDF
User Defined Partitioning on PlazmaDB
PDF
Fluentd and Kafka
PDF
Using Morphlines for On-the-Fly ETL
Norikra Recent Updates
Presto At Treasure Data
Fluentd - Flexible, Stable, Scalable
Treasure Data and AWS - Developers.io 2015
Data Analytics Service Company and Its Ruby Usage
Tale of ISUCON and Its Bench Tools
Plazma - Treasure Data’s distributed analytical database -
Presto anatomy
Bullet: A Real Time Data Query Engine
Technologies for Data Analytics Platform
Ruby and Distributed Storage Systems
Planet-scale Data Ingestion Pipeline: Bigdam
Presto - Hadoop Conference Japan 2014
Lambda Architecture Using SQL
Open Source Logging and Monitoring Tools
fluentd -- the missing log collector
"How about no grep and zabbix?". ELK based alerts and metrics.
User Defined Partitioning on PlazmaDB
Fluentd and Kafka
Using Morphlines for On-the-Fly ETL
Ad

Viewers also liked (7)

PDF
Fighting API Compatibility On Fluentd Using "Black Magic"
PDF
Modern Black Mages Fighting in the Real World
PDF
Fluentd Overview, Now and Then
PDF
20160730 fluentd meetup in matsue slide
PDF
The Patterns of Distributed Logging and Containers
PDF
AWSにおけるバッチ処理の ベストプラクティス - Developers.IO Meetup 05
PDF
Fluentd v0.14 Plugin API Details
Fighting API Compatibility On Fluentd Using "Black Magic"
Modern Black Mages Fighting in the Real World
Fluentd Overview, Now and Then
20160730 fluentd meetup in matsue slide
The Patterns of Distributed Logging and Containers
AWSにおけるバッチ処理の ベストプラクティス - Developers.IO Meetup 05
Fluentd v0.14 Plugin API Details
Ad

Similar to Open Source Software, Distributed Systems, Database as a Cloud Service (20)

PPTX
Embracing OSS in the enterprise
PDF
IIA4: Open Source and the Enterprise ( Predix Transform 2016)
PDF
Open source in India
PPTX
Mobilize Your Community Army: A Commercial OpenSource's Perspective
PDF
Treasure Data Cloud Data Platform
PDF
WSO2Con EU 2015: Open Source Journey at Ordnance Survey
PDF
The Role of Open-Source Software in Modern Development
PPT
Third Nature - Open Source Data Warehousing
ODP
Prasoon
PDF
Treasure Data Cloud Strategy
PDF
Treasure Data and OSS
ODP
Leveraging Open Source
PDF
SIM RTP Meeting - So Who's Using Open Source Anyway?
PPT
Oss and foss
PPTX
My Seminar
PPTX
Sharing is Caring, How OSS can help embed a DevOps Culture
PPTX
Open source
PPTX
How to become an awesome oss
PPTX
The Coming OSS Sustainability Crisis
PPTX
SOA (Service Oriented Architecture)
Embracing OSS in the enterprise
IIA4: Open Source and the Enterprise ( Predix Transform 2016)
Open source in India
Mobilize Your Community Army: A Commercial OpenSource's Perspective
Treasure Data Cloud Data Platform
WSO2Con EU 2015: Open Source Journey at Ordnance Survey
The Role of Open-Source Software in Modern Development
Third Nature - Open Source Data Warehousing
Prasoon
Treasure Data Cloud Strategy
Treasure Data and OSS
Leveraging Open Source
SIM RTP Meeting - So Who's Using Open Source Anyway?
Oss and foss
My Seminar
Sharing is Caring, How OSS can help embed a DevOps Culture
Open source
How to become an awesome oss
The Coming OSS Sustainability Crisis
SOA (Service Oriented Architecture)

More from SATOSHI TAGOMORI (13)

PDF
Ractor's speed is not light-speed
PDF
Good Things and Hard Things of SaaS Development/Operations
PDF
Maccro Strikes Back
PDF
Invitation to the dark side of Ruby
PDF
Hijacking Ruby Syntax in Ruby (RubyConf 2018)
PDF
Make Your Ruby Script Confusing
PDF
Hijacking Ruby Syntax in Ruby
PDF
Lock, Concurrency and Throughput of Exclusive Operations
PDF
Data Processing and Ruby in the World
PDF
Fluentd 101
PDF
Hive dirty/beautiful hacks in TD
PDF
Data-Driven Development Era and Its Technologies
PDF
Engineer as a Leading Role
Ractor's speed is not light-speed
Good Things and Hard Things of SaaS Development/Operations
Maccro Strikes Back
Invitation to the dark side of Ruby
Hijacking Ruby Syntax in Ruby (RubyConf 2018)
Make Your Ruby Script Confusing
Hijacking Ruby Syntax in Ruby
Lock, Concurrency and Throughput of Exclusive Operations
Data Processing and Ruby in the World
Fluentd 101
Hive dirty/beautiful hacks in TD
Data-Driven Development Era and Its Technologies
Engineer as a Leading Role

Recently uploaded (20)

PPTX
Odoo POS Development Services by CandidRoot Solutions
PPT
Introduction Database Management System for Course Database
PDF
medical staffing services at VALiNTRY
PPTX
L1 - Introduction to python Backend.pptx
PDF
PTS Company Brochure 2025 (1).pdf.......
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PPTX
Introduction to Artificial Intelligence
PPTX
Transform Your Business with a Software ERP System
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PDF
Understanding Forklifts - TECH EHS Solution
PDF
Softaken Excel to vCard Converter Software.pdf
PPTX
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
PPTX
Online Work Permit System for Fast Permit Processing
PDF
Nekopoi APK 2025 free lastest update
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
Odoo POS Development Services by CandidRoot Solutions
Introduction Database Management System for Course Database
medical staffing services at VALiNTRY
L1 - Introduction to python Backend.pptx
PTS Company Brochure 2025 (1).pdf.......
How to Migrate SBCGlobal Email to Yahoo Easily
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
Design an Analysis of Algorithms II-SECS-1021-03
Internet Downloader Manager (IDM) Crack 6.42 Build 41
Introduction to Artificial Intelligence
Transform Your Business with a Software ERP System
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
Understanding Forklifts - TECH EHS Solution
Softaken Excel to vCard Converter Software.pdf
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
Online Work Permit System for Fast Permit Processing
Nekopoi APK 2025 free lastest update
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...

Open Source Software, Distributed Systems, Database as a Cloud Service