SlideShare a Scribd company logo
Hadoop

gumi




           1
:
Twitter:@CkReal
gumi 2




        AWS S3,EMR


                     2
1.

2. Elastic MapReduce EMR

3.

4. EMR




                      3
4
2

             ( 1   )




fluentd




         5
1                        1,000      /
      AP
     APAP
       AP                                           DB
d         fluentd   fluentd   mongos               mongod(PRIMARY)

                                                    DB
                            config
                                               mongod(SECONDARY)

                                                    DB
                   fluentd   mongos
                                               mongod(SECONDARY)

                            config            ReplicaSets & Sharding
    NFS



                              6
EMR




      7
8.5GB             1.4GB /



                                                       ID



Nov 1 23:59:59 hogehoge-ap1 hogehoge ADD_MONEY 12345
[BeforeMoney] 67979 [AfterMoney] 68024 [Money] 45

Nov 1 23:59:59 hogehoge-ap2 hogehoge CONSUME_POWER 12345
[BeforePower] 25   [AfterPower] 20    [ConsumePower] 5


                         8
NFS

NFS




      9
MongoDB
     MongoDB                   (           )

     "app" : "hogehoge",

     "userid" : "12345",
                                                          ID
     "dateint" : 20111101,

     "hourint" : 23,

     "actions" : [

          "CONSUME_POWER",                     MongoDB   Sharding
          "ADD_MONEY"

     ],

     "records" : [

                "action" : "ADD_MONEY",

                "timeint" : 235959,

     ]

                                          10
EMR
 Hive Pig        Hadoop Streaming




                             Hadoop
                            Streaming
                             (Python)




            11
m2.4xlarge × 1         4.9GB           85

EMR(m2.xlarge) × 5       4.9GB           44

  m2.4xlarge × 1         7.2GB   138

EMR(m2.xlarge) × 5       7.2GB           69

         (Macbook Air)   3.6GB         30     …


                          12
EMR        CPU




      13
14
( )

NFS   Amazon S3                   EMR

S3

                                 EMR




                          S3             EMR

                                               config
                       MongoDB                 mongos

                  15
S3
     boto

S3
     c3cmd               S3

EMR
     Mapper,Reducer,Python2.7

MongoDB
     pymongo                  MongoDB

EMR
     Client Tool(Ruby)             EMR

                                        16
EMR




      17
EMR
S3
 EC2⇔S3            20MB/sec



 Hadoop



 HadoopStreaming

EMR



                      18
GB/




      19
20
21

More Related Content

PPT
MongoDB Basic Concepts
PDF
企業・業界情報プラットフォームSPEEDAにおけるElasticsearchの活用
PPT
Redis深入浅出
KEY
MongoFr : MongoDB as a log Collector
PDF
Java/Spring과 Node.js의공존
PDF
NoSQL 동향
PPTX
Cache in API Gateway
KEY
PostgreSQL
MongoDB Basic Concepts
企業・業界情報プラットフォームSPEEDAにおけるElasticsearchの活用
Redis深入浅出
MongoFr : MongoDB as a log Collector
Java/Spring과 Node.js의공존
NoSQL 동향
Cache in API Gateway
PostgreSQL

What's hot (20)

PDF
Tarantool как платформа для микросервисов / Антон Резников, Владимир Перепели...
PDF
PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)
PDF
MesosCon 2018
PDF
Mongodb
PDF
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종
PDF
CuPy v4 and v5 roadmap
PDF
Как PostgreSQL работает с диском
PPTX
To Hire, or to train, that is the question (Percona Live 2014)
PDF
XtraDB 5.7: key performance algorithms
PDF
Как понять, что происходит на сервере? / Александр Крижановский (NatSys Lab.,...
PDF
Новые возможности полнотекстового поиска в PostgreSQL / Олег Бартунов (Postgr...
PDF
PostgreSQL na EXT4, XFS, BTRFS a ZFS / FOSDEM PgDay 2016
PPTX
Embulk and Machine Learning infrastructure
PDF
PostgreSQL performance archaeology
PPTX
PostgreSQL is the new NoSQL - at Devoxx 2018
PDF
Chainer v4 and v5
PPTX
Sun jdk 1.6 gc english version
PPTX
Ops Jumpstart: Admin 101
PDF
Advanced backup methods (Postgres@CERN)
PDF
GlusterFS As an Object Storage
Tarantool как платформа для микросервисов / Антон Резников, Владимир Перепели...
PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)
MesosCon 2018
Mongodb
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종
CuPy v4 and v5 roadmap
Как PostgreSQL работает с диском
To Hire, or to train, that is the question (Percona Live 2014)
XtraDB 5.7: key performance algorithms
Как понять, что происходит на сервере? / Александр Крижановский (NatSys Lab.,...
Новые возможности полнотекстового поиска в PostgreSQL / Олег Бартунов (Postgr...
PostgreSQL na EXT4, XFS, BTRFS a ZFS / FOSDEM PgDay 2016
Embulk and Machine Learning infrastructure
PostgreSQL performance archaeology
PostgreSQL is the new NoSQL - at Devoxx 2018
Chainer v4 and v5
Sun jdk 1.6 gc english version
Ops Jumpstart: Admin 101
Advanced backup methods (Postgres@CERN)
GlusterFS As an Object Storage
Ad

Similar to ソーシャルゲームログ解析基盤のHadoop活用事例 (20)

PDF
Apache Nemo
PDF
Berkeley Performance Tuning
PDF
Parallel Computing for Econometricians with Amazon Web Services
KEY
Deploying with JRuby
PPTX
3rd meetup - Intro to Amazon EMR
PPTX
Cost effective BigData Processing on Amazon EC2
KEY
料理を楽しくする画像配信システム
PPTX
BedCon 2013 - Java Persistenz-Frameworks für MongoDB
PDF
Putting Kafka Together with the Best of Google Cloud Platform
PDF
mongodb tutorial
PDF
Lessons learned scaling big data in cloud
PDF
Antonios Giannopoulos Percona 2016 WiredTiger Configuration Variables
PDF
Flume-Cassandra Log Processor
KEY
COOKPADでのHadoop利用
PDF
Introduction To Elastic MapReduce at WHUG
KEY
R Jobs on the Cloud
PDF
KVSの性能、RDBMSのインデックス、更にMapReduceを併せ持つAll-in-One NoSQL: MongoDB
PDF
Cloud Computing BP-Study 20090319
PDF
Processing Big Data (Chapter 3, SC 11 Tutorial)
PDF
MapReduce: teoria e prática
Apache Nemo
Berkeley Performance Tuning
Parallel Computing for Econometricians with Amazon Web Services
Deploying with JRuby
3rd meetup - Intro to Amazon EMR
Cost effective BigData Processing on Amazon EC2
料理を楽しくする画像配信システム
BedCon 2013 - Java Persistenz-Frameworks für MongoDB
Putting Kafka Together with the Best of Google Cloud Platform
mongodb tutorial
Lessons learned scaling big data in cloud
Antonios Giannopoulos Percona 2016 WiredTiger Configuration Variables
Flume-Cassandra Log Processor
COOKPADでのHadoop利用
Introduction To Elastic MapReduce at WHUG
R Jobs on the Cloud
KVSの性能、RDBMSのインデックス、更にMapReduceを併せ持つAll-in-One NoSQL: MongoDB
Cloud Computing BP-Study 20090319
Processing Big Data (Chapter 3, SC 11 Tutorial)
MapReduce: teoria e prática
Ad

More from 知教 本間 (9)

PDF
gumiにおける、海外支社とのAtlassian製品利用事例
PDF
GitHubEnterpriseからBitbucket(Stash) への移行事例
PDF
AWSアカウント開設からインスタンスを立ち上げるまでの作業自動化について
PDF
Use case for using the ElastiCache for Redis in production
PDF
チームでChef serverを運用するには
PDF
Redisへと至る、gumiデータストアの歴史
PDF
ソーシャルゲームのEMR活用事例
PDF
MongoDBざっくり解説
KEY
ソーシャルゲームログ解析基盤のMongoDB活用事例
gumiにおける、海外支社とのAtlassian製品利用事例
GitHubEnterpriseからBitbucket(Stash) への移行事例
AWSアカウント開設からインスタンスを立ち上げるまでの作業自動化について
Use case for using the ElastiCache for Redis in production
チームでChef serverを運用するには
Redisへと至る、gumiデータストアの歴史
ソーシャルゲームのEMR活用事例
MongoDBざっくり解説
ソーシャルゲームログ解析基盤のMongoDB活用事例

Recently uploaded (20)

PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
cuic standard and advanced reporting.pdf
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
A Presentation on Artificial Intelligence
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Encapsulation_ Review paper, used for researhc scholars
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
Big Data Technologies - Introduction.pptx
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Chapter 3 Spatial Domain Image Processing.pdf
Digital-Transformation-Roadmap-for-Companies.pptx
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Understanding_Digital_Forensics_Presentation.pptx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
cuic standard and advanced reporting.pdf
MYSQL Presentation for SQL database connectivity
Unlocking AI with Model Context Protocol (MCP)
A Presentation on Artificial Intelligence
Spectral efficient network and resource selection model in 5G networks
Advanced methodologies resolving dimensionality complications for autism neur...
Encapsulation_ Review paper, used for researhc scholars
The AUB Centre for AI in Media Proposal.docx
Big Data Technologies - Introduction.pptx
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Per capita expenditure prediction using model stacking based on satellite ima...
Chapter 3 Spatial Domain Image Processing.pdf

ソーシャルゲームログ解析基盤のHadoop活用事例

  • 3. 1. 2. Elastic MapReduce EMR 3. 4. EMR 3
  • 4. 4
  • 5. 2 ( 1 ) fluentd 5
  • 6. 1 1,000 / AP APAP AP DB d fluentd fluentd mongos mongod(PRIMARY) DB config mongod(SECONDARY) DB fluentd mongos mongod(SECONDARY) config ReplicaSets & Sharding NFS 6
  • 7. EMR 7
  • 8. 8.5GB 1.4GB / ID Nov 1 23:59:59 hogehoge-ap1 hogehoge ADD_MONEY 12345 [BeforeMoney] 67979 [AfterMoney] 68024 [Money] 45 Nov 1 23:59:59 hogehoge-ap2 hogehoge CONSUME_POWER 12345 [BeforePower] 25 [AfterPower] 20 [ConsumePower] 5 8
  • 10. MongoDB MongoDB ( ) "app" : "hogehoge", "userid" : "12345", ID "dateint" : 20111101, "hourint" : 23, "actions" : [ "CONSUME_POWER", MongoDB Sharding "ADD_MONEY" ], "records" : [ "action" : "ADD_MONEY", "timeint" : 235959, ] 10
  • 11. EMR Hive Pig Hadoop Streaming Hadoop Streaming (Python) 11
  • 12. m2.4xlarge × 1 4.9GB 85 EMR(m2.xlarge) × 5 4.9GB 44 m2.4xlarge × 1 7.2GB 138 EMR(m2.xlarge) × 5 7.2GB 69 (Macbook Air) 3.6GB 30 … 12
  • 13. EMR CPU 13
  • 14. 14
  • 15. ( ) NFS Amazon S3 EMR S3 EMR S3 EMR config MongoDB mongos 15
  • 16. S3 boto S3 c3cmd S3 EMR Mapper,Reducer,Python2.7 MongoDB pymongo MongoDB EMR Client Tool(Ruby) EMR 16
  • 17. EMR 17
  • 18. EMR S3 EC2⇔S3 20MB/sec Hadoop HadoopStreaming EMR 18
  • 19. GB/ 19
  • 20. 20
  • 21. 21

Editor's Notes