Coexistence and Migration of Vendor HPC based infrastructure to Hadoop Ecosystem/YARN solution

Coexistence and Migration of Vendor HPC-based Infrastructure to Hadoop Ecosystem/YARN Solution
Coexistence and Migration of Vendor HPC-based
Infrastructure to Hadoop Ecosystem/YARN
Solution
S&P Captital IQ
Friday 22nd May, 2015

Agenda
The HPC Inheritance
The Need
Integrating the Hadoop Ecosystem
Integration of HPC vendor based and the Hadoop ecosystem
via YARN AM
Advantages and Potential Drawbacks
Closing & Questions

The HPC Inheritance
Preexisting
HPC distributed computing infrastructure established
2003-2007
Usually 500 - 5000 cores, but some instances 100K, not on a
single (HA) RM
Vendor products, (usually) closed source
No separate resource schedulers, a notable exception (EGO,
Platform Computing)

The HPC Inheritance
Preexisting
The HPC applications
HPC systems built with: MPICH, OpenMPI, ACE-TAO or
Sockets
Few applications have 80% of the computational resources,
80/20 (Pareto) principle
Designed for computation heavy apps, with low I/Os, with
concentrated demand in range of hours
Low latency/high throughput, but some variances
Built with a particular (vendor) API implementation, callbacks
Continuous optimization cycles, on algorithmic and on
infrastructure levels

The Need
Engineer a new system, distributed computing & data, at
reasonable cost.
Reuse the infrastructure
Reuse already built internal knowledge
Current HPC applications should not experience noticeable
slowness
Growing awareness that heavy compute & data oriented
application need to be built in distributed fashion sharing
resources
Eﬃcient resource utilization

Integrating the Hadoop Ecosystem
Apache Hadoop on the existing HPC infrastructure: hardware
coexistence, resource mapping one-to-one
.bashrc user account proﬁle to setup the environment for both
of the systems
Using YARN as a resource scheduler for the both systems
May need OS optimization due to I/Os

Integration of HPC vendor based and the Hadoop ecosystem via YARN AM
Building YARN AM as valve for the computation flow to HPC
Building AM, using the vendor API to control the HPC
computational processes, allocation on demand considering
the HPC specifics
AMRMClientAsync handles AM communications with RM,
needs CallbackHandler implementation
Depending on the HPC API, queue pooling or events
notification
up/down process, memory-utilization efficient - process start,
slow
open/close fast - memory footprint
HPC YARN AM, uses YARN API calls and HPC management
API, variety of combinations of resource allocation/release
possible
fixed, fixed + incremental, incremental only
scheduling based on job patterns, prediction scheduling (art)
combinations of the above
Handling the YARN’s callbacks for resource management
Recoverable on AM crash: simple state based on config
parameters and HPC Scheduler queues state

HPC
Sched-
uler
t
YARN
RM1
YARN
RM2
NNs
HA
ZK
nodes
R1 R2
hpc
AM
R4 R5
H D F S

HPC
Sched-
uler
t+1
YARN
RM1
YARN
RM2
NNs
HA
ZK
nodes
R1 R2
hpc
AM
R4 R5
H D F S

HPC
Sched-
uler
t+2
YARN
RM1
YARN
RM2
NNs
HA
ZK
nodes
R1 R2
hpc
AM
R4 R5
H D F S

HPC
Sched-
uler
t+n
YARN
RM1
YARN
RM2
NNs
HA
ZK
nodes
R1 R2
hpc
AM
R4 R5
H D F S

Advantages and Potential Drawbacks
Advantages
Sharing resources: Apache Hadoop coexisting with HPC
increasing resources utilization
the pattern changes are visible at the node compute resources
in contrast to the network that can have quite complex
topology and behavior
allowing new infrastructure to grow out of the existing one
Potential Drawbacks
Sharing resources: HPC AM logic adds additional complexity
and in some cases it may be considerable
The work is somehow slower, implementing gradual changes
and observing the system behavior based on the job patterns
May impose additional data block transfers on the network

Closing & Questions
Integrating the HPC RM/Schedulers with the Hadoop
Ecosystem via a custom AM valve, an optimal way to make
the HPC aware of YARN
Slowing hardware expansion & eﬃcient resource utilization
Q&A

Coexistence and Migration of Vendor HPC based infrastructure to Hadoop Ecosystem/YARN solution

More Related Content

What's hot (20)

Viewers also liked (20)

Similar to Coexistence and Migration of Vendor HPC based infrastructure to Hadoop Ecosystem/YARN solution (20)

More from DataWorks Summit (20)

Recently uploaded (20)

Coexistence and Migration of Vendor HPC based infrastructure to Hadoop Ecosystem/YARN solution