Making Hadoop Realtime by Dr. William Bain of Scaleout Software

Enabling Operational Intelligence:
Making Hadoop Real-Time
Copyright © 2014 by ScaleOut Software, Inc.
Los Angeles Big Data Users Group
August 6, 2014
Bill Bain, CEO (wbain@scaleoutsoftware.com)

2 ScaleOut Software, Inc.
• Operational Intelligence vs. Business Intelligence
• Data-Parallel Computation
• Implementing Using an In-Memory Data Grid:
• Distributing the Data Across a Cluster
• Running the Computation using “Parallel Method Invocation”
• An Example in Financial Services
• Implementing In-Memory Hadoop MapReduce
• Video Demo
• Examples of Applications in Operational Intelligence
Agenda

• Develops and markets In-Memory Data Grids,
software middleware for:
• Scaling application performance and
• Providing operational intelligence using
• In-memory data storage and computing
• Dr. William Bain, Founder & CEO
• Career focused on parallel computing – Bell Labs, Intel, Microsoft
• 3 prior start-ups, last acquired by Microsoft and product now ships as
Network Load Balancing in Windows Server
• Eight years in the market; 400 customers, 10,000 servers
• Sample customers:
About ScaleOut Software

Goal: Provide immediate feedback to a system handling live data.
A few examples:
• Equity trading: to minimize risk during a trading day
• Ecommerce: for real-time recommendations
• Reservations systems: to identify issues, reroute, etc.
• Credit cards & wire transfers: to detect fraud in real time
• Smart grids: to optimize power distribution & detect issues
Online Systems Need Operational
Intelligence

Big Data Analytics
Real-Time vs. Batch Analytics
Static data sets
Petabytes
Disk storage
Hours to minutes
Best uses:
• Analyzing
warehoused data
• Mining for long-
term trends
Live data sets
Gigabytes to terabytes
In-memory storage
Minutes to seconds
Best uses:
• Tracking live data
• Immediately
identifying trends
and capturing
opportunities
• Providing immediate
feedback
Analytics
Server
hServer
Hadoop
IBM
Teradata
SAS
SAP
Real-Time Batch
Real-time
“Operational Intelligence”
Batch
“Business Intelligence”

• Operational intelligence can co-exist with business intelligence:
• Processes streaming data close to its sources.
• Provides real-time, “tactical” feedback (e.g., recommendations, alerts).
• Translates data for storage in the data warehouse (ETL).
• Data warehouse provides “strategic” guidance.
• Using the same tool set (e.g., Hadoop MapReduce) lowers TCO:
• Leverages common skill set.
• Simplifies design (e.g., loading data into HDFS).
Integrated View of Analytics

• To keep up with fast
growing “live” workloads &
maintain fast response times:
• Ex.: Handle incoming data
streams in real time.
• Ex. Process updates to data
set based on incoming data.
• To identify and respond to
trends in fast-changing data:
• Ex. Evaluate data set changes in
real time.
• Ex. Respond to identified
patterns within seconds.
Challenges for Operational Intelligence
0
50
100
150
200
250
300
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
Millions
Growth in Web Servers
Source:
Netcraft
0
500
1000
1500
2000
2500
3000
3500
4000
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
Exebytes
Growth in “Big Data”
“More data has been
created in the past three
years than in the past
40,000.”

The Solution: Data-Parallel Computing
• Straightforward, well understood model of parallel computation
• An alternative to task-parallel computation (e.g., Storm)
• Simple: runs the same code on multiple, in-memory data items.
• Powerful: maintains a “live,” in-memory model of a real-world system
• Fast: avoids data motion which lowers speedup.
Analyze Data (Eval)
Combine Results
(Merge)
Server Cluster

Track “live” data and analyze in real-time:
Implementing OI
In-Memory
State
NoSQL
Storage
Real-Time
Data Parallel
Analysis

• Storm implements pipelined execution of tasks by “bolts” on
incoming data streams.
• Streams can be distributed to bolts with configurable mappings.
• Developer controls the number of tasks per bolt.
• Storm uses a centralized master node and Zookeeper for fault-
tolerance.
• Key strength: continuous
processing of input
streams
• Issues:
• Complexity / tuning
• Minimizing data motion
• Managing global state
Quick Comparison to Storm

Data-Parallel Enables Linear Speedup
Avoids data motion (network or disk I/O) which limits throughput:

Data-Parallel Computing Is Not New
• 1980’s: Special Purpose Hardware: “SIMD”
Thinking Machines
Connection Machine 5
• 1990’s: General Purpose Parallel Supercomputers:
“Domain Decomposition”, “SPMD”
Intel
IPSC-2
IBM
SP1

Data-Parallel Computing Is Not New
• 1990’s – early 2000’s: HPC on Clusters: “MPI”
• Since 2003: Clusters, the Cloud, and IMDGs: “MapReduce”
HP
Blade
Servers
Amazon EC2,
Windows Azure

• In-memory data grid
(IMDG) holds active
entities undergoing
state changes in
memory.
• Backing store
optionally holds large
population of entities.
• IMDG processes
incoming stream of
state changes.
• Analytics engine
examines entities in real
time and generates
alerts within seconds
as needed.
A Data-Parallel Architecture for
Operational Intelligence

In-Memory Data Grid (IMDG) stores “live” data in a cluster:
• Fits in the business logic layer:
• Follows object-oriented view of data
(vs. relational view).
• Stores collections of Java/.NET
objects shared by multiple clients.
• Uses create/read/update/delete
and query APIs to access data.
• Implemented across a cluster of
servers or VMs:
• Scales storage and throughput
by adding servers.
• Provides high availability
in case a server fails.
In-Memory Data Grid for Live Data

• IMDG’s collections of objects act like in-
memory collections:
• Unstructured, typically instances of a class
(stored as serialized blobs)
• Individually accessible / update-able
• IMDG adds attributes:
• Accessible by global key
• Query-able by properties
• Highly available
• Optional timeouts
• Distributed locking
• Integration with a backing store
• Optional dependency relationships
• Asynchronous event handling
IMDGs Store “Live” Data
Basic “CRUD” APIs:
• Create(key, obj, tout)
• Read(key)
• Update(key, obj)
• Delete(key)
and…
• Lock(key)
• Unlock(key)
Object
key

Spark / Spark Streaming from U.C.
Berkeley amplab:
• In-memory computing to accelerate and
extend Hadoop MapReduce using data-
parallel operators in Scala.
• Stores data as “resilient
distributed datasets” (RDDs):
• Distributed across cluster
• Immutable
• Hold data from/output to HDFS.
• Store data stream as a sequence of RDDs.
• Comparison to IMDG:
• Not designed for “live” data:
• Lacks CRUD on individual objects.
• Lacks high availability.
• Designed for “data parallel” operators.
Quick Comparison to Spark

Data-Parallel Computing Using PMI
“Parallel Method Invocation” (PMI): an object-oriented version of data-
parallel computing from the HPC community:
• Serves as a platform for MapReduce and other data-parallel operators.
• Selects objects using a parallel query on data hosted in the IMDG.
• Runs user-defined methods in parallel across the cluster.
Analyze Data (Eval)
Combine Results
(Merge)
In-Memory Data Grid Runs
Data-Parallel Computation.

Integrate analysis into a stock trading platform:
• The IMDG holds market data and hedging strategies.
• Updates to market data
continuously flow through
the IMDG.
• The IMDG performs
repeated data-parallel
analysis on hedging
strategies and alerts
traders in real time.
• IMDG automatically and dynamically
scales its throughput to handle new
hedging strategies by adding servers.
Example in Financial Services

Selects all relevant objects in a distributed collection.
• Query spec matches data’s object-oriented properties.
• Selected objects are fed to the analysis engine on the local server.
Step 1: Select with Parallel Query

Java Example: Parallel Query
public class Portfolio {
private long id;
private Set<Stock> longPositions;
private Set<Stock> shortPositions;
private double totalValue;
private Region region;
private boolean alerted; // alert for trading
@SossIndexAttribute // query-able property
public double getTotalValue() {…}
@SossIndexAttribute // query-able property
public Region getRegion() {…}
public Set<Long> evalPositions(MarketSnapshot ms) {…};
}
NamedCache pset = CacheFactory.getCache(“portfolios");
Set<Portfolio> res = pset.queryObjects(Portfolio.class,
and(greaterThan(“totalValue”, 1000000),
equals(“region”, Region.US)));

• Create method to analyze a queried portfolio and another method to
pair-wise merge the result sets of alerted portfolios:
Java Example: Parallel Method Invocation
public class PortfolioAnalysis implements
Invokable<Portfolio, MarketSnapshot, Set<Long>>
{
public Set<Long> eval(Portfolio p, MarketSnapshot ms)
throws InvokeException {
// update portfolio and return id if alerted:
return p.evalPositions(ms);
}
public Set<Long> merge(Set<Long> set1, Set<Long> set2)
throws InvokeException {
set1.addAll(set2);
return set1; // merged set of alerted portfolio ids
}}

• Run a parallel method invocation on a queried set of portfolios and
return set of ids for alerted portfolios:
Java Example: Parallel Method Invocation
NamedCache pset = CacheFactory.getCache(“portfolios");
InvokeResult alertedPortolios = pset.invoke(
PortfolioAnalysis.class,
Portfolio.class,
and(greaterThan(“totalValue”, 1000000), // query spec
equals(“region”, Region.US)),
marketSnapshot, // parameters
...
);
System.out.println("The alerted portfolios are" +
alertedPortfolios.getResult());

• IMDG ships user’s code and libraries to its servers.
• IMDG automatically schedules analysis operations across all grid
servers and cores:
• The analysis runs on all objects selected
by the parallel query.
• Each grid server analyzes its locally stored
objects to minimize data motion.
• Parallel execution ensures fast
completion time:
• IMDG automatically distributes
workload across servers/cores.
• Scaling the IMDG automatically
handles larger data sets.
Running the Analysis

• The IMDG automatically merges all analysis results:
• The IMDG first merges all results within each grid server in parallel.
• It then merges results across all grid servers to create one combined
result.
• Efficient parallel merge
minimizes the delay in
combining all results.
• The IMDG delivers the
combined result to the
invoking application as
one object.
Merging the Results

• Measured a similar financial services application (back testing stock
trading strategies on stock histories)
• Hosted IMDG in Amazon EC2 using 75 servers holding 1 TB of stock
history data in memory
• IMDG handled a continuous stream of updates (1.1 GB/s)
• Results: analyzed 1 TB in 4.1 seconds (250 GB/s) with linear scaling
Sample Performance Results for PMI

Benefits:
• Enables use of Hadoop MapReduce for operational intelligence.
• Accelerates data access by holding data in memory.
• Analyzes and updates “live” data.
• Reduces overheads of standard
Hadoop distributions:
• Batch scheduling
• Disk access
• Data shuffling
• Mandatory key sorting
• Enables new features, e.g.:
• Global combining, optional sorting
Using PMI to Implement
“In-Memory” Hadoop MapReduce

• A Hadoop distribution does not have to be installed unless HDFS is used.
• The developer starts MapReduce applications from a remote workstation.
• The IMDG automatically builds a reusable “invocation grid” of JVMs on the
grid’s servers for PMI and ships the application’s jars.
• Results are stored in the IMDG, HDFS, or optionally globally merged and
returned to the remote workstation.
Running MapReduce on the IMDG

Run In-Memory MR with YARN
• YARN, transparently integrates batch and in-memory MapReduce into
a single execution framework with shared access to HDFS.
• For example, hServer can transparently run Apache Hive in-memory.
Example of ScaleOut hServer with Hortonworks
Example of Hive
Running on hServer

Run MapReduce as two PMI
phases:
• Data can be input from either the
IMDG or an external data source.
• Works with any input/output format
compatible with the Apache
distribution.
• IMDG uses its data-parallel
execution engine (PMI) to invoke
the mappers and the reducers.
• Eliminates batch scheduling
overhead.
• Intermediate results are stored
within the IMDG.
• Minimizes data motion between the
mappers and reducers.
• Allows optional sorting.
• Output of a single reducer/combiner
optionally can be globally merged.
Implementing MapReduce

• IMDG adds grid input format for
accessing key/value pairs held in
the IMDG.
• MapReduce programs optionally
can output results to IMDG with
grid output format.
• Grid Record Reader optimizes
access to key/value pairs to
eliminate network overhead.
• Applications can access and
update key/value pairs as
operational data during analysis.
Accessing IMDG Data for M/R

• IMDG adds Dataset Record Reader (wrapper) to cache HDFS
data during program execution.
• Hadoop automatically retrieves data from ScaleOut IMDG on
subsequent runs.
• Dataset Record Reader
stores and retrieves data
with minimum network
and memory overheads.
• Tests with Terasort
benchmark have
demonstrated 11X
faster access latency
over HDFS without IMDG.
Optional Caching of HDFS Data

IMDG needs multiple in-memory
storage models:
• Named cache, optimized for
rich semantics on large
objects:
• Property-based query
• Distributed locking
• Access from remote grids
• Named map, optimized for
efficient storage and bulk
analysis (e.g., MapReduce):
• Highly efficient object storage
• Pipelined, bulk-access
mechanisms
Optimized In-Memory Storage

In-Memory Named Map:
• Stores key/value pairs in chunks.
• Allows CRUD operations on kvps.
• Automatically organizes chunks into
splits.
• Uses per-split hash table to access
keys and manage multi-valued
keys.
• Stores shuffled data set between
mappers and reducers.
• Pipelines chunks to mappers and
from reducers.
• Optionally uses memory mapped
files to reduce access latency.
• Provides support for sorting keys.
Named Map Optimizations

• Measured performance:
• Startup times reduced to a few milliseconds
• Word count benchmark shows 20X speedup.
• Real-world example shows >40X speedup.
• MapReduce optimizations:
• Optional sorting
• Optional multicast of parameters to mappers
• Optional O(logN) global combining (avoids
single reducer)
• Optional HDFS caching
• Optional reuse of JVMs across jobs
• Current limitations:
• No specific security for multi-tenancy
• Intermediate data must fit in the IMDG
Performance & Optimizations

• Invocation grids can be re-used across MapReduce jobs:
Accelerating Start-Up Times
public static void main(String argv[]) throws Exception {
//Configure and load the invocation grid
InvocationGrid grid = HServerJob.getInvocationGridBuilder("myGrid").
// Add JAR files as IG dependencies
addJar("main-job.jar"). addJar("first-library.jar").
// Add classes as IG dependencies
addClass(MyMapper.class). addClass(MyReducer.class).
// Define custom JVM parameters
setJVMParameters("-Xms512M -Xmx1024M").
load();
//Run 10 jobs on the same invocation grid
for(int i=0; i<10; i++) {
Configuration conf = new Configuration();
//The preloaded invocation grid is passed as the parameter to the job
Job job = new HServerJob(conf, "Job number "+i, false, grid);
//......Configure the job here.........
//Run the job
job.waitForCompletion(true);
}
//Unload the invocation grid when we are done
grid.unload();
}

• IMDG can run Apache Hive distribution
unchanged.
• Accelerates queries for datasets hosted in
HDFS or the IMDG:
• Intermediate data must fit within the IMDG.
• Challenges we faced:
• Requires YARN to transparently invoke
MapReduce on IMDG.
• IMDG must use multiple JVMs per server
since Hive tasks are not thread-safe.
• IMDG must support Hadoop’s distributed
cache (required by Hive).
Running Hive on In-Memory Data

• Assume we have a named map called “customers” of
customer objects:
Example: Querying a Named Map
public class Customer implements Serializable
{
private int customerId;
private String firstName;
private String lastName;
private String login;
public int getCustomerId() {
return customerId;}
public String getFirstName() {
return firstName;}
...
}

• Create a table view of a named map:
• Associates class properties with columns.
• Allows properties to be omitted.
• Allows use of custom serialization.
public hive> CREATE TABLE
customers (customerid int, firstname string, lastname
string, login string)
STORED BY
'com.scaleoutsoftware.soss.hserver.hive.HServerHiveStorage
Handler'
TBLPROPERTIES ("hserver.map.name" = "customers");
OK
Time taken: 0.508 seconds

• Now query the named map:
hive> SELECT * FROM customers;
..............................
1 Eduardo Hazelrigg ehazelrigg
13 Serena Sadberry ssadberry
9 Ermelinda Manganaro emanganaro
5 Edda Speir espeir
17 Tomeka Stovall tstovall
21 Luciano Perkinson lperkinson
25 Jacob Garrow jgarrow
33 Quincy Kreutzer qkreutzer
37 Iona Speir ispeir
41 Ermelinda Thielen ethielen
Time taken: 0.475 seconds, Fetched: 100 row(s)

The Challenge: Operational intelligence to quickly evaluate
and respond to sub-second market changes:
• Hedge fund tracks a set of hedging strategies:
• Strategies can cover various market
sectors, such as high-tech, automotive,
energy, consumer, real estate, etc.
• Each strategy contains list of holdings
and rules for managing the holdings
(such as target allocations).
• Updates to market data
continuously arrive during
the trading day.
• Challenge: The hedge fund must be able to quickly update and
analyze its hedging strategies and provide alerts to traders.
Demo of the Finserv Application

• Delivers a stream of alerts to traders
within a few seconds.
• Enables the trader to examine strategy details in real time:
Output: Real-Time Alerts

• Video Link
Video

Fast map/reduce reconciles inventory and order systems
for an online retailer:
• Challenge: Inventory and online
order management are handled
by different applications.
• Reconciled once per day.
• Inaccurate orders reduces margins.
• Solution:
• Host SKUs in IMDG updated in real
time by order & inventory systems.
• Use MapReduce to reconcile in two minutes.
• Results: Real-time reconciliation ensures accurate orders.
Example in Ecommerce: Inventory
Management

• IMDG holds customer
information for active
Web users.
• IMDG saves/retrieves
customer information
from backing store.
• Web browsers send
activity information to
analytics engine.
• IMDG updates customer history and
preferences.
• Analytics engine identifies browsing and
buying patterns.
• Analytics engine makes suggestions in
real-time. Also sends email follow-ups.
Example: Web Shopping

• Track
connectivity
issues.
• Obtain time-
sensitive
business data.
• Offer enhanced
services.
• Increase
security.
Example: Telecommunications
Optimize Operations
Customer Experience
Historical queries
for real-time data
enrichment
Stream
persistence for
future analysis
Network
Elements

• Online systems need operational
intelligence on “live” data for
immediate feedback.
• Operational intelligence can be
implemented using standard
data-parallel computing
techniques, such as M/R.
• In-memory data grids provide
an excellent platform for
operational intelligence:
• Host and update “live” data.
• Implement high availability.
• Offer fast, data-parallel
computation for immediate
feedback.
Recap

• ScaleOut StateServer®
• In-Memory Data Grid for Windows and
Linux
• Scales application performance.
• Industry-leading performance and ease of use
• ScaleOut GeoServer® adds
• WAN based data replication for DR
• Breakthrough technology for global
data access
• ScaleOut Analytics Server® adds
• Real-time data analysis for “live” data
• Comprehensive management tools
• ScaleOut hServer®
• Full Hadoop Map/Reduce engine (>20X faster*)
• Hadoop Map/Reduce on live, in-memory data
ScaleOut Software Products
ScaleOut StateServer In-Memory Data Grid
Grid
Service
Grid
Service
Grid
Service
Grid
Service
*in benchmark testing

Many Use Cases:
• Authorizations / Payment
Processing / Mobile Payments
• Service Activation
• Inventory Management
• Sensor Data / SCADA
• Real Time Tracking
• Fraud Detection
• Situational Awareness
• Churn Management
• Market Feed / Event Handlers
• Execution Rules
• Financial: Risk, P&L, Pricing
• Operational Risk Compliance
The Need for Real-Time Analytics
Across Key Industries:
• CPG
• Financial
• Telco
• Retail
• Utilities
• Manufacturing
• Logistics
• IC / DoD
• Life Sciences
• Government
• Health Care
• Law enforcement

• Brick and mortar stores need to compete with online experience.
• Point-of-sale identifies opt-in customers to analytics engine.
• RFID tags identify product selection and availability in showroom.
• Analytics engine sends real-time advisories to sales staff via tablet.
Example: Retail Shopping

• Typically used for very large, static, offline datasets
• Data must be copied from disk-based storage (e.g., HDFS) into
memory for analysis.
• Hadoop Map/Reduce adds lengthy batch scheduling and data
shuffling overhead.
Problem: Hadoop Cannot Efficiently
Perform Real-Time Analytics

// This job will run using the Hadoop
// job tracker:
public static void main(String[] args)
throws Exception {
Job job = new Job(conf, "wordcount");
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
job.setMapperClass(Map.class);
job.setReducerClass(Reduce.class);
job.setInputFormatClass(
TextInputFormat.class);
job.setOutputFormatClass(
TextOutputFormat.class);
FileInputFormat.addInputPath(
job, new Path(args[0]));
FileOutputFormat.setOutputPath(
}
// This job will run using ScaleOut hServer:
public static void main(String[] args)
throws Exception {
Job job = new HServerJob(conf, "wordcount");
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
job.setMapperClass(Map.class);
job.setReducerClass(Reduce.class);
job.setInputFormatClass(
TextInputFormat.class);
job.setOutputFormatClass(
TextOutputFormat.class);
FileInputFormat.addInputPath(
FileOutputFormat.setOutputPath(
}
Configuring MapReduce for the IMDG
• Without YARN, subclass the Hadoop Job class with a one-line change (below).
• With YARN, just replace the MapReduce execution framework.

• Mark class properties as indexes for query:
• Define a query using these properties:
Parallel Query Example (C#)
class Stock {
[SossIndex]
public string Ticker { get; set; }
public decimal TotalShares { get; set; }
public decimal Price { get; set; }}
NamedCache cache = CacheFactory.GetCache("Stocks");
var q = from s in cache.QueryObjects<Stock>()
where s.Ticker == "GOOG" || s.Ticker == "ORCL"
select s;
Console.WriteLine("{0} Stocks found", q.Count());

• Create method to analyze each queried stock object:
• Create method to pair-wise merge the results:
Example of Analysis Code (C#)
static decimal eval(Stock stock, StockCalcParams params)
{
return stock.Price * stock.TotalShares;
}
static decimal merge(decimal r1, decimal r2)
{
return r1 + r2;
}

• Run a parallel method invocation:
Invoking the Parallel Analysis (C#)
NamedCache cache = CacheFactory.GetCache("Stocks");
decimal valueOfSelectedStocks =
(from s in cache.QueryObjects<Stock>()
where s.Ticker == "GOOG" || s.Ticker == "ORCL"
select s)
.Invoke(new StockCalcParams(…),
new Func<Stock, StockCalcParams, decimal>(eval))
.Merge(new Func<decimal, decimal, decimal>(merge));
Console.WriteLine(“The value of selected stocks is {0}",
valueOfSelectedStocks);

Making Hadoop Realtime by Dr. William Bain of Scaleout Software

More Related Content

What's hot (20)

Similar to Making Hadoop Realtime by Dr. William Bain of Scaleout Software (20)

More from Data Con LA (20)

Recently uploaded (20)

Making Hadoop Realtime by Dr. William Bain of Scaleout Software