Risk managementusinghadoop

Welcome to Redefining Perspectives
November 2012

Capital Markets Risk Management
And Hadoop
Kevin Samborn and
Nitin Agrawal

© COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 2

Agenda

• Risk Management
• Hadoop
• Monte Carlo VaR Implementation
•Q&A


Risk Management


What is Risk Management
• Risk is a tool – the goal is to optimize and understand risk
o Too much risk is locally and systemically dangerous
o Too little risk means the firm may be “leaving profit on the table”
• Portfolio exposure
o Modern portfolios contain many different types of assets
o Simple instruments, Complex instruments and derivatives
• Many types of risk measures
o Defined scenario-based stress testing
o Value at Risk (VaR)
o “Sensitivities”
• Key is valuation under different scenarios
• VaR is used in banking regulations, margin calculations and risk
management

Value at Risk (VaR)
• VaR is a statistical measure of risk – expressed as amount of loss given
probability %. E.g. 97.5% chance that the firm will not lose more than 1mill
USD over the next 5 days
• Computing VaR is a challenging data sourcing and compute intensive process
• VaR calculation:
o Generate statistical scenarios of market behavior
o Revalue the portfolio for each scenario, compare returns to today’s value
o Sort results and select the desired percentage return: VALUE AT RISK
• Different VaR techniques:
o Parametric – analytic approximation
o Historical – captures real (historical) market dynamics
o Monte Carlo – many scenarios, depends on statistical distributions

VaR Graphically

Source: An Introduction To Value at Risk (VAR), Investopedia, May 2010


Complexities
• For modern financial firms, VaR is complex. Calculation requirements:
o Different types of assets require different valuation models
• Risk-based approach
• Full revaluation
o With large numbers of scenarios, many thousands of calculations are required
o Monte Carlo simulations require significant calibration, depending on large historical
data
• Many different reporting dimensions
o VaR is not additive across dimensions. Product/asset class, Currency
o Portfolio – including “what-if” and intraday activity
• Intraday market changes requiring new simulations
• Incremental VaR – how does a single (new) trade contribute to the total

Backtesting VaR


Hadoop Core

• Data stored with REDUNDANCY on a • Provides an EASY ABSTRACTION for
Distributed File System processing large data sets
• Abstracts H/W FAILURES delivering a • Infrastructure for PARALLEL DATA
highly-available service on PROCESSING across huge
COMMODITY H/W Commodity cluster
• SCALES-UP from single to thousands • Infrastructure for TASK and LOAD
of nodes MANAGEMENT
• Data stored WITHOUT A SCHEMA • Framework achieves DATA-PROCESS
• Tuned for SEQUENTIAL DATA ACCESS LOCALITY

Makes two critical assumptions though:
• Data doesn’t need to be updated
• Data doesn’t need to be accessed randomly

A Simple Map Reduce Job
Problem Statement: From historical price data, create frequency distribution of 1-day %age change
for various stocks
Stock Date Open Close BP|1, 33
Map 1 S Reduce 1
BP 23-Nov 435.25 435.5
O BP|2, 64
NXT 23-Nov 3598 3620
R …
MKS 23-Nov 378.5 380.7
BP 22-Nov 434.8 433.6 T
NXT 22-Nov 3579 3603 Map 2 / Reduce 2 NXT|81, 2
MKS 22-Nov 377.8 378 S NXT|-20, 5
BP 21-Nov 430.75 433 H
NXT 21-Nov 3574 3582 …
U
MKS 21-Nov 375 376
F Reduce 3 Output3
BP 20-Nov 430.9 432.25
F
NXT 20-Nov 3592 3600
MKS 20-Nov 373.7 375.3 Map M L
BP 19-Nov 422.5 431.6 E
NXT 19-Nov 3560 3600
MKS 19-Nov 368.5 372.6 Reduce N Output N
BP 16-Nov 423.9 416.6
NXT 16-Nov 3575 3542
MKS 16-Nov 370.3 366.4
BP
public void reduce(Text key, Iterable<IntWritable> values,
15-Nov public void map(LongWritable key, Text value, Context
422 425.4
NXT Context context) throws IOException, InterruptedException {
15-Nov 3596 3550
context) throwsLong> freqDist InterruptedException {
Map<Integer, IOException, = buildFreqDistribution(values);
MKS 15-Nov 376.5 370.6
SecurityAttributes sa =
Set<Integer> percentChanges = freqDist.keySet();
RecordsReadHelper.readAttribs(value.toString());
for (Integer percentChange : percentChanges) {
context.write(new Text(sa.getTicker()), + "|" + percentChange.toString()),
context.write(new Text(key.toString()
new IntWritable(sa.getPercentChange()));
new LongWritable(freqDist.get(percentChange)));
} } © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 13

Hadoop Ecosystem | How/Where These Fit

VISUALIZATION TOOLS

USERS DATA WAREHOUSE

PROCESSING
Sqoop Zoo
hiho Keeper
Scribe
HUE
Flume
LOAD STORAGE SUPPORT


Monte-Carlo VaR Implementation


Monte Carlo VaR
2 Steps

IBM … …

MSFT … … HLV1 = (∑AiVi) 1
IBM.CO … … HLV2 = (∑AiVi) 2
… … … … …
…
V1 V2 V3 V10,000 HLV10k= (∑AiVi) 10k
Aggregation
SIMULATION Aggregation
AGGREGATION

Challenges
 Daily trade data could be massive
 Valuations are Compute intensive
 VaR is not a simple arithmetic sum across hierarchies


IBM ……
MSFT ……
IBM.CO ……
…… ……

Simulation Step - MapReduce V1 V2 V3
SIMULATION

MAP REDUCE
- Read-through portfolio data - For the Underlyer, perform 10k random
- Emit (K,V) as walks in parallel
(Underlyer,InstrumentDetails) - For each random walk output, simulate
e.g. (IBM, IBM.CO.DEC14.225) derivative prices
- Emit 10k sets of simulated prices of the
stock and associated derivatives i.e.
IBM , [V1, V2, …..V10000]
IBM.CO.DEC14.225 , [V1, V2, …..V10000]
Job job = new Job(getConf());
SecurityAttributes stockAttrib = (SecurityAttributes) iter.next();
job.setJobName("RandomValuationGenerator");
simPricesStock = getSimPricesForStock(stockAttrib);
job.setMapperClass(SecurityAttributeMapper.class);
writeReducerOutput(stockAttrib, simPricesStock, context);
job.setReducerClass(PriceSimulationsReducer.class);
…
public void BlackScholesMertonPricingOption(); Context context) throws IOException,
bsmp = new map(LongWritable key, Text value,
InterruptedException { {
while (iter.hasNext())
SecurityAttributes sa secAttribs = iter.next();
SecurityAttributes = RecordsReadHelper.readAttribs(value.toString());
writeReducerOutput(secAttribs,getSimPricesForOptions(
context.write(new Text(sa.getUnderlyer()), sa);
} simPricesStock, bsmp, secAttribs), context);
}

HLV1 =
… iVi)=
HLV2
(∑A 1

Aggregation Step MapReduce (∑AiVi) 2
…Aggregation
Aggregation

MAP REDUCE
- Read-through de-normalized • For the hierarchy level (e.g. US|ERIC),
portfolio data perform ∑AiVi for each simulation and
- Emit (K,V) as (Hierarchy-level, get simulated portfolio values - HLVi
Position Details) • Sort HLVi , find 1%, 5% and 10% values
US , [IBM, 225, 191.23] and emit position and VaR data
US|Tech , [IBM, 400, 191.23]
US|Tech|Eric , [IBM, 400, 191.23]

Map<String, Double> portfolioPositionData = combineInputForPFPositionData(rows);
Map<String, Double[]> simulatedPrices=
protected void map(LongWritable key, HoldingWritable value, Context context)
loadSimulatedPrices(portfolioPositionData.keySet());
throws java.io.IOException ,InterruptedException {
for(long i=0; i<NO_OF_SIMULATIONS-1; i++) {
SecurityAttributes sa = RecordsReadHelper.readAttribs(value.toString());
simulatedPFValues.add(getPFSimulatedValue(i,
Set<String> hierarchyLevels = sa.getHierarchyLevels();
portfolioPositionData, simulatedPrices)); }
for (String hierarchyLevel : hierarchyLevels) {
Collections.sort(simulatedPFValues);
context.write(new Text(hierarchyLevel), new
Text(sa.getPositionDtls()));simulatedPFValues);
emitResults(portfolioPositionData,
} © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 18

DEMO RUN


Observations
• As expected, processing time of Map jobs increased marginally
when input data volume was increased
• Process was IO-bound on Simulation’s Reduce job as
intermediate data emitted was huge
• Data replication factor needs to be chosen carefully
• MapReduce jobs should be designed such that Map/Reduce
output is not huge


Questions?


Thank You!


Appendix


Let’s build a Simple Map Reduce Job
Problem Statement: Across a huge set of documents, we need to find all locations (i.e.
document, page, line) for all words having more than 10 characters.

D
A
T
A
N
O
D
E
2 STORAGE

D
A
T
A
N
O
D
E
1

Store Map

Risk managementusinghadoop

More Related Content

Similar to Risk managementusinghadoop (20)

Recently uploaded (20)

Risk managementusinghadoop