SlideShare a Scribd company logo
Low Latency Data Grids in Finance Jags Ramnarayan Chief Architect GemStone Systems [email_address]
Background on GemStone Systems Known for its Object Database technology since 1982 Now specializes in memory-oriented distributed data management Over 200 installed customers in global 2000 Grid focus driven by: Very high performance with predictable throughput, latency and availability Capital markets Large e-commerce portals – real time fraud Federal intelligence
Use of Grid computing in finance Two primary areas in tier 1 investment banks Risk Analytics Pricing
State of affairs –  Risk Analytics Deluge of data (market data, trade data, etc) Overnight batch job doesn’t cut it Want intra-day risk metrics In some cases, real-time risk Explosion in simulation scenarios More accurate risk exposure Compliance Increasing number of smaller calculations
State of affairs –  Pricing (derivatives) Too many products Increasing complexity in products Too many underliers Many relationships Hunger for latency reduction Calculating the new price with lowest possible latency Pushing the prices to distributed applications
Where is the problem? Compute farm Data warehouses Rational databases Database/file access contention Too many concurrent connections Large database server bottlenecks on network Queries results are large causing CPU bottlenecks Even a parallel file system throttled by disk speeds Too much data transfer Between tasks, Jobs Between Grid and file systems, databases Data consistency issues File system CPU bound job turns into a IO bound Job Grid Scheduler
Data Fabric for Risk Analytics When data is stored, it is transparently replicated and/or partitioned; Redundant storage can be in memory and/or on disk— ensures continuous availability Keep reference data replicated on many; partition trade data Machine nodes can be added dynamically to expand storage capacity or to handle increased client load Pool memory (and disk) across cluster ; parallelize data access and computation to achieve very high aggregate throughput
Data Fabric for Risk Analytics TaskFlow -  As results are generated push events to compute nodes to initiate subsequent computation Avoid bulk data transfer across tasks or Jobs Thousands of compute nodes can maintain local cache of most frequently used data; Optionally use local disk for overflow Move reference data to local cache Synchronous read through, write through or Asynchronous write-behind to other data sources and sinks
Move business logic to data f 1  , f 2  , … f n FIFO Queue Data fabric Resources Exec functions Sept Trades Submit (f1) -> AggregateHighValueTrades(<input data>, “ where trades.month=‘Sept ’) Function (f1) Function (f2) Principle: Move task to computational resource with most of the relevant data before considering other nodes where data transfer becomes necessary Parallel function execution service (“Map Reduce”) Data dependency hints Routing key, collection of keys, “where clause(s)” Serial or parallel execution
Key lessons Apps should think about capitalizing memory across Grid (it is abundant) Keep IO cycles to minimum through main memory caching of operational data sets Scavange Grid memory and avoid data source access Achieve linear scaling for your Grid apps by horizontally partitioning your data and behavior Read “Pat helland’s – Life beyond Distributed transactions” ( http://www-db.cs.wisc.edu/cidr/cidr2007/papers/cidr07p15.pdf ) Get more info on the GemFire data fabric http:// www.gemstone.com/gemfire

More Related Content

PPT
Ogf2008 Grid Data Caching
PDF
Data mining
PPT
Aginity "Big Data" Research Lab
PPTX
Oracle dba-daily-operations
PPTX
big data and hadoop
PPTX
The Big Data Analytics Ecosystem at LinkedIn
PPTX
Hadoop Training Tutorial for Freshers
PDF
Mastering in Data Warehousing and Business Intelligence
Ogf2008 Grid Data Caching
Data mining
Aginity "Big Data" Research Lab
Oracle dba-daily-operations
big data and hadoop
The Big Data Analytics Ecosystem at LinkedIn
Hadoop Training Tutorial for Freshers
Mastering in Data Warehousing and Business Intelligence

What's hot (20)

PPTX
Are New Orleans Data Centers Making Green Strategies a Priority? (SlideShare)
PPTX
Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...
PDF
Analysis of big data in pandemic case
PDF
Big data tools
PPTX
Big Data Ecosystem
PDF
Big Data and OSS at IBM
PPTX
Data Centers In US
PPTX
NoSQL Type, Bigdata, and Analytics
PPTX
Nagios Conference 2013 - Thomas Dunbar - Building Technology for Storage Syst...
PPTX
Trends in Database Management
PDF
Data Center Automation - Cisco ASAP Data Center
PPT
Data warehouseing
PPTX
Three Things to Consider When Making Investments in Your Big Data Infrastructure
PPTX
BigData
PDF
Big data introduction
PDF
Thinking Outside the Table
DOCX
R programming analysis
PPTX
Top 10 data science technologies
PPTX
Big data frameworks
Are New Orleans Data Centers Making Green Strategies a Priority? (SlideShare)
Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...
Analysis of big data in pandemic case
Big data tools
Big Data Ecosystem
Big Data and OSS at IBM
Data Centers In US
NoSQL Type, Bigdata, and Analytics
Nagios Conference 2013 - Thomas Dunbar - Building Technology for Storage Syst...
Trends in Database Management
Data Center Automation - Cisco ASAP Data Center
Data warehouseing
Three Things to Consider When Making Investments in Your Big Data Infrastructure
BigData
Big data introduction
Thinking Outside the Table
R programming analysis
Top 10 data science technologies
Big data frameworks
Ad

Similar to Grid Asia2008 Low Latency Data Grid (20)

PPT
Waters Grid & HPC Course
PDF
Distributed Caches: A Developer’s Guide to Unleashing Your Data in High-Perfo...
PDF
Brian Oliver Pimp My Data Grid
PDF
Top 6 Reasons to Use a Distributed Data Grid
PPTX
Data fabric and VMware
PPTX
MongoDB in a Mainframe World
PDF
Mike Stolz Dramatic Scalability
PPTX
Webinar: Achieving Customer Centricity and High Margins in Financial Services...
PPTX
Real-time analysis using an in-memory data grid - Cloud Expo 2013
PDF
Big Data: Movement, Warehousing, & Virtualization
PPTX
GOTO 2016_real_final
PPT
Dssc Intro
PDF
Big data movement webcast
PPTX
How leading financial services organisations are winning with tech
PPTX
Webinar: How to Drive Business Value in Financial Services with MongoDB
PPTX
4 Ways To Save Big Money in Your Data Center and Private Cloud
PPTX
Intro to Big Data and NoSQL
PPTX
SQL and NoSQL in SQL Server
PDF
cloud computing notes for enginnering students
PPTX
Webinar: How to Drive Business Value in Financial Services with MongoDB
Waters Grid & HPC Course
Distributed Caches: A Developer’s Guide to Unleashing Your Data in High-Perfo...
Brian Oliver Pimp My Data Grid
Top 6 Reasons to Use a Distributed Data Grid
Data fabric and VMware
MongoDB in a Mainframe World
Mike Stolz Dramatic Scalability
Webinar: Achieving Customer Centricity and High Margins in Financial Services...
Real-time analysis using an in-memory data grid - Cloud Expo 2013
Big Data: Movement, Warehousing, & Virtualization
GOTO 2016_real_final
Dssc Intro
Big data movement webcast
How leading financial services organisations are winning with tech
Webinar: How to Drive Business Value in Financial Services with MongoDB
4 Ways To Save Big Money in Your Data Center and Private Cloud
Intro to Big Data and NoSQL
SQL and NoSQL in SQL Server
cloud computing notes for enginnering students
Webinar: How to Drive Business Value in Financial Services with MongoDB
Ad

Recently uploaded (20)

PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
Electronic commerce courselecture one. Pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Approach and Philosophy of On baking technology
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
Big Data Technologies - Introduction.pptx
PDF
Machine learning based COVID-19 study performance prediction
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
Cloud computing and distributed systems.
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPT
Teaching material agriculture food technology
Assigned Numbers - 2025 - Bluetooth® Document
Electronic commerce courselecture one. Pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Programs and apps: productivity, graphics, security and other tools
Advanced methodologies resolving dimensionality complications for autism neur...
Approach and Philosophy of On baking technology
Chapter 3 Spatial Domain Image Processing.pdf
Big Data Technologies - Introduction.pptx
Machine learning based COVID-19 study performance prediction
A comparative analysis of optical character recognition models for extracting...
Dropbox Q2 2025 Financial Results & Investor Presentation
Network Security Unit 5.pdf for BCA BBA.
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
Cloud computing and distributed systems.
MIND Revenue Release Quarter 2 2025 Press Release
Teaching material agriculture food technology

Grid Asia2008 Low Latency Data Grid

  • 1. Low Latency Data Grids in Finance Jags Ramnarayan Chief Architect GemStone Systems [email_address]
  • 2. Background on GemStone Systems Known for its Object Database technology since 1982 Now specializes in memory-oriented distributed data management Over 200 installed customers in global 2000 Grid focus driven by: Very high performance with predictable throughput, latency and availability Capital markets Large e-commerce portals – real time fraud Federal intelligence
  • 3. Use of Grid computing in finance Two primary areas in tier 1 investment banks Risk Analytics Pricing
  • 4. State of affairs – Risk Analytics Deluge of data (market data, trade data, etc) Overnight batch job doesn’t cut it Want intra-day risk metrics In some cases, real-time risk Explosion in simulation scenarios More accurate risk exposure Compliance Increasing number of smaller calculations
  • 5. State of affairs – Pricing (derivatives) Too many products Increasing complexity in products Too many underliers Many relationships Hunger for latency reduction Calculating the new price with lowest possible latency Pushing the prices to distributed applications
  • 6. Where is the problem? Compute farm Data warehouses Rational databases Database/file access contention Too many concurrent connections Large database server bottlenecks on network Queries results are large causing CPU bottlenecks Even a parallel file system throttled by disk speeds Too much data transfer Between tasks, Jobs Between Grid and file systems, databases Data consistency issues File system CPU bound job turns into a IO bound Job Grid Scheduler
  • 7. Data Fabric for Risk Analytics When data is stored, it is transparently replicated and/or partitioned; Redundant storage can be in memory and/or on disk— ensures continuous availability Keep reference data replicated on many; partition trade data Machine nodes can be added dynamically to expand storage capacity or to handle increased client load Pool memory (and disk) across cluster ; parallelize data access and computation to achieve very high aggregate throughput
  • 8. Data Fabric for Risk Analytics TaskFlow - As results are generated push events to compute nodes to initiate subsequent computation Avoid bulk data transfer across tasks or Jobs Thousands of compute nodes can maintain local cache of most frequently used data; Optionally use local disk for overflow Move reference data to local cache Synchronous read through, write through or Asynchronous write-behind to other data sources and sinks
  • 9. Move business logic to data f 1 , f 2 , … f n FIFO Queue Data fabric Resources Exec functions Sept Trades Submit (f1) -> AggregateHighValueTrades(<input data>, “ where trades.month=‘Sept ’) Function (f1) Function (f2) Principle: Move task to computational resource with most of the relevant data before considering other nodes where data transfer becomes necessary Parallel function execution service (“Map Reduce”) Data dependency hints Routing key, collection of keys, “where clause(s)” Serial or parallel execution
  • 10. Key lessons Apps should think about capitalizing memory across Grid (it is abundant) Keep IO cycles to minimum through main memory caching of operational data sets Scavange Grid memory and avoid data source access Achieve linear scaling for your Grid apps by horizontally partitioning your data and behavior Read “Pat helland’s – Life beyond Distributed transactions” ( http://www-db.cs.wisc.edu/cidr/cidr2007/papers/cidr07p15.pdf ) Get more info on the GemFire data fabric http:// www.gemstone.com/gemfire