SlideShare a Scribd company logo
VoltDB: Technical Overview 20101019

Tim Callaghan, VoltDB Field Engineer
tcallaghan@voltdb.com
Before I Begin…

VoltDB is available in community and
         commercial editions

           Runs on Linux + Mac*

 I am “Mark Callaghan’s lesser-known but
 nonetheless smart brother” [C. Monash]*

*http://www.dbms2.com/2010/05/25/voltdb-finally-launches/

                                      March 3, 2009   |
                                                            2
Scaling Traditional OLTP Databases

• Sharding improves performance but
  introduces…
  • Management complexity
     + disjointed backup/recovery and replication
     + manual effort to re-partition data                      X   X
  • Application complexity
     + shard awareness
     + cross partition joins
     + cross partition transactions           X
  • And, each shard still suffers from traditionalXOLTPX
    performance limitations
• If you can shard, your application is probably
  great in VoltDB.
                                           March 3, 2009   |
Technical Overview
• “OLTP Through the Looking Glass”
  http://cs-www.cs.yale.edu/homes/dna/papers/oltpperf-sigmod08.pdf

• VoltDB avoids the overhead of traditional databases
   • K-safety for fault tolerance
      - no logging

   • In memory operation for maximum throughput
       - no buffer management

   • Partitions operate autonomously
     and single-threaded
      - no latching or locking

• Built to horizontally scale


                                                             March 3, 2009   |
                                                                                 4
Partitions (1/4)
• 1 partition per physical CPU core
   • Each physical server has multiple VoltDB partitions
• Data - Two types of tables
   • Partitioned
       + Single column serves as partitioning key
                                                                X
       + Rows are spread across all VoltDB partitions by partition column     X
       + Transactional data (high frequency of modification)
   • Replicated
       + All rows exist within all VoltDB partitions
       + Relatively static data (low frequency of modification)
• Code - Two types of work – both ACID
   • Single-Partition                                            X
       + All insert/update/delete operations within single partition      X   X
       + Majority of transactional workload
   • Multi-Partition
       + CRUD against partitioned tables across multiple partitions
       + Insert/update/delete on replicated tables



                                                      March 3, 2009   |
Partitions (2/4)

  • Single-partition vs. Multi-partition
select count(*) from orders where customer_id = 5                              select count(*) from orders where product_id = 3
                 single-partition                                                               multi-partition


                                                                          update products set product_name = „spork‟ where product_id = 3
                                                                                        multi-partition




                                                        insert into orders (customer_id, order_id, product_id) values (3,303,2)
                                                                         single-partition




            Partition 1             Partition 2                Partition 3

             1     101     2         2     201      1            3     201     1      table orders :    customer_id (partition key)
             1     101     3         5     501      3            6     601     1      (partitioned)     order_id
             4     401     2         5     502      2            6     601     2                        product_id


             1     knife             1     knife                 1     knife          table products : product_id
             2     spoon             2     spoon                 2     spoon          (replicated)     product_name
             3     fork              3     fork                  3     fork

                                                                                               March 3, 2009    |
Partitions (3/4)

• Looking inside a
  VoltDB partition…
  • Each partition contains
                                                         Work
    data and an execution                                Queue
    engine.
  • The execution engine
    contains a queue for                        execution engine
    transaction requests.
  • Requests are                                    Table Data
    executed sequentially                           Index Data
    (single threaded).

        - Complete copy of all replicated tables
        - Portion of rows (about 1/partitions) of
        all partitioned tables
                                                      March 3, 2009   |
Partitions (4/4)

Partition your tables to maximize the frequency of single-
partition transactions and minimize multi-partition
transactions.
   + Single-partition transaction vs. multi-partition transaction, 1
     unit of time
          s1    s2   s3    s4    s5    s6    s7       s8         s9
   or
          m1    m1   m1    m1    m1    m1    m1       m1         m1

   … now imagine this on a 12 node cluster with 96 partitions




                                             March 3, 2009   |
                                                                       8
Compiling Your Application
                              Schema                        Stored Procedures

• The database is
                            CREATE TABLE HELLOWORLD (                   import org.voltdb. * ;
                                                                           import org.voltdb. * ;
                               HELLO CHAR(15),
                                                                          @ProcInfo( org.voltdb. * ;
                                                                              import
                                                                        @ProcInfo(
                               WORLD CHAR(15),                               partitionInfo = "HELLOWORLD.DIA
                               DIALECT CHAR(15),                            partitionInfo true "HE
                                                                             singlePartition = =
                                                                              @ProcInfo(
                                                                                 partitionInfo = "HELLOWORLD.DIA
                                                                          ) singlePartition = t
                               PRIMARY KEY (DIALECT)                             singlePartition = true



  constructed from          );                                                )
                                                                           public class Insert extends VoltPr
                                                                        public final SQLStmt sql =
                                                                          public final SQLStmt
                                                                             public class Insert extends VoltPr
                                                                              new SQLStmt("INSERT INTO HELLO
                                                                        public VoltTable[] sql =
                                                                             public final SQLStmt run
                                                                                   new SQLStmt("INSERT INTO HELLO
                                                                           public VoltTable[] run( String hel




  • The schema (DDL)
                                                                              public VoltTable[] run( String hel




  • The work load (Java                         Project.xml
    stored procedures)                         <?xml version="1.0"?>
                                               <project>
                                                 <database name='data


  • The Project (users,
                                                   <schema path='ddl.
                                                   <partition table=‘
                                                 </database>
                                               </project>

    groups, partitioning)
• VoltCompiler creates
  application catalog
  • Copy to servers along
    with 1 .jar and 1 .so
  • Start servers

                                                March 3, 2009       |
Clusters/Durability

• Scalability
   • Increase RAM in servers to add capacity
   • Add servers to increase performance / capacity
   • Consistently measuring 90% of single-node
     performance increase per additional node
• High availability
   • K-safety for redundancy
• Snapshots
   • Scheduled, continuous, on demand
• Spooling to external systems
• Disaster Recovery/WAN replication (Future)
   • Asynchronous replication
                                      March 3, 2009   |
Interfacing with VoltDB

• Client applications interface with VoltDB via stored
  procedures
   • Java stored procedures – Java and SQL
   • No ODBC/JDBC
   • Wire protocol: client libraries available for Java, C++, Python,
     PHP, Ruby, and Erlang.
   • HTTP/JSON interface also available.




                                                March 3, 2009   |
                                                                        11
Asynchronous Communications

• Client applications communicate asynchronously with
  VoltDB
   •   Stored procedure invocations are placed “on the wire”
   •   Responses are pulled from the server
   •   Allows a single client application to generate > 100K TPS
   •   Our client library will simulate synchronous if needed


   Traditional
       salary := get_salary(employee_id);

   VoltDB
       callProcedure(asyncCallback, “get_salary”, employee_id);


                                               March 3, 2009   |
                                                                   12
Transaction Control

• VoltDB does not support client-side transaction control
   • Client applications cannot:
       + insert into t_colors (color_name) values („purple‟);
       + rollback;
   • Stored procedures commit if successful, rollback if failed
   • Client code in stored procedure can call for rollback




                                                          March 3, 2009   |
                                                                              13
Lack of concurrency

• Single-threaded execution within partitions (single-
  partition) or across partitions (multi-partition)
• No need to worry about locking/dead-locks
   • great for “inventory” type applications
      + checking inventory levels
      + creating line items for customers
• Because of this, transactions execute in microseconds
• However, single-threaded comes at a price
   • Other transactions wait for running transaction to complete
   • Don‟t do anything crazy in a SP (request web page, send email)
   • Useful for OLTP, not OLAP*
       + However, certain use cases allow for quite a bit of work to be done in a single
         stored procedure (pipeline, staging/transformation, low concurrent users).

                                                         March 3, 2009   |
                                                                                           14
Throughput vs. Latency

• VoltDB is built for throughput over latency
• Latency measured in mid single-digits in a properly sized
  cluster
• Do not estimate latency as (1 / TPS)




                                         March 3, 2009   |
                                                              15
SQL Support

• SELECT, INSERT (using values), UPDATE, and
  DELETE
• Aggregate SQL supports AVG, COUNT, MAX, MIN, SUM
• Materialized views using COUNT and SUM
• Hash and Tree Indexes
• SQL functions and functionality will be added over time,
  for now I do it in Java
• Execution plan for all SQL is created at compile time and
  available for analysis



                                       March 3, 2009   |
                                                              16
SQL in Stored Procedures
• SQL can be parameterized, but not dynamic

     “select * from foo where bar = ?;”      (YES)

     “select * from ? where bar = ?;”        (NO)




                                          March 3, 2009   |
Connecting to the Cluster
• Clients connect to one or more nodes in the
  VoltDB cluster, transactions are forwarded to
  the correct node
  • Clients are not aware of partitioning strategy
  • In the future we may send back data in the
    response indicating if the transaction was sent to
    the correct node.




                                       March 3, 2009   |
Schema Changes
• Traditional OLTP
  • add table…
  • alter table…
• VoltDB
  • modify schema and stored procedures
  • build catalog
  • deploy catalog
• V1.0: Add/drop users, stored procedures
• V1.1: Add/drop tables
• Future: Add/drop column, Add/drop index
                                  March 3, 2009   |
Table/Index Storage
• VoltDB is entirely in-memory
• Cluster must collectively have enough RAM to
  hold all tables/indexes (k + 1 copies)
• Even data distribution is important
• Tables do not return memory of deleted rows,
  they mark, add to freelist, and reuse
• Tree indexes return memory on delete, hash
  indexes do not



                                  March 3, 2009   |
Future…
• WAN replication, asynchronously
• Increased SQL support
• Client libraries
• Performance enhancements, especially multi-
  partition
• Export via JDBC
• Expand cloud support
• Increase cluster k-safety (grow)
• Grow/shrink cluster dynamically (repartition)
                                  March 3, 2009   |
Q&A

• Visit http://voltdb.com to…
  • Download VoltDB
  • Get sample app code

• Join the VoltDB community
  • VoltDB user groups: www.meetup.com/voltdb
  • Follow VoltDB on Twitter @voltdb

• Contact me
  • tcallaghan@voltdb.com (email)
  • @tmcallaghan (Twitter)

                                    March 3, 2009   |
                                                        22

More Related Content

PDF
Normalization | (1NF) |(2NF) (3NF)|BCNF| 4NF |5NF
PPTX
Databases
PDF
VoltDB: as vantagens e os desafios dos banco de dados NewSQL
PPT
Entity relationship (er) modeling
PPTX
Unit 3 dbms
PPT
Sql join
PPTX
SQL Commands
PPT
Elmasri Navathe DBMS Unit-1 ppt
Normalization | (1NF) |(2NF) (3NF)|BCNF| 4NF |5NF
Databases
VoltDB: as vantagens e os desafios dos banco de dados NewSQL
Entity relationship (er) modeling
Unit 3 dbms
Sql join
SQL Commands
Elmasri Navathe DBMS Unit-1 ppt

What's hot (20)

PDF
SQL - שפת הגדרת הנתונים
PPTX
Database management systems components
PPT
Normalization of database tables
PPTX
Normalization in a Database
PPT
Database systems introduction
PPT
Database Normalization 1NF, 2NF, 3NF, BCNF, 4NF, 5NF
PDF
Efficient Use of indexes in MySQL
PDF
computer system structure
PPTX
Applications of data structures
PPTX
Transaction Properties in database | ACID Properties
PPT
normalization-1nf-to-3nf-with-same-example.ppt
PDF
SQL Joins With Examples | Edureka
PPTX
Normalization in DBMS
PPT
VB6 Using ADO Data Control
PDF
Distributed DBMS - Unit 3 - Distributed DBMS Architecture
PPTX
Presentation slides of Sequence Query Language (SQL)
PDF
2 database system concepts and architecture
PPT
Chapter06.ppt
PPTX
Structure of dbms
SQL - שפת הגדרת הנתונים
Database management systems components
Normalization of database tables
Normalization in a Database
Database systems introduction
Database Normalization 1NF, 2NF, 3NF, BCNF, 4NF, 5NF
Efficient Use of indexes in MySQL
computer system structure
Applications of data structures
Transaction Properties in database | ACID Properties
normalization-1nf-to-3nf-with-same-example.ppt
SQL Joins With Examples | Edureka
Normalization in DBMS
VB6 Using ADO Data Control
Distributed DBMS - Unit 3 - Distributed DBMS Architecture
Presentation slides of Sequence Query Language (SQL)
2 database system concepts and architecture
Chapter06.ppt
Structure of dbms
Ad

Viewers also liked (18)

PDF
Kyle Kingsbury Talks about the Jepsen Test: What VoltDB Learned About Data Ac...
PDF
Transforming Your Business with Fast Data – Five Use Case Examples
PDF
Moving Beyond Batch: Transactional Databases for Real-time Data
PDF
Understanding the Operational Database Infrastructure for IoT and Fast Data
PDF
Understanding the Top Four Use Cases for IoT
PDF
Eat Your Data and Have It Too: Get the Blazing Performance of In-Memory Opera...
PDF
VoltDB and HPE Vertica Present: Building an IoT Architecture for Fast + Big Data
PDF
Acting on Real-time Behavior: How Peak Games Won Transactions
ODP
Voltdb: Shard It by V. Torshyn
PDF
Fast Data Choices: 5 Strategies for Evaluating Alternative Business and Techn...
PDF
Using a Fast Operational Database to Build Real-time Streaming Aggregations
PDF
Memory Database Technology is Driving a New Cycle of Business Innovation
PDF
How to build streaming data applications - evaluating the top contenders
PPTX
Lessons Learned: The Impact of Fast Data for Personalization
PDF
Arguments for a Unified IoT Architecture
PDF
VoltDB and Flytxt Present: Building a Single Technology Platform for Real-Tim...
PDF
Powering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
PPTX
Real time Analytics with Apache Kafka and Apache Spark
Kyle Kingsbury Talks about the Jepsen Test: What VoltDB Learned About Data Ac...
Transforming Your Business with Fast Data – Five Use Case Examples
Moving Beyond Batch: Transactional Databases for Real-time Data
Understanding the Operational Database Infrastructure for IoT and Fast Data
Understanding the Top Four Use Cases for IoT
Eat Your Data and Have It Too: Get the Blazing Performance of In-Memory Opera...
VoltDB and HPE Vertica Present: Building an IoT Architecture for Fast + Big Data
Acting on Real-time Behavior: How Peak Games Won Transactions
Voltdb: Shard It by V. Torshyn
Fast Data Choices: 5 Strategies for Evaluating Alternative Business and Techn...
Using a Fast Operational Database to Build Real-time Streaming Aggregations
Memory Database Technology is Driving a New Cycle of Business Innovation
How to build streaming data applications - evaluating the top contenders
Lessons Learned: The Impact of Fast Data for Personalization
Arguments for a Unified IoT Architecture
VoltDB and Flytxt Present: Building a Single Technology Platform for Real-Tim...
Powering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
Real time Analytics with Apache Kafka and Apache Spark
Ad

Similar to VoltDB : A Technical Overview (20)

PDF
NewSQL Database Overview
PPT
VoltDB.ppt
PPTX
SQLFire at Strata 2012
PPTX
PPTX
Riak add presentation
PPTX
PDF
[INSIGHT OUT 2011] B26 optimising a two table join(jonathan lewis)
PDF
Things you should know about Oracle truncate
PDF
[嵌入式系統] MCS-51 實驗 - 使用 IAR (3)
PPTX
Hive: Loading Data
PDF
Supporting Over a Thousand Custom Hive User Defined Functions
PDF
Hidden pearls for High-Performance-Persistence
PPTX
Fortran & Link with Library & Brief Explanation of MKL BLAS
PDF
Meetup talk
PPTX
High Performance, High Reliability Data Loading on ClickHouse
PDF
Apache Storm Tutorial
PDF
Introduction to Apache Spark
PPT
Real-Time Streaming with Apache Spark Streaming and Apache Storm
PDF
Drizzles Approach To Improving Performance Of The Server
PDF
vSAN Beyond The Basics
NewSQL Database Overview
VoltDB.ppt
SQLFire at Strata 2012
Riak add presentation
[INSIGHT OUT 2011] B26 optimising a two table join(jonathan lewis)
Things you should know about Oracle truncate
[嵌入式系統] MCS-51 實驗 - 使用 IAR (3)
Hive: Loading Data
Supporting Over a Thousand Custom Hive User Defined Functions
Hidden pearls for High-Performance-Persistence
Fortran & Link with Library & Brief Explanation of MKL BLAS
Meetup talk
High Performance, High Reliability Data Loading on ClickHouse
Apache Storm Tutorial
Introduction to Apache Spark
Real-Time Streaming with Apache Spark Streaming and Apache Storm
Drizzles Approach To Improving Performance Of The Server
vSAN Beyond The Basics

More from Tim Callaghan (12)

PPTX
Is It Fast? : Measuring MongoDB Performance
PDF
Benchmarking MongoDB for Fame and Fortune
PPTX
So you want to be a software developer? (version 2.0)
PPTX
Performance Benchmarking: Tips, Tricks, and Lessons Learned
PPTX
Introduction to TokuDB v7.5 and Read Free Replication
PDF
Use Your MySQL Knowledge to Become an Instant Cassandra Guru
PPT
5 Pitfalls to Avoid with MongoDB
PPTX
Get More Out of MySQL with TokuDB
PPTX
Get More Out of MongoDB with TokuMX
PDF
Use Your MySQL Knowledge to Become a MongoDB Guru
PDF
Fractal Tree Indexes : From Theory to Practice
PPTX
Creating a Benchmarking Infrastructure That Just Works
Is It Fast? : Measuring MongoDB Performance
Benchmarking MongoDB for Fame and Fortune
So you want to be a software developer? (version 2.0)
Performance Benchmarking: Tips, Tricks, and Lessons Learned
Introduction to TokuDB v7.5 and Read Free Replication
Use Your MySQL Knowledge to Become an Instant Cassandra Guru
5 Pitfalls to Avoid with MongoDB
Get More Out of MySQL with TokuDB
Get More Out of MongoDB with TokuMX
Use Your MySQL Knowledge to Become a MongoDB Guru
Fractal Tree Indexes : From Theory to Practice
Creating a Benchmarking Infrastructure That Just Works

Recently uploaded (20)

PDF
Web App vs Mobile App What Should You Build First.pdf
PDF
A comparative study of natural language inference in Swahili using monolingua...
PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
PPTX
A Presentation on Artificial Intelligence
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
PDF
Mushroom cultivation and it's methods.pdf
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
August Patch Tuesday
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
Tartificialntelligence_presentation.pptx
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Heart disease approach using modified random forest and particle swarm optimi...
PDF
DP Operators-handbook-extract for the Mautical Institute
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
Zenith AI: Advanced Artificial Intelligence
Web App vs Mobile App What Should You Build First.pdf
A comparative study of natural language inference in Swahili using monolingua...
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
A Presentation on Artificial Intelligence
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
Unlocking AI with Model Context Protocol (MCP)
Univ-Connecticut-ChatGPT-Presentaion.pdf
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
Mushroom cultivation and it's methods.pdf
Programs and apps: productivity, graphics, security and other tools
August Patch Tuesday
Encapsulation_ Review paper, used for researhc scholars
Tartificialntelligence_presentation.pptx
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Heart disease approach using modified random forest and particle swarm optimi...
DP Operators-handbook-extract for the Mautical Institute
Group 1 Presentation -Planning and Decision Making .pptx
NewMind AI Weekly Chronicles - August'25-Week II
Zenith AI: Advanced Artificial Intelligence

VoltDB : A Technical Overview

  • 1. VoltDB: Technical Overview 20101019 Tim Callaghan, VoltDB Field Engineer tcallaghan@voltdb.com
  • 2. Before I Begin… VoltDB is available in community and commercial editions Runs on Linux + Mac* I am “Mark Callaghan’s lesser-known but nonetheless smart brother” [C. Monash]* *http://www.dbms2.com/2010/05/25/voltdb-finally-launches/ March 3, 2009 | 2
  • 3. Scaling Traditional OLTP Databases • Sharding improves performance but introduces… • Management complexity + disjointed backup/recovery and replication + manual effort to re-partition data X X • Application complexity + shard awareness + cross partition joins + cross partition transactions X • And, each shard still suffers from traditionalXOLTPX performance limitations • If you can shard, your application is probably great in VoltDB. March 3, 2009 |
  • 4. Technical Overview • “OLTP Through the Looking Glass” http://cs-www.cs.yale.edu/homes/dna/papers/oltpperf-sigmod08.pdf • VoltDB avoids the overhead of traditional databases • K-safety for fault tolerance - no logging • In memory operation for maximum throughput - no buffer management • Partitions operate autonomously and single-threaded - no latching or locking • Built to horizontally scale March 3, 2009 | 4
  • 5. Partitions (1/4) • 1 partition per physical CPU core • Each physical server has multiple VoltDB partitions • Data - Two types of tables • Partitioned + Single column serves as partitioning key X + Rows are spread across all VoltDB partitions by partition column X + Transactional data (high frequency of modification) • Replicated + All rows exist within all VoltDB partitions + Relatively static data (low frequency of modification) • Code - Two types of work – both ACID • Single-Partition X + All insert/update/delete operations within single partition X X + Majority of transactional workload • Multi-Partition + CRUD against partitioned tables across multiple partitions + Insert/update/delete on replicated tables March 3, 2009 |
  • 6. Partitions (2/4) • Single-partition vs. Multi-partition select count(*) from orders where customer_id = 5 select count(*) from orders where product_id = 3 single-partition multi-partition update products set product_name = „spork‟ where product_id = 3 multi-partition insert into orders (customer_id, order_id, product_id) values (3,303,2) single-partition Partition 1 Partition 2 Partition 3 1 101 2 2 201 1 3 201 1 table orders : customer_id (partition key) 1 101 3 5 501 3 6 601 1 (partitioned) order_id 4 401 2 5 502 2 6 601 2 product_id 1 knife 1 knife 1 knife table products : product_id 2 spoon 2 spoon 2 spoon (replicated) product_name 3 fork 3 fork 3 fork March 3, 2009 |
  • 7. Partitions (3/4) • Looking inside a VoltDB partition… • Each partition contains Work data and an execution Queue engine. • The execution engine contains a queue for execution engine transaction requests. • Requests are Table Data executed sequentially Index Data (single threaded). - Complete copy of all replicated tables - Portion of rows (about 1/partitions) of all partitioned tables March 3, 2009 |
  • 8. Partitions (4/4) Partition your tables to maximize the frequency of single- partition transactions and minimize multi-partition transactions. + Single-partition transaction vs. multi-partition transaction, 1 unit of time s1 s2 s3 s4 s5 s6 s7 s8 s9 or m1 m1 m1 m1 m1 m1 m1 m1 m1 … now imagine this on a 12 node cluster with 96 partitions March 3, 2009 | 8
  • 9. Compiling Your Application Schema Stored Procedures • The database is CREATE TABLE HELLOWORLD ( import org.voltdb. * ; import org.voltdb. * ; HELLO CHAR(15), @ProcInfo( org.voltdb. * ; import @ProcInfo( WORLD CHAR(15), partitionInfo = "HELLOWORLD.DIA DIALECT CHAR(15), partitionInfo true "HE singlePartition = = @ProcInfo( partitionInfo = "HELLOWORLD.DIA ) singlePartition = t PRIMARY KEY (DIALECT) singlePartition = true constructed from ); ) public class Insert extends VoltPr public final SQLStmt sql = public final SQLStmt public class Insert extends VoltPr new SQLStmt("INSERT INTO HELLO public VoltTable[] sql = public final SQLStmt run new SQLStmt("INSERT INTO HELLO public VoltTable[] run( String hel • The schema (DDL) public VoltTable[] run( String hel • The work load (Java Project.xml stored procedures) <?xml version="1.0"?> <project> <database name='data • The Project (users, <schema path='ddl. <partition table=‘ </database> </project> groups, partitioning) • VoltCompiler creates application catalog • Copy to servers along with 1 .jar and 1 .so • Start servers March 3, 2009 |
  • 10. Clusters/Durability • Scalability • Increase RAM in servers to add capacity • Add servers to increase performance / capacity • Consistently measuring 90% of single-node performance increase per additional node • High availability • K-safety for redundancy • Snapshots • Scheduled, continuous, on demand • Spooling to external systems • Disaster Recovery/WAN replication (Future) • Asynchronous replication March 3, 2009 |
  • 11. Interfacing with VoltDB • Client applications interface with VoltDB via stored procedures • Java stored procedures – Java and SQL • No ODBC/JDBC • Wire protocol: client libraries available for Java, C++, Python, PHP, Ruby, and Erlang. • HTTP/JSON interface also available. March 3, 2009 | 11
  • 12. Asynchronous Communications • Client applications communicate asynchronously with VoltDB • Stored procedure invocations are placed “on the wire” • Responses are pulled from the server • Allows a single client application to generate > 100K TPS • Our client library will simulate synchronous if needed Traditional salary := get_salary(employee_id); VoltDB callProcedure(asyncCallback, “get_salary”, employee_id); March 3, 2009 | 12
  • 13. Transaction Control • VoltDB does not support client-side transaction control • Client applications cannot: + insert into t_colors (color_name) values („purple‟); + rollback; • Stored procedures commit if successful, rollback if failed • Client code in stored procedure can call for rollback March 3, 2009 | 13
  • 14. Lack of concurrency • Single-threaded execution within partitions (single- partition) or across partitions (multi-partition) • No need to worry about locking/dead-locks • great for “inventory” type applications + checking inventory levels + creating line items for customers • Because of this, transactions execute in microseconds • However, single-threaded comes at a price • Other transactions wait for running transaction to complete • Don‟t do anything crazy in a SP (request web page, send email) • Useful for OLTP, not OLAP* + However, certain use cases allow for quite a bit of work to be done in a single stored procedure (pipeline, staging/transformation, low concurrent users). March 3, 2009 | 14
  • 15. Throughput vs. Latency • VoltDB is built for throughput over latency • Latency measured in mid single-digits in a properly sized cluster • Do not estimate latency as (1 / TPS) March 3, 2009 | 15
  • 16. SQL Support • SELECT, INSERT (using values), UPDATE, and DELETE • Aggregate SQL supports AVG, COUNT, MAX, MIN, SUM • Materialized views using COUNT and SUM • Hash and Tree Indexes • SQL functions and functionality will be added over time, for now I do it in Java • Execution plan for all SQL is created at compile time and available for analysis March 3, 2009 | 16
  • 17. SQL in Stored Procedures • SQL can be parameterized, but not dynamic “select * from foo where bar = ?;” (YES) “select * from ? where bar = ?;” (NO) March 3, 2009 |
  • 18. Connecting to the Cluster • Clients connect to one or more nodes in the VoltDB cluster, transactions are forwarded to the correct node • Clients are not aware of partitioning strategy • In the future we may send back data in the response indicating if the transaction was sent to the correct node. March 3, 2009 |
  • 19. Schema Changes • Traditional OLTP • add table… • alter table… • VoltDB • modify schema and stored procedures • build catalog • deploy catalog • V1.0: Add/drop users, stored procedures • V1.1: Add/drop tables • Future: Add/drop column, Add/drop index March 3, 2009 |
  • 20. Table/Index Storage • VoltDB is entirely in-memory • Cluster must collectively have enough RAM to hold all tables/indexes (k + 1 copies) • Even data distribution is important • Tables do not return memory of deleted rows, they mark, add to freelist, and reuse • Tree indexes return memory on delete, hash indexes do not March 3, 2009 |
  • 21. Future… • WAN replication, asynchronously • Increased SQL support • Client libraries • Performance enhancements, especially multi- partition • Export via JDBC • Expand cloud support • Increase cluster k-safety (grow) • Grow/shrink cluster dynamically (repartition) March 3, 2009 |
  • 22. Q&A • Visit http://voltdb.com to… • Download VoltDB • Get sample app code • Join the VoltDB community • VoltDB user groups: www.meetup.com/voltdb • Follow VoltDB on Twitter @voltdb • Contact me • tcallaghan@voltdb.com (email) • @tmcallaghan (Twitter) March 3, 2009 | 22