SlideShare a Scribd company logo
MySQL at Sabre

Alan Walker
Sabre Labs

February 2004
Confidential
Agenda
• Sabre Holdings Overview
• Business drivers for MySQL & Open Source
• Shopping for fares
• Air Travel Shopping Engine (ATSE)
• Data replication strategy
• ESQL precompiler for MySQL
• Other MySQL users at Sabre

2
22
Who is Sabre Holdings?

A world leader in travel commerce,
retailing travel products, and
providing distribution and
technology solutions for the
travel industry

3
33
Sabre Holdings Businesses

4
44
Sabre Holdings Fast Facts

• Industry leader in multiple travel channels

• Revenues of $2.06 billion in 2002
• S&P 500 company

• NYSE:TSG
• Headquarters in Dallas/Fort Worth, Texas
• 6,500 employees in 45 countries

5
55
Business drivers

Over 3 billion
fare combinations
for a single customer request

Multiple airlines, flights, fare types, dates
prices, taxes, surcharges
6
66
Business drivers
• No direct revenue for shopping queries
• Revenue for booking, but not looking (searching)
• Look-to-book ratio increasing
• Competition requires staying on the “leading edge”
• Highly reliable and scalable database
• Fast processors
• Large real memory
• Smart algorithms

• Shopping is a good fit for horizontal scale
• Pricing requires higher precision
7
77
Business drivers
Application

DB / Middleware

Computing
Stack

Commodity
Point

Operating System

Hardware

Hardware, operating system, database and middleware are
becoming commodities. This drives the cost down rapidly.
Open source software is a major driver of this effect.
8
88
Business Solution
• Linux servers alongside HP NonStop servers to create
“hybrid” Air Travel Shopping Engine (ATSE) platform
• HP NonStop delivers high availability and reliability
– Better than or equal to legacy, but at significantly lower cost
– Best fit for critical workloads and master database
management
• Linux / MySQL delivers 64-bit memory and faster CPUs

– Lower availability and reliability than HP NonStop but at
significantly lower cost
– Best fit for CPU-intensive shopping workloads

Most cost-effective platform for the shopping workload
9
99
Business drivers
• Sabre’s legacy
• World’s first commercial OLTP system in 1960
• Mainframe clusters running TPF
• Operating system customized to our needs
• True 7*24 application, with zero scheduled downtime
• Most application code in assembler
• Sabre’s future
• Higher-level languages
• Relational databases
• Internet
• Open systems
• Reduce specialized training
• Use off the shelf software
• HP NonStop with OSS is a key component (LINUX?)

10
10
10
Shopping
• Finding cheap air fares is hard!
• With 50+ connect points to consider, and >100 fares per
leg, we need to evaluate >3 billion combinations
• Up to a million fares can change every day
• Availability changes continuously
• Solve it >100 times per second
• Other functions
• Price 250 tickets per second
• Process 1000 flight routing requests per second

11
11
11
Pricing
• Shopping vs. Pricing
• Shopping is the problem of finding low fares
• Pricing is used to print the ticket
• Pricing has to be accurate, or we pay the difference to the
airline
• Many internet search engines still rely on mainframes to
actually print the ticket
• Pricing also requires additional functions, such as refunds,
exchanges and auditing

12
12
12
Algorithms
• Fare-led search
• Graph-based algorithm that searches all fare
combinations across 50+ connect points
• Can generate up to a 4-segment connection
• Search space of >3 billion fare combinations
• Match or exceed any competitor in finding lowest fare
• Only loses to competitors to have access to exclusive
private fares and/or other discounts
• Search actually checks Direct Connect Availability, so that
low fare options are actually bookable

13
13
13
Algorithms
• Dynamic schedules
• Connections are not generated overnight and stored
• Not limited to routes explicitly setup by airlines or other
marketing staff
• Availability Manager
• Flexible rules to access airline availability
• Current methods
– Direct Connect
– Host Availability
– Teletype (AVS)
• Can also use

– Cached DCA
– Inventory proxy

14
14
14
ATSE Hybrid
• Air shopping for desirable itineraries
• Must search through multiple airlines, flights, fare types,
dates, adjacent airports, etc.
• Must calculate prices, taxes, surcharges
• Complexity
• Single round-trip request can have over 3 billion fare
combinations
• Search is CPU and memory intensive

• Business driver
• No direct revenue for shopping transactions
• Increasing look to book ratio
15
15
15
ATSE Hybrid
• Combine Linux servers and HP NonStop servers
• HP NonStop delivers high availability and reliability
• Better than or equal to TPF at significantly lower cost
• Master database management
• Data replicated in real-time to Linux servers
• PNR pricing, schedules and availability
• Linux delivers 64-bit memory model and faster CPUs
• Lower availability and reliability than HP NonStop but at
significantly lower cost
• Horizontally scaled server farm with spare capacity
• Best fit for CPU-intensive shopping workloads
16
16
16
ATSE Hybrid
IBM

Fare and Rule
Updates

Schedule and Availability
Updates

IBM

PSS

MVS
d i g i t a l

d i g i t a l

d i g i t a l

d i g i t a l

d i g i t a l

d i g i t a l

HP Non-Stop
Air Shopping
Transactions

Shopping
Availability
Transactions Requests

Naming Service
And
Load Balancing

DB Image
Load
and Updates
E/R

Logging
and Billing

Linux Server Farm

Load Information

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

17
17
17
ATSE Linux servers
• In production since July 2003
• Started with HP rp5405 servers (Unix PA-RISC)
– Migrated to Itanium in December 2003
• Using 45 HP rx5670 servers

– 4-way, 1.5 GHz, 6MB L2 cache, 32GB RAM, 4x72GB SCSI

• Software
• MySQL 4.0.15
• GNU compilers – g++ 3.2.3 and glibc 2.3.2
• TAO object request broker
• Redhat RHAS 2.1
• GoldenGate Extractor/Replicator
• Monitoring – Prognosis, CA Unicenter, scripts
18
18
18
ATSE Software
• Extensive use of open source software
• MySQL 4.0.15
• GNU compilers – g++ 3.2.3 and glibc 2.3.2
• TAO object request broker
• Redhat Linux AS 3.0

• Third party software
• GoldenGate Extractor/Replicator
• Monitoring – Prognosis, CA Unicenter, scripts
• Internally developed applications and scripts

19
19
19
Data replication
• HP NonStop (Tandem) is master database
• Golden Gate Software used to replicate to MySQL
– Extracts data form undo/redo logs on the NonStop server
– Performs INSERT / UPDATE / DELETE on MySQL
– Software performs catch-up / resync in case of crashes or
other failures
• Each Linux server has an identical copy of the database

– 50GB database on each server, all InnoDB

• Replication volume
• 150 tables replicated (over 300 on NonStop server)
• Can replicate 1M fare changes / hour
• Data updates on 7x24 basis
20
20
20
Data replication
HP NonStop

SQL/MP

DB

TMF
Log

Linux IA-64

Data
Pump

Queue

Extract

Receive

Updater

Queue

MySQL

DB

= Golden Gate Software

21
21
21
Data Replication
Server-Net

Extract
Queue

Extract
Queue

Extract
Queue

Extract
Queue

Extract
Queue

Extract
Queue

Data
Pump

Data
Pump

Data
Pump

Data
Pump

Data
Pump

Data
Pump

Data
Pump

Data
Pump

Data
Pump

Data
Pump

Data
Pump

Data
Pump

Extract
Collector

Extract
Collector

Extract
Collector

Extract
Collector

Extract
Collector

Extract
Collector

Extract
Collector

Extract
Collector

Extract
Collector

Extract
Collector

Extract
Collector

Extract
Collector

Queue

Queue

Queue

Queue

Queue

Queue

Queue

Queue

Queue

Queue

Queue

Queue

Replicator

Replicator

Replicator

Replicator

Replicator

Replicator

Replicator

Replicator

Replicator

Replicator

Replicator

Replicator

MySQL

MySQL

MySQL

MySQL

MySQL

MySQL

MySQL

MySQL

MySQL

MySQL

MySQL

MySQL

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

22
22
22
Results
Reduced development
costs

Decreased fare
loading cycle times

Competitive
Advantage

Increased
functionality

Reduced runtime costs
(over 80% compared to legacy)
23
23
23
Hybrid
• Horizontal scalability
• Ability to throw inexpensive CPUs at the problem
• Tolerate failure of a single server
• How do we get there from here?
• Database and network functions remain on Himalaya
• C++ code readily ports to Linux
• Publish/subscribe metaphor for data in memory
• 64-bit addressing to avoid memory constraints

24
24
24
Connectivity
• CORBA
• Major functions use CORBA internally
• CORBA requests to TPF for availability
• CORBA to CTS for DCA this Summer (bypass TPF)
• Asynchronous messaging via MQ Series

• XML
• Currently uses XML requests from TPF (over RPPC) for
pricing functions
• Working on direct access from Travelocity to ATSE
– Will be used for BIP
– Already working over HTTP (development systems)
– Working on security & billing for production
25
25
25
Timeline
• 2000
• Proof Of Concept, April – August
• 5 core developers, partnership with Compaq
• 2001
• Development & training began in February
• Initial hardware delivered
• 2002
• Phase 1 in production since July
• Zero downtime since implementation
• Rapidly developing additional functionality
• Wow – this is from an ancient slide, huh?
26
26
26
Precompiler
• Challenge
• 500K lines of C/C++, 150+

files with embedded SQL
• We did not want to rewrite
ESQL / C code by hand
• Solution
• Wrote a precompiler that

converts ESQL to inline
MySQL calls
• About 1000 lines of awk
• We are willing to share this
code with others

EXEC SQL
int
double
char
EXEC SQL

BEGIN DECLARE SECTION;
host_a;
host_b;
host_c;
END DECLARE SECTION;

EXEC SQL DECLARE csr1 CURSOR FOR
SELECT a, b, c
FROM table1
WHERE x = :hostvar1;
EXEC SQL OPEN csr1;
while (rc >= 0 && rc != 100){
EXEC SQL FETCH csr1 INTO
:host_a, :host_b, :host_c;
printf("Fetch %d, %lf, %sn",
host_a, host_b, host_c);
}
EXEC SQL CLOSE csr1;

27
27
27
Precompiler
• How it works
• Convert C / ESQL to C++ code
• Polymorphism matches data types in the declare section
• Can ignore the declare section
EXEC SQL
int
double
char
EXEC SQL

BEGIN DECLARE SECTION;
host_a;
host_b;
host_c;
END DECLARE SECTION;

// EXEC
int
double
char
// EXEC

SQL BEGIN DECLARE SECTION;
host_a;
host_b;
host_c;
SQL END DECLARE SECTION;

28
28
28
Precompiler

Cursor declarations (SELECT statements) are converted to a static
struct. The struct has the text of the SQL, as well as statement
handles for doing prepare / execute (where applicable)

EXEC SQL DECLARE csr1 CURSOR FOR
SELECT a, b, c
FROM table1
WHERE x = :hostvar1;

// EXEC SQL DECLARE csr1
static e2mysql csr1 = {
" SELECT a,b,c FROM table1 WHERE x = :hostvar1"
, NULL , 0};

29
29
29
Precompiler
The OPEN, FETCH and CLOSE statements are converted into
function calls. The precompiler generates the code for these calls
and puts it at the end of the source module.
EXEC SQL FETCH csr1 INTO :host_a, :host_b, :host_c;
// EXEC SQL FETCH csr1
static int16 fetch_csr1()
{
if ( ! csr1.rslt )
return SQL_ERROR;
if ( csr1.row >= mysql_num_rows(csr1.rslt) )
return SQL_NO_DATA;
MYSQL_ROW row = mysql_fetch_row(csr1.rslt);
SQLBindColPoly(row[0], host_a, sizeof(host_a));
SQLBindColPoly(row[1], host_b, sizeof(host_b));
SQLBindColPoly(row[2], host_c, sizeof(host_c));
++csr1.row;
return SQL_SUCCESS;
}

30
30
30
Precompiler

A lightweight wrapper around the database API lets us
use polymorphism to convert to the types specified in the
declare section. There is a wrapper function for each
simple C++ type that we handle.

inline int32
SQLBindColPoly(const char* value, int32& parm, uint16 size)
{
parm = atoi(value);
return SQL_SUCCESS;
}

31
31
31
Precompiler
• Notes
• Light-weight C++ wrapper to MySQL API
• The precompiler understands some SQL syntax and does
some modifications of NonStop SQL/MP statements
• We have also used our precompiler to target other DBMS
– ODBC API
– Oracle
– PostgreSQL
• Since we convert C to C++, this may be problematic for

ESQL programs that used deprecated K&R syntax
– C++ compilers are stricter than C compilers
– However, we did not have this problem with our application
32
32
32
Other MySQL applications at Sabre
• ATSE is our largest and most mission critical
• We have other production systems that rely on MySQL
• Site59.com is the most visible
• MySQL also used for some internal databases
• More under development
• MySQL / Linux / SATA drives make cheap data marts
• Sometimes cheaper to replicate to a data mart than to
upgrade a central data warehouse
• Currently testing with a 1.5B row database

33
33
33
Site59
• Last minute travel packages
• Acquired by Travelocity in
March 2002
• Sales volume?
• Transaction rates?
• All dynamic content generated
using PHP & MySQL

34
34
34
Site59
Site59 implements a fairly “classic” dynamic website using MySQL.
Dynamic content is generated at about 30Mbits / second. Extensive
use is made of single and dual processor Linux machines (IA-32)

Presentation
(Apache/PHP)
Internet
HTTP

Application
Server

Reservations
System Gateway

XML/HTTP

Frontend DB
(MySQL, Linux)

Replication

Backend DB
(Oracle, Sun)

35
35
35
Travel Commerce Processing Chain

Session

Shop

Price

Sell

Fulfill

36
36
36

More Related Content

PDF
How Mapbox Scales over 9 AWS Regions
PDF
Flink Forward San Francisco 2018: Ken Krugler - "Building a scalable focused ...
PDF
Stream Processing in Uber
PPTX
A Walkthrough of InfluxCloud 2.0 by Tim Hall
PDF
HBaseCon2017 Apache HBase at Didi
PPTX
InfluxData Internals by Ryan Betts
PDF
PDF
Flink Forward San Francisco 2018: Xu Yang - "Alibaba’s common algorithm platf...
How Mapbox Scales over 9 AWS Regions
Flink Forward San Francisco 2018: Ken Krugler - "Building a scalable focused ...
Stream Processing in Uber
A Walkthrough of InfluxCloud 2.0 by Tim Hall
HBaseCon2017 Apache HBase at Didi
InfluxData Internals by Ryan Betts
Flink Forward San Francisco 2018: Xu Yang - "Alibaba’s common algorithm platf...

What's hot (20)

PPTX
RedisConf17 - Redfin - The Real Estate Brokerage and the In-memory Database
PPTX
Moving Beyond Cache by Yiftach Shoolman - Redis Day Bangalore 2020
PPTX
HBaseConEast2016: Splice machine open source rdbms
PDF
Hadoop summit - Scaling Uber’s Real-Time Infra for Trillion Events per Day
PPTX
Hp hadoop platform
PPTX
Powering an API with GraphQL, Golang, and NoSQL
PDF
Postgres Plus Cloud Database
PDF
Data streaming-systems
PDF
Kafka and Kafka Streams in the Global Schibsted Data Platform
PDF
uReplicator: Uber Engineering’s Scalable, Robust Kafka Replicator
PDF
E commerce data migration in moving systems across data centres
PDF
Row #9: An architecture overview of APNIC's RDAP deployment to the cloud
PDF
Building a custom time series db - Colin Hemmings at #DOXLON
PPTX
Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020
PPTX
Boost on!!next generation big data platform
PPTX
Apache geode
PDF
Billions of Messages in Real Time: Why Paypal & LinkedIn Trust an Engagement ...
PDF
Storing State Forever: Why It Can Be Good For Your Analytics
PPTX
Flink Forward Berlin 2017 Keynote: Ferd Scheepers - Taking away customer fric...
PDF
Joe witt may2015_kafka_nyc_apachenifi-overview
RedisConf17 - Redfin - The Real Estate Brokerage and the In-memory Database
Moving Beyond Cache by Yiftach Shoolman - Redis Day Bangalore 2020
HBaseConEast2016: Splice machine open source rdbms
Hadoop summit - Scaling Uber’s Real-Time Infra for Trillion Events per Day
Hp hadoop platform
Powering an API with GraphQL, Golang, and NoSQL
Postgres Plus Cloud Database
Data streaming-systems
Kafka and Kafka Streams in the Global Schibsted Data Platform
uReplicator: Uber Engineering’s Scalable, Robust Kafka Replicator
E commerce data migration in moving systems across data centres
Row #9: An architecture overview of APNIC's RDAP deployment to the cloud
Building a custom time series db - Colin Hemmings at #DOXLON
Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020
Boost on!!next generation big data platform
Apache geode
Billions of Messages in Real Time: Why Paypal & LinkedIn Trust an Engagement ...
Storing State Forever: Why It Can Be Good For Your Analytics
Flink Forward Berlin 2017 Keynote: Ferd Scheepers - Taking away customer fric...
Joe witt may2015_kafka_nyc_apachenifi-overview
Ad

Viewers also liked (20)

ODP
Alan Walker - Faded
PDF
Airline scheduling and pricing using a genetic algorithm
PPTX
NOSQL Session GlueCon May 2010
PDF
Revving Up Revenue By Replenishing
PDF
MongoDB at ex.fm
PDF
Introduction Pentaho 5.0
PDF
Review: Leadership Frameworks
PDF
Strongly Typed Languages and Flexible Schemas
DOCX
GIT Best Practices V 0.1
PPT
Science Communication 2.0: changing University attitude through Science resea...
PDF
MongoDB and AWS Best Practices
PPTX
Challenges in opening up qualitative research data
PDF
Microsoft xamarin-experience
PDF
Special project
PPTX
MongoDB at Flight Centre Ltd
PDF
Online Travel: Today and Tomorrow
PPT
Av capabilities presentation
PPT
USJBF Overview Presentation
PPT
онлайн бронирование модуль для турагенств
PDF
Data meets Creativity - Webbdagarna 2015
Alan Walker - Faded
Airline scheduling and pricing using a genetic algorithm
NOSQL Session GlueCon May 2010
Revving Up Revenue By Replenishing
MongoDB at ex.fm
Introduction Pentaho 5.0
Review: Leadership Frameworks
Strongly Typed Languages and Flexible Schemas
GIT Best Practices V 0.1
Science Communication 2.0: changing University attitude through Science resea...
MongoDB and AWS Best Practices
Challenges in opening up qualitative research data
Microsoft xamarin-experience
Special project
MongoDB at Flight Centre Ltd
Online Travel: Today and Tomorrow
Av capabilities presentation
USJBF Overview Presentation
онлайн бронирование модуль для турагенств
Data meets Creativity - Webbdagarna 2015
Ad

Similar to Sabre presentation for MySQL user conference 2004 (20)

PPTX
Internship msc cs
PDF
MySQL in the Cloud
PDF
MySQL in the Hosted Cloud - Percona Live 2015
PPTX
Mysql ecosystem in 2019
PDF
Storage Methods for Nonstandard Data Patterns
PDF
Databases in the hosted cloud
PPT
Leveraging Open Source to Manage SAN Performance
PDF
Embracing Database Diversity: The New Oracle / MySQL DBA - UKOUG
PDF
MariaDB: in-depth (hands on training in Seoul)
PDF
A beginners guide to MariaDB
PDF
PostgreSQL and MySQL
PDF
The Complete MariaDB Server Tutorial - Percona Live 2015
PDF
The Complete MariaDB Server tutorial
PDF
The MySQL Server ecosystem in 2016
PDF
Databases in the hosted cloud
PDF
Introduction of MariaDB 2017 09
PDF
MySQL in the Hosted Cloud
PDF
Databases in the Hosted Cloud
PDF
Mysql To Db2 Conversion Guide Ibm Redbooks
PDF
Ora mysql bothGetting the best of both worlds with Oracle 11g and MySQL Enter...
Internship msc cs
MySQL in the Cloud
MySQL in the Hosted Cloud - Percona Live 2015
Mysql ecosystem in 2019
Storage Methods for Nonstandard Data Patterns
Databases in the hosted cloud
Leveraging Open Source to Manage SAN Performance
Embracing Database Diversity: The New Oracle / MySQL DBA - UKOUG
MariaDB: in-depth (hands on training in Seoul)
A beginners guide to MariaDB
PostgreSQL and MySQL
The Complete MariaDB Server Tutorial - Percona Live 2015
The Complete MariaDB Server tutorial
The MySQL Server ecosystem in 2016
Databases in the hosted cloud
Introduction of MariaDB 2017 09
MySQL in the Hosted Cloud
Databases in the Hosted Cloud
Mysql To Db2 Conversion Guide Ibm Redbooks
Ora mysql bothGetting the best of both worlds with Oracle 11g and MySQL Enter...

Recently uploaded (20)

PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Electronic commerce courselecture one. Pdf
PDF
KodekX | Application Modernization Development
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
Cloud computing and distributed systems.
PPTX
Spectroscopy.pptx food analysis technology
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPT
Teaching material agriculture food technology
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
cuic standard and advanced reporting.pdf
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Machine learning based COVID-19 study performance prediction
Digital-Transformation-Roadmap-for-Companies.pptx
Unlocking AI with Model Context Protocol (MCP)
Diabetes mellitus diagnosis method based random forest with bat algorithm
Electronic commerce courselecture one. Pdf
KodekX | Application Modernization Development
Review of recent advances in non-invasive hemoglobin estimation
Cloud computing and distributed systems.
Spectroscopy.pptx food analysis technology
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
The AUB Centre for AI in Media Proposal.docx
sap open course for s4hana steps from ECC to s4
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Mobile App Security Testing_ A Comprehensive Guide.pdf
Teaching material agriculture food technology
The Rise and Fall of 3GPP – Time for a Sabbatical?
NewMind AI Weekly Chronicles - August'25 Week I
cuic standard and advanced reporting.pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
Machine learning based COVID-19 study performance prediction

Sabre presentation for MySQL user conference 2004

  • 1. MySQL at Sabre Alan Walker Sabre Labs February 2004 Confidential
  • 2. Agenda • Sabre Holdings Overview • Business drivers for MySQL & Open Source • Shopping for fares • Air Travel Shopping Engine (ATSE) • Data replication strategy • ESQL precompiler for MySQL • Other MySQL users at Sabre 2 22
  • 3. Who is Sabre Holdings? A world leader in travel commerce, retailing travel products, and providing distribution and technology solutions for the travel industry 3 33
  • 5. Sabre Holdings Fast Facts • Industry leader in multiple travel channels • Revenues of $2.06 billion in 2002 • S&P 500 company • NYSE:TSG • Headquarters in Dallas/Fort Worth, Texas • 6,500 employees in 45 countries 5 55
  • 6. Business drivers Over 3 billion fare combinations for a single customer request Multiple airlines, flights, fare types, dates prices, taxes, surcharges 6 66
  • 7. Business drivers • No direct revenue for shopping queries • Revenue for booking, but not looking (searching) • Look-to-book ratio increasing • Competition requires staying on the “leading edge” • Highly reliable and scalable database • Fast processors • Large real memory • Smart algorithms • Shopping is a good fit for horizontal scale • Pricing requires higher precision 7 77
  • 8. Business drivers Application DB / Middleware Computing Stack Commodity Point Operating System Hardware Hardware, operating system, database and middleware are becoming commodities. This drives the cost down rapidly. Open source software is a major driver of this effect. 8 88
  • 9. Business Solution • Linux servers alongside HP NonStop servers to create “hybrid” Air Travel Shopping Engine (ATSE) platform • HP NonStop delivers high availability and reliability – Better than or equal to legacy, but at significantly lower cost – Best fit for critical workloads and master database management • Linux / MySQL delivers 64-bit memory and faster CPUs – Lower availability and reliability than HP NonStop but at significantly lower cost – Best fit for CPU-intensive shopping workloads Most cost-effective platform for the shopping workload 9 99
  • 10. Business drivers • Sabre’s legacy • World’s first commercial OLTP system in 1960 • Mainframe clusters running TPF • Operating system customized to our needs • True 7*24 application, with zero scheduled downtime • Most application code in assembler • Sabre’s future • Higher-level languages • Relational databases • Internet • Open systems • Reduce specialized training • Use off the shelf software • HP NonStop with OSS is a key component (LINUX?) 10 10 10
  • 11. Shopping • Finding cheap air fares is hard! • With 50+ connect points to consider, and >100 fares per leg, we need to evaluate >3 billion combinations • Up to a million fares can change every day • Availability changes continuously • Solve it >100 times per second • Other functions • Price 250 tickets per second • Process 1000 flight routing requests per second 11 11 11
  • 12. Pricing • Shopping vs. Pricing • Shopping is the problem of finding low fares • Pricing is used to print the ticket • Pricing has to be accurate, or we pay the difference to the airline • Many internet search engines still rely on mainframes to actually print the ticket • Pricing also requires additional functions, such as refunds, exchanges and auditing 12 12 12
  • 13. Algorithms • Fare-led search • Graph-based algorithm that searches all fare combinations across 50+ connect points • Can generate up to a 4-segment connection • Search space of >3 billion fare combinations • Match or exceed any competitor in finding lowest fare • Only loses to competitors to have access to exclusive private fares and/or other discounts • Search actually checks Direct Connect Availability, so that low fare options are actually bookable 13 13 13
  • 14. Algorithms • Dynamic schedules • Connections are not generated overnight and stored • Not limited to routes explicitly setup by airlines or other marketing staff • Availability Manager • Flexible rules to access airline availability • Current methods – Direct Connect – Host Availability – Teletype (AVS) • Can also use – Cached DCA – Inventory proxy 14 14 14
  • 15. ATSE Hybrid • Air shopping for desirable itineraries • Must search through multiple airlines, flights, fare types, dates, adjacent airports, etc. • Must calculate prices, taxes, surcharges • Complexity • Single round-trip request can have over 3 billion fare combinations • Search is CPU and memory intensive • Business driver • No direct revenue for shopping transactions • Increasing look to book ratio 15 15 15
  • 16. ATSE Hybrid • Combine Linux servers and HP NonStop servers • HP NonStop delivers high availability and reliability • Better than or equal to TPF at significantly lower cost • Master database management • Data replicated in real-time to Linux servers • PNR pricing, schedules and availability • Linux delivers 64-bit memory model and faster CPUs • Lower availability and reliability than HP NonStop but at significantly lower cost • Horizontally scaled server farm with spare capacity • Best fit for CPU-intensive shopping workloads 16 16 16
  • 17. ATSE Hybrid IBM Fare and Rule Updates Schedule and Availability Updates IBM PSS MVS d i g i t a l d i g i t a l d i g i t a l d i g i t a l d i g i t a l d i g i t a l HP Non-Stop Air Shopping Transactions Shopping Availability Transactions Requests Naming Service And Load Balancing DB Image Load and Updates E/R Logging and Billing Linux Server Farm Load Information Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux 17 17 17
  • 18. ATSE Linux servers • In production since July 2003 • Started with HP rp5405 servers (Unix PA-RISC) – Migrated to Itanium in December 2003 • Using 45 HP rx5670 servers – 4-way, 1.5 GHz, 6MB L2 cache, 32GB RAM, 4x72GB SCSI • Software • MySQL 4.0.15 • GNU compilers – g++ 3.2.3 and glibc 2.3.2 • TAO object request broker • Redhat RHAS 2.1 • GoldenGate Extractor/Replicator • Monitoring – Prognosis, CA Unicenter, scripts 18 18 18
  • 19. ATSE Software • Extensive use of open source software • MySQL 4.0.15 • GNU compilers – g++ 3.2.3 and glibc 2.3.2 • TAO object request broker • Redhat Linux AS 3.0 • Third party software • GoldenGate Extractor/Replicator • Monitoring – Prognosis, CA Unicenter, scripts • Internally developed applications and scripts 19 19 19
  • 20. Data replication • HP NonStop (Tandem) is master database • Golden Gate Software used to replicate to MySQL – Extracts data form undo/redo logs on the NonStop server – Performs INSERT / UPDATE / DELETE on MySQL – Software performs catch-up / resync in case of crashes or other failures • Each Linux server has an identical copy of the database – 50GB database on each server, all InnoDB • Replication volume • 150 tables replicated (over 300 on NonStop server) • Can replicate 1M fare changes / hour • Data updates on 7x24 basis 20 20 20
  • 21. Data replication HP NonStop SQL/MP DB TMF Log Linux IA-64 Data Pump Queue Extract Receive Updater Queue MySQL DB = Golden Gate Software 21 21 21
  • 22. Data Replication Server-Net Extract Queue Extract Queue Extract Queue Extract Queue Extract Queue Extract Queue Data Pump Data Pump Data Pump Data Pump Data Pump Data Pump Data Pump Data Pump Data Pump Data Pump Data Pump Data Pump Extract Collector Extract Collector Extract Collector Extract Collector Extract Collector Extract Collector Extract Collector Extract Collector Extract Collector Extract Collector Extract Collector Extract Collector Queue Queue Queue Queue Queue Queue Queue Queue Queue Queue Queue Queue Replicator Replicator Replicator Replicator Replicator Replicator Replicator Replicator Replicator Replicator Replicator Replicator MySQL MySQL MySQL MySQL MySQL MySQL MySQL MySQL MySQL MySQL MySQL MySQL Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux 22 22 22
  • 23. Results Reduced development costs Decreased fare loading cycle times Competitive Advantage Increased functionality Reduced runtime costs (over 80% compared to legacy) 23 23 23
  • 24. Hybrid • Horizontal scalability • Ability to throw inexpensive CPUs at the problem • Tolerate failure of a single server • How do we get there from here? • Database and network functions remain on Himalaya • C++ code readily ports to Linux • Publish/subscribe metaphor for data in memory • 64-bit addressing to avoid memory constraints 24 24 24
  • 25. Connectivity • CORBA • Major functions use CORBA internally • CORBA requests to TPF for availability • CORBA to CTS for DCA this Summer (bypass TPF) • Asynchronous messaging via MQ Series • XML • Currently uses XML requests from TPF (over RPPC) for pricing functions • Working on direct access from Travelocity to ATSE – Will be used for BIP – Already working over HTTP (development systems) – Working on security & billing for production 25 25 25
  • 26. Timeline • 2000 • Proof Of Concept, April – August • 5 core developers, partnership with Compaq • 2001 • Development & training began in February • Initial hardware delivered • 2002 • Phase 1 in production since July • Zero downtime since implementation • Rapidly developing additional functionality • Wow – this is from an ancient slide, huh? 26 26 26
  • 27. Precompiler • Challenge • 500K lines of C/C++, 150+ files with embedded SQL • We did not want to rewrite ESQL / C code by hand • Solution • Wrote a precompiler that converts ESQL to inline MySQL calls • About 1000 lines of awk • We are willing to share this code with others EXEC SQL int double char EXEC SQL BEGIN DECLARE SECTION; host_a; host_b; host_c; END DECLARE SECTION; EXEC SQL DECLARE csr1 CURSOR FOR SELECT a, b, c FROM table1 WHERE x = :hostvar1; EXEC SQL OPEN csr1; while (rc >= 0 && rc != 100){ EXEC SQL FETCH csr1 INTO :host_a, :host_b, :host_c; printf("Fetch %d, %lf, %sn", host_a, host_b, host_c); } EXEC SQL CLOSE csr1; 27 27 27
  • 28. Precompiler • How it works • Convert C / ESQL to C++ code • Polymorphism matches data types in the declare section • Can ignore the declare section EXEC SQL int double char EXEC SQL BEGIN DECLARE SECTION; host_a; host_b; host_c; END DECLARE SECTION; // EXEC int double char // EXEC SQL BEGIN DECLARE SECTION; host_a; host_b; host_c; SQL END DECLARE SECTION; 28 28 28
  • 29. Precompiler Cursor declarations (SELECT statements) are converted to a static struct. The struct has the text of the SQL, as well as statement handles for doing prepare / execute (where applicable) EXEC SQL DECLARE csr1 CURSOR FOR SELECT a, b, c FROM table1 WHERE x = :hostvar1; // EXEC SQL DECLARE csr1 static e2mysql csr1 = { " SELECT a,b,c FROM table1 WHERE x = :hostvar1" , NULL , 0}; 29 29 29
  • 30. Precompiler The OPEN, FETCH and CLOSE statements are converted into function calls. The precompiler generates the code for these calls and puts it at the end of the source module. EXEC SQL FETCH csr1 INTO :host_a, :host_b, :host_c; // EXEC SQL FETCH csr1 static int16 fetch_csr1() { if ( ! csr1.rslt ) return SQL_ERROR; if ( csr1.row >= mysql_num_rows(csr1.rslt) ) return SQL_NO_DATA; MYSQL_ROW row = mysql_fetch_row(csr1.rslt); SQLBindColPoly(row[0], host_a, sizeof(host_a)); SQLBindColPoly(row[1], host_b, sizeof(host_b)); SQLBindColPoly(row[2], host_c, sizeof(host_c)); ++csr1.row; return SQL_SUCCESS; } 30 30 30
  • 31. Precompiler A lightweight wrapper around the database API lets us use polymorphism to convert to the types specified in the declare section. There is a wrapper function for each simple C++ type that we handle. inline int32 SQLBindColPoly(const char* value, int32& parm, uint16 size) { parm = atoi(value); return SQL_SUCCESS; } 31 31 31
  • 32. Precompiler • Notes • Light-weight C++ wrapper to MySQL API • The precompiler understands some SQL syntax and does some modifications of NonStop SQL/MP statements • We have also used our precompiler to target other DBMS – ODBC API – Oracle – PostgreSQL • Since we convert C to C++, this may be problematic for ESQL programs that used deprecated K&R syntax – C++ compilers are stricter than C compilers – However, we did not have this problem with our application 32 32 32
  • 33. Other MySQL applications at Sabre • ATSE is our largest and most mission critical • We have other production systems that rely on MySQL • Site59.com is the most visible • MySQL also used for some internal databases • More under development • MySQL / Linux / SATA drives make cheap data marts • Sometimes cheaper to replicate to a data mart than to upgrade a central data warehouse • Currently testing with a 1.5B row database 33 33 33
  • 34. Site59 • Last minute travel packages • Acquired by Travelocity in March 2002 • Sales volume? • Transaction rates? • All dynamic content generated using PHP & MySQL 34 34 34
  • 35. Site59 Site59 implements a fairly “classic” dynamic website using MySQL. Dynamic content is generated at about 30Mbits / second. Extensive use is made of single and dual processor Linux machines (IA-32) Presentation (Apache/PHP) Internet HTTP Application Server Reservations System Gateway XML/HTTP Frontend DB (MySQL, Linux) Replication Backend DB (Oracle, Sun) 35 35 35
  • 36. Travel Commerce Processing Chain Session Shop Price Sell Fulfill 36 36 36