SlideShare a Scribd company logo
Instant Search API
Build Unique Search Experiences
Sylvain Utard
VP of Engineering
sylvain@algolia.com
@sylvainutard
Enterprise Search and Analytics
@algolia
Who am I?
5 years @ Exalead, leading the core-engine & NLP teams
• C++
• ExaScript (RIP)
• Java
2 years @ Algolia, VP of Engineering
• C++
• Ruby
• Java
• and 10+ other languages…
@sylvainutard
@algolia
A hosted search API
@algolia
A hosted search API
@algolia
@algolia
A hosted search API
Replies in milliseconds
@algolia
A hosted search API
Replies in milliseconds
From anywhere
@algolia
A hosted search API
Replies in milliseconds
From anywhere With intuitive relevance
@algolia
Algolia Today
@algolia
800+customers in 80+ countries
Algolia Today
@algolia
800+customers in 80+ countries
40B+ Write operationsper month
4B+ User-generated queriesper month
Algolia Today
@algolia
Algolia Today
13locations
800+customers in 80+ countries
40B+ Write operationsper month
4B+ User-generated queriesper month
@algolia
Performance
is our DNA
@algolia
Speed matters
Half a second delay

caused 20% drop in traffic
Every 100ms of latency

costs them 1% in sales
@algolia
Behind the scene
@algolia
Unique set of constraints
High volume of Read & Write operations
@algolia
Unique set of constraints
High volume of Read & Write operations
High-availability
@algolia
Unique set of constraints
High volume of Read & Write operations
High-availability
Worldwide data distribution
@algolia
API Software Stack
Started as a mobile offline SDK
Written in C++
Search code embedded in Nginx as a module
Indexing is done in a separate process
Two redis instances
@algolia
API Hardware
Fast CPU (Xeon E5 >3.5GHz)
In Memory (128GB)
Backed by High-end SSD in Raid-0 (800GB)
Specific kernel settings
@algolia
Scaling horizontally
Several clusters per location
A user is assigned to one master cluster
A user can be replicated to N replicate clusters
@algolia
What is a cluster
Master-Master
Stream of writes via Consensus
At least 3 machines
@algolia
A write in practice
One of the machines accept
the write operation via the API (https)
/1/indexes/MyFirstIndex/batch
@algolia
A write in practice
The file is saved on the three machines
as a temporary file
tmp1265
tmp7864
tmp2357
@algolia
A write in practice
Launch the consensus by contacting
the RAFT master
startConsensus(tmp2357, tmp7864, tmp1265)
@algolia
A write in practice
1 -Master send the commit order to all nodes
2- Each node returns the next job ID to master
3- If there is a majority the file is committed
@algolia
A write in practice
Same job ID on all hosts
Send to slave replicate in parallel
Processed in parallel on all hosts
job42
job42
job42
@algolia
In case one host is down
Continue to accept writes
The two other hosts keep jobs
Jobs are sequential, will catch up at restart
job42job42
@algolia
Distribution
Replicate jobs, not the result
Send to all machines in parallel
Consistent with few seconds delay
@algolia
High availability
Multi-regions in one location
@algolia
High availability
13 fully independent locations
@algolia
Network Optimisations
API usage moving from servers to
browser and mobile apps
Get close to end users
@algolia
Distributed Search Network - Worldwide Synchronization
@algolia
Distributed Search Network - Worldwide Synchronization
@algolia
• 13 locations = 25 datacenters
• No ideal worldwide provider
• AWS is not in India, Eastern EU, Africa…
• Need to handle several providers
• Anticipate long deliveries / customs
• Keep as few providers as possible
Distributed Search Network - Worldwide Synchronization
@algolia
DNS is key
Used to find the closest location
Several DNS providers
Good anycast network
@algolia
API Clients
DNS health checks are not enough
Smart retry logic in all our API Clients
@algolia
Analytics
• What are my users searching for?
• Top search
• Top search without hits
• Top refinements
• From where do they search for?
@algolia
@algolia
@algolia
Analytics
• Billions of user-generated queries per month
• As-you-type aggregation
• ~3 months retentions
• Storing all of them in…
@algolia
Analytics
• Elasticsearch o/
• … without FTS :)
• but with aggregations
@algolia
Analytics• No FTS
• No source
• Doc values everywhere
• SSD only
• Custom aggregations
(deprecated since ES 1.1.0)
@algolia
Top-k Aggregation
• Before
• Linear memory consumption
• Exhaustivity
• After
• Constant memory consumption
• Approximative but enough
@algolia
Building your worldwide infra
- Is long and difficult quest
- Is a real asset & differentiator
The Future of APIs
is Distributed
@algolia
All the details of our architecture
are on HighScalability.com
Want to know more?
THANK YOU!
sylvain@algolia.com
@algolia
Build Unique Search Experiences
W
e are hiring in SF, NYC and Paris 😊

More Related Content

PDF
Algolia-Pitch-Deck
PDF
Data Engineering 101
PPTX
Real-Time Data Flows with Apache NiFi
PDF
Freie Fahrt für die Reisendeninformation mit Kafka Streams
PDF
Pulsar - Distributed pub/sub platform
PPTX
System design for video streaming service
PDF
Enterprise Messaging with Apache ActiveMQ
PDF
Serverless
Algolia-Pitch-Deck
Data Engineering 101
Real-Time Data Flows with Apache NiFi
Freie Fahrt für die Reisendeninformation mit Kafka Streams
Pulsar - Distributed pub/sub platform
System design for video streaming service
Enterprise Messaging with Apache ActiveMQ
Serverless

What's hot (20)

ODP
pfSense presentation
PDF
The basics of fluentd
PPSX
Big Data Redis Mongodb Dynamodb Sharding
PDF
2019 DevSecOps Reference Architectures
PPTX
Cloud Migration PPT -final.pptx
PDF
Aurora MySQL Backtrack을 이용한 빠른 복구 방법 - 진교선 :: AWS Database Modernization Day 온라인
PDF
API Management
PDF
Event Driven Microservices with Spring Cloud Stream #jjug_ccc #ccc_ab3
PPTX
Microservices Architecture
PDF
Infrastructure as a Service ( IaaS)
PDF
Real time analytics at uber @ strata data 2019
PDF
Bjorn Rabenstein. SRE, DevOps, Google, and you
PPTX
Monoliths and Microservices
PDF
Squid, SquidGuard, and Lightsquid on pfSense 2.3 & 2.4 - pfSense Hangout Janu...
PDF
Clean Infrastructure as Code
PPTX
Future Of DevOps Trends 2023
PPSX
Microservices Testing Strategies JUnit Cucumber Mockito Pact
PDF
OWASP Top 10 Web Application Vulnerabilities
ODP
Introduction to Nginx
pfSense presentation
The basics of fluentd
Big Data Redis Mongodb Dynamodb Sharding
2019 DevSecOps Reference Architectures
Cloud Migration PPT -final.pptx
Aurora MySQL Backtrack을 이용한 빠른 복구 방법 - 진교선 :: AWS Database Modernization Day 온라인
API Management
Event Driven Microservices with Spring Cloud Stream #jjug_ccc #ccc_ab3
Microservices Architecture
Infrastructure as a Service ( IaaS)
Real time analytics at uber @ strata data 2019
Bjorn Rabenstein. SRE, DevOps, Google, and you
Monoliths and Microservices
Squid, SquidGuard, and Lightsquid on pfSense 2.3 & 2.4 - pfSense Hangout Janu...
Clean Infrastructure as Code
Future Of DevOps Trends 2023
Microservices Testing Strategies JUnit Cucumber Mockito Pact
OWASP Top 10 Web Application Vulnerabilities
Introduction to Nginx
Ad

Similar to Algolia - Hosted Search API (20)

PDF
Algolia's Fury Road to a Worldwide API - Take Off Conference 2016
PDF
Fury road to a worldwide API - API Days - December 2015
PDF
Algolia's Fury Road to a Worldwide API
PPTX
Using AWS To Build A Scalable Machine Data Analytics Service
PPTX
ACDKOCHI19 - Technical Presentation - Connecting 10000 cars to the AWS Cloud
PDF
Sumo Logic QuickStart Webinar - Jan 2016
PDF
Airflow @ Agari
PDF
GraphQL API on a Serverless Environment
PPTX
MassTLC Cloud Summit Keynote
PPTX
AWS Techniques and lessons writing low cost autoscaling GitLab runners
PDF
Cloud Native Data Pipelines (DataEngConf SF 2017)
PPTX
Scylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
PDF
Elastic Data Analytics Platform @Datadog
PPTX
Gluecon 2013 - NetflixOSS Cloud Native Tutorial Introduction
PDF
In-Memory Data Grids - Ampool (1)
PDF
Best practices for highly available and large scale SolrCloud
PPTX
Where Is My Data - ILTAM Session
PDF
Overview of data analytics service: Treasure Data Service
PPTX
Microsoft Azure Cost Optimization and improve efficiency
PDF
ITSubbotik - как скрестить ежа с ужом или подводные камни внедрения функциона...
Algolia's Fury Road to a Worldwide API - Take Off Conference 2016
Fury road to a worldwide API - API Days - December 2015
Algolia's Fury Road to a Worldwide API
Using AWS To Build A Scalable Machine Data Analytics Service
ACDKOCHI19 - Technical Presentation - Connecting 10000 cars to the AWS Cloud
Sumo Logic QuickStart Webinar - Jan 2016
Airflow @ Agari
GraphQL API on a Serverless Environment
MassTLC Cloud Summit Keynote
AWS Techniques and lessons writing low cost autoscaling GitLab runners
Cloud Native Data Pipelines (DataEngConf SF 2017)
Scylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
Elastic Data Analytics Platform @Datadog
Gluecon 2013 - NetflixOSS Cloud Native Tutorial Introduction
In-Memory Data Grids - Ampool (1)
Best practices for highly available and large scale SolrCloud
Where Is My Data - ILTAM Session
Overview of data analytics service: Treasure Data Service
Microsoft Azure Cost Optimization and improve efficiency
ITSubbotik - как скрестить ежа с ужом или подводные камни внедрения функциона...
Ad

More from enterprisesearchmeetup (6)

PDF
Cisco meetup-25 april2017
PPTX
ElasticSearch - Introduction to Aggregations
PPTX
The Elastic ELK Stack
PPTX
Relevancy and Search Quality Analysis - Search Technologies
PDF
Scalable Search Analytics
PPTX
Practical Relevance Measurement
Cisco meetup-25 april2017
ElasticSearch - Introduction to Aggregations
The Elastic ELK Stack
Relevancy and Search Quality Analysis - Search Technologies
Scalable Search Analytics
Practical Relevance Measurement

Recently uploaded (20)

PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
Big Data Technologies - Introduction.pptx
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Approach and Philosophy of On baking technology
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Electronic commerce courselecture one. Pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
Review of recent advances in non-invasive hemoglobin estimation
Spectral efficient network and resource selection model in 5G networks
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Unlocking AI with Model Context Protocol (MCP)
Big Data Technologies - Introduction.pptx
NewMind AI Weekly Chronicles - August'25 Week I
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Chapter 3 Spatial Domain Image Processing.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
Approach and Philosophy of On baking technology
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Digital-Transformation-Roadmap-for-Companies.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
The AUB Centre for AI in Media Proposal.docx
Per capita expenditure prediction using model stacking based on satellite ima...
Agricultural_Statistics_at_a_Glance_2022_0.pdf
MIND Revenue Release Quarter 2 2025 Press Release
Electronic commerce courselecture one. Pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Mobile App Security Testing_ A Comprehensive Guide.pdf

Algolia - Hosted Search API