SlideShare a Scribd company logo
How One BillionSalesforce recordsCan Be Replicated with Minimal API Usage 
Baruch Oxman 
R&D Manager, Implisit 
@implisithq, @baruchoxman
Baruch Oxman 
R&D Manager
In this session… 
•Implisit -Intro & Motivation 
•Salesforce APIs Usage & Limits -Overview 
•Efficient use of Salesforce APIs 
•Scale and limitations 
•Other pitfalls and tips
Implisit –The End of CRM Data Entry 
•Implisit uses Data-Mining and Machine Learning to keep Salesforce updated: 
–Updating emails and calendar events to Salesforce automatically 
–Creating and updating Accounts, Opportunities, Contacts, Leads 
–Keeping team informed on all client communications 
•Using text analysis: 
–Creating meaningful business insights 
–Improving forecasting and sales pipeline management 
•Requires Salesforce data replication for offline processing
Data Replication Goals 
•Minimize your API usage 
–Avoid reaching the API limit 
–API limits are shared between all API-connected apps –other apps can be blocked 
•Minimize sync cycle time 
–Don’t makeour customers wait for too long
Salesforce API Limits 
•Daily API limits for Salesforce Editions: 
–Unlimited/Performance: # of users x 5,000, up to 1,000,000 
–Enterprise/Professional: # of users x 1,000 
–Developer: 15,000 
–Sandbox: 5,000,000 
•In-parallel API calls limit (25 –production, 5 –dev) 
Source & more info: https://help.salesforce.com/HTViewHelpDoc?id=integrate_api_rate_limiting.htm
Performance Stats 
•Keeping over one billionSalesforce records replicated in-sync 
–27 Salesforce object types are replicated (e.g. Accounts, Contacts) 
•Initial sync 
–600-1000 API calls in total 
•Updates sync 
–200-400 API calls in total 
–Performed every few hours
How One Billion Salesforce records Can Be Replicated with Minimal API Usage
•Bulk (Async) API 
–Large amounts of records in a single request (fewer API calls) 
–Slow, requires polling for results 
–Implements internal retries 
–Does not support some objects (e.g. OpportunityHistory) 
Salesforce API Types 
•REST API 
–Fast, synchronous queries 
–Up to 2,000 records per request 
–Each request –single API call 
–Simple usage 
https://developer.salesforce.com/blogs/tech-pubs/2011/10/salesforce-apis-what-they-are-when-to-use-them.html
Are you ready to replicate ?
Replication method 
Initial Fetching 
Changes Fetching
Replication method –Initial fetching 
•Using Bulk API as much as possible 
•Fetch all records for each relevant object type 
–Lots of data 
–Only non-deleted records 
•Paginate by CreatedDate 
•Example: 
–1stquery: “…ORDER BY CreatedDateLIMIT 100000” 
–Subsequent: “…WHERE CreatedDate> 2014-08-31T02:29:29Z ORDER BY CreatedDateLIMIT 100000”
Replication method –Changes fetching 
•Fetch only records that changed since the previous fetch time 
–Less data –only changes 
–Take care of updates and deletions 
•Using SystemModstampas indicator for changes in record 
•Same pagination logic as in initial fetching 
•Example: 
–1stquery: “…WHERE SystemModstamp> 2014-07-31T02:29:29Z AND ORDER BY CreatedDateLIMIT 100000” 
–Subsequent:“…WHERE SystemModstamp> 2014-07-31T02:29:29Z AND CreatedDate> 2014-08-31T02:29:29Z ORDER BY CreatedDateLIMIT 100000” 
•Bulk changes fetching VS getUpdated()
Deleted items 
•Motivation: 
–Required to maintain consistent sync 
•Two implementation options 
–Use getDeleted()call in SOAP API (our choice) 
–Use queryAll(isDeleted= True)call in REST API 
•Potentially more API calls 
•Some objects can become “undeleted” !
Getting all fields 
•No “SELECT *” support 
•Get all fields for table using “describe” 
–Optionally, filter the fields (skip custom fields, etc…) 
–Non-visible fields (due to security restrictions) 
•Use the field names in the query 
•Limitation: query length cannot exceed 20,000 characters* 
* http://www.salesforce.com/us/developer/docs/soql_sosl/Content/sforce_api_calls_soql_select.htm
User Access Restrictions 
•Full access rights are strongly encouraged 
–Full view of all objects 
–Limited access rights → slower queries 
•Reference Fields –special case 
–Tasks / Events -WhoId, WhatId 
–Attachment -ParentId 
–Reference fields make access checks in Salesforce even slower 
–Limited to 100,000 different values per query 
–Solution: query in smaller chunks
Error handling 
•Nothing is fail-safe 
•Different APIs produce different errors 
•Examples: 
–Query too long (too many fields) 
–Scale limitations 
–Communication errors 
–Salesforce maintenance windows 
•Add support for anything you encounter 
–“Rare” becomes “frequent” once you scale 
•ABR (Always Be Retrying) 
•Remember to clean up upon errors 
–Close open bulk jobs
Unavailable Salesforce objects 
•Some orgs make some of the objects unavailable 
–Using security restriction 
–For example, Lead or Opportunity 
•Check using describeSObjectsfor each object, before fetching 
•Safely skip when not supported
Summary 
•Implisit -Intro & Motivation 
•Salesforce APIs Overview 
•Efficient use of API 
•Scale and limitations 
•Other pitfalls and tips
Additional Resources: 
•API Call Basics 
•Salesforce App Limits Cheat Sheet 
•Understanding Execution Governors and Limits 
•Query & Search Optimization Cheat Sheet 
•Bulk Query Details
How One Billion Salesforce records Can Be Replicated with Minimal API Usage
How One Billion Salesforce records Can Be Replicated with Minimal API Usage

More Related Content

PPTX
Replicating One Billion Records with Minimal API Usage
PPTX
Server for POS
PPTX
Fast parallel data loading with the bulk API
PPTX
Salesforce Apex Hours : How Lightning Platform Query Optimizer works for LDV
PPTX
Asynchronous Apex Salesforce World Tour Paris 2015
DOC
Varun-CV-J
PPTX
SenchaCon 2016: Oracle Forms Modernisation - Owen Pagan
PDF
Wave Analytics: Developing Predictive Business Intelligence Apps
Replicating One Billion Records with Minimal API Usage
Server for POS
Fast parallel data loading with the bulk API
Salesforce Apex Hours : How Lightning Platform Query Optimizer works for LDV
Asynchronous Apex Salesforce World Tour Paris 2015
Varun-CV-J
SenchaCon 2016: Oracle Forms Modernisation - Owen Pagan
Wave Analytics: Developing Predictive Business Intelligence Apps

What's hot (20)

PPTX
Monolith to microservices - our journey
PDF
API Economy, Realizing the Business Value of APIs
PPT
Difference between sage erp 300 standard and sage erp 300 advanced
PPT
XML Publisher (www.aboutoracleapps.com)
PPTX
Part I: SharePoint 2013 Administration by Todd Klindt and Shane Young - SPTec...
PPTX
SITIST 2017 Dev - Alexa Custom Skill Development with SAP HANA XSA
PPTX
Scale net apps in aws
PPTX
Exchange Integration in 5.0, by Doug Johnson
PPTX
Site templates, site life cycle management and Modern SharePoint
PPTX
SEO - Trending search
PPTX
Bpm company code camp - configuration or coding with pega
PPTX
Sage 300 ERP: Environment setup and configuration
PPTX
Applications Manager Technical Overview
PPTX
Sage 300 ERP: Advanced virtulization optimization
PDF
Building better SQL Server Databases
PPTX
Benefits of developing single page web applications using angular js
PPTX
Installing SharePoint 2013 – Step by Step presented by Alan Richards
PDF
Office Online Server 2016 - a must for on-premises installation for SharePoin...
PDF
Real-time SQL Access to Your Salesforce.com Data Using Progress Data Direct
PPTX
Creating Workflows in Project Online
Monolith to microservices - our journey
API Economy, Realizing the Business Value of APIs
Difference between sage erp 300 standard and sage erp 300 advanced
XML Publisher (www.aboutoracleapps.com)
Part I: SharePoint 2013 Administration by Todd Klindt and Shane Young - SPTec...
SITIST 2017 Dev - Alexa Custom Skill Development with SAP HANA XSA
Scale net apps in aws
Exchange Integration in 5.0, by Doug Johnson
Site templates, site life cycle management and Modern SharePoint
SEO - Trending search
Bpm company code camp - configuration or coding with pega
Sage 300 ERP: Environment setup and configuration
Applications Manager Technical Overview
Sage 300 ERP: Advanced virtulization optimization
Building better SQL Server Databases
Benefits of developing single page web applications using angular js
Installing SharePoint 2013 – Step by Step presented by Alan Richards
Office Online Server 2016 - a must for on-premises installation for SharePoin...
Real-time SQL Access to Your Salesforce.com Data Using Progress Data Direct
Creating Workflows in Project Online
Ad

Similar to How One Billion Salesforce records Can Be Replicated with Minimal API Usage (20)

PDF
SharePoint Saturday The Conference 2011 - SP2010 Performance
PDF
Creating a RESTful api without losing too much sleep
PDF
PPTX
Super simple introduction to REST-APIs (2nd version)
PDF
SharePoint Saturday San Antonio: SharePoint 2010 Performance
PDF
Flink Forward San Francisco 2018: Dave Torok & Sameer Wadkar - "Embedding Fl...
PPTX
How Serverless Changes DevOps
PPTX
SharePoint 2013 - What's New
PDF
REST - Why, When and How? at AMIS25
PPTX
Capacity Planning
PPTX
Building high performance and scalable share point applications
KEY
Solr 101
PPTX
Drupal performance
PDF
Business-friendly library for inter-service communication
PPTX
Share point 2013 enterprise search (public)
PDF
SQL Server and SharePoint - Best Practices presented by Steffen Krause, Micro...
PPT
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
PPTX
Salesforce1 API Overview
PPTX
Design API using RAML - basics
PDF
Boost the Performance of SharePoint Today!
SharePoint Saturday The Conference 2011 - SP2010 Performance
Creating a RESTful api without losing too much sleep
Super simple introduction to REST-APIs (2nd version)
SharePoint Saturday San Antonio: SharePoint 2010 Performance
Flink Forward San Francisco 2018: Dave Torok & Sameer Wadkar - "Embedding Fl...
How Serverless Changes DevOps
SharePoint 2013 - What's New
REST - Why, When and How? at AMIS25
Capacity Planning
Building high performance and scalable share point applications
Solr 101
Drupal performance
Business-friendly library for inter-service communication
Share point 2013 enterprise search (public)
SQL Server and SharePoint - Best Practices presented by Steffen Krause, Micro...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Salesforce1 API Overview
Design API using RAML - basics
Boost the Performance of SharePoint Today!
Ad

Recently uploaded (20)

PDF
top salesforce developer skills in 2025.pdf
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PDF
System and Network Administration Chapter 2
PDF
How Creative Agencies Leverage Project Management Software.pdf
PDF
Understanding Forklifts - TECH EHS Solution
PDF
Digital Strategies for Manufacturing Companies
PPTX
history of c programming in notes for students .pptx
PPTX
CHAPTER 2 - PM Management and IT Context
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PPTX
L1 - Introduction to python Backend.pptx
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PPTX
ManageIQ - Sprint 268 Review - Slide Deck
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
top salesforce developer skills in 2025.pdf
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
Design an Analysis of Algorithms I-SECS-1021-03
Which alternative to Crystal Reports is best for small or large businesses.pdf
Odoo Companies in India – Driving Business Transformation.pdf
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
System and Network Administration Chapter 2
How Creative Agencies Leverage Project Management Software.pdf
Understanding Forklifts - TECH EHS Solution
Digital Strategies for Manufacturing Companies
history of c programming in notes for students .pptx
CHAPTER 2 - PM Management and IT Context
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
L1 - Introduction to python Backend.pptx
2025 Textile ERP Trends: SAP, Odoo & Oracle
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
ManageIQ - Sprint 268 Review - Slide Deck
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx

How One Billion Salesforce records Can Be Replicated with Minimal API Usage

  • 1. How One BillionSalesforce recordsCan Be Replicated with Minimal API Usage Baruch Oxman R&D Manager, Implisit @implisithq, @baruchoxman
  • 3. In this session… •Implisit -Intro & Motivation •Salesforce APIs Usage & Limits -Overview •Efficient use of Salesforce APIs •Scale and limitations •Other pitfalls and tips
  • 4. Implisit –The End of CRM Data Entry •Implisit uses Data-Mining and Machine Learning to keep Salesforce updated: –Updating emails and calendar events to Salesforce automatically –Creating and updating Accounts, Opportunities, Contacts, Leads –Keeping team informed on all client communications •Using text analysis: –Creating meaningful business insights –Improving forecasting and sales pipeline management •Requires Salesforce data replication for offline processing
  • 5. Data Replication Goals •Minimize your API usage –Avoid reaching the API limit –API limits are shared between all API-connected apps –other apps can be blocked •Minimize sync cycle time –Don’t makeour customers wait for too long
  • 6. Salesforce API Limits •Daily API limits for Salesforce Editions: –Unlimited/Performance: # of users x 5,000, up to 1,000,000 –Enterprise/Professional: # of users x 1,000 –Developer: 15,000 –Sandbox: 5,000,000 •In-parallel API calls limit (25 –production, 5 –dev) Source & more info: https://help.salesforce.com/HTViewHelpDoc?id=integrate_api_rate_limiting.htm
  • 7. Performance Stats •Keeping over one billionSalesforce records replicated in-sync –27 Salesforce object types are replicated (e.g. Accounts, Contacts) •Initial sync –600-1000 API calls in total •Updates sync –200-400 API calls in total –Performed every few hours
  • 9. •Bulk (Async) API –Large amounts of records in a single request (fewer API calls) –Slow, requires polling for results –Implements internal retries –Does not support some objects (e.g. OpportunityHistory) Salesforce API Types •REST API –Fast, synchronous queries –Up to 2,000 records per request –Each request –single API call –Simple usage https://developer.salesforce.com/blogs/tech-pubs/2011/10/salesforce-apis-what-they-are-when-to-use-them.html
  • 10. Are you ready to replicate ?
  • 11. Replication method Initial Fetching Changes Fetching
  • 12. Replication method –Initial fetching •Using Bulk API as much as possible •Fetch all records for each relevant object type –Lots of data –Only non-deleted records •Paginate by CreatedDate •Example: –1stquery: “…ORDER BY CreatedDateLIMIT 100000” –Subsequent: “…WHERE CreatedDate> 2014-08-31T02:29:29Z ORDER BY CreatedDateLIMIT 100000”
  • 13. Replication method –Changes fetching •Fetch only records that changed since the previous fetch time –Less data –only changes –Take care of updates and deletions •Using SystemModstampas indicator for changes in record •Same pagination logic as in initial fetching •Example: –1stquery: “…WHERE SystemModstamp> 2014-07-31T02:29:29Z AND ORDER BY CreatedDateLIMIT 100000” –Subsequent:“…WHERE SystemModstamp> 2014-07-31T02:29:29Z AND CreatedDate> 2014-08-31T02:29:29Z ORDER BY CreatedDateLIMIT 100000” •Bulk changes fetching VS getUpdated()
  • 14. Deleted items •Motivation: –Required to maintain consistent sync •Two implementation options –Use getDeleted()call in SOAP API (our choice) –Use queryAll(isDeleted= True)call in REST API •Potentially more API calls •Some objects can become “undeleted” !
  • 15. Getting all fields •No “SELECT *” support •Get all fields for table using “describe” –Optionally, filter the fields (skip custom fields, etc…) –Non-visible fields (due to security restrictions) •Use the field names in the query •Limitation: query length cannot exceed 20,000 characters* * http://www.salesforce.com/us/developer/docs/soql_sosl/Content/sforce_api_calls_soql_select.htm
  • 16. User Access Restrictions •Full access rights are strongly encouraged –Full view of all objects –Limited access rights → slower queries •Reference Fields –special case –Tasks / Events -WhoId, WhatId –Attachment -ParentId –Reference fields make access checks in Salesforce even slower –Limited to 100,000 different values per query –Solution: query in smaller chunks
  • 17. Error handling •Nothing is fail-safe •Different APIs produce different errors •Examples: –Query too long (too many fields) –Scale limitations –Communication errors –Salesforce maintenance windows •Add support for anything you encounter –“Rare” becomes “frequent” once you scale •ABR (Always Be Retrying) •Remember to clean up upon errors –Close open bulk jobs
  • 18. Unavailable Salesforce objects •Some orgs make some of the objects unavailable –Using security restriction –For example, Lead or Opportunity •Check using describeSObjectsfor each object, before fetching •Safely skip when not supported
  • 19. Summary •Implisit -Intro & Motivation •Salesforce APIs Overview •Efficient use of API •Scale and limitations •Other pitfalls and tips
  • 20. Additional Resources: •API Call Basics •Salesforce App Limits Cheat Sheet •Understanding Execution Governors and Limits •Query & Search Optimization Cheat Sheet •Bulk Query Details