SlideShare a Scribd company logo
This Photo by Unknown Author is licensed
under CC BY-SA
talk talk
http://speakerscore.com/7KKX
bobward@microsoft.com
@bobwardms
#bobsql
The What and Why
Data, Indexes, and Natively Compiled Stored Procedures
Concurrency and Transactions
Logging and Checkpoint
Based on SQL
Server 2016Check out the Bonus
Material Section
ETL IOT Tempdb
Project Verde
‱ 2007
“Early”
Hekaton
‱ 2009/2010
Hekaton
becomes In-
Memory OLTP
‱ SQL Server 2014
In-Memory
OLTP
Unleashed
‱ SQL Server 2016
OLTP Through the Looking Glass, and What We Found There
by Stavros Harizopoulos, Daniel J. Abadi, Samuel Madden,
Michael Stonebraker , SIGMOD 2008
XTP = eXtreme Transaction Processing
HK = Hekaton
጑ÎșατόΜ
“Feels” like normal disk-based tables
but in memory
Internally completely different
Hash or non-clustered index choices (at
least one required)
Inside sql server in memory oltp sql sat nyc 2017
USER
GC
USER
USER
USER
USER
USER
Node 0
GC
Node <n>
user transactions
Garbage collection one
per node
database
CHECKPOINT
CONTROLLER
CHECKPOINT
CLOSE
LOG FLUSH
LOG FLUSH
thread
Data
Serializer
thread thread
In-Memory Thread Pools
CHECKPOINT I/O
U
U
P
XTP_OFFLINE_CKPT
H
H
P
U
H
P
P
U
U
H
P
user scheduler
hidden scheduler
preemptive
You could see
session_id > 50
Data
Serializer
dm_xtp_threads
Your data.
“bytes” to HK
engine
Points to another row in the
“chain”. One pointer for
each index on the table
Timestamp of
INSERT
Halloween
protection
Number of
pointers = #
indexes
Rows don’t
have to be
contiguous in
memory
Timestamp of
DELETE. ∞ =
“not deleted”
‱ The HK Engine doesn’t understand row formats or key value comparisons
‱ So when a table is built or altered, we create, compile, and build a DLL bound to
the table used for interop queries
dbid
object_id
Symbol file
Rebuilt when database
comes online or table
altered
cl.exe
output
Generated C
source
t = schema DLL
p =native proc
create table letsgomavs
(col1 int not null identity primary key
nonclustered hash with (bucket_count =
1000000),
col2 int not null,
col3 varchar(100)
)
with (memory_optimized = on, durability
= schema_and_data)
struct NullBitsStruct_2147483654
{
unsigned char hkc_isnull_3:1;
};
struct hkt_2147483654
{
long hkc_1;
long hkc_2;
struct NullBitsStruct_2147483654 null_bits;
unsigned short hkvdo[2];
};
Header hkc_1 hkc_2
Null
bits
Offset array col3
“your data”
nullable
100,∞ Bob Dallas
hash on name
columns = name char(100), city char (100)
hash index on name bucket_count = 8
hash index on city bucket_count = 8 hash on city
0 0
4
5
200,∞ Bob Fort Worth
90,∞ Ginger Austin
50,∞ Ryan Arlington
70,∞ Ginger Houston
3
4
5
chain count =
3
Hash
collision
duplicate key !=
collision
Begin ts reflects
order of inserts
ts
Mixed Abstract
Tree (MAT) built
‱ Abstraction with flow
and SQL specific info
‱ Query trees into
MAT nodes
‱ XML file in the DLL
dir
Converted to
Pure Imperative
Tree (PIT)
‱ General nodes of a
tree that are SQL
agnostic. C like data
structures
‱ Easier to turn into
final C code
Final C code built
and written
‱ C instead of C++ is
simpler and faster
compile.
‱ No user defined
strings or SQL
identifiers to prevent
injection
Call cl.exe to
compile, link, and
generate DLL
‱ Many of what is
needed to execute in
the DLL
‱ Some calls into the
HK Engine
All files in
BINNXTP
VC has
compiler files,
DLLs and libs
Gen has HK
header and libs
Call cl.exe to
compile and
link
Done in memory
xtp_matgen XEvent
Inside sql server in memory oltp sql sat nyc 2017
In-memory OLTP versioning
In-Mem TableDisk based table
1, ‘Row 1’
100
101
102
103
104
105
106
Time T1 T2
BEGIN TRAN
BEGIN TRAN
2, ‘Row 2’
SELECT
COMMIT
SELECT
COMMIT
100
101
102
103
104
105
106
Time T1 T2
BEGIN TRAN
BEGIN TRAN
2, ‘Row 2’
SELECT
COMMIT
SELECT
COMMIT
read committed and
SERIALIZABLE = blocked
XRCSI = Row 1
Snapshot = Row 1
RCSI = Row 1 and 2
Snapshot = Row 1 SERIALIZABLE = Msg
41325
Snapshot always
used = Row 1
X
‱ Code executing transactions in the Hekaton Kernel are lock, latch, and spinlock free
Locks
‱ Only SCH-S needed for interop queries
‱ Database locks
Latches
‱ No pages = No page latches
‱ Latch XEvents don’t fire inside HK
Spinlocks
‱ Spinlock class never used
The “host” may wait – tLog, SQLOS
We use Thread
Local Storage (TLS)
to “guard” pointers
We can “walk” lists
lock free. Retries
may be needed
Atomic “Compare
and Swap” (CAS) to
modify
IsXTPSupported() = cmpxchg16b
Spinlocks use CAS
(‘cmpxchg’) to
“acquire” and
“release”
Great blog
post here
*XTP* wait
types not in
transaction
code
deadlock free
CMED_HASH_SET fix
Release latches and locks
Update index pages ( more locks and latch )
Modify page
Latch page
Obtain locks
INSERT LOG record
In-Memory OLTP INSERT
Maintain index in
memory
Insert ROW into
memory
COMMIT Transaction = Log
Record and Flush
COMMIT Transaction = Insert HK
Log Record and Flush
Page Split
Sys Tran =
Log Flush
Spinlocks
No index
logging
SCHEMA_ONLY
no logging
No latch
No spinlock
LOP_BEGIN_XACT
LOP_INSERT_ROWS – heap page
LOP_INSERT_ROWS – ncl index
LOP_BEGIN_XACT
LOP_MODIFY_ROW – PFS
LOP_HOBT_DELTA
LOP_FORMAT_PAGE
LOP_COMMIT_XACT
LOP_SET_FREE_SPACE - PFS
LOP_COMMIT_XACT
100 rows
Log flush
PFS latch multiple times
213 log records @ 33Kb
PFS latch
Metadata access
1 pair for every row
Alloc new page twice
LOP_BEGIN_XACT
LOP_HK
LOP_COMMIT_XACT
144
11988
84
HK_LOP_BEGIN_TX
HK_LOP_INSERT_ROW
HK_LOP_COMMIT_TX
100
Only inserted into log cache at commit.
Log flush ROLLBACK = No
log records for HK
If LOP_HK too big we
may need more than
one
Typically 8Mb
but can be
128Mb
Why CHECKPOINT?
All data written in pairs of data and delta files
Typically 128Mb but can be 1Gb
No WAL protocol
CHECKPOINT FILE types and states PRECREATED
ACTIVE
UNDER CONSTRUCTION
WAITING..TRUNCATION
ROOT
FREE
After first CREATE TABLE
DELTA
DATA
ROOT
FREE
INSERT data rows
DELTA
DATA
ROOT
ROOT
CHECKPOINT event
DELTA
DATA
0 to 100 TS
0 to 100 TS
ROOT
ROOT
DELTA
DATA
INSERT more
data rows
DATA
DELTA
101 to 200 TS
101 to 200 TS
These will be used
to populate
table on startup
and then apply log
Checkpoint
File Pair
(CFP) and are
what is in
tlog before
CHECKPOINT
This can be
reused or
deleted at log
truncation
We constantly keep
PRECREATED, FREE
files available
Any tran in this
range
FREEFREE
Automatic Checkpoint
Log Truncation
Inside sql server in memory oltp sql sat nyc 2017
Inside sql server in memory oltp sql sat nyc 2017
‱ SQL Server In-Memory OLTP Internals for SQL Server 2016
‱ In-Memory OLTP Videos: What it is and When/How to use it
‱ Explore In-Memory OLTP architectures and customer case studies
‱ Review In-Memory OLTP in SQL Server 2016 and Azure SQL Database
‱ In-Memory OLTP (In-Memory Optimization) docs
‱ Blog post on In-Memory OLTP and checkpoint files
‱ Retry logic for In-Memory OLTP Transactions
http://speakerscore.com/7KKX
Inside sql server in memory oltp sql sat nyc 2017
Always On Availability Groups
HTAP applications
Azure SQL Database
Cross container transactions
Table variables
here
BACKUP/RESTORE
Transaction Performance Analysis Report
Hear the case
studies
How do you find data rows in a normal SQL table?
‱ Heap = Use IAM pages
‱ Clustered Index = Find root page and traverse index
What about an in-memory table?
‱ Hash index table pointer known in HK metadata for a table. Hash the index key, go to bucket pointer,
traverse chain to find row
‱ Page Mapping Table used for range indexes and has a known pointer in HK metadata. Traverse the
range index which points to data rows
‱ Data exists in memory as pointers to rows (aka a heap). No page structure
All data rows have known header but data is opaque to HK engine
‱ Schema DLL and/or Native Compiled Proc DLL knows the format of the row data
‱ Schema DLL and/or Native Compiled Proc DLL knows how to find “key” inside the index
Compute your
estimated row size
here
“bag of
bytes”
Hash index
scan is
possible
TS = 243
SELECT Name,
City FROM T1
“write set” = logged records
“read
set”
BEGIN TRAN TX3 – TS 246
SELECT City FROM T1 WITH
(REPEATABLEREAD) WHERE Name =
'Jane';
UPDATE T1 WITH (REPEATABLEREAD) SET
City ‘Helinski’ WHERE Name = 'Susan';
COMMIT TRAN -- commits at timestamp
255
Greg, Lisbon
Susan Bogata
Jane Helsinki
FAIL = ‘Jane’ changed after I started
but before I committed and I’m
REPEATABLEREAD. With SQL update to
Jane would have been blocked
Commit dependency
create procedure cowboys_proc_scan
with native_compilation, schemabinding as
begin atomic with
(transaction isolation level = snapshot, language = N'English')
select player_number, player_name from dbo.starsoftheteam
..
end
Compile this
Into a DLL
Required. No referenced
object/column
can be dropped or altered.
No SCH lock required
Everything in
block a single
tran
These are required. There are other options
Iso
levels
still use
MVCC
Your queries
Hekaton implements its own memory management system built on SQLOS
‱ MEMORYCLERK_XTP (DB_ID_<dbid>) uses SQLOS Page allocator
‱ Variable heaps created per table and range index
‱ Hash indexes using partitioned memory objects for buckets
‱ “System” memory for database independent tasks
‱ Memory only limited by the OS (24TB in Windows Server 2016)
‱ Details in dm_db_xtp_memory_consumers and dm_xtp_system_memory_consumers
In-Memory does recognize SQL Server Memory Pressure
‱ Garbage collection is triggered
‱ If OOM, no inserts allowed but you may be able to DELETE to free up space
Allocated at
create index
time
Locked and
Large apply
Remember this is ALL memory
size
Binding to your own Resource Pool
What about CPU and I/O?
no classifier
function
Not the same
as memory
Check out this blog
Upgrade from 2014 to 2016 can take time
Large checkpoint files for 2016
https://support.microsoft.com/en-us/kb/3090141
From the CSS team
Inside sql server in memory oltp sql sat nyc 2017
37
Client Access
· Backup/
Restore
· DBCC
· Buik load
· Exception
handling
· Event logging
· XEvents
SOS
· Process model (scheduling & synchronization)
· Memory management & caching
· I/O
Txn’s
Lock
mgr
Buffer
Pool
Access Methods
Query Execution & Expression
Services
UCS
Query
Optimization
TSQL Parsing & Algebrizer
Metadata
File
Manager
Database
Manager
Procedure
Cache
· TDS handler
· Session Mgmt
Security
Support Utilities
UDTs &
CLR
Stored
Procs
SNI
Service
Broker
Logging
&
Recovery
10%10%
Network,
TDS
T-SQL
interpreter
Query
Execution
Expressions
Access
Methods
Transaction
, Lock, Log
Managers
SOS, OS
and I/O
35%45%
SQLOS task and worker threads are the foundation
“User” tasks to run transactions
Hidden schedulers used for critical background tasks
Some tasks dedicated while others use a “worker” pool
SQLOS workers dedicated
to HK
NON-PREEMPTIVEPOOL
Background
workers needed
for on-demand
Parallel ALTER
TABLE
WORKERSPOOL
Other workers
needed for on-
demand (non-
preemptive)
Parallel MERGE
operation of
checkpoint files
PREEMPTIVEPOOL
Background
workers needed
on-demand -
preemptive
Serializers to
write out
checkpoint files
Command = XTP_THREAD_POOL wait_type =
DISPATCHER_QUEUE_SEMAPHORE
Command = UNKNOWN
TOKEN wait_type =
XTP_PREEMPTIVE_TASK
Ideal count = < # schedulers> ; idle timeout = 10 secs
HUU
H
U
hidden scheduler
user scheduler
‱ Multi-version Optimistic Concurrency prevents all blocking
‱ ALL UPDATEs are DELETE followed by INSERT
‱ DELETED rows not automatically removed from memory
‱ Deleted rows not visible to active transactions becomes
stale
‱ Garbage Collection process removes stale rows from
memory
‱ TRUNCATE TABLE not supported
Page deallocation
in SQL Server
Inside sql server in memory oltp sql sat nyc 2017
Query Plans and Stats
here
Query Store
XEvent and SQLTrace
Read the
docs
Multi-row
insert test
Remember
SCHEMA_ONLY has no
logging or I/OBULK INSERT for in-
mem executes and
logged just like
INSERT
Minimally logged
BULK INSERT took
271 log records @
27Kb
Latches required for
GAM, PFS, and system
table pates
Multiple Log Writers in SQL Server 2016
You could go to delayed durability
Log at the speed of memory
video
Over time we could have many CFPs
Inside sql server in memory oltp sql sat nyc 2017

More Related Content

PPTX
Inside SQL Server In-Memory OLTP
PPTX
Mokraćni organi
PPTX
Pravilno uzorkovanje za lab
PPT
Poremećaj metabolizma vode, natrijuma i kalijuma
PPT
Soc.zin - nauda 4.kl.
PPT
Kreativnost ociglednost-i-forme-rada-u-nastavi-saha1
PPT
Anatomija srca
PDF
Kosti ruke
Inside SQL Server In-Memory OLTP
Mokraćni organi
Pravilno uzorkovanje za lab
Poremećaj metabolizma vode, natrijuma i kalijuma
Soc.zin - nauda 4.kl.
Kreativnost ociglednost-i-forme-rada-u-nastavi-saha1
Anatomija srca
Kosti ruke

Similar to Inside sql server in memory oltp sql sat nyc 2017 (20)

PPTX
SQL Server In-Memory OLTP: What Every SQL Professional Should Know
PPTX
SQL Server 2014 In-Memory OLTP
PPTX
Hekaton introduction for .Net developers
PPTX
Sql server scalability fundamentals
PPTX
Novedades SQL Server 2014
PPTX
Sql server engine cpu cache as the new ram
PDF
SQL for Elasticsearch
PPTX
Proving out flash storage array performance using swingbench and slob
PPTX
Why databases cry at night
PDF
In-memory ColumnStore Index
 
PPTX
An introduction to column store indexes and batch mode
PPTX
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
PDF
ITB2017 - Slaying the ORM dragons with cborm
PPT
XMLDB Building Blocks And Best Practices - Oracle Open World 2008 - Marco Gra...
PDF
WebObjects Optimization
PDF
SQL Server 2014 Mission Critical Performance - Level 300 Presentation
PPTX
Silk_SQLSaturdayBatonRouge_kgorman_2024.pptx
PDF
The InnoDB Storage Engine for MySQL
PPTX
SQL Server It Just Runs Faster
PPTX
SQL Track: In Memory OLTP in SQL Server
SQL Server In-Memory OLTP: What Every SQL Professional Should Know
SQL Server 2014 In-Memory OLTP
Hekaton introduction for .Net developers
Sql server scalability fundamentals
Novedades SQL Server 2014
Sql server engine cpu cache as the new ram
SQL for Elasticsearch
Proving out flash storage array performance using swingbench and slob
Why databases cry at night
In-memory ColumnStore Index
 
An introduction to column store indexes and batch mode
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
ITB2017 - Slaying the ORM dragons with cborm
XMLDB Building Blocks And Best Practices - Oracle Open World 2008 - Marco Gra...
WebObjects Optimization
SQL Server 2014 Mission Critical Performance - Level 300 Presentation
Silk_SQLSaturdayBatonRouge_kgorman_2024.pptx
The InnoDB Storage Engine for MySQL
SQL Server It Just Runs Faster
SQL Track: In Memory OLTP in SQL Server
Ad

More from Bob Ward (13)

PPTX
Experience sql server on l inux and docker
PPTX
Build new age applications on azures intelligent data platform
PPTX
Brk2051 sql server on linux and docker
PPTX
Brk2045 upgrade sql server 2017 (on prem, iaa-s and paas)
PPTX
Experience SQL Server 2017: The Modern Data Platform
PPTX
Sql server hybrid what every sql professional should know
PPTX
Keep your environment always on with sql server 2016 sql bits 2017
PPTX
Enhancements that will make your sql database roar sp1 edition sql bits 2017
PPTX
Sql server 2016 it just runs faster sql bits 2017 edition
PPTX
Brk3288 sql server v.next with support on linux, windows and containers was...
PPTX
Brk3043 azure sql db intelligent cloud database for app developers - wash dc
PPTX
Gs08 modernize your data platform with sql technologies wash dc
PPTX
SQL Server R Services: What Every SQL Professional Should Know
Experience sql server on l inux and docker
Build new age applications on azures intelligent data platform
Brk2051 sql server on linux and docker
Brk2045 upgrade sql server 2017 (on prem, iaa-s and paas)
Experience SQL Server 2017: The Modern Data Platform
Sql server hybrid what every sql professional should know
Keep your environment always on with sql server 2016 sql bits 2017
Enhancements that will make your sql database roar sp1 edition sql bits 2017
Sql server 2016 it just runs faster sql bits 2017 edition
Brk3288 sql server v.next with support on linux, windows and containers was...
Brk3043 azure sql db intelligent cloud database for app developers - wash dc
Gs08 modernize your data platform with sql technologies wash dc
SQL Server R Services: What Every SQL Professional Should Know
Ad

Recently uploaded (20)

PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PPTX
Online Work Permit System for Fast Permit Processing
PDF
How Creative Agencies Leverage Project Management Software.pdf
PPTX
ISO 45001 Occupational Health and Safety Management System
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PDF
Softaken Excel to vCard Converter Software.pdf
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PPTX
Operating system designcfffgfgggggggvggggggggg
PDF
System and Network Administration Chapter 2
PPTX
Introduction to Artificial Intelligence
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PDF
medical staffing services at VALiNTRY
PPTX
Odoo POS Development Services by CandidRoot Solutions
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
How to Choose the Right IT Partner for Your Business in Malaysia
Internet Downloader Manager (IDM) Crack 6.42 Build 41
Navsoft: AI-Powered Business Solutions & Custom Software Development
Online Work Permit System for Fast Permit Processing
How Creative Agencies Leverage Project Management Software.pdf
ISO 45001 Occupational Health and Safety Management System
Upgrade and Innovation Strategies for SAP ERP Customers
Which alternative to Crystal Reports is best for small or large businesses.pdf
Softaken Excel to vCard Converter Software.pdf
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
Operating system designcfffgfgggggggvggggggggg
System and Network Administration Chapter 2
Introduction to Artificial Intelligence
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
medical staffing services at VALiNTRY
Odoo POS Development Services by CandidRoot Solutions
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
2025 Textile ERP Trends: SAP, Odoo & Oracle
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus

Inside sql server in memory oltp sql sat nyc 2017

  • 1. This Photo by Unknown Author is licensed under CC BY-SA
  • 3. The What and Why Data, Indexes, and Natively Compiled Stored Procedures Concurrency and Transactions Logging and Checkpoint Based on SQL Server 2016Check out the Bonus Material Section
  • 5. Project Verde ‱ 2007 “Early” Hekaton ‱ 2009/2010 Hekaton becomes In- Memory OLTP ‱ SQL Server 2014 In-Memory OLTP Unleashed ‱ SQL Server 2016 OLTP Through the Looking Glass, and What We Found There by Stavros Harizopoulos, Daniel J. Abadi, Samuel Madden, Michael Stonebraker , SIGMOD 2008
  • 6. XTP = eXtreme Transaction Processing HK = Hekaton
  • 8. “Feels” like normal disk-based tables but in memory Internally completely different Hash or non-clustered index choices (at least one required)
  • 10. USER GC USER USER USER USER USER Node 0 GC Node <n> user transactions Garbage collection one per node database CHECKPOINT CONTROLLER CHECKPOINT CLOSE LOG FLUSH LOG FLUSH thread Data Serializer thread thread In-Memory Thread Pools CHECKPOINT I/O U U P XTP_OFFLINE_CKPT H H P U H P P U U H P user scheduler hidden scheduler preemptive You could see session_id > 50 Data Serializer dm_xtp_threads
  • 11. Your data. “bytes” to HK engine Points to another row in the “chain”. One pointer for each index on the table Timestamp of INSERT Halloween protection Number of pointers = # indexes Rows don’t have to be contiguous in memory Timestamp of DELETE. ∞ = “not deleted”
  • 12. ‱ The HK Engine doesn’t understand row formats or key value comparisons ‱ So when a table is built or altered, we create, compile, and build a DLL bound to the table used for interop queries dbid object_id Symbol file Rebuilt when database comes online or table altered cl.exe output Generated C source t = schema DLL p =native proc
  • 13. create table letsgomavs (col1 int not null identity primary key nonclustered hash with (bucket_count = 1000000), col2 int not null, col3 varchar(100) ) with (memory_optimized = on, durability = schema_and_data) struct NullBitsStruct_2147483654 { unsigned char hkc_isnull_3:1; }; struct hkt_2147483654 { long hkc_1; long hkc_2; struct NullBitsStruct_2147483654 null_bits; unsigned short hkvdo[2]; }; Header hkc_1 hkc_2 Null bits Offset array col3 “your data” nullable
  • 14. 100,∞ Bob Dallas hash on name columns = name char(100), city char (100) hash index on name bucket_count = 8 hash index on city bucket_count = 8 hash on city 0 0 4 5 200,∞ Bob Fort Worth 90,∞ Ginger Austin 50,∞ Ryan Arlington 70,∞ Ginger Houston 3 4 5 chain count = 3 Hash collision duplicate key != collision Begin ts reflects order of inserts ts
  • 15. Mixed Abstract Tree (MAT) built ‱ Abstraction with flow and SQL specific info ‱ Query trees into MAT nodes ‱ XML file in the DLL dir Converted to Pure Imperative Tree (PIT) ‱ General nodes of a tree that are SQL agnostic. C like data structures ‱ Easier to turn into final C code Final C code built and written ‱ C instead of C++ is simpler and faster compile. ‱ No user defined strings or SQL identifiers to prevent injection Call cl.exe to compile, link, and generate DLL ‱ Many of what is needed to execute in the DLL ‱ Some calls into the HK Engine All files in BINNXTP VC has compiler files, DLLs and libs Gen has HK header and libs Call cl.exe to compile and link Done in memory xtp_matgen XEvent
  • 17. In-memory OLTP versioning In-Mem TableDisk based table 1, ‘Row 1’ 100 101 102 103 104 105 106 Time T1 T2 BEGIN TRAN BEGIN TRAN 2, ‘Row 2’ SELECT COMMIT SELECT COMMIT 100 101 102 103 104 105 106 Time T1 T2 BEGIN TRAN BEGIN TRAN 2, ‘Row 2’ SELECT COMMIT SELECT COMMIT read committed and SERIALIZABLE = blocked XRCSI = Row 1 Snapshot = Row 1 RCSI = Row 1 and 2 Snapshot = Row 1 SERIALIZABLE = Msg 41325 Snapshot always used = Row 1 X
  • 18. ‱ Code executing transactions in the Hekaton Kernel are lock, latch, and spinlock free Locks ‱ Only SCH-S needed for interop queries ‱ Database locks Latches ‱ No pages = No page latches ‱ Latch XEvents don’t fire inside HK Spinlocks ‱ Spinlock class never used The “host” may wait – tLog, SQLOS We use Thread Local Storage (TLS) to “guard” pointers We can “walk” lists lock free. Retries may be needed Atomic “Compare and Swap” (CAS) to modify IsXTPSupported() = cmpxchg16b Spinlocks use CAS (‘cmpxchg’) to “acquire” and “release” Great blog post here *XTP* wait types not in transaction code deadlock free CMED_HASH_SET fix
  • 19. Release latches and locks Update index pages ( more locks and latch ) Modify page Latch page Obtain locks INSERT LOG record In-Memory OLTP INSERT Maintain index in memory Insert ROW into memory COMMIT Transaction = Log Record and Flush COMMIT Transaction = Insert HK Log Record and Flush Page Split Sys Tran = Log Flush Spinlocks No index logging SCHEMA_ONLY no logging No latch No spinlock
  • 20. LOP_BEGIN_XACT LOP_INSERT_ROWS – heap page LOP_INSERT_ROWS – ncl index LOP_BEGIN_XACT LOP_MODIFY_ROW – PFS LOP_HOBT_DELTA LOP_FORMAT_PAGE LOP_COMMIT_XACT LOP_SET_FREE_SPACE - PFS LOP_COMMIT_XACT 100 rows Log flush PFS latch multiple times 213 log records @ 33Kb PFS latch Metadata access 1 pair for every row Alloc new page twice
  • 21. LOP_BEGIN_XACT LOP_HK LOP_COMMIT_XACT 144 11988 84 HK_LOP_BEGIN_TX HK_LOP_INSERT_ROW HK_LOP_COMMIT_TX 100 Only inserted into log cache at commit. Log flush ROLLBACK = No log records for HK If LOP_HK too big we may need more than one
  • 22. Typically 8Mb but can be 128Mb Why CHECKPOINT? All data written in pairs of data and delta files Typically 128Mb but can be 1Gb No WAL protocol
  • 23. CHECKPOINT FILE types and states PRECREATED ACTIVE UNDER CONSTRUCTION WAITING..TRUNCATION ROOT FREE After first CREATE TABLE DELTA DATA ROOT FREE INSERT data rows DELTA DATA ROOT ROOT CHECKPOINT event DELTA DATA 0 to 100 TS 0 to 100 TS ROOT ROOT DELTA DATA INSERT more data rows DATA DELTA 101 to 200 TS 101 to 200 TS These will be used to populate table on startup and then apply log Checkpoint File Pair (CFP) and are what is in tlog before CHECKPOINT This can be reused or deleted at log truncation We constantly keep PRECREATED, FREE files available Any tran in this range FREEFREE
  • 27. ‱ SQL Server In-Memory OLTP Internals for SQL Server 2016 ‱ In-Memory OLTP Videos: What it is and When/How to use it ‱ Explore In-Memory OLTP architectures and customer case studies ‱ Review In-Memory OLTP in SQL Server 2016 and Azure SQL Database ‱ In-Memory OLTP (In-Memory Optimization) docs ‱ Blog post on In-Memory OLTP and checkpoint files ‱ Retry logic for In-Memory OLTP Transactions http://speakerscore.com/7KKX
  • 29. Always On Availability Groups HTAP applications Azure SQL Database Cross container transactions Table variables here BACKUP/RESTORE Transaction Performance Analysis Report Hear the case studies
  • 30. How do you find data rows in a normal SQL table? ‱ Heap = Use IAM pages ‱ Clustered Index = Find root page and traverse index What about an in-memory table? ‱ Hash index table pointer known in HK metadata for a table. Hash the index key, go to bucket pointer, traverse chain to find row ‱ Page Mapping Table used for range indexes and has a known pointer in HK metadata. Traverse the range index which points to data rows ‱ Data exists in memory as pointers to rows (aka a heap). No page structure All data rows have known header but data is opaque to HK engine ‱ Schema DLL and/or Native Compiled Proc DLL knows the format of the row data ‱ Schema DLL and/or Native Compiled Proc DLL knows how to find “key” inside the index Compute your estimated row size here “bag of bytes” Hash index scan is possible
  • 31. TS = 243 SELECT Name, City FROM T1 “write set” = logged records “read set” BEGIN TRAN TX3 – TS 246 SELECT City FROM T1 WITH (REPEATABLEREAD) WHERE Name = 'Jane'; UPDATE T1 WITH (REPEATABLEREAD) SET City ‘Helinski’ WHERE Name = 'Susan'; COMMIT TRAN -- commits at timestamp 255 Greg, Lisbon Susan Bogata Jane Helsinki FAIL = ‘Jane’ changed after I started but before I committed and I’m REPEATABLEREAD. With SQL update to Jane would have been blocked Commit dependency
  • 32. create procedure cowboys_proc_scan with native_compilation, schemabinding as begin atomic with (transaction isolation level = snapshot, language = N'English') select player_number, player_name from dbo.starsoftheteam .. end Compile this Into a DLL Required. No referenced object/column can be dropped or altered. No SCH lock required Everything in block a single tran These are required. There are other options Iso levels still use MVCC Your queries
  • 33. Hekaton implements its own memory management system built on SQLOS ‱ MEMORYCLERK_XTP (DB_ID_<dbid>) uses SQLOS Page allocator ‱ Variable heaps created per table and range index ‱ Hash indexes using partitioned memory objects for buckets ‱ “System” memory for database independent tasks ‱ Memory only limited by the OS (24TB in Windows Server 2016) ‱ Details in dm_db_xtp_memory_consumers and dm_xtp_system_memory_consumers In-Memory does recognize SQL Server Memory Pressure ‱ Garbage collection is triggered ‱ If OOM, no inserts allowed but you may be able to DELETE to free up space Allocated at create index time Locked and Large apply
  • 34. Remember this is ALL memory size Binding to your own Resource Pool What about CPU and I/O? no classifier function Not the same as memory
  • 35. Check out this blog Upgrade from 2014 to 2016 can take time Large checkpoint files for 2016 https://support.microsoft.com/en-us/kb/3090141 From the CSS team
  • 37. 37 Client Access · Backup/ Restore · DBCC · Buik load · Exception handling · Event logging · XEvents SOS · Process model (scheduling & synchronization) · Memory management & caching · I/O Txn’s Lock mgr Buffer Pool Access Methods Query Execution & Expression Services UCS Query Optimization TSQL Parsing & Algebrizer Metadata File Manager Database Manager Procedure Cache · TDS handler · Session Mgmt Security Support Utilities UDTs & CLR Stored Procs SNI Service Broker Logging & Recovery 10%10% Network, TDS T-SQL interpreter Query Execution Expressions Access Methods Transaction , Lock, Log Managers SOS, OS and I/O 35%45%
  • 38. SQLOS task and worker threads are the foundation “User” tasks to run transactions Hidden schedulers used for critical background tasks Some tasks dedicated while others use a “worker” pool SQLOS workers dedicated to HK
  • 39. NON-PREEMPTIVEPOOL Background workers needed for on-demand Parallel ALTER TABLE WORKERSPOOL Other workers needed for on- demand (non- preemptive) Parallel MERGE operation of checkpoint files PREEMPTIVEPOOL Background workers needed on-demand - preemptive Serializers to write out checkpoint files Command = XTP_THREAD_POOL wait_type = DISPATCHER_QUEUE_SEMAPHORE Command = UNKNOWN TOKEN wait_type = XTP_PREEMPTIVE_TASK Ideal count = < # schedulers> ; idle timeout = 10 secs HUU H U hidden scheduler user scheduler
  • 40. ‱ Multi-version Optimistic Concurrency prevents all blocking ‱ ALL UPDATEs are DELETE followed by INSERT ‱ DELETED rows not automatically removed from memory ‱ Deleted rows not visible to active transactions becomes stale ‱ Garbage Collection process removes stale rows from memory ‱ TRUNCATE TABLE not supported Page deallocation in SQL Server
  • 42. Query Plans and Stats here Query Store XEvent and SQLTrace Read the docs
  • 43. Multi-row insert test Remember SCHEMA_ONLY has no logging or I/OBULK INSERT for in- mem executes and logged just like INSERT Minimally logged BULK INSERT took 271 log records @ 27Kb Latches required for GAM, PFS, and system table pates
  • 44. Multiple Log Writers in SQL Server 2016 You could go to delayed durability Log at the speed of memory video
  • 45. Over time we could have many CFPs

Editor's Notes

  • #8: Following readme.txt in demo1_inmem_oltp_justshowus
  • #11: CHECKPOINT controller will look at committed log records and use a data serializer thread from the pool. These pool threads put work in queues and the LOG FLUSH threads actually perform the checkpoint I/O (which is continuous) The CHECKPOINT CLOSE thread is the one that actually executes a checkpoint event. A request with command = XTP_CKPT_AGENT is used to keep FREE files replenished for CHECKPOINT files
  • #13: We will discuss the compilation process when talking about natively complied DLLs
  • #15: Hash collision on name index with Ryan because it is a different value than Bob. The first two Bob rows are technically not a collision because they are the same key value
  • #17: Follow the instructions in readme.txt in demo2_inmem_oltp_debugging
  • #19: T-SQL IsXTPSupported() = Windows API IsProcessorFeaturePresent(PF_COMPARE_EXCHANGE128) = _InterlockedCompareExchange128() = Cmpxchg16b instruction
  • #20: Notice that log records are not even created until COMMIT.
  • #23: We can switch to “large” files (1Gb) if checkpoint has determine it can read 200Mb/sec and the machine has 16 cores and 128Gb for SQL Server What is the disk space required for checkpoint files? It is roughly the total size of memory consumed by your table data I need to make comments here about log truncation and checkpoint files “hanging around”. This has been a problem for some customers