SlideShare a Scribd company logo
Apache BookKeeper
DISTRIBUTED STORE
a Salesforce Use Case
Venkateswararao Jujjuri (JV)
Cloud Storage Architect
vjujjuri@salesforce.com
jujjuri@gmail.com
@jvjujjuri | Twitter
https://www.linkedin.com/in/jvjujjuri
Agenda
​ Salesforce needs and requirements
​ Hunt and Selection
​ BookKeeper Introduction
​ Improvements and Enhancements
​ As Service at Scale @ Salesforce
​ Performance
​ Community
​ Q & A
Salesforce Application Storage Needs
​ Store for Persistent WAL, data, and objects
​ Low, constant write latencies
•  Transaction Log, Smaller writes
​ Low, constant Random Read latencies
​ Highly available
​ Append Only entries
•  Objects
​ Highly Consistent for immutable data
​ Long Term Storage
​ Distributed and linearly scalable.
​ On commodity hardware
​ Low Operating Cost
What Did we consider?
​ Build vs. Buy
•  Time-To-Market, resources, cost.
​ Finalists
•  Ceph
•  A CP System
•  w/Unreliable reads, Read path can behave like an AP system.
•  Lot of effort to make it AP behavior on write path
•  Remember: Immutable data.
•  BookKeeper
•  CAP system, because of immutable/append only data.
•  Came close to what we want
•  Almost there but not everything.
Apache Bookkeeper
​ A highly consistent, available, replicated, distributed log service.
​ Immutable , append only store.
​ Thick Client, Simple and Elegant placement policy
•  No Central Master
•  No complicated hashing/computing for placement
​ Low latency, both on writes and reads.
​ Runs on commodity hardware.
​ Built for WAL use-case, but can be expanded to broader storage needs
​ Uses ZooKeeper as consensuses service, and metadata store.
​ Awesome Community.
Enter Apache BookKeeper
Apache BookKeeper
​ A system to reliably log streams of records.
​ Is designed to store write ahead logs for database like applications.
​ Inspired by and designed to solve HDFS NameNode availability deficiencies.
​ Opensource Chronology
•  2008 Open Sourced contribution to ZooKeeper
•  2011 Sub-Project of ZooKeeper.
•  2012 Production
Terminology
​ Journal: Write ahead log
​ Ledger: Log Stream
​ Entry: Each entry of log stream
​ Client: Library, with the application.
​ Bookie: Server
​ Ensemble: Set of Bookies across which a ledger is striped.
​ Cluster: All bookies belong to a given instance of Bookkeeper
​ Write Quorum Size: Number of replicas.
​ Ack Quorum Size: Number of responses needed before client’s write is satisfied.
​ LAC: Last Add Confirmed.
Major Components
• Thick Client; Carries heavy weight in the protocol.
• Thin Server, Bookie. Bookies never initiate any interaction with ZooKeeper or fellow Bookies.
• Zookeeper monitors Bookies.
• Metadata is stored on Zookeeper.
• Auditor to monitor bookies and identify under replicated ledgers.
• Replication workers to replicate under replicated ledger copies.
Major Components
Create Ledger
• Gets Writer Ledger Handle
Add an entry to the Ledger
• Write To the Ledger
Open Ledger
• Gives ReadOnly Ledger Handle.
• May ask for non-recovery read handle.
Get an entry from the ledger
• Read from the ledger
Close ledger
Delete Ledger
Basic Operations
Salesforce Application with BookKeeper
Application
Store Interface
With
Bookkeeper client User
Library
Bookies ZooKeeper
Server Machine
Guarantees
• If an entry has been acknowledged, it must be readable.
• If an entry is read once, it must always be readable.
• If write of entryID ‘n’ is successful, all entries until ‘n’ are successfully committed.
Consistencies
• Last Add Confirmed is consistency among readers
• Fence is consistency among writers.
Commitment
Out-of-order write and In-Order Ack.
• Application has liberty to pre-allocate entryIDs
• Multiple application threads can write in parallel.
User defined Ledger Names
• Not restricted by BK generated ledger Names
Explicit LAC updates
• Added ReadLac, WriteLac to the protocol.
• Maintain both piggy-back LAC and explicit LAC simultaneously.
Enhancements - In the internal branch working to push upstream
Conventional Name Space.
• User defined Names
• Treat LedgerId as an i-node in a file system.
Disk scrubbers and Repairs
• Actively hunt and repair bit-rots and corruptions
Scalable Metadata Store
• Separate and dedicated metadata store
• Not restricted by ZK limitations
Enhancements - Future
Out of order write and in order Ack
0 1 2 3 4 5
App A ( Writer )
6
App B ( Writer )
8
App C ( Writer )
7
Last Add Confirmed
0 1 2 3 4 5
App A ( Writer )
6
App B ( Writer )
8
App C ( Writer )
7
LAC LAC
App D (Reader)
X
LAC
Things Do Break
What Can Happen?
Client
•  Client Restarts
•  Client loses connection with zookeeper
•  Client loses connection with bookies.
Bookie
• Bookie Goes down
• Disk(s) on bookie go bad, IO issues
• Bookie gets disconnected from network.
Zookeeper
• Gets disconnected from rest of the cluster
Writing Client Crash
bookie
bookie
bookie
zookeeper
What is the last entry?
•  Nothing happens until a reader attempts to
read.
•  Recovery process gets initiated when a
process opens the ledger for reading.
•  Close the ledger on zoo keeper
•  Identify Last entry of the ledger.
•  Update metadata on zookeeper with
Last Add Confirmed. (LAC)
Client gets disconnected with Bookies.
Either bookie is down or network between client and bookie have issues.
Contact zoo keeper to get the list of available bookies.
Update ensemble set, register with zookeeper.
Continue with new set.
Client gets disconnected with Zookeeper.
Tries to reestablish the connection.
Can continue to read and write to the ledger.
Until that time, no metadata operations can be performed.
•  Can not create a ledger
•  Can not open a ledger
•  Can not close a ledger
Reader Opens while writer is active.
Application control
BK guarantees correctness.
Reader initiates recovery process.
•  Fences bookie on the zookeeper.
•  Informs all bookies in ensemble recovery started.
•  After these steps writer will get write errors.(if actively writing)
•  Reader contacts all bookies to learn last entry.
•  Replicates last entry if it doesn’t have enough replicas.
•  Updates zookeeper with LAC, and closes the ledger.
Recovery begins when the ledger is opened by the reader in recovery mode
• Check if the ledger needs recovery (not closed)
• Fence the ledger first and initiate recovery
• Step1: Flag that the ledger is in recovery by update ZooKeeper state.
• Step2 : Fence Bookies
• Step3 : Recover the Ledger
Fencing and Recovery
Ledger Fencing
BookKeeper
Distributed Store
Ledger
Write Non Recovery Read
Recovery ReadFence & Recover
Attempt to write
ZooKeeper
Cluster
B
Auto Recovery Components
Bookie-1 Bookie-2 Bookie-N
BookKeeper
Cluster
Auditor (Lead)
Replicator
Worker
Auditor
(Follower)
Replicator
Worker
Auditor
(Follower)
Replicator
Worker
Machine-1 Machine-2 Machine-N
Auditor
• Starts on every Bookie machine, leader gets elected through ZooKeeper.
• One active auditor per cluster.
• Watch Bookie failures and manage under replicated ledgers list.
Replication Workers
• Responsible for performing replication to maintain quorum copies.
• Can run on any machine in the cluster, usually runs on each Bookie machine.
• Work on under replicated ledgers list published by the Auditor.
• Pick one ledger at a time, create a lock on ZooKeeper and replicate to local bookie.
• If local bookie is part of the ensemble, drop the lock and move to next one in the list.
Bookie Crashes - Auto Recovery
Heterogeneous Stores and Tiered Architecture
Log Store
Data Store
Archival Store
Clusters of storage serving App Instances
Log Store
Data Store
Archival Store
App Instance
App Instance App Instance
App Instance
App Instance
App Instance
App Instance
App Instance
Performance
Performance
Performance
Community Update
Projects built on BookKeeper
•  Twitter Distributed Log : Manhattan, Pub/Sub, DeferredRPC
•  Yahoo Cloud Messaging
•  Salesforce Distributed Store.
•  Huawei – HDFS NameNode
•  HubSpot – WAL
•  Majordodo – Distributed Resource Manager
Community
•  6 PMC members
•  8 Committers
•  20-25 active members
•  5 Enterprises actively using/contributing
More Info
https://cwiki.apache.org/confluence/display/BOOKKEEPER/BookKeeper+papers+and+presentations

More Related Content

PDF
Apache BookKeeper: A High Performance and Low Latency Storage Service
PPTX
File Format Benchmark - Avro, JSON, ORC & Parquet
PPTX
Apache Con 2021 : Apache Bookkeeper Key Value Store and use cases
PDF
Big Data Security in Apache Projects by Gidon Gershinsky
PDF
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
PDF
最近のストリーム処理事情振り返り
PDF
Designing and Building Next Generation Data Pipelines at Scale with Structure...
PDF
Deep Dive: Memory Management in Apache Spark
Apache BookKeeper: A High Performance and Low Latency Storage Service
File Format Benchmark - Avro, JSON, ORC & Parquet
Apache Con 2021 : Apache Bookkeeper Key Value Store and use cases
Big Data Security in Apache Projects by Gidon Gershinsky
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
最近のストリーム処理事情振り返り
Designing and Building Next Generation Data Pipelines at Scale with Structure...
Deep Dive: Memory Management in Apache Spark

What's hot (20)

PDF
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016
PDF
Apache Spark Core – Practical Optimization
PDF
Introduction to Redis
PPTX
How to understand and analyze Apache Hive query execution plan for performanc...
PDF
Memory Management in Apache Spark
PPTX
PDF
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
PPTX
Apache Kudu: Technical Deep Dive


PDF
Advanced Apache Spark Meetup Project Tungsten Nov 12 2015
PDF
A Thorough Comparison of Delta Lake, Iceberg and Hudi
PDF
Top 5 Mistakes When Writing Spark Applications
PDF
Understanding Data Partitioning and Replication in Apache Cassandra
PPT
Cloudera Impala Internals
PPTX
From cache to in-memory data grid. Introduction to Hazelcast.
PPTX
Elastic Stack Introduction
PPTX
Introduction to Apache Kafka
PDF
Considerations for Data Access in the Lakehouse
PDF
Cloud-Native Apache Spark Scheduling with YuniKorn Scheduler
PDF
In-Memory Evolution in Apache Spark
PDF
How to Automate Performance Tuning for Apache Spark
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016
Apache Spark Core – Practical Optimization
Introduction to Redis
How to understand and analyze Apache Hive query execution plan for performanc...
Memory Management in Apache Spark
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Apache Kudu: Technical Deep Dive


Advanced Apache Spark Meetup Project Tungsten Nov 12 2015
A Thorough Comparison of Delta Lake, Iceberg and Hudi
Top 5 Mistakes When Writing Spark Applications
Understanding Data Partitioning and Replication in Apache Cassandra
Cloudera Impala Internals
From cache to in-memory data grid. Introduction to Hazelcast.
Elastic Stack Introduction
Introduction to Apache Kafka
Considerations for Data Access in the Lakehouse
Cloud-Native Apache Spark Scheduling with YuniKorn Scheduler
In-Memory Evolution in Apache Spark
How to Automate Performance Tuning for Apache Spark
Ad

Similar to Apache con2016final (20)

PPTX
Apache BookKeeper Distributed Store- a Salesforce use case
PDF
Introduction to Apache BookKeeper Distributed Storage
PPTX
Building reliable systems with Apache BookKeeper
PPTX
Apache Bookkeeper and Apache Zookeeper for Apache Pulsar
PDF
How pulsar stores data at Pulsar-na-summit-2021.pptx (1)
PPTX
Apache BookKeeper as a long term distributed store
PDF
Pushing Pulsar Performance to the Limits - Pulsar Summit NA 2021
PPTX
How Pulsar Stores Your Data - Pulsar Summit NA 2021
PDF
ClickHouse Keeper
PDF
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLab
PPTX
Zookeeper Tutorial for beginners
PDF
BookKeeper Administrator's Guide
PPT
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
PPT
Hadoop and Voldemort @ LinkedIn
PDF
Voldemort Nosql
PPTX
Leo's Notes about Apache Kafka
PPT
Distributed System by Pratik Tambekar
PDF
What Ever Happened to Durability?
PDF
LJC: Fault tolerance with Apache Cassandra
PDF
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Apache BookKeeper Distributed Store- a Salesforce use case
Introduction to Apache BookKeeper Distributed Storage
Building reliable systems with Apache BookKeeper
Apache Bookkeeper and Apache Zookeeper for Apache Pulsar
How pulsar stores data at Pulsar-na-summit-2021.pptx (1)
Apache BookKeeper as a long term distributed store
Pushing Pulsar Performance to the Limits - Pulsar Summit NA 2021
How Pulsar Stores Your Data - Pulsar Summit NA 2021
ClickHouse Keeper
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLab
Zookeeper Tutorial for beginners
BookKeeper Administrator's Guide
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Hadoop and Voldemort @ LinkedIn
Voldemort Nosql
Leo's Notes about Apache Kafka
Distributed System by Pratik Tambekar
What Ever Happened to Durability?
LJC: Fault tolerance with Apache Cassandra
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Ad

Recently uploaded (20)

PPTX
Odoo POS Development Services by CandidRoot Solutions
PDF
System and Network Administration Chapter 2
PDF
How Creative Agencies Leverage Project Management Software.pdf
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PPTX
Transform Your Business with a Software ERP System
PPTX
Online Work Permit System for Fast Permit Processing
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PDF
System and Network Administraation Chapter 3
PPTX
Operating system designcfffgfgggggggvggggggggg
PPTX
ISO 45001 Occupational Health and Safety Management System
PPT
Introduction Database Management System for Course Database
PDF
AI in Product Development-omnex systems
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PPTX
L1 - Introduction to python Backend.pptx
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
Odoo POS Development Services by CandidRoot Solutions
System and Network Administration Chapter 2
How Creative Agencies Leverage Project Management Software.pdf
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
Transform Your Business with a Software ERP System
Online Work Permit System for Fast Permit Processing
Wondershare Filmora 15 Crack With Activation Key [2025
Which alternative to Crystal Reports is best for small or large businesses.pdf
How to Choose the Right IT Partner for Your Business in Malaysia
System and Network Administraation Chapter 3
Operating system designcfffgfgggggggvggggggggg
ISO 45001 Occupational Health and Safety Management System
Introduction Database Management System for Course Database
AI in Product Development-omnex systems
Internet Downloader Manager (IDM) Crack 6.42 Build 41
L1 - Introduction to python Backend.pptx
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...

Apache con2016final

  • 1. Apache BookKeeper DISTRIBUTED STORE a Salesforce Use Case Venkateswararao Jujjuri (JV) Cloud Storage Architect vjujjuri@salesforce.com jujjuri@gmail.com @jvjujjuri | Twitter https://www.linkedin.com/in/jvjujjuri
  • 2. Agenda ​ Salesforce needs and requirements ​ Hunt and Selection ​ BookKeeper Introduction ​ Improvements and Enhancements ​ As Service at Scale @ Salesforce ​ Performance ​ Community ​ Q & A
  • 3. Salesforce Application Storage Needs ​ Store for Persistent WAL, data, and objects ​ Low, constant write latencies •  Transaction Log, Smaller writes ​ Low, constant Random Read latencies ​ Highly available ​ Append Only entries •  Objects ​ Highly Consistent for immutable data ​ Long Term Storage ​ Distributed and linearly scalable. ​ On commodity hardware ​ Low Operating Cost
  • 4. What Did we consider? ​ Build vs. Buy •  Time-To-Market, resources, cost. ​ Finalists •  Ceph •  A CP System •  w/Unreliable reads, Read path can behave like an AP system. •  Lot of effort to make it AP behavior on write path •  Remember: Immutable data. •  BookKeeper •  CAP system, because of immutable/append only data. •  Came close to what we want •  Almost there but not everything.
  • 5. Apache Bookkeeper ​ A highly consistent, available, replicated, distributed log service. ​ Immutable , append only store. ​ Thick Client, Simple and Elegant placement policy •  No Central Master •  No complicated hashing/computing for placement ​ Low latency, both on writes and reads. ​ Runs on commodity hardware. ​ Built for WAL use-case, but can be expanded to broader storage needs ​ Uses ZooKeeper as consensuses service, and metadata store. ​ Awesome Community.
  • 7. Apache BookKeeper ​ A system to reliably log streams of records. ​ Is designed to store write ahead logs for database like applications. ​ Inspired by and designed to solve HDFS NameNode availability deficiencies. ​ Opensource Chronology •  2008 Open Sourced contribution to ZooKeeper •  2011 Sub-Project of ZooKeeper. •  2012 Production
  • 8. Terminology ​ Journal: Write ahead log ​ Ledger: Log Stream ​ Entry: Each entry of log stream ​ Client: Library, with the application. ​ Bookie: Server ​ Ensemble: Set of Bookies across which a ledger is striped. ​ Cluster: All bookies belong to a given instance of Bookkeeper ​ Write Quorum Size: Number of replicas. ​ Ack Quorum Size: Number of responses needed before client’s write is satisfied. ​ LAC: Last Add Confirmed.
  • 9. Major Components • Thick Client; Carries heavy weight in the protocol. • Thin Server, Bookie. Bookies never initiate any interaction with ZooKeeper or fellow Bookies. • Zookeeper monitors Bookies. • Metadata is stored on Zookeeper. • Auditor to monitor bookies and identify under replicated ledgers. • Replication workers to replicate under replicated ledger copies. Major Components
  • 10. Create Ledger • Gets Writer Ledger Handle Add an entry to the Ledger • Write To the Ledger Open Ledger • Gives ReadOnly Ledger Handle. • May ask for non-recovery read handle. Get an entry from the ledger • Read from the ledger Close ledger Delete Ledger Basic Operations
  • 11. Salesforce Application with BookKeeper Application Store Interface With Bookkeeper client User Library Bookies ZooKeeper Server Machine
  • 12. Guarantees • If an entry has been acknowledged, it must be readable. • If an entry is read once, it must always be readable. • If write of entryID ‘n’ is successful, all entries until ‘n’ are successfully committed. Consistencies • Last Add Confirmed is consistency among readers • Fence is consistency among writers. Commitment
  • 13. Out-of-order write and In-Order Ack. • Application has liberty to pre-allocate entryIDs • Multiple application threads can write in parallel. User defined Ledger Names • Not restricted by BK generated ledger Names Explicit LAC updates • Added ReadLac, WriteLac to the protocol. • Maintain both piggy-back LAC and explicit LAC simultaneously. Enhancements - In the internal branch working to push upstream
  • 14. Conventional Name Space. • User defined Names • Treat LedgerId as an i-node in a file system. Disk scrubbers and Repairs • Actively hunt and repair bit-rots and corruptions Scalable Metadata Store • Separate and dedicated metadata store • Not restricted by ZK limitations Enhancements - Future
  • 15. Out of order write and in order Ack 0 1 2 3 4 5 App A ( Writer ) 6 App B ( Writer ) 8 App C ( Writer ) 7
  • 16. Last Add Confirmed 0 1 2 3 4 5 App A ( Writer ) 6 App B ( Writer ) 8 App C ( Writer ) 7 LAC LAC App D (Reader) X LAC
  • 18. What Can Happen? Client •  Client Restarts •  Client loses connection with zookeeper •  Client loses connection with bookies. Bookie • Bookie Goes down • Disk(s) on bookie go bad, IO issues • Bookie gets disconnected from network. Zookeeper • Gets disconnected from rest of the cluster
  • 19. Writing Client Crash bookie bookie bookie zookeeper What is the last entry? •  Nothing happens until a reader attempts to read. •  Recovery process gets initiated when a process opens the ledger for reading. •  Close the ledger on zoo keeper •  Identify Last entry of the ledger. •  Update metadata on zookeeper with Last Add Confirmed. (LAC)
  • 20. Client gets disconnected with Bookies. Either bookie is down or network between client and bookie have issues. Contact zoo keeper to get the list of available bookies. Update ensemble set, register with zookeeper. Continue with new set.
  • 21. Client gets disconnected with Zookeeper. Tries to reestablish the connection. Can continue to read and write to the ledger. Until that time, no metadata operations can be performed. •  Can not create a ledger •  Can not open a ledger •  Can not close a ledger
  • 22. Reader Opens while writer is active. Application control BK guarantees correctness. Reader initiates recovery process. •  Fences bookie on the zookeeper. •  Informs all bookies in ensemble recovery started. •  After these steps writer will get write errors.(if actively writing) •  Reader contacts all bookies to learn last entry. •  Replicates last entry if it doesn’t have enough replicas. •  Updates zookeeper with LAC, and closes the ledger.
  • 23. Recovery begins when the ledger is opened by the reader in recovery mode • Check if the ledger needs recovery (not closed) • Fence the ledger first and initiate recovery • Step1: Flag that the ledger is in recovery by update ZooKeeper state. • Step2 : Fence Bookies • Step3 : Recover the Ledger Fencing and Recovery
  • 24. Ledger Fencing BookKeeper Distributed Store Ledger Write Non Recovery Read Recovery ReadFence & Recover Attempt to write
  • 25. ZooKeeper Cluster B Auto Recovery Components Bookie-1 Bookie-2 Bookie-N BookKeeper Cluster Auditor (Lead) Replicator Worker Auditor (Follower) Replicator Worker Auditor (Follower) Replicator Worker Machine-1 Machine-2 Machine-N
  • 26. Auditor • Starts on every Bookie machine, leader gets elected through ZooKeeper. • One active auditor per cluster. • Watch Bookie failures and manage under replicated ledgers list. Replication Workers • Responsible for performing replication to maintain quorum copies. • Can run on any machine in the cluster, usually runs on each Bookie machine. • Work on under replicated ledgers list published by the Auditor. • Pick one ledger at a time, create a lock on ZooKeeper and replicate to local bookie. • If local bookie is part of the ensemble, drop the lock and move to next one in the list. Bookie Crashes - Auto Recovery
  • 27. Heterogeneous Stores and Tiered Architecture Log Store Data Store Archival Store
  • 28. Clusters of storage serving App Instances Log Store Data Store Archival Store App Instance App Instance App Instance App Instance App Instance App Instance App Instance App Instance
  • 32. Community Update Projects built on BookKeeper •  Twitter Distributed Log : Manhattan, Pub/Sub, DeferredRPC •  Yahoo Cloud Messaging •  Salesforce Distributed Store. •  Huawei – HDFS NameNode •  HubSpot – WAL •  Majordodo – Distributed Resource Manager Community •  6 PMC members •  8 Committers •  20-25 active members •  5 Enterprises actively using/contributing More Info https://cwiki.apache.org/confluence/display/BOOKKEEPER/BookKeeper+papers+and+presentations