SlideShare a Scribd company logo
Remote Blog Storage (RBS) Best Practices in SharePoint 2010 - EPC Group
Remote Blog Storage (RBS) Best Practices in SharePoint 2010 - EPC Group
Remote Blog Storage (RBS) Best Practices in SharePoint 2010 - EPC Group
   Overview
   Understanding Unstructured Data Storage
    ◦ SQL BLOB
    ◦ RBS
    ◦ FILESTREAM
   RBS Structure and Mechanics
   Planning BLOB Storage
   Deployment and Migration
   Support
   EBS
   Summary
   Remote BLOB Storage is designed to delineate
    structured (metadata) and unstructured (BLOB
    data) data
   Enables organizations to deploy more efficient
    content storage models based on commodity
    storage
    ◦ Does not address capacity – database is the sum of the
      unstructured and structured data regardless of location
   Provides an upgrade path for WIDE customers
   Improves existing BLOB storage scenarios (EBS)
   On average 20% of data is structured, 80% is
    unstructured or semi-structured
   Does not adhere to specific format or
    sequence
   Is not tied to rules and unpredictable
   Examples:
    ◦   Text
    ◦   Video
    ◦   Audio
    ◦   Images
    ◦   Word, PowerPoint, code, etc.
   Organized in semantic chunks (entities)
   Tied to relationships and has attributes
   Associated with a defined schema:
    ◦ All entities have the defined format
    ◦ Have a predefined length
   BLOB  Binary Large OBject
   BLOB is the data stream associated with a file
    ◦ SharePoint file metadata and BLOBs are stored in
      SQL databases
    ◦ BLOBs do not participate in query operations
    ◦ Sample BLOB operations: Get, Put, Read range, etc.
   SharePoint is built around the file
    ◦ Document libraries, Record Centers
   BLOBs generally represent 80% of total
    content
Web Server




                                  Database Server



                                            Content
                                            Database




   SharePoint stores BLOBs and associated
    metadata in the content database
   Storage
    ◦ SQL storage is usually more expensive
      SAN versus CAS stores
   Performance
    ◦ Impacts load on SQL Server box
   Policy requirements
    ◦ Expunge, BLOB immutability
Web Server




                                                         Database Server



                                                                    Content
                                                                    Database




         BLOB         BLOB        BLOB
        Store X      Store Y     Store Z


•   Independent component that can be registered for a SharePoint farm
•   Data store for all BLOBs added to the content databases where the Provider is set
    Active
•   Customers can select BLOB store providers
Unstructured Data     Unstructured Data   Unstructured Data




Dedicated BLOB Stores
                           SQL BLOB         Integrated File +
 Remote File Servers
                                                Database
   Binary large objects stored in data tables
    (varbinary(MAX))
   Traditional method of storing and retrieving
    binary large objects with SharePoint Products
    and Technologies
   SQL Server 2008 Add-on
   Remote Blob Storage (RBS) is a library API set
    that is designed to move storage of large
    binary data (BLOBs) from Microsoft SQL Server
    to external storage solutions.
   RBS gives applications the ability to use rich
    relational capabilities of SQL Server for their
    structured data along with capabilities of
    dedicated storage solutions for their
    unstructured data in a transactionally
    consistent manner.
   Leverages NTFS FS by storing varbinary(max)
    BLOB data as files on the FS
   Addresses performance by enabling more
    memory access for query processing
   Suited for scenarios where BLOB
    (unstructured) data is 1 MB or larger and fast
    read access is desired (i.e. RM scenarios)
   Storage attribute on
Unstructured Data        VARBINARY(MAX)
                        Unstructured data stored directly in
                         the file system (requires NTFS)
                        Dual Programming Model
                        Data Consistency
                        Size limit is the file system volume
                         size
                        Integrated Manageability
                        SQL Server Security Stack (Same as
Integrated Files &
    Database             SQL BLOB)
Throughput (Mb/sec.)
                       4000
Throughput (Mb/sec.)




                       3500

                       3000

                       2500

                       2000

                       1500                                                             Throughput
                       1000                                                             (Mb/sec.)
                       500

                         0

                              0KB   240 KB 480 KB   1 MB   2 MB   4 MB   8 MB   16 MB

                                                    Data Size
   Unstructured data is        Unstructured data is
    stored in a Filegroup        stored in a Filegroup in a
                                 separate database or SQL
    with the associated          Server instance from the
    Content Database with        associated Content
    the related structured       Database with the related
    data on the local SQL        structured data
    Server instance             Does not supported
                                 integrated
   Supports integrated          management, structured
    management, i.e.             and unstructured data is
    backup and restore           managed separately


Local FILESTREAM             Remote FILESTREAM
   FILESTREAM Provider is limited local storage
    ◦ DAS, NAS, SAN are considered remote storage
      regardless of disk presentation
    ◦ Does not support compression, TDE, and other
      SQL Server capabilities
    ◦ Special constraints and limitations apply to BCM
      scenarios such as Database Mirroring and Log
      Shipping (see FAQ)
   3rd party ISV solutions require SQL Server Enterprise Edition
    ◦ NAS storage devices require 20ms TTFB
   RBS is a downloadable component in the SQL
    Server 2008 R2 Feature Pack
    ◦ Includes a set of libraries and interface specifications
   Defines and exposes 3 views for interaction
    ◦ Application View
      Interacts with SharePoint Web Front-end, Provider
       Library, SQL DB
      Implemented by SharePoint 2010 – Transparent to the user
    ◦ Administrator View
      Windows PowerShell CmdLets – Call Stored Procedures and
       Functions
      Installation, configuration, provisioning, RBS Maintainer etc.
    ◦ Provider View
      Defines an interface that should be implemented by each
       BLOB store provider
   RBS storage contains two main features: blob
    storage and blob retrieval. Blobs are
    immutable so edits are translated in the
    backend into new blobs.
   Blobs are only deleted by garbage
    collection, which scans the content database
    and blob library, deleting blobs (of a certain
    age) that are no longer referenced.
   Where to store items (RBS or inline) is
    determined at the content database level on
    the back-end, not the WFE
Business Logic
         Application                       Database Server
                     Web Server
                                  Return BLOB

                                  Read BLOB
                                                 BLOB
                                                 Store
 Response
User Request




                                                Content
                                                  Db
                                  Get BLOB Id


                                                Config Db
Business Logic
         Application                              Database Server
                     Web Server
                             Save BLOB     Commit
                                Data        BLOB
                                         Return BLOB
                                              Id
                                                        BLOB
                                                        Store
 Response
User Request




                                           Commit
                                                       Content
                                          BLOB Id &
                                                         Db
                                          Metadata



                                                       Config Db
mssql_resources.rbs_internal_ta   RBS providers store metadata
                                   bles               in their own tables inside the
                     [internal rbs         …          content database.
Content                  data]
Database                   …               …


RBS provider = []



            AllDocStreams
        Content          RbsId
         NULL           [ id # ]                       RBS Blob
   [Inline Content]      NULL
DatabaseInformation
       …                    …
 RBS Enabled             [1|0]
 RBS Provider               [
                      providername             The provider is set at the
                            ]                  database level.
AllSites
 …   RBS Collection        …
          ID

           [ id # ]                                Each site has a separate
                         RBS Provider Logic        collection (top-level storage
                                                   unit). This is the default
AllDocStreams                                      location for new blobs, and
 …          RBS ID          …
                                                   consulted when adding new
       [ rbs id (bin                               blobs.
                                                                      This is a lookup-only
          data) ]                                                     operation. The location of a
                          RBS Provider Logic             Blob         blob in the AllDocStreams
                                                                      table need not be in the
                                                                      same collection as what the
                                                                      AllSites table default
                                                                      collection.
   Select storage solutions designed to support
    BLOB data storage
    ◦ Write Once Read Multiple (WORM) devices can pronounce
      orphan occurrences
      Prevents deletion of BLOB related SharePoint metadata row
    ◦ SAN devices support BLOB data replication, locally and
      over wide area networks using bit mapping. Database
      mirroring and/or Log Shipping can augment these
      solutions, but add operation complexity.
   RBS Maintainer is instrumental in detecting and
    resolving orphans in the event of failover
    ◦ Provides Reference Scan (RS), Delete Propagation
      (DP), and Orphan Cleanup (OC)
   RBS can be used in conjunction with commodity
    storage, i.e. DAS; however, collaborative scenarios
    should design for IO (RAID 10)
    ◦ CAPEX and OPEX considerations are critical to realizing ROI
      Ensure operationally a multi-tiered storage subsystem can be
       maintained and supported
      Ensure BCM plans ensure metadata and BLOB data can be kept
       synchronized
    ◦ Consider RBS when the following is true:
      BLOB data files are larger than 256KB on average
      BLOB data files are at least 80KB and the DB server I/O is a
       bottleneck
    ◦ A large number of small BLOBs can decrease performance
      RBS provides maximum value in archiving and DAM
       scenarios, particularly large files with infrequent access
Unstructured Data              Unstructured Data               Unstructured Data




                    Dedicated BLOB Stores                  File System
                                                                                              SQL BLOB
                     Remote File Servers                   File Server

Solution        Dedicated BLOB Store               File System/File Server         SQL BLOB
Advantages          Lower cost per GB at scale     Lowest cost per GB             Integrated management
                    Scalability & Expandability    Streaming Performance          Data-level consistency
Disadvantages       Complex application            Complex application            Poor data streaming support
                     development & deployment        development & deployment       File size limitations
                    Separate data management       Integration with structured    Highest cost per GB
                    Enterprise-scales only          data
Example             EMC Centera                    Windows File Servers           SQL Server VARBINARY(MAX)
                    Fujitsu Nearline
   Ensure the provider is working within the
    scope of provided API’s and support policies
   Evaluate provider characteristics, features,
    and capabilities
   Map provider offering to scenario
   Required
    ◦   Implementation of RBS provider interface
    ◦   Enable multiple provider instances
    ◦   Guarantee BLOB persistence
    ◦   Guarantee link-level consistency
   Recommended
    ◦ Backup, HA and Disaster recovery capability
    ◦ Data de-duplication
    ◦ Expunge, Immutability of BLOBs
1.   Enable FILESTREAM
2.   Provision BLOB Store
3.   Download and install SQL Server Remote
     BLOB Store on each database server
4.   Download and install SQL Server Remote
     BLOB Store on each front-end Web server
5.   Enable RBS
   Enable/Disable
    ◦ Enables/Disables usage       $database = Get-SPContentDatabase -
      of RBS with SharePoint
                                   WebApplication http://<server>
                                   $rbs =
   GetProviderNames               $database.RemoteBlobStorageSettings
                                   $rbs.Installed()
    ◦ List of all registered       $rbs.Enable()
      Providers
    ◦ Registered Providers are
      kept track of in Config DB
                                   $rbs.SetActiveProviderName($rbs.GetProv
                                   iderNames()[0])
   SetActiveProvider              $rbs

    ◦ One active provider/BLOB
      store per Content DB
    ◦ Other BLOB stores can be
      used for read operations
   Use Windows PowerShell Migrate CmdLet
    ◦ Moves BLOBs from current location to the current
      Active RBS Provider store.
    ◦ Performs a “deep copy” of BLOB, sequentially
    ◦ Live migration – does not require downtime
    ◦ Migration can be stopped and resumed
   Migrate – can be used for upgrade from EBS
    to RBS
Backup


   Backup Start     Both Backups Complete




    Restore Start   Both Restores are Complete
Provider       SQL Server 2008   SQL Server 2008 R2
2008           Not Supported     Not Supported
2008 R2 with   Supported         Supported
FILESTREAM
Provider
   SPFarm scoped                 SPContentDatabase scoped
   Exposed as a COM              Exposed as a .NET interface
    interface (unmanaged)          (managed)
   Does not provide a            Provides a configurable
    configurable maintainer        maintainer
   Supports single provider      Supports multiple providers
                                  Supports Object Model,
   Supports Object Model          PRIME, and *SQL Server
    and PRIME                      backup/restore
    backup/restore                Introduced in SharePoint
   Deprecated in SharePoint       2010
    2010                           ◦ Implemented by SQL Server,
    ◦ Roadmap based on RBS           supported with SharePoint



EBS                            RBS
   Deep copy the binary large objects from EBS
    either inline or into RBS
   Internalize EBS binary large objects, re-
    externalize with RBS
   Used by Work and PowerPoint Web
    Applications
    ◦ Implemented per Web Application
   Database Mirroring does not support
    FILESTREAM
    ◦ Log Shipping should augment Database Mirroring
      for BLOB protection
   Performs maintenance tasks associated with
    Remote BLOB Storage
    ◦ Garbage Collection
    ◦ Reference Scan
    ◦ Delete Propagation
   Successful binary large object
    implementations require careful planning to
    ensure expected ROI is realized
   Binary large object externalization is
    designed to reduce capital expenditures, it
    does not resolve capacity or performance
    constraints
   Standardized API set allows choice of
    providers

More Related Content

PPTX
RBS in SharePoint
PPTX
Sizing your Content Databases: Understanding the Limits
PDF
SharePoint Saturday San Antonio: SharePoint 2010 Performance
PPTX
Storing and managing your content in share point spsnyc
PDF
SharePoint Saturday The Conference 2011 - SP2010 Performance
PPTX
Infrastructure Best Practices for SharePoint On-Premises presented by Michael...
PDF
An Elastic Metadata Store for eBay’s Media Platform
PDF
SPSUtah 2014 SharePoint 2013 Performance (Admin)
RBS in SharePoint
Sizing your Content Databases: Understanding the Limits
SharePoint Saturday San Antonio: SharePoint 2010 Performance
Storing and managing your content in share point spsnyc
SharePoint Saturday The Conference 2011 - SP2010 Performance
Infrastructure Best Practices for SharePoint On-Premises presented by Michael...
An Elastic Metadata Store for eBay’s Media Platform
SPSUtah 2014 SharePoint 2013 Performance (Admin)

What's hot (19)

PPTX
MongoDB at eBay
PPTX
Introduction about Mongo DB for Beginners
PPTX
What SQL DBAs need to know about SharePoint
PPTX
SharePoint 2010 database maintenance
PDF
The Evolution of Open Source Databases
PDF
Hadoop World 2011: Apache HBase Road Map - Jonathan Gray - Facebook
PPTX
How companies-use-no sql-and-couchbase-10152013
PDF
Alfresco in an hour
PPTX
Compare DynamoDB vs. MongoDB
PDF
Common MongoDB Use Cases
PPTX
SharePoint 2010 – Installation and maintenance – best practices
PDF
Hadoop World 2011: Building Realtime Big Data Services at Facebook with Hadoo...
PDF
What's behind facebook
PDF
MongoDB Capacity Planning
PDF
Papers We Love Too, June 2015: Haystack
PDF
Qcon 090408233824-phpapp01
PDF
PPT
Wmware NoSQL
PDF
TCO Comparison MongoDB & Oracle
MongoDB at eBay
Introduction about Mongo DB for Beginners
What SQL DBAs need to know about SharePoint
SharePoint 2010 database maintenance
The Evolution of Open Source Databases
Hadoop World 2011: Apache HBase Road Map - Jonathan Gray - Facebook
How companies-use-no sql-and-couchbase-10152013
Alfresco in an hour
Compare DynamoDB vs. MongoDB
Common MongoDB Use Cases
SharePoint 2010 – Installation and maintenance – best practices
Hadoop World 2011: Building Realtime Big Data Services at Facebook with Hadoo...
What's behind facebook
MongoDB Capacity Planning
Papers We Love Too, June 2015: Haystack
Qcon 090408233824-phpapp01
Wmware NoSQL
TCO Comparison MongoDB & Oracle
Ad

Viewers also liked (9)

PPTX
Sps Ottawa - Storing Your Content in SharePoint
ODP
The IBM Social Business Toolkit
DOCX
Storage blog 2015 holistic view
PDF
[Presentation] Automated Model-Based Android GUI Testing using Multi-Level GU...
PPT
Storage, San And Business Continuity Overview
PPT
11. Storage and File Structure in DBMS
PDF
Datacenter - Vad söker IT-jättarna efter - site selection
PDF
Datacenter Computing with Apache Mesos - BigData DC
PDF
SlideShare 101
Sps Ottawa - Storing Your Content in SharePoint
The IBM Social Business Toolkit
Storage blog 2015 holistic view
[Presentation] Automated Model-Based Android GUI Testing using Multi-Level GU...
Storage, San And Business Continuity Overview
11. Storage and File Structure in DBMS
Datacenter - Vad söker IT-jättarna efter - site selection
Datacenter Computing with Apache Mesos - BigData DC
SlideShare 101
Ad

Similar to Remote Blog Storage (RBS) Best Practices in SharePoint 2010 - EPC Group (20)

PPTX
EBS and RBS in SharePoint 2010
PPTX
To blob or not to blob
PDF
SharePoint Storage Best Practices
PPTX
Storing and managing your content in share point tspbug
PPTX
SharePoint Saturday Durban Presentation
PPTX
SQLBits X SQL Server 2012 Rich Unstructured Data
PPTX
Steve marsh blob-spsbe25
PDF
Share point 2010 performance and capacity planning best practices
PDF
Azure - Data Platform
PDF
Lee oracle
PDF
SQL Server 2008 Fast Track Data Warehouse
PPTX
Sql Health in a SharePoint environment
PPT
Automating SQL Server Database Creation for SharePoint
PPTX
Building the Perfect SharePoint 2010 Farm - Sharing the Point South America
PPTX
Unit 3 MongDB
PPT
Websphere - Introduction to jdbc
PPTX
DB Luminous... Know Your Data
PPTX
Accesso ai dati con Azure Data Platform
PDF
Performance analysis of MongoDB and HBase
PDF
EBS and RBS in SharePoint 2010
To blob or not to blob
SharePoint Storage Best Practices
Storing and managing your content in share point tspbug
SharePoint Saturday Durban Presentation
SQLBits X SQL Server 2012 Rich Unstructured Data
Steve marsh blob-spsbe25
Share point 2010 performance and capacity planning best practices
Azure - Data Platform
Lee oracle
SQL Server 2008 Fast Track Data Warehouse
Sql Health in a SharePoint environment
Automating SQL Server Database Creation for SharePoint
Building the Perfect SharePoint 2010 Farm - Sharing the Point South America
Unit 3 MongDB
Websphere - Introduction to jdbc
DB Luminous... Know Your Data
Accesso ai dati con Azure Data Platform
Performance analysis of MongoDB and HBase

More from EPC Group (20)

PPTX
Power BI vs Tableau - An Overview from EPC Group.pptx
PPTX
EPC Group Intune Practice and Capabilities Overview
PPTX
Pop the Hood on Microsoft Teams - EPC Group
PPTX
Windows Server 2012 Deep-Dive - EPC Group
PPTX
Understanding Windows Azure’s Active Directory (AD) and PowerShell Tools
PPTX
PowerShell with SharePoint 2013 and Office 365 - EPC Group
PPTX
Understanding Office 365’s Identity Solutions: Deep Dive - EPC Group
PPTX
System Center 2012 SP1 - Overview - EPC Group
PPTX
Windows Azure Pack Enabling Virtual Machines - IaaS & Virtual Machine Role - ...
PPTX
Lync 2013 - Audio - Quick Reference - 2 Page Reference - EPC Group
PPTX
Lync 2013 - Sharing and Collaboration - Quick Reference 2 Pager
PPTX
Windows Server 2012 Deep-Dive - EPC Group
PPTX
Hyper-V’s Virtualization Enhancements - EPC Group
PPTX
High Level Overview of Windows Azure - EPC Group
PPTX
SharePoint 2013 and Office 365 External Sharing
PPTX
BizTalk Server 2010 - Invoking Restful Services - EPC Group
PDF
BizTalk Sever 2010 - Basic Principles of Maps - EPC Group
PDF
EPC Group and Continental Airlines ECM Case Study - SharePoint 2007 Global Study
PPTX
Driving End User Adoption in SharePoint 2013 & 2010 - EPC Group
DOCX
Join EPC Group's Monthly Newsletter
Power BI vs Tableau - An Overview from EPC Group.pptx
EPC Group Intune Practice and Capabilities Overview
Pop the Hood on Microsoft Teams - EPC Group
Windows Server 2012 Deep-Dive - EPC Group
Understanding Windows Azure’s Active Directory (AD) and PowerShell Tools
PowerShell with SharePoint 2013 and Office 365 - EPC Group
Understanding Office 365’s Identity Solutions: Deep Dive - EPC Group
System Center 2012 SP1 - Overview - EPC Group
Windows Azure Pack Enabling Virtual Machines - IaaS & Virtual Machine Role - ...
Lync 2013 - Audio - Quick Reference - 2 Page Reference - EPC Group
Lync 2013 - Sharing and Collaboration - Quick Reference 2 Pager
Windows Server 2012 Deep-Dive - EPC Group
Hyper-V’s Virtualization Enhancements - EPC Group
High Level Overview of Windows Azure - EPC Group
SharePoint 2013 and Office 365 External Sharing
BizTalk Server 2010 - Invoking Restful Services - EPC Group
BizTalk Sever 2010 - Basic Principles of Maps - EPC Group
EPC Group and Continental Airlines ECM Case Study - SharePoint 2007 Global Study
Driving End User Adoption in SharePoint 2013 & 2010 - EPC Group
Join EPC Group's Monthly Newsletter

Recently uploaded (20)

PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPT
Teaching material agriculture food technology
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PPTX
Spectroscopy.pptx food analysis technology
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Machine learning based COVID-19 study performance prediction
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Encapsulation theory and applications.pdf
PDF
Empathic Computing: Creating Shared Understanding
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Electronic commerce courselecture one. Pdf
PDF
Approach and Philosophy of On baking technology
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
The Rise and Fall of 3GPP – Time for a Sabbatical?
Advanced methodologies resolving dimensionality complications for autism neur...
Teaching material agriculture food technology
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Spectroscopy.pptx food analysis technology
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Agricultural_Statistics_at_a_Glance_2022_0.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
Review of recent advances in non-invasive hemoglobin estimation
Reach Out and Touch Someone: Haptics and Empathic Computing
Machine learning based COVID-19 study performance prediction
Understanding_Digital_Forensics_Presentation.pptx
Dropbox Q2 2025 Financial Results & Investor Presentation
Encapsulation theory and applications.pdf
Empathic Computing: Creating Shared Understanding
Chapter 3 Spatial Domain Image Processing.pdf
Electronic commerce courselecture one. Pdf
Approach and Philosophy of On baking technology
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton

Remote Blog Storage (RBS) Best Practices in SharePoint 2010 - EPC Group

  • 4. Overview  Understanding Unstructured Data Storage ◦ SQL BLOB ◦ RBS ◦ FILESTREAM  RBS Structure and Mechanics  Planning BLOB Storage  Deployment and Migration  Support  EBS  Summary
  • 5. Remote BLOB Storage is designed to delineate structured (metadata) and unstructured (BLOB data) data  Enables organizations to deploy more efficient content storage models based on commodity storage ◦ Does not address capacity – database is the sum of the unstructured and structured data regardless of location  Provides an upgrade path for WIDE customers  Improves existing BLOB storage scenarios (EBS)
  • 6. On average 20% of data is structured, 80% is unstructured or semi-structured
  • 7. Does not adhere to specific format or sequence  Is not tied to rules and unpredictable  Examples: ◦ Text ◦ Video ◦ Audio ◦ Images ◦ Word, PowerPoint, code, etc.
  • 8. Organized in semantic chunks (entities)  Tied to relationships and has attributes  Associated with a defined schema: ◦ All entities have the defined format ◦ Have a predefined length
  • 9. BLOB  Binary Large OBject  BLOB is the data stream associated with a file ◦ SharePoint file metadata and BLOBs are stored in SQL databases ◦ BLOBs do not participate in query operations ◦ Sample BLOB operations: Get, Put, Read range, etc.  SharePoint is built around the file ◦ Document libraries, Record Centers  BLOBs generally represent 80% of total content
  • 10. Web Server Database Server Content Database  SharePoint stores BLOBs and associated metadata in the content database
  • 11. Storage ◦ SQL storage is usually more expensive  SAN versus CAS stores  Performance ◦ Impacts load on SQL Server box  Policy requirements ◦ Expunge, BLOB immutability
  • 12. Web Server Database Server Content Database BLOB BLOB BLOB Store X Store Y Store Z • Independent component that can be registered for a SharePoint farm • Data store for all BLOBs added to the content databases where the Provider is set Active • Customers can select BLOB store providers
  • 13. Unstructured Data Unstructured Data Unstructured Data Dedicated BLOB Stores SQL BLOB Integrated File + Remote File Servers Database
  • 14. Binary large objects stored in data tables (varbinary(MAX))  Traditional method of storing and retrieving binary large objects with SharePoint Products and Technologies
  • 15. SQL Server 2008 Add-on  Remote Blob Storage (RBS) is a library API set that is designed to move storage of large binary data (BLOBs) from Microsoft SQL Server to external storage solutions.  RBS gives applications the ability to use rich relational capabilities of SQL Server for their structured data along with capabilities of dedicated storage solutions for their unstructured data in a transactionally consistent manner.
  • 16. Leverages NTFS FS by storing varbinary(max) BLOB data as files on the FS  Addresses performance by enabling more memory access for query processing  Suited for scenarios where BLOB (unstructured) data is 1 MB or larger and fast read access is desired (i.e. RM scenarios)
  • 17. Storage attribute on Unstructured Data VARBINARY(MAX)  Unstructured data stored directly in the file system (requires NTFS)  Dual Programming Model  Data Consistency  Size limit is the file system volume size  Integrated Manageability  SQL Server Security Stack (Same as Integrated Files & Database SQL BLOB)
  • 18. Throughput (Mb/sec.) 4000 Throughput (Mb/sec.) 3500 3000 2500 2000 1500 Throughput 1000 (Mb/sec.) 500 0 0KB 240 KB 480 KB 1 MB 2 MB 4 MB 8 MB 16 MB Data Size
  • 19. Unstructured data is  Unstructured data is stored in a Filegroup stored in a Filegroup in a separate database or SQL with the associated Server instance from the Content Database with associated Content the related structured Database with the related data on the local SQL structured data Server instance  Does not supported integrated  Supports integrated management, structured management, i.e. and unstructured data is backup and restore managed separately Local FILESTREAM Remote FILESTREAM
  • 20. FILESTREAM Provider is limited local storage ◦ DAS, NAS, SAN are considered remote storage regardless of disk presentation ◦ Does not support compression, TDE, and other SQL Server capabilities ◦ Special constraints and limitations apply to BCM scenarios such as Database Mirroring and Log Shipping (see FAQ)  3rd party ISV solutions require SQL Server Enterprise Edition ◦ NAS storage devices require 20ms TTFB
  • 21. RBS is a downloadable component in the SQL Server 2008 R2 Feature Pack ◦ Includes a set of libraries and interface specifications  Defines and exposes 3 views for interaction ◦ Application View  Interacts with SharePoint Web Front-end, Provider Library, SQL DB  Implemented by SharePoint 2010 – Transparent to the user ◦ Administrator View  Windows PowerShell CmdLets – Call Stored Procedures and Functions  Installation, configuration, provisioning, RBS Maintainer etc. ◦ Provider View  Defines an interface that should be implemented by each BLOB store provider
  • 22. RBS storage contains two main features: blob storage and blob retrieval. Blobs are immutable so edits are translated in the backend into new blobs.  Blobs are only deleted by garbage collection, which scans the content database and blob library, deleting blobs (of a certain age) that are no longer referenced.  Where to store items (RBS or inline) is determined at the content database level on the back-end, not the WFE
  • 23. Business Logic Application Database Server Web Server Return BLOB Read BLOB BLOB Store Response User Request Content Db Get BLOB Id Config Db
  • 24. Business Logic Application Database Server Web Server Save BLOB Commit Data BLOB Return BLOB Id BLOB Store Response User Request Commit Content BLOB Id & Db Metadata Config Db
  • 25. mssql_resources.rbs_internal_ta RBS providers store metadata bles in their own tables inside the [internal rbs … content database. Content data] Database … … RBS provider = [] AllDocStreams Content RbsId NULL [ id # ] RBS Blob [Inline Content] NULL
  • 26. DatabaseInformation … … RBS Enabled [1|0] RBS Provider [ providername The provider is set at the ] database level. AllSites … RBS Collection … ID [ id # ] Each site has a separate RBS Provider Logic collection (top-level storage unit). This is the default AllDocStreams location for new blobs, and … RBS ID … consulted when adding new [ rbs id (bin blobs. This is a lookup-only data) ] operation. The location of a RBS Provider Logic Blob blob in the AllDocStreams table need not be in the same collection as what the AllSites table default collection.
  • 27. Select storage solutions designed to support BLOB data storage ◦ Write Once Read Multiple (WORM) devices can pronounce orphan occurrences  Prevents deletion of BLOB related SharePoint metadata row ◦ SAN devices support BLOB data replication, locally and over wide area networks using bit mapping. Database mirroring and/or Log Shipping can augment these solutions, but add operation complexity.  RBS Maintainer is instrumental in detecting and resolving orphans in the event of failover ◦ Provides Reference Scan (RS), Delete Propagation (DP), and Orphan Cleanup (OC)
  • 28. RBS can be used in conjunction with commodity storage, i.e. DAS; however, collaborative scenarios should design for IO (RAID 10) ◦ CAPEX and OPEX considerations are critical to realizing ROI  Ensure operationally a multi-tiered storage subsystem can be maintained and supported  Ensure BCM plans ensure metadata and BLOB data can be kept synchronized ◦ Consider RBS when the following is true:  BLOB data files are larger than 256KB on average  BLOB data files are at least 80KB and the DB server I/O is a bottleneck ◦ A large number of small BLOBs can decrease performance  RBS provides maximum value in archiving and DAM scenarios, particularly large files with infrequent access
  • 29. Unstructured Data Unstructured Data Unstructured Data Dedicated BLOB Stores File System SQL BLOB Remote File Servers File Server Solution Dedicated BLOB Store File System/File Server SQL BLOB Advantages  Lower cost per GB at scale  Lowest cost per GB  Integrated management  Scalability & Expandability  Streaming Performance  Data-level consistency Disadvantages  Complex application  Complex application  Poor data streaming support development & deployment development & deployment  File size limitations  Separate data management  Integration with structured  Highest cost per GB  Enterprise-scales only data Example  EMC Centera  Windows File Servers  SQL Server VARBINARY(MAX)  Fujitsu Nearline
  • 30. Ensure the provider is working within the scope of provided API’s and support policies  Evaluate provider characteristics, features, and capabilities  Map provider offering to scenario
  • 31. Required ◦ Implementation of RBS provider interface ◦ Enable multiple provider instances ◦ Guarantee BLOB persistence ◦ Guarantee link-level consistency  Recommended ◦ Backup, HA and Disaster recovery capability ◦ Data de-duplication ◦ Expunge, Immutability of BLOBs
  • 32. 1. Enable FILESTREAM 2. Provision BLOB Store 3. Download and install SQL Server Remote BLOB Store on each database server 4. Download and install SQL Server Remote BLOB Store on each front-end Web server 5. Enable RBS
  • 33. Enable/Disable ◦ Enables/Disables usage $database = Get-SPContentDatabase - of RBS with SharePoint WebApplication http://<server> $rbs =  GetProviderNames $database.RemoteBlobStorageSettings $rbs.Installed() ◦ List of all registered $rbs.Enable() Providers ◦ Registered Providers are kept track of in Config DB $rbs.SetActiveProviderName($rbs.GetProv iderNames()[0])  SetActiveProvider $rbs ◦ One active provider/BLOB store per Content DB ◦ Other BLOB stores can be used for read operations
  • 34. Use Windows PowerShell Migrate CmdLet ◦ Moves BLOBs from current location to the current Active RBS Provider store. ◦ Performs a “deep copy” of BLOB, sequentially ◦ Live migration – does not require downtime ◦ Migration can be stopped and resumed  Migrate – can be used for upgrade from EBS to RBS
  • 35. Backup Backup Start Both Backups Complete Restore Start Both Restores are Complete
  • 36. Provider SQL Server 2008 SQL Server 2008 R2 2008 Not Supported Not Supported 2008 R2 with Supported Supported FILESTREAM Provider
  • 37. SPFarm scoped  SPContentDatabase scoped  Exposed as a COM  Exposed as a .NET interface interface (unmanaged) (managed)  Does not provide a  Provides a configurable configurable maintainer maintainer  Supports single provider  Supports multiple providers  Supports Object Model,  Supports Object Model PRIME, and *SQL Server and PRIME backup/restore backup/restore  Introduced in SharePoint  Deprecated in SharePoint 2010 2010 ◦ Implemented by SQL Server, ◦ Roadmap based on RBS supported with SharePoint EBS RBS
  • 38. Deep copy the binary large objects from EBS either inline or into RBS  Internalize EBS binary large objects, re- externalize with RBS
  • 39. Used by Work and PowerPoint Web Applications ◦ Implemented per Web Application
  • 40. Database Mirroring does not support FILESTREAM ◦ Log Shipping should augment Database Mirroring for BLOB protection
  • 41. Performs maintenance tasks associated with Remote BLOB Storage ◦ Garbage Collection ◦ Reference Scan ◦ Delete Propagation
  • 42. Successful binary large object implementations require careful planning to ensure expected ROI is realized  Binary large object externalization is designed to reduce capital expenditures, it does not resolve capacity or performance constraints  Standardized API set allows choice of providers

Editor's Notes

  • #6: BLOBs and relational database data are very different entities. Relational data usually consists of text or numbers and tends to be small. In contrast, BLOB data is most often pictures in .jpg, .tiff, or .bmp format—such as product images on a Web site—which can be quite large.
  • #14: There are three (3) approaches to storing unstructured data with SQL Server, RBS, SQL BLOB, and FILESTREAM.Remote BLOB Storage (RBS), SharePoint relies on a new layer in SQL Server to read or update BLOB data stored outside of the database on separate BLOB Stores (File System or dedicated BLOB Stored)SQL BLOB traditional BLOB storage with SharePoint, BLOB data is stored along side the structured metadata in the Content DatabaseFILESTREAM Storage: Improving SQL BLOBGeneral guidance can be summarized as:Storing records that are on average smaller than 256KB is optimized with traditional SQL BLOBStoring large BLOB data like large video files will benefit from FILESTREAM - FILESTREAM will provide better streaming performance on these filesStoring the TIFF images or large static files on remote BLOBs, you should consider to leveraging the new Remote BLOB API layer, i.e. RBS
  • #15: varbinary is the binary data type designation for binary large objects stored in SharePoint 2010 Content Databases and refers to variable-length binary data. (MAX) refers to a value that max indicates that the maximum storage size is 2^31-1 bytes or otherwise 2GB.
  • #16: RBS is a SQL Server 2008 add-on that uses auxiliary tables, stored procedures and a managed client library to provide its services. A reference to the blob (provided by the Blob Store) is stored in RBS auxiliary tables and an RBS Blob ID is generated.ISVs create RBS Provider Libraries to enable custom BLOB stores using the RBS API set. The externalization of binary large objects in SharePoint 2010 is often referred to as RBS.
  • #17: FILESTREAM storage is implemented as a *varbinary(max) column in which the data is stored as BLOBs in the file system.Transact-SQL statements can insert, update, query, search, and back up FILESTREAM data. Win32 file system interfaces provide streaming access to the data.A FILESTREAM filegroup is a special filegroup that contains file system directories instead of the files themselves. These file system directories are called data containers. Data containers are the interface between Database Engine storage and file system storage.FILESTREAM uses the NT system cache for caching file data. This helps reduce any effect that FILESTREAM data might have on Database Engine performance. The SQL Server buffer pool is not used; therefore, this memory is available for query processing.Performance: The way the data is going to be used is a critical factor. If streaming access is needed, storing the data inside a SQL Server database may be slower than storing it externally in a location such as the NTFS file system. Using file system storage, the data is read from the file and passed to the client application (either directly or with additional buffering). When the BLOB is stored in a SQL Server database, the data must first be read into SQL Server’s memory (the buffer pool) and then passed back out through a client connection to the client application. Not only does this mean the data goes through an extra processing step, it also means that SQL Server’s memory is unnecessarily “polluted” with BLOB data, which can cause further performance problems for SQL Server operations.FILESTREAM is most appropriate when the following conditions are true:Objects that are being stored are, on average, larger than 1 MB.Fast read access is important.http://technet.microsoft.com/en-us/library/bb933993.aspxhttp://msdn.microsoft.com/en-us/library/cc949109(SQL.100).aspx*See slide
  • #18: Storage AttributeA FILESTREAM BLOB is a SQL BLOB, varbinary(MAX) Dual Programming ModelAll the existing code you have on BLOBs can work for a filestream BLOB.We leverage the Windows file systems layer to enable the win32 rich streaming programming model with the same transactional semantics as T-SQL. You open can obtain a real win32 handle to the filestream BLOB and call win32 read/write APIs on it.This is main advantage over SQL BLOB. With your large BLOB data stored now on the filesystem, the streaming performance can be as good as the filesystem’s. You can leverage the file system streaming optimizations options that fit your app’s access patterns.Data ConsistencySQL Server is now on the I/O stack of the streaming that you do on the filestream unstructured data Integrated ManageabilityThe filestream BLOB is still a SQL BLOB; FTS, Indexing, Replication, Log shipping etc.. Works for filestream BLOBs.
  • #19: BLOBs smaller than 256 kilobytes SQL BLOB suitable, and BLOBs larger than 1 megabyte (MB) are best stored outside the database. For those sized between 256 KB and 1 MB, the more efficient storage solution depends on the read vs. write ratio of the data, and the rate of “overwrite”. Storing BLOB data solely within the database (e.g., using the varbinary(max) data type) is limited to 2 gigabytes (GB) per BLOB and enforced through SharePoint 2010.
  • #28: Write Once, Read Many (alternatively Write One, Read Multiple or WORM) refers to computer data storage systems, data storage devices, and data storage media that can be written to once, but read from multiple times. DVD/CDRLook through the application tables and find blobs that are no longer referenced by the application.GC time window to define when BLOBs are deleted from the blob store.blobs are &quot;orphans&quot; and can be caused due to aborted transactions, application misbehavior or other failures. Orphan blobs created before the GC time window are deleted from the blob store.Get more info here http://blogs.msdn.com/b/sqlrbs/archive/2008/08/08/rbs-garbage-collection-settings-and-rationale.aspx.
  • #30: Enterprises have 3 options. Each option has advantages &amp; disadvantages based on your scenario.What are the main tradeoffs? When I choose to store my Jpeg images or video files on Windows file system/file server configuration, I get great streaming performance. But my development cost is high because SharePoint needs to have the logic to keep the structured data in sync with my Jpegs/MP3 files stored outside the db.When I choose to store my Jpeg images or video files on a dedicated Blob store, I can scale and expand my storage space easily. The more I expand it the cheaper it gets to add to it. In this case streaming performance depends on your store and keeping structured and unstructured data in sync requires a complex implementation at the app level. When I choose to store my Jpeg images as SQL VARBINAY(MAX) BLOBs in SQL Database, my Jpeg files are constantly in sync with the metadata. I can also use all the SQL Server database management features on my Jpeg files. What’s the tradeoff in this case? SharePoint has to manage the file size limitation and poor streaming performance if my jpegs grow larger (a paper by Jim Grey points out the SQL BLOBs streaming performance is better than file system streaming perf for files smaller than 250KB, while when the unstructured data is around 1MB and larger the File system streaming performs much better than SQL BLOB Streaming).I’ve also seen large enterprises, designing more than one of these options in the same app and some central module decides in which option to store the new BLOB depending on the storage purpose (how large the BLOB is, how it will be used)Show of handsWho store some or all BLOB data in SQL Server BLOBs (Varbinary(max))?Who store some of the BLOB data on file system?Who store some of the BLOB data on remote dedicated store?
  • #40: The Office Web Applications cache is used by Word and PowerPoint Web Applications to create a version of a document requested for viewing through the browser improving performance and reducing resource consumption on server machines by making cached versions of a document or presentation available in cases where there are multiple requests for the same document.The Office Web Applications cache occurs in two (2) distinct tiers, on the server file system and within a “specialized” site collection hosted on a per Web application basis. Document or presentation requests made through the Office Web Applications are served through both caches as the images are rendered for client consumption. Both cache locations are used by all site collections within a Web application where the Office Web Applications features activated.For example, when a client requests a document, the document is rendered through an AppServerHost.exe process and the document subsequently is cached to the server file system cache located on each server from which the content is propagated to the site collection cache which exists on a per Web application basis. Subsequent requests for the document are rendered from the site collection cache.
  • #42: The maintenance tasks associated with Remote BLOB Storage (RBS) are mainly performed through the RBS Maintainer. The RBS Maintainer performs periodic garbage collection and other maintenance tasks for an RBS deployment.GCGarbage collection is how unreferenced or deleted data is removed from the remote BLOB store. Garbage collection in RBS is performed passively. References to BLOBs are counted by looking at the list of BLOB IDs stored by the application in its RBS table columns at the time of garbage collection.RSLook through the application tables and find blobs that are no longer referenced by the application. The list of registered RBS columns is used for this purpose. BlobIds must not be stored in any place other than the registered columns. The blobs that are no longer referenced by the application are marked to be deleted.DPBlobs marked for deletion are actually deleted from the blob store. There is a gap between when the blobs get marked for deletion and when they are actually deleted. This gap duration can be configured using the &quot;garbage_collection_time_window&quot; config item and defaults to 30 days.