2. Row vs. columnar relational databases
âą All relational databases deal with tables, rows, and columns
âą But there are sub-types:
âą row-oriented: they are internally organized around the handling of rows
âą columnar / column-oriented: these mainly work with columns
âą Both types usually offer SQL interfaces and produce tables (with rows
and columns) as their result sets
âą Both types can generally solve the same queries
âą Both types have specific use cases that they're good for (and use
cases that they're not good for)
3. Row vs. columnar relational databases
âą In practice, row-oriented databases are often optimized and
particularly good for OLTP workloads
âą whereas column-oriented databases are often well-suited for OLAP
workloads
âą This is due to the different internal designs of row- and column-
oriented databases
5. Row-oriented storage
âą When looking at a table's datafile, it could look like as follows:
âą Actual row values are stored at specific offsets of the values struct:
âą Offsets depend on column types, e.g. 4 for int32, 8 for int64 etc.
6. Row-oriented storage
âą Row-oriented storage is good if we need to touch one row. This
normally requires reading/writing a single page
âą Row-oriented storage is beneficial if all or most columns of a row
need to be read or written. This can be done with a single read/write.
âą Row-oriented storage is very inefficient if not all columns are needed
but a lot of rows need to be read:
âą Full rows are read, including columns not used by a query
âą Reads are done page-wise. Not many rows may fit on a page when rows are
big
âą Pages are normally not fully filled, which leads to reading lots of unused areas
âą Record (and sometimes page) headers need to be read, too but do not
contain actual row data
7. Column-oriented storage
âą Column-oriented databases primarily work on columns
âą All columns are treated individually
âą Values of a single column are stored contiguously
âą This allows array-processing the values of a column
âą Rows may be constructed from column values later if required
âą This means column stores can still produce row output (tables)
âą Values from multiple columns need to be retrieved and assembled for
that, making implementation of bit more complex
âą Query processors in columnar databases work on columns, too
8. Column-oriented storage
âą Column stores store data in column-specific files
âą Simplest case: one datafile per column
âą Row values for each column are stored contiguously
9. Column-oriented storage
âą Since the data is stored in column wise, a single block can store many
values compared to row-oriented databases
âą Records per block,
âą Row-oriented:
Block size
Total record size
âą Column oriented:
Block size
Column size
âą More data per block => less block reads => improve I/O efficnency
10. Column-oriented storage â Compression
âą Almost all column stores perform compression
â Compression further reduces the storage footprint of each column
â Column data type tailored compression
âą RLE (Run-length encoding)
âą Integer packing
âą Dictionary and lookup string compression
âą Other (depends on column store)
âą Effective compression reduces storage cost
âą IO reduction yields decreased response times during queries as well
- Queries may execute an order of magnitude faster compared to queries over
the same data set on a row store
âą 10:1 to 30:1 compression rations may be seen
11. Column-oriented storage â Compression
âą All data within each column datafile have the same type, making it
ideal for compression
âą Usually a much better compression factor can be achieved for single
columns than for entire rows
âą Compression allows reducing disk I/O when reading/writing column
data but has some CPU cost
âą For data sets bigger than the memory size compression is often
beneficial because disk access is slower than decompression
12. Column-oriented storage â Compression
âą A good use case for compression in column stores is dictionary
compression for variable length string values
âą Each unique string is assigned an integer number
âą The dictionary, consisting of integer number and string value, is saved
as column meta data
âą Column values are then integers only, making them small and fixed
width
âą This can save much space if string values are non-unique
âą With dictionaries sorted by column value, this will also allow range
queries
13. Column-oriented storage â IO saving
âą Column stores can greatly improve the performance of queries that only
touch a small amount of columns
âą This is because they will only access these columnsâ content
âą Simple math: table t has a total of 10 GB data, with
âą column a: 4 GB
âą column b: 2 GB
âą column c: 3 GB
âą column d: 1 GB
âą If a query only uses column d, at most 1 GB of data will be processed by a
column store
âą Could read even less with compression
âą In a row store, the full 10 GB will be processed
14. Column-oriented storage â segments
âą Column data in column stores is often grouped into segments/packets of a
specific size (e.g. 64 K values)
âą Meta data is calculated and stored separately per segment, e.g.:
âą Segment meta data can be checked during query processing when no indexes are
available
âą Min value in segment
âą Max value in segment
âą Number of NOT NULL values in segment
âą Histograms
âą Compression meta data
âą Segment meta data may provide information about whether the segment can be
skipped entirely, allowing to reduce the number of values that need to be
processed in the query
âą Calculating segment meta data is a relatively cheap operation (only needs to
traverse column values in segment) but still should occur infrequently
âą In a read-only or read-mostly workload, this is tolerable
15. Column-oriented storage â processing
âą Column values are not processed row-at-a-time, but block-at-a-time
âą This reduces the number of function calls (function call per block of
values, but not per row)
âą Operating in blocks allows compiler optimizations, e.g. loop unrolling,
parallelization, pipelining
âą Column values are normally positioned in contiguous memory
locations, also allowing SIMD operations (vectorization)
âą Working on many subsequent memory positions also improves cache
usage (multiple values are in the same cache line) and reduces
pipeline stalls
âą All these make column stores ideal for batch processing
16. Column-oriented storage â processing
âą Reading all columns of a row is an expensive operation in a column
store, so full row tuple construction is avoided or delayed as much as
possible internally
âą Updating/deleting or inserting rows may also be very expensive and
may cost much more time than in a row store
âą Some column stores are hybrids, with read-optimized (column)
storage and write-optimized OLTP storage
âą Still, column stores are not really made for OLTP workloads, and if you
need to work with many columns at once, you'll pay a price in a
column store
17. OLTP
âą Transactional processing
âą Retrieve or modify individual records (mostly few records)
âą Use indexes to quickly find relevant records
âą Queries often triggered by end user actions and should complete
instantly
âą ACID properties may be important
âą Mixed read/write workload working set should fit in RAM
18. OLAP
âą Analytical processing / reporting
âą Derive new information from existing data (aggregates,
transformations, calculations)
âą Queries often run on many records or complete data set data set
may exceed size of RAM easily
âą Mainly read or even read-only workload
âą ACID properties often not important, data can often be regenerated
âą Queries often run interactively
âą Common: not known in advance which aspects are interesting so pre-
indexing ârelevantâ columns is difficult
23. HBase VsRDBMS
Hbase RDBMS
Column-oriented Row-oriented (Mostly)
Flexible schema, add columns on the fly Fixed schema
Good with sparse tables Not optimized for sparse tables
Not optimized for joins â still possible with map reduce Optimized for joins
Tight integration with MR Not really
Horizontal scalability â just add hardware Hard to shared and scale
Good for semi-structured data as well as structured data Good for structured data
25. When not touseHBase?
âą When you have only few thousands/millions rows
âą Lacks of RDBMS commands
âą When you have hardware less than 5 data nodes when replica
factor is 3
Note: HBase can run quite well in stand-alone mode on a laptop, but,
this should be considers a development configuration only
27. Column family
âą In the HBase data model columns are grouped into column families
âą Column families must be defined up front during table creation
âą Column families are stored together on disk, which is why HBase is referred to
as a column-oriented data store
29. Datamodel
Row key Personal_data Demographic
Personal_ID Name Address Birth Date Gender
1 H. Houdini Budapest 1926-10-31 M
2 D. Copper 1956-09-16
3 - - - M
32. âą HBase HMaster is a lightweight process that assigns regions to region servers in
the Hadoop cluster for load balancing.
âą Manages and Monitors the Hadoop Cluster
âą Performs Administration (Interface for creating, updating and deleting tables.)
âą Controlling the failover
âą DDL operations are handled by the HMaster
âą Whenever a client wants to change the schema and change any of the
metadata operations, HMaster is responsible for all these operation
Architecture-HMaster
33. âą These are the worker nodes which handle read, write, update, and delete
requests from clients
âą Region Server process, runs on every node in the Hadoop cluster
âą Block Cache â This is the read cache. Most frequently read data is stored in the
read cache and whenever the block cache is full, recently used data is evicted.
âą MemStore- This is the write cache and stores new data that is not yet written to
the disk. Every column family in a region has a MemStore.
âą Write Ahead Log (WAL) is a file that stores new data that is not persisted to
permanent storage.
âą HFile is the actual storage file that stores the rows as sorted key values on a disk
ArchitectureâRegionserver
39. CAPtheory inHBase
âą HBase supports Consistency and Partition tolerance.
âą IT compromises the Availability factor
âą Partition tolerance
âą HBase runs on top of Hadoop distribution
âą All the HBase data are stored in HFDS
âą Hadoop is designed to have fault tolerance and therefore, HBase inherit the
partition tolerance capability.
40. CAPtheory inHBasecontd.
âą Consistency
âą Access to row data is atomic and includes any number of columns being read
or written to
âą The atomic access is a factor to this architecture being strictly consistent, as
each concurrent reader and writer can make safe assumptions about the state
of a row
âą When data is updated it is first written to a commit log, called a write-ahead
log (WAL) in Hbase
âą Then stored in the (sorted by RowId) in-memory memstore
âą Once the data in memory has exceeded a given maximum value, it is flushed
as an HFile to disk
âą After the flush, the commit logs can be discarded up to the last unflushed
modification
41. CAPtheory inHBasecontd.
âą Availability
âą HBase compromises the availability factor
âą But, Cloudera Enterprise 5.9.x and Hortonworks Data Platform 2.2
implements high available feature in HBase
âą They provides a feature called region replication to achieve high availability for
reads