Database Sizing

Database Sizing
Idea used from: Lake (2014)

Why and When Size?
• Initially to establish the scale of the
required database to help select OS
environment and DBMS
• Establish HDD requirements
• To get a “feel” for the data:
• which tables need special treatment: separate
tablespaces? Partitioning?….
• Generate statistics which help in physical
design
• Continually monitor

Sizing Basics
Bit
“One of the two digits 0 and 1 used in binary
notation. The word comes from Binary digit”
Byte
 “A set of binary digits usually representing
one character, which is treated by the
computer as one unit”

Common Data Types
Type Bytes Range
bit 0 or 1
tinyint 1 0 to 255
smallint 2 -216
to 216
- 1
integer 4 -231
to 231
- 1
decimal(m,n) 8 -1038
to 1038
- 1
datetime 8 depends on DBMS
char(n) , string(n) n Maximum 255
varchar2(n) n Maximum 4000

Oracle Date Data Type
Dates are renowned for causing problems
when transfering data between DBMS
because the method used to store the data
internally differs. For example:
In Oracle the DATE datatype stores the
century, year, month, day, hours, minutes, and
seconds.
Paradox date fields can contain any valid date
from January 1, 9999 BC to December 31, 9999
AD.

Oracle Date Data Type
Example with Oracle:
CREATE TABLE Birthdays_tab (Bname
VARCHAR2(20),Bday DATE) ;
INSERT INTO Birthdays_tab (bname, bday) VALUES
('ANNIE',TO_DATE('13-NOV-92 10:56 A.M.','DD-
MON-YY HH:MI A.M.'));
Oracle uses its own internal format to
store dates. Date data is stored in fixed-
length fields of seven bytes each,
corresponding to century, year, month,
day, hour, minute, and second.

Oracle Varchar2 Data Type
The VARCHAR2 datatype stores variable-length character
strings.
specify a maximum string length (in bytes or characters)
between 1 and 4000 bytes for the VARCHAR2 column.
For each row, Oracle stores each value in the column as a
variable-length field unless a value exceeds the column's
maximum length, in which case Oracle returns an error.
Using VARCHAR2 saves on space used by the table.
For example, storing “PETER” in a column defined as
VARCHAR2(50) will cost only 5 bytes of storage, not 50.
More efficient, but more difficult for sizing!

Oracle LOB Data Types
The LOB datatypes BLOB, CLOB, and BFILE enable you to
store large blocks of unstructured data (such as text,
graphic images, video clips, and sound waveforms) up to 4
gigabytes in size. They provide efficient, random access to
the data.
CLOB is roughly equivalent to a MEMO in Paradox.
You can manipulate and search CLOB fields using special
tools
Again, sizing is difficult as only the space needed is taken
There are lots of other data types, but
these will do for the time being!

Row Sizing
Maximum row size can be determined by
ascertaining the data-types of different
columns of the table and adding together
the respective number of bytes.
Create Table SizeDemo (id Integer, Name
Varchar2(20), Dayte DATE) ;
Max Row Length = 4 + 20 +7 = 31 bytes
Max Row size is a safe estimation, but can
be considerably over estimated.

Oracle Row Sizing (8i onwards)
As an alternative to manual calculation
the average Row Size can be discovered
using the ANALYZE function:
ANALYZE TABLE Member ESTIMATE
STATISTICS;
Then ask for the statistic:
SELECT AVG(NVL(VSIZE(SURNAME),1)) from
member;

Tablespace building blocks
• Data blocks are the finest level of granularity
• A data block is the smallest unit of Input/Output
(I/O) used by the database.
• The block size itself will depend upon several things,
including the OS block size, and is set when the
database is created and is not altered thereafter.
• The data block size should be a multiple of the
operating system's block size.
• For a decision support system (DSS) application, it is
suggested that you choose a large value for the
DB_BLOCK_SIZE. For an OLTP type application, a
lower value (e.g., 2k or 4k) is suggested.
• There is no point in bringing back 32K of data from a
disk if the user only needs 2K!

Data blocks
• Regardless of what the block is being used
to store (it could be part of an Extent in a
table segment, or an index segment, or any
other segment) the data block will be of a
set format.
• The overhead: information about the block
(type, count of entries, timestamp, pointers
to items in the block, etc.). This is often no
more than 100 bytes in size.
• The data section (or Row Data): contains
the rows from the table, or branches of an
index.
• Free Space: the area in a block not yet
taken by row data.

Data blocks
• PCTFREE parameter tells Oracle to stop inserting
data when the free space reduces to 20% of fillable
space
• Free space = Block size-overhead-row data
• Fillable space = Block size-overhead
• The block is now unusable for insertions, and will
remain so until enough rows are deleted to bring
the percentage of the block that is filled with rows
below the PCTUSED parameter setting.
• You do not have to set either parameter: Oracle
defaults to 10% for PCTFREE and 40% for
PCTUSED
• PCTFREE+PCTUSED < 100
create table grades(g_id integer, Grade varchar2(12)) PCTFREE 20 PCTUSED 60;

Block Headers
Vary in size according to information in Block (ie
Index data or Row data)
block header = fixed header + variable
transaction header + table directory + row
directory
Where:
Fixed Header = 57 bytes
Variable transaction = 23 * initrans
Initrans is the number of transaction slots per block. By
default it is 1 for data and 2 for indexes.
Table Directory = 4 bytes
Row directory = 2 bytes for every row in the block
Ref: Oracle Metalink support

Block Space – Worked example block header = fixed header + variable transaction header +
table directory + row directory
 block header = 57 + (23*1) + 4 + 2x = (84 + 2x) bytes, where x =
number of rows in the block (assumes initrans=1)
 available data space = (block size - total block header) -
((block size - total block header) * (PCTFREE/100))
 For example, with PCTFREE = 10 and a block size of 2048, the
total space for new data in a block is:
 available data space = (2048 - (84 + 2x)) - ((2048 - (84 + 2x)) *
(10/100))
 = (1964 - 2x) - ((2048 - 84 - 2x) * (10/100))
 = (1964 - 2x) - (1964 - 2x) * 0.1
 = (1964 - 2x - 196 + 0.2x) bytes
 = (1768 - 1.8x) bytes
Ref: Oracle Metalink support

Sizing: Rows per Block
The next Step is to take your average Row Size
and calculate the average number of rows that
can fit into a database block
average number of rows per block =
floor(available data space / average row size)
Eg, for a average row size of 28 bytes:
average number of rows per block = x = (1768 - 1.8x)/28
bytes
28x = 1768 - 1.8x
29.8x = 1768
x ~ 59 = average number of rows per block
Make sure you round x or the average number of rows per
block DOWN.

Table Sizing
Once you know the number of rows that can fit
inside the available space of a database block,
you can calculate the number of blocks required
to hold the proposed table:
number of blocks for the table = number of
rows / average number of rows per block
Using 10,000 rows for table test:
number of blocks for table test = 10000 rows / 59
rows per block
~ 169 blocks

Index SizingThe method is the same, but there are some
differences in the numbers for Index Blocks:
INITRANS is usually = 2
Fixed Header = 113
So block header size = 113 + (23 * 2) bytes = 159
available data space is still= (block size - block
header size) - ((block size - block header size) *
(PCTFREE/100))
Assuming a block size of 2048 bytes and PCTFREE of 10:
available data space = (2048 bytes - 159 bytes) -
((2048 bytes - 159 bytes) * (10/100)) = 1889 bytes -
188.9 bytes = 1700.1 bytes

Index Sizing cont...
Now find the total average column widths of the
columns used in the index.
Eg: Put an index on the NAME column of SizeDemo.
Assuming average width of 22
Take that into our calculation of bytes per index
entry:
 bytes per entry = entry header + ROWID length + F + V + D
 entry header = 1 byte
 ROWID length = 6 bytes
 F = total length bytes of all columns with 1 byte column
lengths (CHAR, NUMBER, DATE, and ROWID types)
 V = total length bytes of all columns with 3 byte column
lengths (VARCHAR2 and RAW datatypes)
 D = 22 (from above)

Index Sizing cont...
bytes per entry = 1 + 6 + (0 * 1) + (1 * 3) + 22 bytes
= 32 bytes
To calculate the number of blocks and bytes
required for the index, use:
number of blocks for index = 1.1 * ((number of
not null rows * avg. entry size) / avail. data space
The additional 10% added to this result accounts
for branch blocks of the index.
number of blocks for index = 1.1 * ((10000 * 32
bytes) / 1700)
= 208 blocks (rounded up)

Database Sizing
Repeat this exercise for all your major tables and
indexes
80/20 rule applies: don’t waste time on lookups
for example, just make an appropriate, but safe,
guess
Add all the table sizes (in blocks) together and
you have the disk space required
To get this value in bytes, multiply by the
database block size.

Database Sizing

More Related Content

What's hot (20)

Similar to Database Sizing (20)

More from Amin Chowdhury (8)

Recently uploaded (20)

Database Sizing