Below article can explain in detail on how to estimate the space required for a table , index , temporary tablespace , rollback segment , redo log files.
a new oracle dba can find complete information along with the required formulas and advice here.
Subject: | Extent and Block Space Calculation and Usage in Oracle Databases | |||
Doc ID: | 10640.1 | Type: | BULLETIN | |
Modified Date: | 01-MAY-2008 | Status: | PUBLISHED |
The challenging role of database administrating can be made easy. Proper
planning and consideration taken when setting up the database can decrease
maintenance cost and prolong the life of the database. This paper presents
guidelines for the Oracle database administrator in order to plan efficient
use of the system's resources and prevent unnecessary maintenance down the
road.
In Oracle versions 7 and 8, accurate sizing of the data dictionary, user tables,
user indexes, rollback segments is important since the RDBMS deals more
efficiently
--> with a table or index stored in a single extent than one stored in
multiple extents
--> with rollback segments precisely created in number and size
In Oracle versions 9i and 10g, accurate sizing of the data dictionary, user
tables, user indexes, rollback segments is of lesser importantance since the
RDBMS deals
--> with the extents locally within the tablespace for tables or indexes
stored in locally managed tablespaces where the extent size is either
AUTOALLOCATEd by the system or UNIFORM (size defined by the DBA)
Note: except in 9.0 for data dictionary objects still stored within a
dictionary managed tablespace
Refer to Note 175434.1
Create/Upgrade/Migrate a Database to Have a Locally
Managed SYSTEM Tablespace
--> with the creation of undo segments automatically, in number and size, in
UNDO tablespaces in Automatic Undo Management.
Therefore the following explanations deal with
--> version 7 up to 8.1.5 for tables and indexes calculation
For version 8.1.6 up to 9.2 , refer to
Note 105120.1
Advantages of Using Locally Managed vs Dictionary Managed Tablespaces
--> version 7 up to 8.1.7 for rollback segments
For version 9.0 up to 9.2 , refer to
Note 135090.1
Managing Rollback/Undo Segments in AUM (Automatic Undo Management)
It should be noted, however, that many of the storage conventions
related to size of data, remain the same in all versions.
A - INTRODUCTION
=================
Accurate sizing of the data dictionary, user tables, user indexes,
rollback segments, and redo logs depends on accurate estimates of:
1. The number and size of rows stored in user tables.
2. The transaction mix.
3. The sizes and performance characteristics of database objects.
Considerations should be made for the larger tables and tables that will
grow considerably over time. When calculating this space you should account
for the fact that the RDBMS deals more efficiently with a table or index stored
in a single extent than one stored in multiple extents.
(See 'How to stop defragmenting and start living: the definitive word on
fragmentation' at /cgi-bin/cr/getfile.cgi?p_attid=10640.1:239049
for more information on this subject)
To size the transactions, you will need to ascertain the nature of your
transactions. These can fall into three basic categories:
1. Short update transactions affecting relatively small amounts of data.
2. Long-running update transactions that alter significant amounts of data
before committing or rolling back.
3. Long-running read-only transactions which only query data but require that
read-consistent snapshots of tables remain available until the last fetch
of the query is completed.
The transaction mix for different times should also be considered since it may
vary throughout the day. Both cases should be investigated and planned for in
the production system.
Once the row and transaction information is gathered from an analysis of the
application, object sizes for the database can be estimated.
B - DATA DICTIONARY / SYSTEM TABLESPACE
========================================
The data dictionary space should remain relatively constant, growing only as
the database inserts records tracking the inserted database objects. Whenever
the database is in operation, the Oracle RDBMS will update the data dictionary
in response to every DDL statement, reflecting changes in database structures,
auditing, grants, and data. It is critical to allow enough room for the data
dictionary to grow and for other objects such as deferred rollback segments to
exist in the SYSTEM tablespace.
To ensure that space remains available for the data dictionary and other
objects which must reside in the SYSTEM tablespace, place all other objects
in other user tablespaces. In addition, make the SYSTEM tablespace large enough
so that it has at least 50% - 75% free space. Finally, insure that your users
do not have 'resource' privileges on the SYSTEM tablespace for creating objects.
C - TABLES
===========
Table size increases proportionally with the number of rows in the table
assuming average row length remains constant. While the DBA Guide provides a
formula for calculating table size, it may not be as accurate as the following
formula when calculating space for tables with a large number of records.
Below is a formula that will provide more accurate results for both small and
large tables.
Note that this procedure estimates only the initial amount of space required
for the table. Space used by transaction entries and deleted entries does not
become free immediately after the commit due to delayed cleanout. Trailing
nulls are not stored in the database.
Below are methods for calculating space for tables. These are approximations,
so it is a good idea to round up. Also making the initial extent size 10 to
20 percent larger than the value from these calculations may avoid allocating
additional extents right away. Some values may be operating system dependent.
Knowing the type of transactions that will affect the data in the tables is
also very important. This will help you size pctfree and pctused accordingly.
For example, as the rows lengthen in a block, row chaining may result if
pctfree is not set sufficiently high. Also deletes and updates may empty out
blocks so that the table needs more blocks even though the number of rows in
the table is static. For details on pctfree and pctused, please refer to
the Oracle RDBMS Database Administrator's Guide for Version 7/ Version 8i.
First, calculate the total block header size:
block header = fixed header + variable transaction header +
table directory + row directory
where:
fixed header = 57 bytes (the 4 bytes at the end of the block have
already been taken account of in the 24 bytes
for the cache header)
variable transaction header = 23 * i where i is the value of INITRANS
for the table, or can grow
dynamically up to MAXTRANS.
table directory = 4 * n where n is the number of tables.
n = 1 for non-clustered tables.
row directory = 2 * x where x is the number of rows in the block.
The variable transaction header in the block header contains 23 byte-structures
that hold information about transactions accessing the block. By default,
space is reserved for only one transaction (INITRANS = 1) in the block header.
As more transactions access the block, space for additional transaction
entries is allocated out of the free space in the block or an old entry is
reused. As rows are inserted, the 2 bytes for each row in the row directory
are also allocated out of the free space. Once allocated, this space becomes
part of the block header. So, once the two bytes have been allocated in the
row directory, this space is not reclaimed when the row is deleted. For example,
a block that is currently empty but had up to 50 rows at one time will have 100
bytes allocated in the header for the row directory. This space will be reused
as new rows are inserted in the block.
Using the above formula, the initial block header size for a non-clustered
table with INITRANS = 1 is:
block header = 57 + 23 + 4 + 2x = (84 + 2x) bytes
The space reserved for data within the block, as specified by PCTFREE, is
calculated as a percentage of the block size minus the block header.
available data space = (block size - total block header) -
((block size - total block header) * (PCTFREE/100))
For example, with PCTFREE = 10 and a block size of 2048, the total space for
new data in a block is:
available data space = (2048 - (84 + 2x)) - ((2048 - (84 + 2x)) * (10/100))
= (1964 - 2x) - ((2048 - 84 - 2x) * (10/100))
= (1964 - 2x) - (1964 - 2x) * 0.1
= (1964 - 2x - 196 + 0.2x) bytes
= (1768 - 1.8x) bytes
Now, calculate the combined data space required for an average row. Calculating
this depends on the following:
1. The number of columns in the table definition.
2. The datatypes used for each column.
3. The average value size for variable length columns.
A test database similar to the production database will be helpful here. To
calculate the combined data space for an average row in a table, use the
following query:
SELECT AVG(NVL(VSIZE(col1), 1)) +
AVG(NVL(VSIZE(col2), 1)) +
... +
AVG(NVL(VSIZE(coln), 1)) "SPACE OF AVERAGE ROW"
FROM table_name;
col1, col2, ... , coln are the column names of the table and table_name is the
table being evaluated.
Note: This formula assumes that columns containing nulls are not trailing
columns. A column length of 1 is assumed (column length of a null in a
trailing column is 0).
For example, a test table created with the following statement:
CREATE TABLE test (
A CHAR(10),
B DATE,
C NUMBER(10, 2));
can have the space determined by the average row in the table with the query:
SELECT AVG(NVL(VSIZE(A), 1)) +
AVG(NVL(VSIZE(B), 1)) +
AVG(NVL(VSIZE(C), 1)) "SPACE OF AVERAGE ROW"
FROM test;
Alternatively, if a test database is not available, you can estimate the
combined data space for an average row in a table. This is done by examining
the datatype of each column in a table. If the column is of a fixed length
datatype, the value for the column in each row of the table is fixed. For
variable length columns, you must determine the average column value and
determine how much space is required to hold this average value.
For example, using the above table test, the column lengths of the first two
columns are of fixed lengths. So, column A requires 10 bytes (assuming a one
character set is being used) and column B requires 7 bytes. Column C can vary
for each row in the table because the NUMBER datatype is a variable length
datatype. By estimating your proposed data, you can determine the typical value
for this column. For NUMBER datatypes, a typical value may require approximately:
average length column C = (average_precision/2 + 1) bytes
= (8/2 + 1) bytes (estimating an average_precision of 8)
= 5 bytes
Calculating the average space required for the columns in an average row for
table test:
average row = (A + B + C)
= (10 + 7 + 5) bytes
= 22 bytes
Note: negative numbers consume one additional byte for the sign.
Using the results from above, we can calculate the total average row size (or
the minimum amount of space required by a row) in a non-clustered table with
the formula:
average row size = row header + F + V + D (bytes per row)
where:
row header = 3 bytes per row of a non-clustered table
F = total length bytes of all columns with 1 byte column
lengths (CHAR, NUMBER, DATE, and ROWID types)
V = total length bytes of all columns with 3 byte column
lengths (VARCHAR2, LONG, RAW, LONG RAW datatypes)
D = combined data space of all columns in average row
(from above)
Perhaps a simpler way to think about this is that there would
be one byte for columns with an actual length between
1 and 250. Then for lengths from 251 to 64K, we use 3 bytes
(0xFE marker byte, 2 bytes of actual length).
Note 231214.1 describes this aspect more accurately.
For example, the total average row size for table test is:
average row size = (3 + (3 * 1) + (3 * 0) + 22) bytes
= 28 bytes
Note: The minimum row size for a non-clustered table is 9 bytes. Therefore, if
the calculated value for an average row size is less than this absolute
minimum row size, use the minimum value as the average row size.
After the average row size is determined, you can calculate the average number
of rows that can fit into a database block (the values of PCTFREE and PCTUSED
will also determine the amount of space used):
average number of rows per block =
floor(available data space / average row size)
where:
available data space and average row size are calculated above.
Using the information for table test from above:
average number
of rows per block = x = (1768 - 1.8x)/28 bytes
28x = 1768 - 1.8x
29.8x = 1768
x ~ 59 = average number of rows per block
Make sure you round x or the average number of rows per block DOWN. Once you
know the number of rows that can fit inside the available space of a database
block, you can calculate the number of blocks required to hold the proposed
table:
number of blocks
for the table = number of rows / average number of rows per block
Using 10,000 rows for table test:
number of blocks
for table test = 10000 rows / 59 rows per block
~ 169 blocks
To get this value in bytes, multiply by the database block size. And again,
round the number of blocks from the above DOWN.
D - INDEXES
============
Indexes increase in size slightly faster than the corresponding table. So, you
should round up when estimating index size. Formulas are provided below for
calculating the size of an index:
First, calculate the space required by the database block header of a block to
contain index data. To calculate the total block header size:
block header size = fixed header + variable transaction header
where:
fixed header = 113 bytes
variable transaction header = 23*i where i is the value of INITRANS
for the index
Assuming the INITRANS = 2 (the default for indexes), the previous formula can
be reduced to:
block header size = 113 + (23 * 2) bytes
= 159 bytes
The space reserved in each database block for index data, as specified by PCTFREE,
is calculated as a percentage of the block size minus the block header.
available data space = (block size - block header size) -
((block size - block header size) * (PCTFREE/100))
Assuming a block size of 2048 bytes and PCTFREE of 10:
available data space = (2048 bytes - 159 bytes) -
((2048 bytes - 159 bytes) * (10/100))
= 1889 bytes - 188.9 bytes
= 1700.1 bytes
The calculation of the combined column lengths of an average index value is the
same as the calculation for table size, except you only need to calculate the
average combined column lengths of the columns in the index. This formula is
listed above.
Once the combined column length of an average index entry has been calculated,
the total average entry size can be calculated, using the formula:
bytes per entry = entry header + ROWID length + F + V + D
where:
entry header = 1 byte
ROWID length = 6 bytes
F = total length bytes of all columns with 1 byte column
lengths (CHAR, NUMBER, DATE, and ROWID types)
V = total length bytes of all columns with 3 byte column
lengths (VARCHAR2 and RAW datatypes)
D = combined data space of all columns (from above)
Here again it applies that there would be one byte for columns
with an actual length between 1 and 250. Then for lengths from
251 to 64K, we use 3 bytes (0xFE marker byte, 2 bytes of actual length).
Note 231214.1 describes this aspect more accurately.
For example, given that D is calculated to be 22 bytes (from the table
calculations above), and that the index is comprised of three CHAR columns, the
total average entry size of the index is:
bytes per entry = 1 + 6 + (3 * 1) + (3 * 0) + 22 bytes
= 32 bytes
To calculate the number of blocks and bytes required for the index, use:
number of blocks
for index = 1.1 * ((number of not null rows * avg. entry size) /
((floor (avail. data space / avg. entry size)) *
(avg. entry size))
Note: The additional 10% added to this result accounts for the extra space
required for branch blocks of the index.
number of blocks
for index = 1.1 * ((10000 * 32 bytes) / ((floor (1700/32 bytes)) *
(32 bytes))
= 208 blocks
The number of bytes can be calculated by multiplying the result by the database
block size. This method for indexes is more approximate than for tables.
An index, unlike a table, does not contain rows. Therefore, no query similar
to the table query above exists for indexes. Also, the VSIZE function will not
report information about index storage space.
Putting indexes for large tables in their own tablespaces makes space
management more efficient. It decreases fragmentation and also makes managing
the index growth easier if recreating the index is necessary.
E - TEMPORARY SEGMENTS
=======================
The amount of temporary segment space needed for a sort grows proportionally
with the amount of data being sorted. Most SQL operations do not require data
to be sorted, however, the few SQL operations which do require sorting are:
CREATE INDEX
SELECT ... ORDER BY
SELECT ... DISTINCT
SELECT ... GROUP BY
SELECT ... UNION
SELECT ... INTERSECT
SELECT ... MINUS
unindexed joins
certain correlated subqueries
For V7.3.X and +, please read the following:
Note 102339.1 Temporary Segments What Happens When a Sort Occurs
The init.ora SORT_AREA_SIZE parameter sets up an upper limit on the size of
a sort that can be performed in memory. Common settings of this parameter range
between 64K and 256K for larger systems. Sorts exceeding the SORT_AREA_SIZE
limit transparently acquire a temporary segment in the database.
Temporary segments have storage clauses of the tablespace that they reside in.
To ensure that the temporary segments can grow to handle large transactions,
the initial extent should be greater than SORT_AREA_SIZE and next extent and
maxextents should be large. A pctincrease (percent increase) of zero may
reduce fragmentation in the temporary tablespace. If the temporary tablespace
holds other large database objects, then defragmenting the tablespace might
involve more work.
Temporary segments are allocated as needed during a user session. The storage
characteristics for the extents of the temporary segment are determined by the
defaults of the tablespace in which the temporary segment is created. Temporary
segments are dropped when the statement completes.
A sort may require up to twice the amount of space needed to store the data
being sorted. Most statements issued by interactive transactions will not
generate sorts requiring a temporary segment, and of those that do, few will
require a large amount of temporary segment space. Reports and index creations,
however, often require large temporary segments.
Setting aside a tablespace for temporary segments may help you more accurately
measure the temporary segment requirements of different parts of your
application. To monitor the temporary segment usage, query the data dictionary
views USER_SEGMENTS, ALL_SEGMENTS, or DBA_SEGMENTS, where the value for the
column SEGMENT_TYPE equals TEMPORARY. Measurements may be inflated because
temporary segment space will not be reclaimed until needed or until the
database startup occurs.
F - ROLLBACK SEGMENTS
======================
Rollback segments are used to store undo transactions in case the actions
need to be "rolled back" or the system needs to generate a read-consistent
state from an earlier time.
As the amount of data and number of rows being modified by a statement
increases, the amount of rollback segment space required for the statement will
increase in size proportionally.
To choose the number of rollback segments for your system, refer to
the RDBMS Administrator's Guide for Version 7/ Version 8i
Information is guaranteed to remain in the rollback segment for the duration of
the transaction that generated it. Once a commit is issued, the database is
free to overwrite the undo information. A long-running SQL statement may need
this undo to construct a read-consistent snapshot. The larger a rollback
segment is, the longer the undo information is available (before being
overwritten). Therefore, it is wise to allocate more space than the initial
estimates may suggest because support for long-running transactions may be
needed. As a final test, simulate a heavy transaction load on the full-sized
database before going production.
G - REDO LOG FILES
===================
The issues in sizing redo log files are very different from issues for the
other objects discussed in this paper. The on-line redo logs should be on
devices that are not loaded with other database files. Ensure that enough redo
log space is available for the LGWR process to write out bursts of information
during heavy processing times, and for the ARCH process to catch up later if
necessary. Otherwise, database operations will be suspended until the ARCH can
make a redo log available.
If the redo log is not completely filled and that log is archived (by an
ALTER SYSTEM SWITCH LOGFILE), Oracle will just archive the part of the log that
has been written to. Therefore, you may have archive logs that are different sizes.
Tuning your redo logs
One efficient way to monitor how effectively your redo logs are performing is
to look for messages about switching logs, archiving logs and checkpointing in
your alert log. Everytime Oracle does a log switch it is recorded in the alert
log. Here is an example:
Mon Aug 21 13:53:31 1995
Thread 1 advanced to log sequence 68
Current log# 2 seq# 68 mem# 0: /u05/ylien/dbs/log2ween.dbf
The rate that the redo logs are switching reflects the activity on the
database. If you feel that the redo logs are switching too frequently, examine
the activity on the database. For example, doing hot backups, conventional
loads will generate more redo and it will be normal to see more redo log
switches.
The following example may cause performance degradation:
Mon Aug 21 13:40:56 1995
Thread 1 cannot allocate new log, sequence 68
Checkpoint not complete
Current log# 1 seq# 67 mem# 0: /u05/ylien/dbs/log1ween.dbf
These messages indicate that Oracle is ready to recycle the redo logs but
checkpoint has not been complete on the redo log Oracle needs. The checkpoint
process is absolutely necessary for Oracle to establish data consistency.
There are three ways to resolve this problem:
1- add more redo log groups
2- create bigger redo logs and remove the smaller ones
3- enable the CHECKPOINT_PROCESS = TRUE paramter to be set in the init.ora
1 and 2 will solve the problem because Oracle now has more time before it has
to recycle through the redo logs.
3 enables another background process(CKPT) that will help the checkpoint
process.
H - MAINTENANCE
================
Once the database is up and running, it is always good practice to monitor the
growth of the system. If you can anticipate how much more space a particular
database object will need, then you can plan where to acquire the additional
space.
See Note 162994.1 SCRIPT TO REPORT EXTENTS AND CONTIGUOUS FREE SPACE
I - CONCLUSION
===============
A well planned and designed database lends to easy maintenance and maximum
utilization of your system's resources. Accurate space calculations of the
objects in the database allow the DBA to acquire the disk space needed and
to plan an efficient layout of the physical datafiles. After the database
has been running or goes production, it is also essential to monitor the
growth of the database since it will vary over time and level out. Using the
various resources provided by Oracle, you can determine whether there is
enough space currently for any fast growing objects in the database to extend.
If the database is badly fragmented, then you can plan a convenient time to
resolve the fragmentation.
References:
===========
Oracle7 RDBMS Database Administrator's Guide
Oracle7 RDBMS Performance Tuning Guide
Oracle7 SQL Language Reference Guide
Oracle 8i Administrator's Guide
Note 105120.1 Advantages of Using Locally Managed vs Dictionary Managed
Tablespaces
Note 135090.1 Managing Rollback/Undo Segments in AUM (Automatic Undo Management)
Note 175434.1 Create/Upgrade/Migrate a Database to Have a Locally Managed
SYSTEM Tablespace
.