estimating table,index,temp,rollback,undo space usage – reference for the new oracle dba





Below article can explain in detail on how to estimate the space required for a table , index , temporary tablespace , rollback segment , redo log files.

a new oracle dba can find complete information along with the required formulas and advice here.

Subject: Extent and Block Space Calculation and Usage in Oracle Databases
  Doc ID: 10640.1 Type: BULLETIN
  Modified Date: 01-MAY-2008 Status: PUBLISHED
 
The challenging role of database administrating can be made easy. Proper
planning and consideration taken when setting up the database can decrease
maintenance cost and prolong the life of the database.  This paper presents
guidelines for the Oracle database administrator in order to plan efficient
use of the system's resources and prevent unnecessary maintenance down the
road. 
 
In Oracle versions 7 and 8, accurate sizing of the data dictionary, user tables,
user indexes, rollback segments is important since the RDBMS deals more
efficiently
   --> with a table or index stored in a single extent than one stored in
       multiple extents
   --> with rollback segments precisely created in number and size
 
 
In Oracle versions 9i and 10g, accurate sizing of the data dictionary, user
tables, user indexes, rollback segments is of lesser importantance since the
RDBMS  deals   
   --> with the extents locally within the tablespace for tables or indexes  
       stored in locally managed tablespaces where the extent size is either
       AUTOALLOCATEd by the system or UNIFORM (size defined by the DBA)
       Note: except in 9.0 for data dictionary objects still stored within a
             dictionary managed tablespace
             Refer to Note 175434.1
                      Create/Upgrade/Migrate a Database to Have a Locally  
                      Managed SYSTEM Tablespace
 
   --> with the creation of undo segments automatically, in number and size, in
       UNDO tablespaces in Automatic Undo Management.
 
Therefore the following explanations deal with
  --> version 7 up to 8.1.5 for tables and indexes calculation
 
      For version 8.1.6 up to 9.2 , refer to
          Note 105120.1
          Advantages of Using Locally Managed vs Dictionary Managed Tablespaces
 
  --> version 7 up to 8.1.7 for rollback segments
 
      For version 9.0 up to 9.2 , refer to
          Note 135090.1   
          Managing Rollback/Undo Segments in AUM (Automatic Undo Management)
 
It should be noted, however, that many of the storage conventions
related to size of data, remain the same in all versions.
 
 
A - INTRODUCTION
=================
 
Accurate sizing of the data dictionary, user tables, user indexes,
rollback segments, and redo logs depends on accurate estimates of:
 
 1. The number and size of rows stored in user tables.
 2. The transaction mix.
 3. The sizes and performance characteristics of database objects.
 
Considerations should be made for the larger tables and tables that will
grow considerably over time. When calculating this space you should account
for the fact that the RDBMS deals more efficiently with a table or index stored
in a single extent than one stored in multiple extents. 
(See 'How to stop defragmenting and start living: the definitive word on
fragmentation' at /cgi-bin/cr/getfile.cgi?p_attid=10640.1:239049
for more information on this subject)
 
To size the transactions, you will need to ascertain the nature of your 
transactions. These can fall into three basic categories:
 
 1. Short update transactions affecting relatively small amounts of data.
 2. Long-running update transactions that alter significant amounts of data
    before committing or rolling back.
 3. Long-running read-only transactions which only query data but require that
    read-consistent snapshots of tables remain available until the last fetch
    of the query is completed.
 
The transaction mix for different times should also be considered since it may 
vary throughout the day. Both cases should be investigated and planned for in
the production system.
 
Once the row and transaction information is gathered from an analysis of the 
application, object sizes for the database can be estimated.
 

B - DATA DICTIONARY / SYSTEM TABLESPACE
========================================

The data dictionary space should remain relatively constant, growing only as 
the database inserts records tracking the inserted database objects. Whenever 
the database is in operation, the Oracle RDBMS will update the data dictionary 
in response to every DDL statement, reflecting changes in database structures, 
auditing, grants, and data. It is critical to allow enough room for the data
dictionary to grow and for other objects such as deferred rollback segments to
exist in the SYSTEM tablespace. 
To ensure that space remains available for the data dictionary and other
objects which must reside in the SYSTEM tablespace, place all other objects
in other user tablespaces. In addition, make the SYSTEM tablespace large enough
so that it has at least 50% - 75% free space. Finally, insure that your users
do not have 'resource' privileges on the SYSTEM tablespace for creating objects.
 

C - TABLES
===========

Table size increases proportionally with the number of rows in the table 
assuming average row length remains constant. While the DBA Guide provides a
formula for calculating table size, it may not be as accurate as the following
formula when calculating space for tables with a large number of records. 
Below is a formula that will provide more accurate results for both small and 
large tables.
 
Note that this procedure estimates only the initial amount of space required
for the table. Space used by transaction entries and deleted entries does not
become free immediately after the commit due to delayed cleanout. Trailing 
nulls are not stored in the database. 
 
Below are methods for calculating space for tables.  These are approximations,
so it is a good idea to round up.  Also making the initial extent size 10 to 
20 percent larger than the value from these calculations may avoid allocating
additional extents right away.  Some values may be operating system dependent.
 
Knowing the type of transactions that will affect the data in the tables is 
also very important.  This will help you size pctfree and pctused accordingly.
  
For example, as the rows lengthen in a block, row chaining may result if 
pctfree is not set sufficiently high.  Also deletes and updates may empty out 
blocks so that the table needs more blocks even though the number of rows in 
the table is static.  For  details on pctfree and pctused, please refer to
the Oracle RDBMS Database Administrator's Guide for Version 7/ Version 8i.

First, calculate the total block header size:
 
  block header = fixed header + variable transaction header +
                 table directory + row directory
 
     where:
        fixed header = 57 bytes  (the 4 bytes at the end of the block have 
                                  already been taken account of in the 24 bytes
                                  for the cache header)
 
        variable transaction header = 23 * i  where i is the value of INITRANS
                                              for the table, or can grow
                                              dynamically up to MAXTRANS.
 
        table directory = 4 * n    where n is the number of tables.
                                   n = 1 for non-clustered tables.
 
        row directory = 2 * x    where x is the number of rows in the block.
 
The variable transaction header in the block header contains 23 byte-structures
that hold information about transactions accessing the block.  By default, 
space is reserved for only one transaction (INITRANS = 1) in the block header. 
As more transactions access the block, space for additional transaction 
entries is allocated out of the free space in the block or an old entry is 
reused. As rows are inserted, the 2 bytes for each row in the row directory
are also allocated out of the free space. Once allocated, this space becomes
part of the block header.  So, once the two bytes have been allocated in the
row directory, this space is not reclaimed when the row is deleted. For example,
a block that is currently empty but had up to 50 rows at one time will have 100 
bytes allocated in the header for the row directory. This space will be reused 
as new rows are inserted in the block.
 
Using the above formula, the initial block header size for a non-clustered 
table with INITRANS = 1 is:
 
  block header = 57 + 23 + 4 + 2x = (84 + 2x) bytes
 
The space reserved for data within the block, as specified by PCTFREE, is 
calculated as a percentage of the block size minus the block header. 
 
  available data space = (block size - total block header) -
         ((block size - total block header) * (PCTFREE/100))
 
For example, with PCTFREE = 10 and a block size of 2048, the total space for 
new data in a block is:
 
  available data space = (2048 - (84 + 2x)) - ((2048 - (84 + 2x)) * (10/100)) 
                       = (1964 - 2x) - ((2048 - 84 - 2x) * (10/100))
                       = (1964 - 2x) - (1964 - 2x) * 0.1 
                       = (1964 - 2x - 196 + 0.2x) bytes
                       = (1768 - 1.8x) bytes
 
Now, calculate the combined data space required for an average row. Calculating
this depends on the following:
 
  1. The number of columns in the table definition.
  2. The datatypes used for each column.
  3. The average value size for variable length columns.
 
A test database similar to the production database will be helpful here. To 
calculate the combined data space for an average row in a table, use the 
following query:
 
  SELECT AVG(NVL(VSIZE(col1), 1)) +
         AVG(NVL(VSIZE(col2), 1)) +
         ... +
         AVG(NVL(VSIZE(coln), 1))  "SPACE OF AVERAGE ROW"
         FROM table_name;
 
col1, col2, ... , coln are the column names of the table and table_name is the 
table being evaluated.
 
Note: This formula assumes that columns containing nulls are not trailing 
      columns. A column length of 1 is assumed (column length of a null in a
      trailing column is 0).
 
For example, a test table created with the following statement:
 
  CREATE TABLE test (
         A CHAR(10),
         B DATE,
         C NUMBER(10, 2));
 
can have the space determined by the average row in the table with the query:
 
  SELECT AVG(NVL(VSIZE(A), 1)) +
         AVG(NVL(VSIZE(B), 1)) +
         AVG(NVL(VSIZE(C), 1))   "SPACE OF AVERAGE ROW"
         FROM test;
 
Alternatively, if a test database is not available, you can estimate the 
combined data space for an average row in a table. This is done by examining 
the datatype of each column in a table. If the column is of a fixed length 
datatype, the value for the column in each row of the table is fixed. For
variable length columns, you must determine the average column value and 
determine how much space is required to hold this average value.
 
For example, using the above table test, the column lengths of the first two
columns are of fixed lengths. So, column A requires 10 bytes (assuming a one
character set is being used) and column B requires 7 bytes. Column C can vary 
for each row in the table because the NUMBER datatype is a variable length
datatype. By estimating your proposed data, you can determine the typical value
for this column. For NUMBER datatypes, a typical value may require approximately:
 
  average length column C = (average_precision/2 + 1) bytes
                          = (8/2 + 1) bytes (estimating an average_precision of 8)
                          = 5 bytes
 
Calculating the average space required for the columns in an average row for
table test:
 
  average row  = (A + B + C)
               = (10 + 7 + 5) bytes
               = 22 bytes
 
  Note: negative numbers consume one additional byte for the sign.
 
Using the results from above, we can calculate the total average row size (or
the minimum amount of space required by a row) in a non-clustered table with 
the formula:
 
   average row size = row header + F + V + D (bytes per row)  
 
      where:
         row header = 3 bytes per row of a non-clustered table
 
         F          = total length bytes of all columns with 1 byte column 
                      lengths (CHAR, NUMBER, DATE, and ROWID types)
 
         V          = total length bytes of all columns with 3 byte column 
                      lengths (VARCHAR2, LONG, RAW, LONG RAW datatypes)
 
         D          = combined data space of all columns in average row 
                      (from above)
 
         Perhaps a simpler way to think about this is that there would
         be one byte for columns with an actual length between
         1 and 250.  Then for lengths from 251 to 64K, we use 3 bytes
         (0xFE marker byte, 2 bytes of actual length). 
               Note 231214.1 describes this aspect more accurately.
 
For example, the total average row size for table test is:
 
  average row size = (3 + (3 * 1) + (3 * 0) + 22) bytes
                   = 28 bytes
 
Note: The minimum row size for a non-clustered table is 9 bytes. Therefore, if
      the calculated value for an average row size is less than this absolute
      minimum row size, use the minimum value as the average row size.
 
After the average row size is determined, you can calculate the average number 
of rows that can fit into a database block (the values of PCTFREE and PCTUSED 
will also determine the amount of space used):
 
  average number of rows per block = 
      
        floor(available data space / average row size)
 
      where:
         available data space and average row size are calculated above.
 
Using the information for table test from above:
 
  average number 
  of rows per block = x = (1768 - 1.8x)/28 bytes 
                    28x = 1768 - 1.8x
                  29.8x = 1768
                      x ~ 59 = average number of rows per block 
 
Make sure you round x or the average number of rows per block DOWN.  Once you 
know the number of rows that can fit inside the available space of a database 
block, you can calculate the number of blocks required to hold the proposed 
table:
 
  number of blocks 
  for the table     = number of rows / average number of rows per block
 
Using 10,000 rows for table test:
 
  number of blocks
  for table test   = 10000 rows / 59 rows per block
                   ~ 169 blocks
 
To get this value in bytes, multiply by the database block size.  And again,
round the number of blocks from the above DOWN.
 
 
D - INDEXES
============

Indexes increase in size slightly faster than the corresponding table. So, you
should round up when estimating index size. Formulas are provided below for
calculating the size of an index:
 
First, calculate the space required by the database block header of a block to
contain index data. To calculate the total block header size:
 
  block header size = fixed header + variable transaction header 
 
     where:
        fixed header = 113 bytes
 
        variable transaction header = 23*i    where i is the value of INITRANS
                                              for the index
 
Assuming the INITRANS = 2 (the default for indexes), the previous formula can 
be reduced to:
 
  block header size = 113 + (23 * 2) bytes
                    = 159 bytes
 
The space reserved in each database block for index data, as specified by PCTFREE,
is calculated as a percentage of the block size minus the block header.
 
  available data space = (block size - block header size) -
                         ((block size - block header size) * (PCTFREE/100))
 
Assuming a block size of 2048 bytes and PCTFREE of 10:
 
  available data space = (2048 bytes - 159 bytes) -
                         ((2048 bytes - 159 bytes) * (10/100))
                       = 1889 bytes - 188.9 bytes
                       = 1700.1 bytes
 
The calculation of the combined column lengths of an average index value is the
same as the calculation for table size, except you only need to calculate the 
average combined column lengths of the columns in the index. This formula is
listed above.
 
Once the combined column length of an average index entry has been calculated,
the total average entry size can be calculated, using the formula:
 
  bytes per entry = entry header + ROWID length + F + V + D
 
     where:
        entry header = 1 byte
 
        ROWID length = 6 bytes
 
        F            = total length bytes of all columns with 1 byte column 
                       lengths (CHAR, NUMBER, DATE, and ROWID types)
 
        V            = total length bytes of all columns with 3 byte column 
                       lengths (VARCHAR2 and RAW datatypes)
 
        D            = combined data space of all columns  (from above)
 
         Here again it applies that there would be one byte for columns
         with an actual length between 1 and 250.  Then for lengths from
         251 to 64K, we use 3 bytes (0xFE marker byte, 2 bytes of actual length). 
               Note 231214.1 describes this aspect more accurately.
 
For example, given that D is calculated to be 22 bytes (from the table 
calculations above), and that the index is comprised of three CHAR columns, the
total average entry size of the index is:
 
  bytes per entry = 1 + 6 + (3 * 1) + (3 * 0) + 22 bytes
                  = 32 bytes
 
To calculate the number of blocks and bytes required for the index, use:
 
  number of blocks 
  for index        = 1.1 * ((number of not null rows * avg. entry size) / 
                     ((floor (avail. data space / avg. entry size)) *
                     (avg. entry size))
 
Note: The additional 10% added to this result accounts for the extra space
      required for branch blocks of the index.
 
  number of blocks
  for index        = 1.1 * ((10000 * 32 bytes) / ((floor (1700/32 bytes)) *
                           (32 bytes))
                   = 208 blocks
 
The number of bytes can be calculated by multiplying the result by the database
block size.  This method for indexes is more approximate than for tables.
                      
An index, unlike a table, does not contain rows. Therefore, no query similar 
to the table query above exists for indexes. Also, the VSIZE function will not 
report information about index storage space.
 
Putting indexes for large tables in their own tablespaces makes space 
management more efficient.  It decreases fragmentation and also makes managing
the index growth easier if recreating the index is necessary.
 
 
E - TEMPORARY SEGMENTS
=======================

The amount of temporary segment space needed for a sort grows proportionally 
with the amount of data being sorted. Most SQL operations do not require data
to be sorted, however, the few SQL operations which do require sorting are: 
 
  CREATE INDEX
  SELECT ... ORDER BY
  SELECT ... DISTINCT
  SELECT ... GROUP BY
  SELECT ... UNION
  SELECT ... INTERSECT
  SELECT ... MINUS
  unindexed joins
  certain correlated subqueries
 
For V7.3.X and +, please read the following:
 
Note 102339.1 Temporary Segments What Happens When a Sort Occurs

 
The init.ora SORT_AREA_SIZE parameter sets up an upper limit on the size of
a sort that can be performed in memory. Common settings of this parameter range
between 64K and 256K for larger systems. Sorts exceeding the SORT_AREA_SIZE 
limit transparently acquire a temporary segment in the database.
 
Temporary segments have storage clauses of the tablespace that they reside in.
  
To ensure that the temporary segments can grow to handle large transactions,
the initial extent should be greater than SORT_AREA_SIZE and next extent and 
maxextents should be large.  A pctincrease (percent increase) of zero may 
reduce fragmentation in the temporary tablespace.  If the temporary tablespace 
holds other large database objects, then defragmenting the tablespace might 
involve more work.
 
Temporary segments are allocated as needed during a user session. The storage 
characteristics for the extents of the temporary segment are determined by the
defaults of the tablespace in which the temporary segment is created. Temporary
segments are dropped when the statement completes.  
 
A sort may require up to twice the amount of space needed to store the data
being sorted. Most statements issued by interactive transactions will not 
generate sorts requiring a temporary segment, and of those that do, few will 
require a large amount of temporary segment space. Reports and index creations,
however, often require large temporary segments.
 
Setting aside a tablespace for temporary segments may help you more accurately 
measure the temporary segment requirements of different parts of your 
application. To monitor the temporary segment usage, query the data dictionary
views USER_SEGMENTS, ALL_SEGMENTS, or DBA_SEGMENTS, where the value for the 
column SEGMENT_TYPE equals TEMPORARY. Measurements may be inflated because 
temporary segment space will not be reclaimed until needed or until the 
database startup occurs.
 
 
F - ROLLBACK SEGMENTS
======================

Rollback segments are used to store undo transactions in case the actions
need to be "rolled back" or the system needs to generate a read-consistent
state from an earlier time.
 
As the amount of data and number of rows being modified by a statement
increases, the amount of rollback segment space required for the statement will
increase in size proportionally.
 
To choose the number of rollback segments for your system, refer to 
the RDBMS Administrator's Guide for Version 7/ Version 8i

Information is guaranteed to remain in the rollback segment for the duration of
the transaction that generated it. Once a commit is issued, the database is
free to overwrite the undo information. A long-running SQL statement may need
this undo to construct a read-consistent snapshot. The larger a rollback
segment is, the longer the undo information is available (before being
overwritten). Therefore, it is wise to allocate more space than the initial
estimates may suggest because support for long-running transactions may be
needed. As a final test, simulate a heavy transaction load on the full-sized
database before going production.
 
 
G - REDO LOG FILES
===================

The issues in sizing redo log files are very different from issues for the
other objects discussed in this paper. The on-line redo logs should be on 
devices that are not loaded with other database files. Ensure that enough redo 
log space is available for the LGWR process to write out bursts of information 
during heavy processing times, and for the ARCH process to catch up later if 
necessary. Otherwise, database operations will be suspended until the ARCH can
make a redo log available. 
 
If the redo log is not completely filled and that log is archived (by an
ALTER SYSTEM SWITCH LOGFILE), Oracle will just archive the part of the log that
has been written to. Therefore, you may have archive logs that are different sizes.

Tuning your redo logs
 
One efficient way to monitor how effectively your redo logs are performing is
to look for messages about switching logs, archiving logs and checkpointing in
your alert log. Everytime Oracle does a log switch it is recorded in the alert
log. Here is an example:
 
Mon Aug 21 13:53:31 1995
Thread 1 advanced to log sequence 68
  Current log# 2 seq# 68 mem# 0: /u05/ylien/dbs/log2ween.dbf
 
The rate that the redo logs are switching reflects the activity on the
database. If you feel that the redo logs are switching too frequently, examine
the activity on the database. For example, doing hot backups, conventional
loads will generate more redo and it will be normal to see more redo log
switches.
 
The following example may cause performance degradation:
 
Mon Aug 21 13:40:56 1995
Thread 1 cannot allocate new log, sequence 68
Checkpoint not complete
  Current log# 1 seq# 67 mem# 0: /u05/ylien/dbs/log1ween.dbf
 
These messages indicate that Oracle is ready to recycle the redo logs but
checkpoint has not been complete on the redo log Oracle needs. The checkpoint
process is absolutely necessary for Oracle to establish data consistency. 
There are three ways to resolve this problem:
   1- add more redo log groups
   2- create bigger redo logs and remove the smaller ones
   3- enable the CHECKPOINT_PROCESS = TRUE paramter to be set in the init.ora
 
1 and 2 will solve the problem because Oracle now has more time before it has
to recycle through the redo logs.
3 enables another background process(CKPT) that will help the checkpoint
process.
 
 
H - MAINTENANCE 
================

Once the database is up and running, it is always good practice to monitor the
growth of the system.  If you can anticipate how much more space a particular
database object will need, then you can plan where to acquire the additional 
space. 
 
See Note 162994.1 SCRIPT TO REPORT EXTENTS AND CONTIGUOUS FREE SPACE
 
 
I - CONCLUSION
===============

A well planned and designed database lends to easy maintenance and maximum
utilization of your system's resources.  Accurate space calculations of the
objects in the database allow the DBA to acquire the disk space needed and
to plan an efficient layout of the physical datafiles.  After the database
has been running or goes production, it is also essential to monitor the 
growth of the database since it will vary over time and level out.  Using the 
various resources provided by Oracle, you can determine whether there is 
enough space currently for any fast growing objects in the database to extend.
If the database is badly fragmented, then you can plan a convenient time to
resolve the fragmentation. 
 
References:
===========
 
Oracle7 RDBMS Database Administrator's Guide
Oracle7 RDBMS Performance Tuning Guide
Oracle7 SQL Language Reference Guide
Oracle 8i Administrator's Guide
 
Note 105120.1 Advantages of Using Locally Managed vs Dictionary Managed
                Tablespaces
Note 135090.1 Managing Rollback/Undo Segments in AUM (Automatic Undo Management)
Note 175434.1 Create/Upgrade/Migrate a Database to Have a Locally Managed
                SYSTEM Tablespace
 
 

.

Author: admin