Unix virtual memory , paging and swapping explained-> advanced indepth article for the new Oracle dba

Below is an indepth advanced article on unix virtual memory , paging and swapping. Get some coffee and enjoy the lengthy article.If you dont understand it the first time dont worry and attack it again after few days and this article has all the information that is necessary for a new oracle dba to learn about what happens on a busy unix server.

TECH: Unix Virtual Memory, Paging & Swapping explained [ID 17094.1]

	Modified 25-OCT-2000 Type BULLETIN Status PUBLISHED

====================================================================

Understanding and measuring memory usage on UNIX operating systems.

====================================================================

When planning an Oracle installation, it is often necessary to plan for

memory requirements.  To do this, it is necessary to understand how the

UNIX operating system allocates and manages physical and virtual memory

among the processes on the system.

------------------------------

I.  Virtual memory and paging

------------------------------

Modern UNIX operating systems all support virtual memory.  Virtual

memory is a technique developed around 1961 which allows the size of a

process to exceed the amount of physical memory available for it.  (A

process is an instance of a running program.)  Virtual memory also

allows the sum of the sizes of all processes on the system to exceed

the amount of physical memory available on the machine.  (Contrast this

with a system running MS-DOS or Apple MacIntosh, in which the amount of

physical memory limits both the size of a single process and the total

number of simultaneous processes.)

A full discussion of virtual memory is beyond the scope of this

article.  The basic idea behind virtual memory is that only part of a

particular process is in main memory (RAM), and the rest of the process

is stored on disk.  In a virtual memory system, the memory addresses

used by programs do not refer directly to physical memory.  Instead,

programs use virtual addresses, which are translated by the operating

system and the memory management unit (MMU) into the physical memory

(RAM) addresses.  This scheme works because most programs only use a

portion of their address space at any one time.

Modern UNIX systems use a paging-based virtual memory system.  In a

paging-based system, the virtual address space is divided up into

equal-sized chunks called pages.  The actual size of a single page is

dependent on the particular hardware platform and operating system

being used: page sizes of 4k and 8k are common.  The translation of

virtual addresses to physical addresses is done by mapping virtual

pages to physical pages.  When a process references a virtual address,

the MMU figures out which virtual page contains that address, and then

looks up the physical page which corresponds to that virtual page.

One of two things is possible at this point: either the physical page

is loaded into RAM, or it is on disk.  If the physical page is in RAM,

the process uses it.  If the physical page is on disk, the MMU

generates a page fault.  At this point the operating system locates the

page on disk, finds a free physical page in RAM, copies the page from

disk into RAM, tells the MMU about the new mapping, and restarts the

instruction that generated the page fault.

Note that the virtual-to-physical page translation is invisible to the

process.  The process "sees" the entire virtual address space as its

own: whenever it refers to an address, it finds memory at that

address.  All translation of virtual to physical addresses and all

handling of page faults is performed on behalf of the process by the

MMU and the operating system.  This does not mean that taking a page

fault has no effect.  Since handling a page fault requires reading the

page in from disk, a process that takes a lot of page faults will run

much slower than one that does not.

In a virtual memory system, only a portion of a process's virtual

address space is mapped into RAM at any particular time.  In a

paging-based system, this notion is formalized as the working set of a

process.  The working set of a process is simply the set of pages that

the process is using at a particular point in time.  The working set of

a process will change over time.  This means that some page faulting

will occur, and is normal.  Also, since the working set changes over

time, the size of the working set changes over time as well.  The

operating system's paging subsystem tries to keep all the pages in the

process's working set in RAM, thus minimizing the number of page faults

and keeping performance high.  By the same token, the operating system

tries to keep the pages not in the working set on disk, so as to leave

the maximum amount of RAM available for other processes.

Recall from above that when a process generates a page fault, the

operating system must read the absent page into RAM from disk.  This

means that the operating system must choose which page of RAM to

use for this purpose.  In the general case, there may not be a free

page of physical RAM, and the operating system will have to read the

data for the new page into a physical page that is already in use.  The

choice of which in-use page to replace with the new data is called the

page replacement policy.

Entire books have been written on various page replacement policies and

algorithms, so a full discussion of them is beyond the scope of this

article.  It is important to note, however, that there are two general

classes of page replacement policy: local and global.  In a local page

replacement policy, a process is assigned a certain number of physical

pages, and when a page fault occurs the operating system finds a free

page within the set of pages assigned to that process.  In a global

page replacement policy, when a page fault occurs the operating system

looks at all processes in the system to find a free page for the

process.

There are a number of key points to understand about paging.

(1) Typically, only a relatively small number of pages (typically 10% -

50%) of a single process are in its working set (and therefore in

physical memory) at any one time.

(2) The location of physical pages in RAM bears no relation whatever to

the location of pages in any process's virtual address space.

(3) Most implementations of paging allow for a single physical page to

be shared among multiple processes.  In other words, if the operating

system can determine that the contents of two (or more) virtual pages

are identical, only a single physical page of RAM is needed for those

virtual pages.

(4) Since working set sizes change over time, the amount of physical

memory that a process needs changes over time as well.  An idle process

requires no RAM; if the same process starts manipulating a large data

structure (possibly in response to some user input) its RAM requirement

will soar.

(5) There exists a formal proof that it is impossible to determine

working set sizes from a static analysis of a program.  You must run a

program to determine its working set.  If the working set of the

program varies according to its input (which is almost always the case)

the working sets of two processes will be different if the processes

have different inputs.

---------------------------

II. Virtual memory on Unix

---------------------------

The discussion above of virtual memory and paging is a very general

one, and all of the statements in it apply to any system that

implements virtual memory and paging.  A full discussion of paging and

virtual memory implementation on UNIX is beyond the scope of this

article.  In addition, different UNIX vendors have implemented

different paging subsystems, so you need to contact your UNIX vendor

for precise information about the paging algorithms on your UNIX

machine.  However, there are certain key features of the UNIX paging

system which are consistent among UNIX ports.

Processes run in a virtual address space, and the UNIX kernel

transparently manages the paging of physical memory for all processes

on the system.  Because UNIX uses virtual memory and paging, typically

only a portion of the process is in RAM, while the remainder of the

process is on disk.

1) The System Memory Map

The physical memory on a UNIX system is divided among three uses.  Some

portion of the memory is dedicated for use by the operating system

kernel.  Of the remaining memory, some is dedicated for use by the I/O

subsystem (this is called the buffer cache) and the remainder goes into

the page pool.

Some versions of UNIX statically assign the sizes of system memory, the

buffer cache, and the page pool, at system boot time; while other

versions will dynamically move RAM between these three at run time,

depending on system load.  (Consult your UNIX system vendor for details

on your particular version of UNIX.)

The physical memory used by processes comes out of the page pool.  In

addition, the UNIX kernel allocates a certain amount of system memory

for each process for data structures that allow it to keep track of

that process.  This memory is typically not more than a few pages.  If

your system memory size is fixed at boot time you can completely ignore

this usage, as it does not come out of the page pool.   If your system

memory size is adjusted dynamically at run-time, you can also typically

ignore this usage, as it is dwarfed by the page pool requirements of

Oracle software.

2)  Global Paging Strategy

UNIX systems implement a global paging strategy.  This means that the

operating system will look at all processes on the system when is

searching for a page of physical memory on behalf of a process.  This

strategy has a number of advantages, and one key disadvantage.

The advantages of a global paging strategy are:  (1) An idle process

can be completely paged out so it does not hold memory pages that can

be better used by another process.  (2) A global strategy allows for a

better utilization of system memory; each process's page allocations

will be closer to their actual working set size.  (3) The administrative

overhead of managing process or user page quotas is completely

absent.  (4) The implementation is smaller and faster.

The disadvantage of a global strategy is that is is possible for a

single ill-behaved process to affect the performance of all processes

on the system, simply by allocating and using a large number of pages.

3)  Text and Data Pages

A UNIX process can be conceptually divided into two portions; text and

data.  The text portion contains the machine instructions that the

process executes; the data portion contains everything else.  These two

portions occupy different areas of the process's virtual address

space.  Both text and data pages are managed by the paging subsystem.

This means that at any point in time, only some of the text pages and

only some of the data pages of any given process are in RAM.

UNIX treats text pages and data pages differently.  Since text pages

are typically not modified by a process while it executes, text pages

are marked read-only.  This means that the operating system will

generate an error if a process attempts to write to a text page.  (Some

UNIX systems provide the ability to compile a program which does not

have read-only text: consult the man pages on 'ld' and 'a.out' for

details.)

The fact that text pages are read-only allows the UNIX kernel to

perform two important optimizations:  text pages are shared between all

processes running the same program, and text pages are paged from the

filesystem instead of from the paging area.  Sharing text pages between

processes reduces the amount of RAM required to run multiple instances

of the same program.  For example, if five processes are running Oracle

Forms, only one set of text pages is required for all five processes.

The same is true if there are fifty or five hundred processes running

Oracle Forms.  Paging from the filesystem means that no paging space

needs to be allocated for any text pages.  When a text page is paged

out it is simply over-written in RAM;  if it is paged in at a later

time the original text page is available in the program image in the

file system.

On the other hand, data pages must be read/write, and therefore cannot

(in general) be shared between processes.  This means that each process

must have its own copy of every data page.  Also, since a process can

modify its data pages, when a data page is paged out it must be written

to disk before it is over-written in RAM.  Data pages are written to

specially reserved sections of the disk.  For historical reasons, this

paging space is called "swap space" on UNIX.  Don't let this name

confuse you: the swap space is used for paging.

4) Swap Space Usage

The UNIX kernel is in charge of managing which data pages are in RAM

and which are in the swap space.  The swap space is divided into swap

pages, which are the same size as the RAM pages.  For example, if a

particular system has a page size of 4K, and 40M devoted to swap space,

this swap space will be divided up into 10240 swap pages.

A page of swap can be in one of three states: it can be free, allocated,

or used.  A "free" page of swap is available to be allocated as a disk

page.  An "allocated" page of swap has been allocated to be the disk

page for a particular virtual page in a particular process, but no data

has been written to the disk page yet -- that is, the corresponding

memory page has not yet been paged out.  A "used" page of swap is one

where the swap page contains the data which has been paged out from RAM.

A swap page is not freed until the process which "owns" it frees the

corresponding virtual page.

On most UNIX systems, swap pages are allocated when virtual memory is

allocated.  If a process requests an additional 1M of (virtual) memory,

the UNIX kernel finds 1M of pages in the swap space, and marks those

pages as allocated to a particular process.  If at some future time a

particular page of RAM must be paged out, swap space is already

allocated for it.  In other words, every virtual data page is "backed

with" a page of swap space.

An important consequence of this strategy is if all the swap space is

allocated, no more virtual memory can be allocated.  In other words,

the amount of swap space on a system limits the maximum amount of

virtual memory on the system.  If there is no swap space available, and

a process makes a request for more virtual memory, then the request

will fail.  The request will also fail if there is some swap space

available, but the amount available is less than the amount requested.

There are four system calls which allocate virtual memory: these are

fork(), exec(), sbrk(), and shmget().  When one of these system calls

fails, the system error code is set to EAGAIN.  The text message

associated with EAGAIN is often "No more processes".  (This is because

EAGAIN is also used to indicate that the per-user or system-wide

process limit has been reached.)  If you ever run into a situation

where processes are failing because of EAGAIN errors, be sure to check

the amount of available swap as well as the number of processes.

If a system has run out of swap space, there are only two ways to fix

the problem: you can either terminate some processes (preferably ones

that are using a lot of virtual memory) or you can add swap space to

your system.  The method for adding swap space to a system varies

between UNIX variants: consult your operating system documentation or

vendor for details.

5) Shared Memory

UNIX systems implement, and the Oracle server uses, shared memory.  In

the UNIX shared memory implementation, processes can create and attach

shared memory segments.  Shared memory segments are attached to a

process at a particular virtual address.  Once a shared memory segment

is attached to a processes, memory at that address can be read from and

written to, just like any other memory in the processes address space.

Unlike "normal" virtual memory, changes written to an address in the

shared memory segment are visible to every process that has attached to

that segment.

Shared memory is made up of data pages, just like "conventional"

memory.  Other that the fact that multiple processes are using the same

data pages, the paging subsystem does not treat shared memory pages any

differently than conventional memory.  Swap space is reserved for

a shared memory segment at the time it is allocated, and the pages of

memory in RAM are subject to being paged out if they are not in use,

just like regular data pages.  The only difference between the

treatment of regular data pages and shared data pages is that shared

pages are allocated only once, no matter how many processes are using

the shared memory segment.

6) Memory Usage of a Process

When discussing the memory usage of a process, there are really two

types of memory usage to consider: the virtual memory usage and the

physical memory usage.

The virtual memory usage of a process is the sum of the virtual text

pages allocated to the process, plus the sum of the virtual data pages

allocated to the process.  Each non-shared virtual data page has a

corresponding page allocated for it in the swap space.  There is no

system-wide limit on the number of virtual text pages, and the number

of virtual data pages on the system is limited by the size of the swap

space.  Shared memory segments are allocated on a system-wide basis

rather than on a per-process basis, but are allocated swap pages and

are paged from the swap device in exactly the same way as non-shared

data.

The physical memory usage of a process is the sum of the physical text

pages of that process, plus the sum of the physical data pages of that

process.  Physical text pages are shared among all processes running

the same executable image, and physical data pages used for shared

memory are shared among among all processes attached to the same shared

memory segment.  Because UNIX implements virtual memory, the physical

memory usage of a process will be lower than the virtual memory usage.

The actual amount of physical memory used by a process depends on the

behavior of the operating system paging subsystem.  Unlike the virtual

memory usage of a process, which will be the same every time a

particular program runs with a particular input, the physical memory

usage of a process depends on a number of other factors.

First: since the working set of a process changes over time, the amount

of physical memory needed by the process will change over time.

Second: if the process is waiting for user input, the amount of

physical memory it needs will drop dramatically.  (This is a special

case of the working set size changing.)  Third: the amount of physical

memory actually allocated to a process depends on the overall system

load.   If a process is being run on a heavily loaded system, then the

global page allocation policy will tend to make the number of physical

memory pages allocated to that process to be very close to the size of

the working set.  If the same program is run with the same input on a

lightly loaded system, the number of physical memory pages allocated to

that process will tend to be much larger than the size of the working

set:  the operating system has no need to reclaim physical pages from

that process, and will not do so.

The net effect of this is that any measure of physical memory usage

will be inaccurate unless you are simulating both the input and the

system load of the final system you will be testing.  For example, the

physical memory usage of a Oracle Forms process will be very different

if a user is rapidly moving between 3 large windows, infrequently

moving between the same three windows, rapidly typing into a single

window, slowly typing into the same window, or if they are reading data

off of the screen and the process is sitting idle -- even though the

virtual memory usage of the process will remain the same.  By the same

token, the physical memory usage of an Oracle Forms process will be

different if it is the only active process on a system, or if it is one

of fifty active Oracle Forms processes on the same system.

7) Key Points

There are a number of key points to understand about the UNIX virtual

memory implementation.

(1)  Every data page in every process is "backed" by a page in the swap

space.  This size of the swap space limits the amount of virtual data

space on the system;  processes are not able to allocate memory if

there is not enough swap space available to back it up, regardless of

how much physical memory is available on the system.

(2)  UNIX implements a global paging strategy.  This means that the

amount of physical memory allocated to a process varies greatly over

time, depending on the size of the process's working set and the

overall system load.  Idle processes may be paged out completely on a

busy system.  On a lightly loaded system processes may be allocated

much more physical memory than they require for their working sets.

(3)  The amount of virtual memory available on a system is determined

by the amount of swap spaces configured for that system.  The amount of

swap space needed is equal to the sum of the virtual data allocated by

all processes on the system at the time of maximum load.

(4)  Physical memory is allocated for processes out of the page pool,

which is the memory not allocated to the operating system kernel and

the buffer cache.  The amount of physical memory needed for the page

pool is equal to the sum of the physical pages in the working sets of

all processes on the system at the time of maximum load.

----------------------------------

III. Process Memory Layout on UNIX

----------------------------------

1) The Segments of a Process

The discussion above speaks of a UNIX process as being divided up into

two regions: text and data.  This division is accurate for discussions

of the paging subsystem, since the paging subsystem treats every

non-text page as a data page.  In fact, a UNIX process is divided into

six segments: text, stack, heap, BSS, initialized data, and shared

memory.  Each of these segments contains a different type of information

and is used for a different purpose.

The text segment is used to store the machine instructions that the

process executes.  The pages that make up the text segment are marked

read-only and are shared between processes that are running the same

executable image.  Pages from the text segment are paged from the

executable image in the filesystem.  The size of the text segment is

fixed at the time that the program is invoked: it does not grow or

shrink during program execution.

The stack segment is used to store the run-time execution stack.  The

run-time program stack contains function and procedure activation

records, function and procedure parameters, and the data for local

variables.  The pages that make up the stack segment are marked

read/write and are private to the process.   Pages from the stack

segment are paged into the swap device.  The initial size of the stack

segment is typically one page;  if the process references an address

beyond the end of the stack the operating system will transparently

allocate another page to the stack segment.

The BSS segment is used to store statically allocated uninitialized

data.  The pages that make up the BSS segment are marked read/write,

are private to the process, and are initialized to all-bits-zero at

the time the program is invoked.  Pages from the BSS segment are paged

into the swap device.   The size of the BSS segment is fixed at the

time the program is invoked: it does not grow or shrink during program

execution.

The initialized data segment is used to store statically allocated

initialized data.  The pages that make up the initialized data segment

are marked read/write, and are private to the process.  Pages from the

initialized data segment are initially read in from the initialized

data in the filesystem; if they have been modified they are paged into

the swap device from then on.   The size of the initialized data

segment is fixed at the time the program is invoked: it does not grow

or shrink during program execution.

The dynamically allocated data segment (or "heap") contains data pages

which have been allocated by the process as it runs, using the brk() or

sbrk() system call.  The pages that make up the heap are marked

read/write, are private to the process, and are initialized to

all-bits-zero at the time the page is allocated to the process.  Pages

from the heap are paged into the swap device.  At program startup the

heap has zero size: it can grow arbitrarily large during program

execution.

Most processes do not have a shared data segment.  In those that do,

the shared data segment contains data pages which have been attached to

this process using the shmat() system call.  Shared memory segments are

created using the shmget() system call.  The pages that make up the

shared data segment are marked read/write, are shared between all

processes attached to the shared memory segment, and are initialized to

all-bits-zero at the time the segment is allocated using shmget().

Pages from the shared data segment are paged into the swap device.

Shared memory segments are dynamically allocated by processes on the

system:  the size of a shared memory segment is fixed at the time it is

allocated, but processes can allocate arbitrarily large shared memory

segments.

2)  Per-Process Memory Map

The six segments that comprise a process can be laid out in memory in

any arbitrary way.  The exact details of the memory layout depend on

the architecture of the CPU and the design of the particular UNIX

implementation.  Typically, a UNIX process uses the entire virtual

address space of the processor.  Within this address space, certain

addresses are legal, and are used for particular segments.  Addresses

outside of any segment are illegal, and any attempt to read or write to

them will generate a 'Segmentation Violation' signal.

The diagram below shows a typical UNIX per-process virtual memory map

for a 32-bit processor.  Note that this memory map covers the entire

virtual address space of the machine.  In this diagram, regions marked

with a 't' are the text segment, 's' indicates the stack segment, 'S'

the shared memory segment, 'h' the heap, 'd' the initialized data, and

'b' the BSS.  Blank spaces indicate illegal addresses.

+--------+-----+--------+----+---------------------+-------+----+----+

|tttttttt|sssss|        |SSSS|                     |hhhhhhh|dddd|bbbb|

|tttttttt|sssss| ->>    |SSSS|                 <<- |hhhhhhh|dddd|bbbb|

|tttttttt|sssss|        |SSSS|                     |hhhhhhh|dddd|bbbb|

+--------+-----+--------+----+---------------------+-------+----+----+

0                                                                   2G

In this particular implementation, the text segment occupies the lowest

virtual addresses, and the BSS occupies the highest.  Note that memory

is layed out in such a way as to allow the stack segment and the heap

to grow.  The stack grows "up", toward higher virtual addresses, while

the heap grows "down", toward lower virtual addresses.  Also note that

the placement of the shared memory segment is critical: if it is

attached at too low of an address it will prevent the stack from

growing, and if it is attached at too high of an address it will

prevent the heap from growing.

3) Process size limits

All UNIX systems provide some method for limiting the virtual size of a

process.  Note that these limits are only on virtual memory usage:

there is no way to limit the amount of physical memory used by a

process or group of processes.

On systems that are based on SVR3, there is a system-wide limit on the

virtual size of the data segment.  Changing this limit typically

requires you to change a UNIX kernel configuration parameter and relink

the kernel: check your operating system documentation for details.

On systems that are based on BSD or SVR4, there is a default limit on

the size of the stack segment and the data segment.  It is possible to

change these limits on a per-process basis; consult the man pages on

getrlimit() and setrlimit() for details.  If you are using the C-shell

as your login shell the 'limit' command provides a command-line

interface to these system calls.  Changing the system-wide default

typically requires that you change a UNIX kernel configuration

parameter and relink the kernel: check your operating system

documentation for details.

Most systems also provide a way to control the maximum size and number

of shared memory segments: this typically involves changing the UNIX

kernel parameters SHMMAX, SHMSEG and SHMMNI.  Again, consult your

operating system documentation for details.

4) The High-Water-Mark Effect

Recall from above that the size of the data segment can only be changed

by using the brk() and sbrk() system calls.  These system calls allow

you to either increase or decrease the size of the data segment.

However, most programs, including Oracle programs, do not use brk() or

sbrk() directly.  Instead, they use a pair of library functions

provided by the operating system vendor, called malloc() and free().

These two functions are used together to manage dynamic memory

allocation.  The two functions maintain a pool of free memory (called

the arena) for use by the process.  They do this by maintaining a data

structure that describe which portions of the heap are in use and which

are available.  When the process calls malloc(), a chunk of memory of

the requested size is obtained from the arena and returned to the

calling function.  When the process calls free(), the

previously-allocated chunk is returned to the arena making it available

for use by a later call to malloc().

If a process calls malloc() with a request that is larger than the

largest free chunk currently in the arena, malloc() will call sbrk() to

enlarge the size of the arena by enlarging the heap.  However, most

vendor's implementations of free() will not shrink the size of the arena

by returning memory to the operating system via sbrk().   Instead, they

simply place the free()d memory in the arena for later use.

The result of this implementation is that processes which use the

malloc() library exhibit a high-water-mark effect:  the virtual sizes

of the processes grow, but do not shrink.  Once a process has allocated

virtual memory from the operating system using malloc(), that memory

will remain part of the process until it terminates.  Fortunately, this

effect only applies to virtual memory;  memory returned to the arena is

quickly paged out and is not paged in until it is re-allocated via

malloc().

-------------------------

IV. Monitoring Memory Use

-------------------------

In the final analysis, there are only two things to be concerned with

when sizing memory for a UNIX system: do you have enough RAM, and do

you have enough swap space?  In order to answer these questions, it is

necessary to know how much virtual memory and how much physical memory

each process on the system is using.  Unfortunately, the standard UNIX

process monitoring tools do not provide a way to reliably determine

these figures.  The standard tools for examining memory usage on a UNIX

system are 'size', 'ipcs', 'ps', 'vmstat' and 'pstat'.  Most

SYSV-derived systems will also have the 'crash' utility: most

BSD-derived systems will allow you to run 'dbx' against the UNIX

kernel.

The 'size' utility works by performing a static analysis of the program

image.  It prints out the virtual memory size of the text, BSS and

initialized data segments.  It does not attempt to determine the size

of the stack and the heap, since both of these sizes can vary greatly

depending on the input to the program.  Since the combined size of the

stack and the heap is typically several hundred times larger than than

the combined size of the BSS and the initialized data, this method is

the single most unreliable method of determining the runtime virtual

memory requirement of a program.  It is also the method used in the ICG

to determine memory requirements for Oracle programs.  The one useful

piece of information you can obtain from 'size' is the virtual size of

the text segment.  Since the text segment is paged from the filesystem,

knowing the virtual size of the text segment will not help you size

either swap space or RAM.

The 'ipcs' utility will print out the virtual memory size of all the

shared memory segments on the system.  Use the '-mb' flags to have it

print the size of the segments under the SEGSZ column.

The 'ps' utility will print out information about any process currently

active on the system.  On SYSV-based systems, using 'ps' with the '-l'

will cause 'ps' to print out the SZ field, which contains the virtual

size of the process's non-text segments, measured in pages.  On

BSD-based systems, using 'ps' with the '-u' flag will also cause the SZ

field to be printed.  While this figure is an accurate measure of the

virtual memory being used by this process, it is not accurate if the

process has attached a shared memory segment.  This means that when

sizing memory, you must subtract the size of the SGA (obtained via

'ipcs', above) from the virtual memory used by all of the Oracle

background and shadow processes.

On SVR4-based and BSD-based systems, using the BSD-style 'ps' command

with the '-u' flag will also cause the RSS field to be printed.  This

field contains the physical memory usage for the process.

Unfortunately, this value is the combined physical memory usage for all

the segments of the process, and does not distinguish between pages

private to the process and pages shared between processes.  Since text

and shared data pages are shared between processes, this means that

adding up the RSS sizes of all processes on the system will

over-estimate the amount of physical memory being used by the system.

This also means that if you add up the RSS fields for all the processes

on the system you may very well come up with a number larger than the

amount of RAM on your system!  While the RSS field is a good indicator

of how much RAM is required when there is only one process running a

program image, it does not tell you how much additional RAM is required

when a second process runs that same image.

The 'pstat' utility is also used to print per-process information.  If

it has a SZ or RSS field, the same limitations that apply to 'ps'

output also apply to 'pstat' output.  On some versions of UNIX, 'pstat'

invoked with a flag (typically '-s' or '-T') will give you information

about swap space usage.  Be careful!  Some UNIX versions will only

print out information about how much swap space that is used, and not

about how much has been allocated.  On those machines you can run out

of swap, and 'pstat' will still tell you that you have plenty of swap

available.

The 'vmstat' utility is used to print out system-wide information on

the performance of the paging subsystem.  Its major limitation is that

it does not print out per-process information.  The format of 'vmstat'

output varies between UNIX ports: the key fields to look at are the

ones that measure the number of page-in and page-out events per

second.  Remember that some paging activity is normal, so you will have

to decide for yourself what number of pages-in or pages-out per second

means that your page pool is too small.

On SYSV-based systems, the 'sar' utility is used to print out

system-wide information on the performance of a wide variety of kernel

subsystems.  Like 'vmstat', its major limitation is that it does not

print out per-process information.  The '-r', '-g', and '-p' options

are the most useful for examining the behavior of the paging subsystem.

On SYSV-based systems, the 'crash' utility lets you directly examine

the contents of the operating system kernel data structures.  On

BSD-based systems, it is usually possible to use a kernel debugger to

examine these same data structures.  These data structures are always

hardware- and operating system-specific, so you will not only need a

general knowledge of UNIX internals, but you will also need knowledge of

the internals of that particular system.  However, if you have this

information (and a lot of patience) it is possible to get 'crash' to

give you precise information about virtual and physical memory usage on

a per-process basis.

Finally, there are a variety of public domain and vendor-specific tools

for monitoring memory usage.  Remember: you are looking for a utility

that lets you measure the physical memory usage of a process, and which

gives you separate values for the number of pages used by the text

segment, the shared memory segment, and the remainder of the process.

Consult your operating system vendor for details.

----------------------------

V. Sizing Swap Space and RAM

----------------------------

The bottom line is, that while it is possible to estimate virtual and

physical memory usage on a UNIX machine, doing so is more of an art

than a science.

First:  you must measure your actual application.  An Oracle Forms

application running in bitmapped mode, using 256 colors, 16 full-screen

windows, and retrieving thousands of records with a single query may

well use two orders of magnitude more stack and heap than an Oracle

Forms application running in character mode, using one window and only

retrieving a few dozen rows in any single query.  Similarly, a

server-only system with five hundred users logged into the database but

only fifty of them performing queries at any one time will have a far

lower RAM requirement than a server-only system which has only two

hundred users logged into the database all of which are continually

performing queries and updates.

Second: when measuring physical memory usage, make sure that your

system is as heavily loaded as it will be in a production situation.

It does no good to measure physical memory usage with  255 processes

running Oracle Forms if all 255 processes are sitting idle waiting for

input -- all of the processes are paged out waiting for input.

Sizing swap space is relatively easy.  Recall that every page of

virtual data must be backed with a page of swap.  This means that if

you can estimate the maximum virtual memory usage on your machine, you

have determined how much swap space you need.  Use the SZ column from

the 'ps' command to determine the virtual memory usage for the

processes running on the system.  The high-water mark can be your ally

in this measurement: take one process, run it as hard as you can, and

see how high you can drive the value of the SZ column.

Add together the virtual memory used by the system processes to form

a baseline, then calculate the maximum amount of virtual memory used

by each incremental process (don't forget to count all processes that

get created when a user logs on, such as the shell and any dedicated

shadow processes).  The swap space requirement is simply the sum of the

SZ columns of all processes at the time of maximum load.  The careful

system administrator will add 10% to the swap space size for overhead

and emergencies.

Sizing RAM is somewhat more difficult.  First, start by determining the

amount of RAM dedicated for system space (this is usually printed in a

message during startup).  Note that tuning the operating system kernel

may increase the amount of RAM needed for system space.

Next, determine the amount of RAM needed for the buffer cache.

Finally, determine the amount of RAM needed for the page pool.  You

will want to have enough RAM on the system so that the working set of

every active process can remain paged in at all times.

--------------

VI. References

--------------

`Operating Systems Design and Implementation'

  Andrew S. Tannenbaum, Prentice-Hall, ISBN 0-13-637406-9

`The Design and Implementation of the 4.3BSD Unix Operating System',

  Samuel Leffler, Kirk McKusick, Michael Karels, John Quarterman,

  1989, Addison-Wesley, ISBN 0-201-06196-1

`The Design of the Unix Operating System', Maurice Bach, 1986,

  Prentice Hall, ISBN 0-13-201757-1

`The Magic Garden Explained: The Internals of Unix System V Release 4',

  Berny Goodheart, James Cox, 1994, Prentice Hall, ISBN

  0-13-098138-9.

Related

Products

Keywords

UNIXGENERIC

Rate this document

Related

Categories