oracle scheduler job coordinator , slaves , related parameters – troubleshooting guide for the new oracle dba





The new oracle dba can find here information about the

1) job_queue_interval parameter and related error messages

2)problem with oracle scheduler jobs when multiple oracle instances are running on the same solaris server(host).The problem is that we have very high CPU consumption.

3)How does oracle internally check which jobs should run at a specified time. which background process does that? read the below articles to find out the answer.

4)what does the job queue coordinator process do?

read on to find answers for above questions

ID 418755.1

Parameter Name & Recommendation Description Considerations
_job_queue_interval = 1 Scan rate interval (seconds) of job queue. Default is 5 This improves the scan rate for propagation jobs to every second, rather than every 5 seconds.

 

Kkjcre1p: Unable To Spawn Jobq Slave Process, Error 1089 [ID 344275.1]

 

 

Oracle Server – Enterprise Edition – Version: 10.1.0.4 and later   [Release: 10.1 and later ]
Information in this document applies to any platform.

During RMAN cold backup when the database goes down we may see the error in alert log
      
         
  kkjcre1p: unable to spawn jobq slave process, error 1089

ORA-01089 is just a warning that the DB is being shut down.

If a job is about to be spawned when shutdown of database is in progress, you will see these errors in the alert log file and this is perfectly valid.
    

There is no harm at all because of  this warning being logged to the alert.log  The Error can be safely ignored as the job coordinator process tried to spawn a job slave when the Shutdown was in progress.

One workaround that we can suggest is to set an underscore parameter

_JOB_QUEUE_INTERVAL=120 or greater value

The default value is 60 but when we change to 120 there are less chances of getting the above warnings in the alert log file.

Bug 8531434 – Solaris: Excessive CPU by MMNL/CJQ0 when running multiple instances and cpus [ID 8531434.8]  

 
  Modified 09-JUL-2010     Type PATCH     Status PUBLISHED  
       

Oracle Server – Enterprise Edition
Information in this document applies to any platform.

Bug 8531434  Solaris: Excessive CPU by MMNL/CJQ0 when running multiple instances and cpus

 This note gives a brief overview of bug 8531434.
 The content was last updated on: 14-DEC-2009
 Click here for details of each of the sections below.

Affects:

Product (Component) Oracle Server (Rdbms)
Range of versions believed to be affected Versions >= 10.2.0.1 but < 11.2
Versions confirmed as being affected
Platforms affected Generic (all / most platforms affected)

Fixed:

This issue is fixed in

Symptoms:

Related To:

  • (None Specified)
  • _JOB_QUEUE_INTERVAL
         

Description

High system CPU when running many instances on a single Solaris host with many processors.
The processes make excessive calls to kstat .



Fix for bug 8777336 fixes other kstats calls made by 11g exclusively.

Workaround
Increase job_queue_interval (e.g. from 5 to 30).
Please note: The above is a summary description only. Actual symptoms can vary. Matching to any symptoms here does not confirm that you are encountering this problem. Always consult with Oracle Support for advice.

References

Bug:8531434 (This link will only work for PUBLISHED bugs)
Note:245840.1 Information on the sections in this article

How To Control The Frequency That The Server Checks For New Scheduled Jobs? [ID 197220.1]  

 
  Modified 20-JUN-2007     Type HOWTO     Status PUBLISHED  
       

Checked for relevance on 20-Jun-2007

 
 
·                goal: How to control the frequency that the server checks for new scheduled
·                jobs?
·                 
·                fact: Oracle Server - Enterprise Edition 9.0
·                 
·                 
 
 
fix:
 
Oracle Server 8.1 uses initialization parameter JOB_QUEUE_INTERVAL to specify
how frequently each SNPn background process woke up. The parameter
JOB_QUEUE_INTERVAL is obsolete with Oracle Server 9.0.
 
For each Oracle Server 9.0 instance, job queue processes are dynamically
spawned by a coordinator job queue (CJQ0) background process. The coordinator
periodically selects jobs that are ready to run from the jobs shown in the
DBA_JOBS view. 
 
The Oracle Server 9.0 default interval upon which the job queue coordinator
wakes up to see if there are any jobs to run is 5 seconds.
If 5 seconds is inappropriate interval, you can set a hidden parameter:
_job_queue_interval to a value other than 5 seconds. The lowest value it will
accept is 1 second.
What Does CJQ0 Process Do ? [ID 222180.1]  

 
  Modified 27-APR-2010     Type HOWTO     Status PUBLISHED  
       
 
·                 "Checked for relevance on 27-Apr-2010"
·                 
·                goal: What does CJQ0 process do ?
·                 
·                fact: Oracle Server - Enterprise Edition 9
·                 
·                fact: Job Queue
·                 
·                 
 
 
fix:
 
CJQ0 is Coordinator of Job Queue slave processes.
 
The job queue processes run user jobs as they are assigned by the CJQ process.
Here's what happens:
 
1.  The coordinator process, named CJQ0, periodically selects jobs that need
to be run from the system JOB$ table. New jobs selected are ordered by time.
 
2.  The CJQ0 process dynamically spawns job queue slave processes (J000…J999)
to run the jobs.
 
3.  The job queue process runs one of the jobs that was selected by the CJQ
process for execution. The processes run one job at a time.
 
4.  After the process finishes execution of a single job, it polls for more jobs.
If no jobs are scheduled for execution, then it enters a sleep state, from which
it wakes up at periodic intervals and polls for more jobs. If the process does
not find any new jobs, then it aborts after a preset interval.
 
The initialization parameter JOB_QUEUE_PROCESSES represents the maximum number
of job queue processes that can concurrently run on an instance.
However, clients should not assume that all job queue processes are available
for job execution.
 
Note:
The coordinator process is not started if the initialization parameter
JOB_QUEUE_PROCESSES is set to 0.

 

http://blogs.sun.com/hippy/entry/multiple_oracle_instances_performance_issue

Finally closed a VOS case that’s been open for over a year which was related to high system consumption caused by running multiple Oracle RDBMS’s on a single system. The observation was 80/90% system cpu consumption from mpstat 1 and the following from lockstat profiling:

Profiling interrupt: 67240 events in 2.168 seconds (31017 events/sec)

Count genr cuml rcnt nsec Hottest CPU+PIL Caller
——————————————————————————-
40920 61% —- 0.00 987 cpu[7] fop_ioctl
40920 61% —- 0.00 987 cpu[7] ioctl
40880 61% —- 0.00 986 cpu[7] read_kstat_data
40248 60% —- 0.00 1077 cpu[7] syscall_trap
38780 58% —- 0.00 947 cpu[2] mutex_vector_enter
32478 48% —- 0.00 947 cpu[5] kstat_hold_bykid
32477 48% —- 0.00 947 cpu[5] kstat_hold
13516 20% —- 0.00 1845 cpu[102] (usermode)
6466 10% —- 0.00 1904 cpu[423] syscall_trap32
6169 9% —- 0.00 926 cpu[3] kstat_rele
4738 7% —- 0.00 1626 cpu[96] thread_start
2420 4% —- 0.00 1359 cpu[135]+11 idle
2317 3% —- 0.00 1464 cpu[135]+11 disp_getwork
2122 3% —- 0.00 2764 cpu[101] fop_read
1388 2% —- 0.00 2510 cpu[101] vx_read
1379 2% —- 0.00 2509 cpu[101] vx_read1
1352 2% —- 0.00 2503 cpu[101] vx_cache_read
1267 2% —- 0.00 2459 cpu[418] trap
1215 2% —- 0.00 3059 cpu[128] fop_write
1082 2% —- 0.00 2339 cpu[418] utl0
——————————————————————————-

Originally I raised CR 6734910 – "kstat_hold doesn’t scale well on large systems" to track this but it seemed as though Oracle could do a better job of utilizing the kstat interface which then resulted in Oracle bug and Patch 8531434 – "KSTAT CALLS BY MMNL/CJQ0 INCUR HIGH SYSTEM CPU WHEN RUNNING NUMEROUS INSTANCES" being logged and fixed for 10.2.0.4.0. 11g doesn’t appear effected by this issue. So to avoid a performance hit whilst running multiple Oracle instances on a single host you can either use the workaround of Oracle parameter _job_queue_interval (e.g. from 5 to 30) and potentially loose granularity of performance statistics or patch

Author: admin