oracle sqlnet issues – tracing and troubleshooting-for the new oracle dba





you are a new oracle dba and you would like to know how to trace the client sqlnet and server sqlnet components since your application had issues connecting to your database server. Below metalink article explains in detail.

Subject:

Troubleshooting Oracle Net
 

Doc ID:

779226.1

Type:

TROUBLESHOOTING
 

Modified Date :

29-MAY-2009

Status:

PUBLISHED

In this Document
Purpose
Last Review Date
Instructions for the Reader
Troubleshooting Details
Oracle Net Troubleshooting
Diagnostic Tools
Generic Oracle Net Tracing
Client Connection Diagnostics:
Server Diagnostics:
TNS Listener diagnostics
References


Applies to:

Oracle Net Services – Version: 9.2.0.1.0
Information in this document applies to any platform.

Purpose

The aim of this document is to provide a Troubleshooting Guide for Oracle Net, including :

Client Connection Issues

Server Connection Issues

TNS listener

With Referances to best practices for

Oracle Net tracing

Diagnosing Oracle Net Traces

Last Review Date

April 30, 2009

Instructions for the Reader

A Troubleshooting Guide is provided to assist in debugging a specific issue. When possible, diagnostic tools are included in the document to assist in troubleshooting.

Troubleshooting Details

Oracle Net Troubleshooting
Instructions for the Reader
The Oracle Net Troubleshooting Guide is provided to assist you in solving Oracle Net issues in a structured manner. The method is based on Oracle Diagnostic Methodology and helps you to avoid trial and error approach. It guides you through a step-by-step method. Diagnostic tools are included in the document to assist in the different troubleshooting steps.

Step by step approach

ISSUE CLARIFICATION

In the issue clarification section it is important to describe the problem as best as possible. What is the problem you need to solve? At the end of the process you should be able to come back to this section to verify if the root cause was found and the solution was provided.

WHAT IS the problem we are looking at?

Examples of problem descriptions:

Connections from 10.2.0.1 Windows clients to standalone 11.1.0.7 RDBMS on Solaris, are not possible via F&A application, erroring with the message TNS-12154. Connection from same client, using SQL*plus is successful from the same ORACLE_HOME.

Connections to application using Oracle client 10.2.0.4 on Linux, become extremely slow in the afternoon and eventually hang. At the time of the hang, connections directly on the Linux server to the RAC 10.2.0.4 RDBMS also hang. Database has to be re-started to resolve the problem.

Some users can connect to the 11.1.0.6 database, some not. ORA-12545 received for the 11.1.0.6 Windows SQL*plus clients.
Note: The problems can be caused due to client connection issues, server connection issues, listener issues or other problems. If this is known already at this stage, it is good to mention this. Otherwise next steps will clarify where the problem exactly resides.

ISSUE VERIFICATION

When you have a starting problem description it is time to collect facts in a structured way. First step would be to get an overview of all the facts we have. What information do we have readily available?
The answers to the following Oracle Net specific questions can potentially help you to solve the problem:

Please note that for Oracle Net, questions can be more relevant to certain areas, than others. For these reasons the following Key / guidelines can be followed:

(C) Questions related to client connection issues.
Issue from a physical client platform, middle tier or an application running, located on server where the database is installed. Where the connection is unable to communicate with the database server, via TCP/IP, has naming resolution or connection handshake failure is started, configuration / setup conflict with RDBMS settings.
(S) Questions related to server issues.
Issues establishing session to database, due to backend process failure, setup and / or configuration failures.
(L) Questions related to TNS Listener issues.
Issue with the TNS listener connection handshake, process crash, memory leak or CPU spin with the TNSLSNR process.

What is the problem / what is not a problem?

What are the symptoms?
What are the different errors generated?
What Client and Server platforms reproduce the problem?
What Client and Server platforms work?
What 5-digit version of Oracle software reproduces the problem?
What 5-digit version of Oracle software works?
(For Windows platform confirm patch number bundle used)
What products reproduce the issue?
What products do not reproduce the issue?
What is the result of TNSping with the same connect string?
Is this standalone database?
Is this RAC?

When is the problem seen / when is the problem not seen?

Is the problem constant?
Is the problem reported intermittently?
Is there any pattern to the failure?
What resolves the problem?
How long does the problem last?
Has this ever worked or is this a fresh installation / setup?
When did the problem start to fail?

Where is the problem / Where is the problem not?

What physical network topology affects the problem?
What settings, if any allow the connection(s) to work?
Problem reproduces on another node / server?
Problem reproduces against another RDBMS?
(C)Is there other, working TCP/IP communications from effected machines?
(C)What is the impact on the problem when physically moving client on the network?
(C)Which clients report the problem?
(C)Which clients do not report a problem?
(C)What do the clients have in common?
Is the Connection model used Dedicated or Shared Server?
What happens if the opposite connection model is used?
(S)Is the problem limited to one database in the Oracle home?
(S)Does the problem reproduce on other servers?
(S)Is the connection from the database Oracle home via Bequeath working?
(S)Is the Local connection from the database Oracle home working?
(S)Is Valid Node checking in use?
(L)Single or multiple TNS listener affected on server?
(L)Symptoms / Problem reproduces for TNS listener on another node?
(L)Is there specific software installed on the non-working node compared to the working node?

CAUSE DETERMINATION

The facts listed in the ISSUE VERIFICATION are the starting point for the CAUSE DETERMINATION:
There are 3 main approaches to take here:

  1. Use your experience to list possible causes. List the assumptions which needs to be checked in this case
  2. Start searching for possible causes in My Oracle Support or other Oracle knowledge bases. Use the facts collected above to refine your search criteria.

Analyze the facts on differences between the working situation and non working situation: Depending on the answers from the questioning above and further investigation from the troubleshooting guides you should be able to list what is different, special, unique between the IS and the IS NOT and also see what is changed and when.

Examples
What changes happened around time the first failure / problem was reported?

New software installed?

Upgrade done?

New hardware?

Network configuration changes?

What is different between the clients who can connect and those that cannot?

What changed and could have an impact between the working situation and the current situation?

What is different to the working Alias and the non-working Alias?

The output will be a list of potential reasons causing the symptoms: can be a bug, a configuration setting, a conflict with other software, …

CAUSE JUSTIFICATION

Evaluate the causes: check the causes against the facts (the IS and the IS NOT observations). This also includes checking the symptoms of the problem against any bug rediscoveries identifying a problem. List potential assumptions you have made. Determine the most probable cause (often the one with the least assumptions or with the most reasonable assumptions).

For the most probable cause, verify the assumptions and turn them into facts (document them in the Issue Verification part). Some examples:

If bug XXX then we expect an upgrade from version x to y happened before the symptoms started

If the configuration file is wrong on client x, and right on client y, we expect it to work if we copy the file

This looks like same issue as described in note XXX but then we expect also a virus checker installed on the Windows server

If this cause would be true then another sequence of actions would result in a different outcome: let’s test this via an internal test case

If the verification fails, go to the next probable cause and repeat the verification.

If no potential cause stands the justification process, go back to the issue verification and collect more detailed facts. Further diagnostics can be verified. See below in the section Diagnostic Tools)

POTENTIAL SOLUTIONS

A brief description of the corrective actions that will remove the cause of the problem: in some cases there is only 1 solution linked to the cause. But in many cases, there are more. Example: install a patch to remove the bug from the system, avoid the problem by working differently, or avoid the problem by setting some parameters.

POTENTIAL SOLUTION JUSTIFICATION

Explain why the proposed solution solves the problem.

Diagnostic Tools

Each area requires a specific diagnostic path to be followed, which always requires Oracle Net tracing to be enabled. The following notes contain all the parameters available for Oracle Net tracing:

Generic Oracle Net Tracing

Note 219968.1SQL*Net, Net8, Oracle Net Services – Tracing and Logging at a Glance
Note 395525.1How to Enable Oracle SQLNet Client, Server, Listener, Kerberos and External procedure Tracing from Net Manager

Please note, that enabling Oracle Net Server tracing can produce large amounts of trace, in a very short time frame. Even with cyclic tracing, each process will have the TRACE_FILENO_SERVER value amount of traces produced. Optimal tracing workflow should be to enable tracing, reproduce problem and then disable tracing. Thus limiting amount of time tracing is enabled. If unable to enable trace due to high connection volume, then the following article can assist.

Note 401134.1How To Limit Connections Generating Trace when Oracle Net Server Enabled

Other options are to enable tracing when the system is quieter or out of normal business hours.

Ensure only matching client and server traces are uploaded. Following article explains how to achieve this. Note 374116.1 How to Match Oracle Net Client and Server Trace Files

Diagnosing Oracle Net traces

When analyzing Oracle Net traces, key features / steps in the trace need to be reviewed, to narrow down the point of failure / issue. Ensure to review the trace to understand how and what is used for the connection. Points to look for are:

SQLNET.ORA parameters.
TNSNAMES.ORA file used.
Naming resolution method (local, onames, ldap, ezconnect, host)
Connection model. (Dedicated or shared server)
Connection handshake completion (Connect, Resend or Redirect, Connect and Accept packets)
Cross-reference client and server packet flow, checking packet sizes send, received / sent information, timings between client and server processes.
Ensure completion codes are not mistaken for errors.

Full details on how to diagnose Oracle Net traces can be found in Note 156485.1 Examining Oracle Net, Net8, SQL*Net Trace Files.

Please note that multithread application will show thread id before the time stamp column of a net trace. Match the thread id’s to follow the correct flow of information in the trace. Ensure completion codes reported in traces are not mistaken for error codes.

Client Connection Diagnostics:

  1. If the problem is specific to client area, Oracle Net Level 16 Client trace capturing the failure / issue.
    If the problem is specific to client and TNS listener handshake failure, Oracle Net Level 16 Client trace capturing the failure / issue with matching TNS listener level 16 trace.
  2. Sqlnet.log from the client. Default location is ORACLE_HOME/network/log

Capture environment settings and all net admin files, etc via RDA.

Please note that JDBC thin client connections cannot be Oracle Net Client traced, but the Oracle Net packet flow can now be JavaNet traced. Note 793415.1 How to Perform the Equivalent of Sqlnet Client Tracing with Oracle JDBC Thin Driver

Common Client Error Codes:

Articles to be followed when diagnosing the common error codes:

Note 114085.1TROUBLESHOOTING GUIDE TNS-12154 TNS could not resolve service name:
Note 77640.1SERVICE_NAME Parameter – Resolving The ORA-12514 Error
Note 553328.1Troubleshooting ORA-12545 / TNS-12545 Connect failed because target host or object does not exist

Server Diagnostics:

  1. Oracle Net level 16 Client and Server traces.
  2. Sqlnet.log from the server and client. Default location ORACLE_HOME/Network/log
  3. Capture environment settings, status, RDBMS information, net admin files, etc via RDA
  4. Shared server diagnostics follow Note 1005259.6 Shared Server (MTS)
    Diagnostics

Important note:

The SQLNET.ORA file is only read once on creation of a process. RDBMS Background process and shared server dispatchers will need to be restarted for parameter changes in the SQLNET.ORA to be picked up. Once a process has started to be traced, tracing will not stop until that the process stops.

TNS Listener diagnostics

  1. Oracle Net level 16 Listener tracing. The workflow should be to enable tracing, capture the failure and disable tracing, as quickly as possible.
  2. RDA output from the ORACLE_HOME where TNS listener is running.
  3. Listener.log. Covering time period of the problem and a few hours before. Default location is ORACLE_HOME/network/log/<listener-name.log>.

Further diagnostics maybe required, see following guides:

For Crashing TNS Listener follow:
Note 793844.1Troubleshooting a TNS listener crash

For Slow TNS listener connections follow:
Note 557416.1How to Diagnose Slow TNS listener / Connection Performance

For Hanging TNS listener situations follow:
Note 230156.1TNSListener Hanging Information to Get For Resolving or Troubleshooting.

If issues arise from TNS listener tracing at busy / peak times, To further restrict the amount of time the TNS listener is traced, follow method described in Note 751432.1 Further TNS Listener Tracing can be used.

References

 

Author: admin