Tuesday, February 3, 2009

Reproduce error, the best way to learn !!

There are serveral Oracle errors which are intermittent. Learning to reproduce this error teaches hows to prevent them too..

Recently I ecountered the following error in a critical production system.

ORA-28547: connection to server failed, probable Oracle Net admin error

The error was very intermittent and disappeared by its own after an hour. Till that time all connection attempts waited for a sometime and gave the above error.



The oerr description of error was not helpful as it talked about persistent issues.


Following observations were made.
The database in shared server mode.
All shared servers were busy in executing SQLs when the error happened.

I wanted to reproduce the problem to make sure that it has something to do with Shared server.

Did the following.
Set my test database to shared server mode.

Set max_shared_servers to 1

made a connection to the database via listener.
Verified the connection is made via listener.

SQL> select server from v$session where userNAME='SYSTEM' 2 ;
SERVER
---------
NONE

(inactive shared server connections shows "none" for v$session.server).


Next plan was to make the available shared server busy.

Just ran a small PL/SQL block like the following.

SQL> begin 2 for i in 1..1000000 loop 3 insert into t values(i); 4 commit; 5 end loop; 6 end; 7 /

Verified the session is actually taking the help of shared server.

SQL> select server from v$session where userNAME='SYSTEM' 2 ;SERVER
---------
SHARED

Ok, in my system the only shared server I have is busy in serving above command.

Attempted another connection via listener..

Wow !! session waited for 60 sec. and then gave
ORA-28547: connection to server failed, probable Oracle Net admin error.


A possible explanation:
Once the connection request reaches listener , it will be handed off. For a shared server connection , shared server must be available to do the authentication. If , all available shared servers are busy, then the connection request can time out with this error.


ORA-28547: connection to server failed, probable Oracle Net admin error.

Once the shared servers are free , the situation will be over..
A quick test to make sure that we are hitting on shared server issue, can attempt a connection request with (SERVER=DEDICATED) or attempt a local connection. This should work even when the shared server connections are failing.