One ORA-12520 issue in Oracle RAC

  • 1,557 views
  • Read

Several days ago I helped to fix one ORA-12520 issue in RAC database, and from this issue I learned that special care should be taken when the listener.ora is modified in RAC database.

The issue was turned out to be not so complicated finally so I will not mention too many details of it.

The error messages were like this:

SQL> conn test/test1234ch;
Connected.
SQL> conn test/test1234ch@prd01;
ERROR:
ORA-12520: TNS:listener could not find available handler for requested type of server

And it seemed the listener did not support any service:

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER
Version                   TNSLSNR for Linux: Version 11.2.0.4.0 - Production
Start Date                25-JAN-2018 15:17:30
Uptime                    0 days 0 hr. 1 min. 49 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u01/app/11.2.0.4/grid/network/admin/listener.ora
Listener Log File         /u01/app/oracle/diag/tnslsnr/prd01a/listener/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER)))
The listener supports no services
The command completed successfully

And we found the the status of the listener was not online:

ora.LISTENER.lsnr
               ONLINE  INTERMEDIATE prd01a            Not All Endpoints Registered
               ONLINE  INTERMEDIATE prd01b            Not All Endpoints Registered

And my friend told me the RAC database worked well in the past and they did not change anything of the database before and after the server reboot.

Yes, for some reason the server crashed and was restarted automatically but after that the database could not be connected.

We checked the alert file of the listener, and only got some information as below while not other clues:

25-JAN-2018 15:17:30 * (CONNECT_DATA=(CID=(PROGRAM=)(HOST=prd01a)(USER=oracle))(COMMAND=status)(ARGUMENTS=64)(SERVICE=LISTENER)(VERSION=186647552)) * status * 0
25-JAN-2018 15:17:30 * version * 0
25-JAN-2018 15:17:30 * service_register_NSGR * 1194
TNS-01194: The listener command did not arrive in a secure transport

From the 1194 error, I guessed the TCPS maybe was used for the connections so I asked to get the content of the listener.ora file:

LISTENER=(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER))))            # line added by Agent
LISTENER_SCAN1=(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER_SCAN1))))                # line added by Agent
ENABLE_GLOBAL_DYNAMIC_ENDPOINT_LISTENER_SCAN1=ON                # line added by Agent
ENABLE_GLOBAL_DYNAMIC_ENDPOINT_LISTENER=ON              # line added by Agent

VALID_NODE_CHECKING_REGISTRATION_LISTENER=1

VALID_NODE_CHECKING_REGISTRATION_LISTENER_SCAN1=1
REGISTRATION_INVITED_NODES_LISTENER_SCAN1=(10.58.167.19,10.58.167.20)
SECURE_REGISTER_LISTENER = (TCP)

And I was told the file worked for a long time.

From the Oracle support, I found several possible reasons like wrong permissions on the /tmp/.oracle or /var/tmp/.oracle, infiniband compatibility issue, more than one started listener or wrong configurations of the parameter local_listener. We checked all of them and they were not matched.

I wanted to figure out the detail purpose of the option SECURE_REGISTER_LISTENER, and got below two notes:

Scan Listener TCPS Service Handlers are Blocked after Implementing COST on an SSL Cluster (Doc ID 1537743.1)

Using Class of Secure Transport (COST) to Restrict Instance Registration in Oracle RAC (Doc ID 1340831.1)

One key point was: In 11.2 RAC the grid agent uses the IPC protocol to create and manage scan listeners so both IPC and TCPS must be enabled.

And the setting of this option: SECURE_REGISTER_LISTENER_SCAN1 = (IPC,TCPS)

So it was clear the IPC must be included.

My friend has a good habit and he showed his previous backup of this file, while I found the last line was not there!

We commented it and restarted the listener, then it became online and could be connected.

Then he recalled such change was made long time ago to fix a security issue and same change was implemented on both UAT and PROD environments and they did not get any issue ... until the reboot.

Why? They just restarted the listener during the changes while listener restarting will not affect all the existing sessions, so the RAC database worked well until this abnormal reboot, and no one would know this change caused this incident.

So when we change the listener.ora file in the RAC database, we'd better restart the cluster service one node by one node, right?

  • by Published on 28/01/2018 04:22:28
  • Repost please keep this link: https://www.dbcloudsvc.com/blogs/oracle/one-ora-12520-issue-in-oracle-rac/
匿名

Comment

Anonymous Write

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

Decide