Solved

RMAN jobs hang with status "running"

  • 12 September 2022
  • 10 replies
  • 848 views

Badge +2

Hello,

 

Very strange problem with CommVault RMAN (archive) backups.

 

Sometimes a RMAN archivelog backup job stays “running” forever/hangs.

It's not related to a specific Oracle instance, the problem occurs with all instances.

At random, sometimes the jobs run and finish OK in a couple of minutes and sometimes stay running forever.

 

Host (Windows2016) memory and CPU performance is ok.

ClOragent.log logging stops at : OraObject::GetOraMode() - oraMode = READ WRITE

So hangup occurs before the RMAN script is called.

No entry in the database alert.log at that timestamp, just skipped.

 

Looks like communication with the CIOragent.exe process is lost, although it is still running on te server…

Hope you have an idea how to solve this problem...

 

icon

Best answer by Chris Hollis 7 March 2024, 00:06

View original

10 replies

Badge

Hello,

 

Something that can help troubleshoot the issue would be to increase the debug level on the ClOraAgent process.  I would recommend to set the debug level to the Process to 8.  I would also recommend to increase the File Versions to 10 since the log will be more verbose. The logging can be increased at the Client Level via Process Manager or through the GUI.  

Setting Logging Parameters in the Process Manager

:https://documentation.commvault.com/11.24/expert/5554_setting_logging_parameters_in_process_manager_01.html 

Configuring Log File Settings in the CommCell Console:

https://documentation.commvault.com/11.24/expert/114289_configuring_log_file_settings_in_commcell_console.html 


If there isn't a clear reason as to why it is hanging, I would recommend to create a case with support and the team can review the logging with the higher debug levels.

Thanks,
Tim

Badge +2

Hi Tim,

Thank you for your suggestion to increase the log level.

 

Before my post I already increased debug level to 99, filesize 50MB and file versions 5.

This morning at 08:10 occured a hang, so tried to analyse de CIOraAgent.log and compare it with a succeeded one.

Unfortunately no error or failure messages found, it just stopped logging after 27 sec. 

 

Guess I will create a support case...

Userlevel 7
Badge +23

@HvS , please add the case number to this thread so we can track it.

Thanks!

Badge +2

Hi Mike,

No case yet, will let you know!

Userlevel 4
Badge +12

@HvS  typically ClOraAgent appears to be hung when we wait for an output from a SQL query we run before the actual RMAN backup starts 

Debug level needs to be in between 1-10, Could you reduce it to 6 

 

Rerun the job and when it hangs again.. Could you share the clOraAgent.log 

Badge +2

@Gowri Shankar Thanks, reduced it to 6. Last SQL in the log is :

SELECT DATABASE_ROLE from V$DATABASE;

DATABASE_ROLE
----------------
PRIMARY

 

Shared the last part of the CIOraagent.log to avoid sharing hostnames and credentials…

 

Badge +2

CIOraagent.log

Userlevel 4
Badge +12

With the part of the logs its not conclusive, please open a support incident and share the ticket number.

 

Regards,

Gowri Shankar

Badge +2

Done. Ticket nr 220922-450

Userlevel 6
Badge +15

Case notes show no real RCA, however reconfiguring of the clients (including update to FR28 from FR26) appeared to resolve the issue with @HvS  confirming the issue was no longer occurring.

Marking thread as closed.

Reply