Skip to main content
Solved

RMAN jobs hang with status "running"


Forum|alt.badge.img+2
  • Bit
  • 5 replies

Hello,

 

Very strange problem with CommVault RMAN (archive) backups.

 

Sometimes a RMAN archivelog backup job stays “running” forever/hangs.

It's not related to a specific Oracle instance, the problem occurs with all instances.

At random, sometimes the jobs run and finish OK in a couple of minutes and sometimes stay running forever.

 

Host (Windows2016) memory and CPU performance is ok.

ClOragent.log logging stops at : OraObject::GetOraMode() - oraMode = READ WRITE

So hangup occurs before the RMAN script is called.

No entry in the database alert.log at that timestamp, just skipped.

 

Looks like communication with the CIOragent.exe process is lost, although it is still running on te server…

Hope you have an idea how to solve this problem...

 

Best answer by Chris Hollis

Case notes show no real RCA, however reconfiguring of the clients (including update to FR28 from FR26) appeared to resolve the issue with @HvS  confirming the issue was no longer occurring.

Marking thread as closed.

View original
Did this answer your question?

10 replies

Forum|alt.badge.img
  • Vaulter
  • 4 replies
  • September 12, 2022

Hello,

 

Something that can help troubleshoot the issue would be to increase the debug level on the ClOraAgent process.  I would recommend to set the debug level to the Process to 8.  I would also recommend to increase the File Versions to 10 since the log will be more verbose. The logging can be increased at the Client Level via Process Manager or through the GUI.  

Setting Logging Parameters in the Process Manager

:https://documentation.commvault.com/11.24/expert/5554_setting_logging_parameters_in_process_manager_01.html 

Configuring Log File Settings in the CommCell Console:

https://documentation.commvault.com/11.24/expert/114289_configuring_log_file_settings_in_commcell_console.html 


If there isn't a clear reason as to why it is hanging, I would recommend to create a case with support and the team can review the logging with the higher debug levels.

Thanks,
Tim


Forum|alt.badge.img+2
  • Author
  • Bit
  • 5 replies
  • September 13, 2022

Hi Tim,

Thank you for your suggestion to increase the log level.

 

Before my post I already increased debug level to 99, filesize 50MB and file versions 5.

This morning at 08:10 occured a hang, so tried to analyse de CIOraAgent.log and compare it with a succeeded one.

Unfortunately no error or failure messages found, it just stopped logging after 27 sec. 

 

Guess I will create a support case...


Mike Struening
Vaulter
Forum|alt.badge.img+23

@HvS , please add the case number to this thread so we can track it.

Thanks!


Forum|alt.badge.img+2
  • Author
  • Bit
  • 5 replies
  • September 13, 2022

Hi Mike,

No case yet, will let you know!


Forum|alt.badge.img+13

@HvS  typically ClOraAgent appears to be hung when we wait for an output from a SQL query we run before the actual RMAN backup starts 

Debug level needs to be in between 1-10, Could you reduce it to 6 

 

Rerun the job and when it hangs again.. Could you share the clOraAgent.log 


Forum|alt.badge.img+2
  • Author
  • Bit
  • 5 replies
  • September 20, 2022

@Gowri Shankar Thanks, reduced it to 6. Last SQL in the log is :

SELECT DATABASE_ROLE from V$DATABASE;

DATABASE_ROLE
----------------
PRIMARY

 

Shared the last part of the CIOraagent.log to avoid sharing hostnames and credentials…

 


Forum|alt.badge.img+2
  • Author
  • Bit
  • 5 replies
  • September 20, 2022

CIOraagent.log


Forum|alt.badge.img+13

With the part of the logs its not conclusive, please open a support incident and share the ticket number.

 

Regards,

Gowri Shankar


Forum|alt.badge.img+2
  • Author
  • Bit
  • 5 replies
  • September 27, 2022

Done. Ticket nr 220922-450


Chris Hollis
Vaulter
Forum|alt.badge.img+15
  • Vaulter
  • 333 replies
  • Answer
  • March 6, 2024

Case notes show no real RCA, however reconfiguring of the clients (including update to FR28 from FR26) appeared to resolve the issue with @HvS  confirming the issue was no longer occurring.

Marking thread as closed.


Reply


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings