Skip to main content

This is a conversation post after my initial post about FR22 .3 

This is more of a findings topic/conversation .

I had three older wk8r2 media agents ( now replaced) that experienced widespread issues after going to FR22 .3

NOTE: none of these issues are/were recorded with Commvault as actual issues. The decision to replace/Migrate the OS was made at the 11th hour after working weeks on these issues.

The basic application appears to work just fine with 2k8r2 fr22 .3 - Readiness, services running  ,can run jobs etc. 

The issue we were running into was consistent across all three. And the only 2k8r2 media agents in our environment's I knew it was an issue. Seemed too coincidental not to be.

After the Fr22 .3 update- within 4 hours our jobs started experiencing all or some of the following  errors:

Pipeline errors

Media mount services 

device not ready

library full.

Even when attempting to select new snap mount hosts for jobs i was getting connection refused messages in the GXTail event logs.

The most consistent issue i could see across all systems was the flapping of the CVD services. And most common was the CVD.EXE service. Now , i went through full software installs. Updates patches everything. Issue would disappear for about 45-90 min- then start again.

We have multiple tickets open with MA, but the issue could not be pinpointed. We sent countless logs and had various zoom calls with support only to be told they would go to Dev. I am not complaining.. its a hard thing to pinpoint.

On a whim , i decided to do an in place upgrade  to 2012r2 on one of the media agents that was low priority but was having the issue. After running all the updates and drivers, the media was up and running. And within 24 hours the Media agent has not had one issue. We didn't reinstall any commvault software-- just updated the OS. And it fixed the issues. 

We replaced all 3 systems with 2k12 and 2k16 ( no 2k19 licenses currently) and everything is 100%  SLA is returning and all jobs are completing without having to restart services or jobs. 

So , If you have a 2k8r2 Media agent and are having random, inconsistent service issues and failures, Here is your problem. 

 

 

I assume dev was provided dumps of CVD?  I wonder if it was related to dotnet or c++ redist.   I have a 2k8 SQL machine hanging around, ill see what happens in 11.23.


Thanks, Matthew - windows 2008 r2 was deprecated for Commvault infrastructure some time ago. That did not necessarily mean it stopped working, or that we would not try to resolve issues on it, but it did mean that we no longer tested that OS so the likelihood was that at some point it would start to have unforeseen problems.

I recall that one of the primary reasons was the network subsystems could get overwhelmed in larger environments, and Microsoft made large improvements in 2012+ that increased scale immensely (here is one example)

 

Thank you for sharing this!


Thanks, Matthew - windows 2008 r2 was deprecated for Commvault infrastructure some time ago. That did not necessarily mean it stopped working, or that we would not try to resolve issues on it, but it did mean that we no longer tested that OS so the likelihood was that at some point it would start to have unforeseen problems.

I recall that one of the primary reasons was the network subsystems could get overwhelmed in larger environments, and Microsoft made large improvements in 2012+ that increased scale immensely (here is one example)

 

Thank you for sharing this!

Yea - we are aware of the deprecation , but ya know how things go. It kept running so we kept using it with plans to replace hardware etc. Logs were provided quite a bit over and over. They did their best - but in the end-- the OS was the culprit. The help we got as always was top notch, but it was time to just rip off the bandaid.


I assume dev was provided dumps of CVD?  I wonder if it was related to dotnet or c++ redist.   I have a 2k8 SQL machine hanging around, ill see what happens in 11.23.

The endpoints for backups worked fine. I didnt have any issue except those running as a media agent.


Reply