Solved

VSA V1 indexing - Index not found when primary proxy offline

  • 8 April 2021
  • 5 replies
  • 459 views

Userlevel 3
Badge +9

I’m deploying a 2 node grid pair (local storage shared using dataserver IP), these servers also have the virtual server agent deployed and are both defined in the pseudo client.

When I take node 1 offline (which is the top node in the VSA Proxy order)  simulate a failure the VSA backups go to a waiting state with “14:201” Index from previous job was not found on MediaAgent, When I take node 2 offline the MediaAgent resources failover as you’d expect

The current VMware pseudo client is V1 indexing, if I create a new pseudo client and upgrade it to V2 indexing will this resolve the issue and allow jobs to continue to run if the primary VSA proxy/MediaAgent is offline?

Also, I know that you need to disable the existing pseudo client, create a new one, run the workflows to update / import configs, but can I get away with just creating a new pseudo client and running the upgrade to v2 workflow (i’ll happily re-create the configuration) and leaving the old pseudo client running the legacy environment until I cut it over?

icon

Best answer by NunoS 8 April 2021, 15:04

View original

5 replies

Userlevel 7
Badge +23

Migrating to v2 indexing, would not solve this challenge. v2 indexing still needs to be housed on one of the MAs and will be known as the Index Cache Server for this client. If that MA is down, the same would need to happen; the new MA would have an index restore kicked off to bring back the index from backup media. 

Nuno 

Thanks @NunoS,

As we have local disk on the MediaAgents shared via dataserver IP a MediaAgent failure would result in the disk paths being unavailable, I’m still in the process of putting in the offsite MediaAgent which will take synchronous DASH copies so the index should be able to be restored from there.  I’ll re-run the test when that is complete.

Thanks again.

Hey Michael,

It's been a while, but I’m of the opinion that the job should not go pending and perform an index restore in the same way as V1. With V1, it had to copy the complete index from the cache, modify it locally, and then when the job is done copy that index back to the Media Agent. With V2, indexes use ‘action logs’ which do not rely on the index database being available for the backupset/subclient. The index database only lives on one media agent and is not transferred to the client - instead, the action logs which only apply to that particular job are shipped and applied to the index DB. The job may delay a bit at the archive index phase as it tries to contact the media agent hosting the index, but it should start and complete without issue.

There is an allowed threshold to process the index against the active database - when the Media Agent comes online again, it should start filtering through and collecting action logs from the other Media Agents where they ended up. If it passes the threshold (which I think is over a week), then I believe it will start an index reconstruction from backup and move the index to a new home on an available media agent.

The other advantages of V2 are the ability to individually control VMs in a job, single VM backup and multi-stream synthetic full amongst others. V2 works with tape too but you must make sure your index backups are kept and reside in the same location as your backup tapes - as DR scenarios can be tricky if not (but don’t worry, Commvault has rules to ensure this is the case automatically). With V1, complete indexes were stored with the backup data so it was a smoother recovery. But this, of course, was a waste of space and inefficient vs the V2 model.

Userlevel 3
Badge +9

Migrating to v2 indexing, would not solve this challenge. v2 indexing still needs to be housed on one of the MAs and will be known as the Index Cache Server for this client. If that MA is down, the same would need to happen; the new MA would have an index restore kicked off to bring back the index from backup media. 

Nuno 

Thanks @NunoS,

As we have local disk on the MediaAgents shared via dataserver IP a MediaAgent failure would result in the disk paths being unavailable, I’m still in the process of putting in the offsite MediaAgent which will take synchronous DASH copies so the index should be able to be restored from there.  I’ll re-run the test when that is complete.

Thanks again.

Userlevel 2
Badge +2

@Michael Woodward,

The VSA v1 backup job that goes into waiting should then go ahead and kick off an index restore from the backup media to restore the index to the MA that is online and proceeded the back up from there, as long as the second MA has access to the data path that the backups jobs were written to. If this not working, a support case is best to then investigate. 

Migrating to v2 indexing, would not solve this challenge. v2 indexing still needs to be housed on one of the MAs and will be known as the Index Cache Server for this client. If that MA is down, the same would need to happen; the new MA would have an index restore kicked off to bring back the index from backup media. 

 

For more information on VSA with Indexing v2, we do have a comparison chart as indexing v2 does have some differences in current capabilities between v1 and v2- Comparison of VSA Features with Indexing Version 1 and Indexing Version 2 - https://documentation.commvault.com/commvault/v11_sp20/article?p=114107.htm

 

Nuno 

Userlevel 3
Badge +9

Hi @Michael Woodward 

Thank you for the question, I’ve got a couple more questions:

  1. What Commvault version is the environment you’re running and how far back is the job history for those VMs.
  2. If there are pre-SP16 VSA jobs for the VMs, these won’t be VM-centric, so will the V2 Index workflow need to be run against the current VSA pseudo client?

I’ll check with the team internally in support to confirm my understanding and get some more info for you.

Thanks,

Stuart

Hi Stuart,

 

Thanks for the response,

  1. The CommCell is running 11.20.9.  The current VSA pseudo client has significant job history that would be needed for restore only, (moving to a new pseudo client would be a good opportunity to tidy up as we are implemeting IntelliSnap and will be going to a datastore affinity based model). 
  2. If moving to a V2 indexing structure for the new pseudo client will help the backups fail over to the other proxy, then it’s my understanding we can run the workflow prior to any jobs being in the new pseudo client. 
Userlevel 7
Badge +15

Hi @Michael Woodward 

Thank you for the question, I’ve got a couple more questions:

  1. What Commvault version is the environment you’re running and how far back is the job history for those VMs.
  2. If there are pre-SP16 VSA jobs for the VMs, these won’t be VM-centric, so will the V2 Index workflow need to be run against the current VSA pseudo client?

I’ll check with the team internally in support to confirm my understanding and get some more info for you.

Thanks,

Stuart

Reply