Solved

HyperV backup did not delete/merge checkpoint disk after completion.


Userlevel 1
Badge +5

Hi,

My customer having issue with hyperv backup. There is a lot of vm with left snapshot/checkpoint after backup completed. Hence there is a lot of avhdx disk cause the storage full.

Microsoft said that backup should be doing the merge/delete after backup completed. From my understanding it should be hyperv do the merge/delete since it is hyperv checkpoint, Commvault only give instruction for hyperv to delete/merge it.

Anyway that we can do backup without checkpoint?

icon

Best answer by Mike Struening RETIRED 7 July 2022, 16:27

View original

19 replies

Userlevel 6
Badge +14

Hi @Fauzi SDS,

I would expect Hyper-V to merge the checkpoints here. - Do you see any errors in the Hyper-V VMMS Admin log on the Hyper-V Hosts here?

Does this occur on one or all hosts for one or multiple VM’s here?

What checkpoint types are you using here? - Standard or Production?

 

Unfortunately with Hyper-V VM’s, we do require a Checkpoint in order to protect the VM here.

 

Since checkpoints are not clearing down and causing issues, you could consider temporarily leveraging FileSystem (any maybe Application) Agents inside the affected Guest VM’s to protect the OS/Data.

 

Best Regards,
Michael

Userlevel 1
Badge +5

Hi Michael,

I’m trying to ask the customer to check on the host event viewer for any errors.

The issue occurs for several vms with different host.

Mostly production checkpoint.

They doing manual merge, some of the vm failed to merge manually, saying that some process holding the disk. Support from Microsoft said that might be backup holding the disk. Is it possible?

The problem is, checkpoint at hyperv manager is deleted, but the checkpoint disk inside the vm folder is not deleted/ merge. Even for 1 day, there is several checkpoint disk created, which is weird if backup only run 1 time. I attached the picture for your reference.

regards,

Fauzi

Userlevel 6
Badge +14

Hi @Fauzi SDS ,

It could be possible if a backup was running at that time. - We attach and Detach disks to the OS when performing backup so It may be worth checking if any VHDX are attached to the Hyper-V Hosts when no backups are running?

Alternatively something else could be holding a lock here, I have seen some AV software do this a few years back so perhaps check exclusions include the recommended Hyper-V ones and Commvault ones.

 

In regards to multiple checkpoints per day, Are you performing any replication or regular scheduled checkpoints?
- Also, Can you check the Hyper-V Backup Jobs “Attempts” Tab to see if Backup phase is running multiple times?

 

Best Regards,

Michael

Userlevel 1
Badge +5

Hi @MichaelCapon ,

They disable backup during the merge. Also nothing other than backup configured at that time.

I’m trying to ask them to capture every steps after manual merge, then try to ensure no more checkpoint disk and the cbt disk. After that we will try to run backup and see if its happening again. If it does, I will ask them to create ticket to support (Commvault & Microsoft).

Thanks.

Best regards,

Fauzi

Userlevel 7
Badge +23

@Fauzi SDS , hope all is well.

Any update on this one?

Thanks!

Userlevel 1
Badge +5

Hi Mike,

I already log a ticket. Support said that the issue is from HyperV since log from Commvault and event from VMMS admin shows that hyperv cannot merge/delete the checkpoint. The issue is Microsoft support before this saying that since backup using the checkpoint, then backup need to do the merge. So server team push back to backup team.

So now we push back to server team to get back to Microsoft with the log provided by Commvault support. Most probably will need to set time for Microsoft and Commvault support to sit down and look into this together.

regards,

Fauzi

Userlevel 7
Badge +23

Appreciate the details!

Can you give me the ticket number so I can track accordingly?

Thanks!!

Userlevel 1
Badge +5

Hi Mike,

This is the ticket number 220623-114.

 

Thanks & Best Regards,

Fauzi

Badge
Good afternoon, is there any news regarding this matter?I have the same problem.

Thank you!

Userlevel 7
Badge +23

Sharing the case closure with any identifiable names removed( @Fauzi SDS , noting that the case is currently closed):

Finding Details:

The customer checked with MS, but they referred to contact CV, however MS team have not given any findings to confirm issue occurs from CV side.
We have already shared our findings with the previous case, based on our review and log findings we confirmed issue is on the Hyper V side.
Unfortunately, CV rely heavily on Microsoft's technology for checkpoint operations and background disk merge process. This is a Microsoft issue unless they can prove otherwise.
Suggest share our findings to MS team.

Findings from previous case:
------------------------------------------------------------------------------------------------------------------------
Hyper-V VM backup failed with the error "Virtual machine has snapshots older than configured expiry threshold hours.
The backup encountered errors while removing expired snapshots for the virtual machine. Please try removing them manually and retry backup.
The log findings confirm that the snapshot delete option fails from the Hyper V host.
Commvault cannot create or delete snapshot, CV only sends to request to create and delete snapshot to the Hyper V host.
From the logs, it’s clear that Commvault has sent request to delete the old snapshots. But the task fails from Hyper V.

vsbkp.log
24716 ca50 05/29 21:40:08 697469 CHyperVInfo2::RemoveExpiredSnapshots() - Checking and removing any expired snapshots for VM[] Guid[73A7F85E-DA09-4671-AEA3-A73E936D8380]
24716 ca50 05/29 21:40:08 697469 Removing expired snapshot [__GX_BACKUP__688164_409_1650547153] Path [] for VM []. Sanpshot created time [22/04/2022 5:19:17 PG] reference time [28/05/2022 9:40:08 PTG]
24716 ca50 05/29 21:40:51 697469 HyperV task failed: [32768] [Job is completed with error] ['' failed to remove checkpoint. (Virtual machine ID 73A7F85E-DA09-4671-AEA3-A73E936D8380)

This finding matches with the Hyper V host machine event logs
Hyper-V-VMMS logs
Host machine () event viewer logs (Microsoft-Windows-Hyper-V-VMMS-Admin)
[TYPE] Error [TIME] 29/05/2022 9:40:50 PTG [SOURCE] Microsoft-Windows-Hyper-V-VMMS [COMPUTER]  [DESCRIPTION] '' failed to remove checkpoint. (Virtual machine ID 73A7F85E-DA09-4671-AEA3-A73E936D8380)
[TYPE] Error [TIME] 29/05/2022 9:40:50 PTG [SOURCE] Microsoft-Windows-Hyper-V-VMMS [COMPUTER]  [DESCRIPTION] Cannot delete checkpoint: Catastrophic failure (0x8000FFFF). Checkpoint ID AC400C54-FBD6-4183-98E8-76AE55B35690.

Since the customer mentioned deleting the snapshot manually corrupts the virtual machine, we have not tried that step.
Issue occurs outside of CV, suggest check with Hyper V / Microsoft support to fix the Snapshot issue.

Userlevel 1
Badge +5

Hi Mike,

Support said cannot let the ticket open. So after 1 week he closed it. I will open another ticket referring to this ticket once there is any update form server team.

Thanks,

Fauzi

Userlevel 7
Badge +23

Ok, thanks @Fauzi SDS !

Badge +4

Hi Mike,

This seems to be a never ending story :).

Already create ticket no 220805-124 for your reference.

 

regards,

Fauzi

Userlevel 7
Badge +23

Following, thanks!!

Userlevel 1
Badge +4

Hi any update on this? In our case there’s nothing in the logs from successful jobs, but new ones are failing after few days due to additional, not merged AVHDX files for disks.

 

 

Userlevel 7
Badge +23

@LukasM , this is the closure for @Fauzi SDS ‘s case, though it’s not definitive.

If this helps you out, awesome.  If not, likely worth opening a case on your side as well.

Finding Details:

Customer was having issues with few of the Hyper-V VM backups which are failing with consolidation error.
the Microsoft team has clarified that they have consolidated the disks manually and the issue is resolved.
Also we have confirmed at your end, that the issue is not happening anymore, and there are no error being raised post that.

But the customer would like to get an RCA for the issue,
Based on the request, we have reviewed the screenshot shared and found there are 3 left behind snapshot, where the initial snapshot was created by different application or manually. Where the follow-up snapshot which are left behind was from the commvault which was created on 3rd-June'22,
Since we cannot rectify who/which software has created the initial software and there are no entries or logs which can point to the issue
Also we cannot re-create the issue, since we are not aware of the snapshot created on 1st of June'22.

Now that we cannot re-create the issue, we cannot perform the RCA as well.
 

Solution:

Based on the request, we have reviewed the screenshot shared and found there are 3 left behind snapshot, where the initial snapshot was created by different application or manually. Where the follow-up snapshot which are left behind was from the commvault which was created on 3rd-June'22,
Since we cannot rectify who/which software has created the initial software and there are no entries or logs which can point to the issue
Also we cannot re-create the issue, since we are not aware of the snapshot created on 1st of June'22.

Now that we cannot re-create the issue, we cannot perform the RCA as well.

Post Manual consolidation the issue with backups have resolved.

Userlevel 1
Badge +4

Hmm in our case it’s something different, but on Hyper-V side, software itself cannot consolidate / merge avhdx in the background, only when VM is offline and that’s causing backups to fail. Workaround is VM reboot.

Userlevel 7
Badge +23

If you can, open a support case, share the incident number here, and I’ll split this off into its own thread for better tracking 🤓

Userlevel 1
Badge +4

If you can, open a support case, share the incident number here, and I’ll split this off into its own thread for better tracking 🤓

Sure, when necessary I will open new one, probably with MS support first..

Reply