Question

Restore speed vmware

  • 17 September 2021
  • 14 replies
  • 173 views

Userlevel 1
Badge +4

Hi All.

 

I am having issues with restore speed when restoring a vmware server.

 

When a restore is done where “vCenter Client”  is set to the vcenter, restore speeds are slow.

When a restore is done where “vCenter Client” is set directly to the ESXI host and the vcenter is bypassed, we see a factor 4 in restore speeds.

 

Anyone who can explain this behavior? I thought that the vcenter was only used for control data and not data movement.

 

Regards

-Anders


14 replies

Userlevel 5
Badge +12

Hi @ApK ,

The vCenter should only be used for control data here, such as create VM, create VM Snap, etc.

What transport method was used for both jobs here? Was the same disk provisioning used also?

 

In the vsrst.log on the VSA Proxy used for restore, you should see counters under Stat-. These should give a good indication of the media read and disk write speeds here.

I’d suggest reviewing the log and comparing, there may have been an operation that took longer or a difference in speeds (for some reason). - Hopefully the log will give more insight into this!

 

Best Regards,

Michael

Userlevel 1
Badge +4

Hi Michael.

Thanks for your reply.

 

That was my owne thought, that it was only using the vcenter for control data, thats why im wondering what is happening here.

 

I’m using nbd for the restores and thin provisioning disks.

 

I have made 10 tests this morning, and all restores via the esxi host directly is 3-4 times faster.

 

Checked the vsrst.log file, and MediaAgent read speeds are fast, so this is not the issue for sure. Issue is, that vcenter is involved in the restore for some how.

 

Regards

-Anders

Userlevel 5
Badge +12

Thanks @ApK ,

Would you be able to share the vsrst.log and a JobId of vCenter and ESX?

 

Best Regards,

Michael

Userlevel 1
Badge +4

Hi Michael.

 

Would it be better to raise a case for this issue, to further investigate?

 

Thanks

-Anders

Userlevel 5
Badge +12

Hi @ApK ,

 

Yes, you can raise a case for this. - We’ll need the Logs and the Job ID’s to check it further.

Once raised, let us know the case number and we can monitor it internally.

 

Best Regards,

Michael

Userlevel 7
Badge +15

Hey folks,

This sounds like a textbook case of “clear lazy zero” if you are doing SAN restores - article here:

https://documentation.commvault.com/11.24/expert/32721_vmw0074_san_mode_restores_slow_down_and_display_clear_lazy_zero_or_allocate_blocks_vmware.html

 

I was writing the description but the KB article sums it up well

Userlevel 1
Badge +4

Hi Michael.

I have made some more restore tests, and looking into vsrst.log for the restores, shows me some cracy differencies in readmedia speed, so I might have a different issue than in the begnining where I thought it was a vcenter issue.

 

from the vsrst.log file:

Same vmware server restore, same HyperScale server, two different dates, very different read speeds.

 
09/17 10:31:52 10456669 stat- ID [writedisk], Bytes [90393542656], Time [702.082991] Sec(s), Average Speed [122.786054] MB/Sec
09/17 10:31:57 10456669 stat- ID [readmedia], Bytes [83501234239], Time [43.402569] Sec(s), Average Speed [1834.752742] MB/Sec
09/17 10:31:58 10456669 stat- ID [Datastore Write [SN771-D2250-L0009]], Bytes [91067777024], Time [708.095798] Sec(s), Average Speed [122.651483] MB/Sec
 
 
09/20 12:05:19 10481089 stat- ID [readmedia], Bytes [152791654482], Time [5328.833140] Sec(s), Average Speed [27.344350] MB/Sec
09/20 12:05:21 10481089 stat- ID [Datastore Write [SN771-D224E-L0008]], Bytes [162756820992], Time [1126.319442] Sec(s), Average Speed [137.809039] MB/Sec
09/20 12:05:21 10481089 stat- ID [writedisk], Bytes [162756820992], Time [1126.369482] Sec(s), Average Speed [137.802917] MB/Sec

 

I will create a case to have this investigated.

 

@Damian Andre, the restores was done via nbd, but thanks for your suggestion :-)

 

Regards

-Anders

Userlevel 7
Badge +15

@Damian Andre, the restores was done via nbd, but thanks for your suggestion :-)

 

I hate it when my hunch is wrong :joy:

Were both restore tests from the same source job or different jobs? Would be interesting to run it again to see if they are consistent with the last run, and if so what the difference between the jobs is

Userlevel 1
Badge +4

Hi Damian.

:grinning:

Same vmware server restored, same HyperScale server as proxy, last couple of restores have been really slow, just did a new one with really slow performance.

 

I have created a case now.

Userlevel 7
Badge +15

Hi Damian.

:grinning:

Same vmware server restored, same HyperScale server as proxy, last couple of restores have been really slow, just did a new one with really slow performance.

 

I have created a case now.

Sounds good. Be sure to let us know the outcome!

Userlevel 7
Badge +21

@ApK , can you share the case number with me so I can track it properly?

Userlevel 1
Badge +4

Hi Mike.

Case number is: 210920-319

 

Regards

-Anders

Userlevel 7
Badge +21

Thanks!  I see you are working with Alexis….you’re in great hands!

Userlevel 7
Badge +21

Updating the thread as per the case notes.  Development and Alexis discovered that the majority of the job duration was opening SFiles, so you have sealed the store and are monitoring.

Reply