Solved

HyperScale X performance went time come to auxcopy to tape ?

  • 15 February 2021
  • 18 replies
  • 797 views

  • Anonymous
  • 0 replies

Hi guys, 

 

I'm having a customer who is actually using Standard MA with Commvault to back up to copy 1 to disk (PureFlash with NVMe Drives), copy 2 (and other PureFlash) and a copy 3 to Tape around 350 TB per week to send to tapes (LTO7) 4 tape drives (weekly Full) at a sustain throughput of 700 GB/Hour per drive (4 drive in parallel).

 

We are looking to replace both MA with two HyperScaleX clusters. 

 

The questions are ! How we need to configure the HyperScaleX (reference architecture) to sustain the weekly tape creation of 350 TB per week with the same throughput knowing that we are going to use NearLine SAS in the HyperscaleX Cluster !  Or ca we use SSD for the Storage Pool drive in a HyperScaleX. 

icon

Best answer by Marco Lachance 5 April 2021, 17:54

View original

18 replies

Userlevel 3
Badge +5

@Marco Please correct me if I understood your requirement/question correctly. Are you looking to replace copy 1 and copy 2 with HyperScale clusters and retain Tape for copy3? Assuming this is correct, your next follow up question is whether HyperScale can copy 350TB per week to tape (700GB/hour throughput). Since you are copying to Tape, I am also assuming above throughput is without deduplication. Feel free to correct if my assumptions are not valid.

Now, allow me to state few observations/facts found in internal benchmarking tests we have performed with HyperScale X and Auxcopies. In a setup with both Primary and Secondary copy written to two HyperScale X clusters (HS 4300 appliances which is a 12 drive node and 14 TB drives), we observed Auxcopy speeds of 12TB/hour (baseline without deduplication) that far exceeds your expectations of 700 GB/hour. Now in your scenario, you are writing to Tape, but that should not affect HyperScale performance. In the worst case writing to Tape could act as a bottleneck and limit your overall Auxcopy throughput to 700 GB/hour (which is your baseline expectation). 

While I do not want to make a too generalized statement, our internal tests with HyperScale X across various operations such as Backups/Restores/Auxcopies etc. indicate that most often bottlenecks are either with source application (read or write during backups/restores) or network speeds or Auxcopy target ingests (Tape or Cloud). HyperScale X is usually not a bottleneck.

@Pavan Bedadala Hi, just to correct your first explanation, you are almost correct, but copy #1 and #2 will be replaced with Hyperscale X. But WITH dedupe ! So copy #3 will have to UNDedup the data to send it to tape ! This is where my concern is ??

Because yes, I agree with you the aux copy from #1 and #2 will have a high throughput. I’m already having a lot of customers with two hyper scale clusters and this it is the kind of speed I’m seeing across sites (around 12TB/hour). 

Thank you again.

Userlevel 3
Badge +5

@Marco Got it, first two copies on HyperScale X are with Dedupe enabled and third copy to Tape is without Dedupe. Hence your concerns on how fast HyperScale X can un-dedupe copy data to Tape. Even in the internal tests with HyperScale X and Auxcopy, it is the baseline Auxcopy job that showed 12 TB/hour performance. In the baseline case Auxcopy destination too was HyperScale, but remember that since Primary and Secondary copies were setup using different Dedupe databases, first job has to read raw data (after un-dedupe) and write into destination. Whether destination does dedupe or not does not affect the fact that source has to read raw data and since overall process performance is at 12 TB/hour what it implies is that HyperScale X is able to read raw data from source at that pace. In our tests destination too is able to write at the same pace of 12 TB/hour, but when you replace this by tape, which can only write at 700 GB/hour, now that becomes your system throughput for Auxcopy. So I do not see a concern from HyperScale side meeting your SLA objectives for Auxcopy, using Tapes.

Btw., thank you for validating our benchmark claim of 12 TB/hour, a stamp from real users always feels satisfying. 

@Pavan Bedadala  Thank you, but the customer is actually getting the throughput of 2.8 TB/hour since he is writing copy #3 with 4 tapes drives in parallel with the PureFlash system ! The Hedving file system is probably more efficient than NTFS for UNDeduping the data and sending it to tapes !?

 

Since the customer is going to go with reference architecture (we cannot do a try and buy) this why I need to be sure about the configuration we are going to propose will achieve the needed throughput !

 

Do you have any benchmark of tape copy from Hyperscale ? Example how many drive per node max, and what kind of throughput we get ?

 

Do you know if in a reference architecture configuration we can use SSDs drives for the Storage pool ? Or it will be overkill ?

Userlevel 3
Badge +5

@Marco Unfortunately, we do not have performance data copying to Tape from HyperScale X. All our tests used disk libraries, with dedupe, as secondary copy. If it helps, we did restore test on HyperScale X Reference Architecture (24 drive) and we observed a restore performance of 11 TB/hour. Since restore un-dedupes data and writes to application (file system), this is the closest metric I can offer. Assuming writing to file system is comparable to tape (with 2.8 TB/hour throughput as you said above), all I can conclude is HyperScale X can un-dedupe and eject data out of HyperScale X at 11 TB/hour. 

Reference architecture designs do not support SSD as data disks yet. 

@Pavan Bedadala  Hi, another question for you, we may be going to be able to do a try and buy for the reference architecture for two,  four node clusters with 24 drives each. So if we are going to attach a 4 LTO-8 Drive into one cluster, what would be the best connectivity ? One drive in each node or two drives in one and two in the second node ? Or ??

Thank you again.

Userlevel 7
Badge +19

@Marco are the drives not part of a library? why would you want to create a 1-on-1 relationship between a HyperScale cluster node and a tape drive? Why not add 2 FC switches and connect all HyperScale nodes using FC to the fabric? That would also add additional resilience if one of the nodes is down. 

Userlevel 3
Badge +5

@Marco I would recommend to distribute the drives across nodes so that you can not only leverage multiple media agents writing to them in parallel but also build resilience to node failures such that you can continue to auxcopy even if a node goes down. 

Userlevel 3
Badge +5

Ok, my message crossed with Onno response. I also like his idea to design fabric connectivity in between HyperScale nodes and the drives. 

Good, this what I was thinking ! Presenting all drives to all nodes in the cluster in the case, I have a FC Switch !

But if it’s not the case and I have to go direct attach for four drives to a four nodes what would be the connectivity recommendations ?

 

@Marco Unfortunately, we do not have performance data copying to Tape from HyperScale X. All our tests used disk libraries, with dedupe, as secondary copy. If it helps, we did restore test on HyperScale X Reference Architecture (24 drive) and we observed a restore performance of 11 TB/hour. Since restore un-dedupes data and writes to application (file system), this is the closest metric I can offer. Assuming writing to file system is comparable to tape (with 2.8 TB/hour throughput as you said above), all I can conclude is HyperScale X can un-dedupe and eject data out of HyperScale X at 11 TB/hour. 

Reference architecture designs do not support SSD as data disks yet. 

Also what was the amount of restore data who gave you the 11 TB/Hour ?

Userlevel 3
Badge +5

82 clients with 4 streams per client and 16TB total application data. Tests ran multiple times to pick an average of 11 TB/hour. Hope this helps.

@Pavan Bedadala Thank you again very good info, what kind of clients in the 82 ? FS, VMs, Database.

Userlevel 3
Badge +5

@Marco It is all file system clients. We intend to test other agents in future, but our starting point was File system. 

Userlevel 2
Badge +7

@Pavan Bedadala  Hi again, I have another question for you ?

Maybe it’s not supported but, it’s only for a test purposed !

I’m trying to deploy HyperScale X inside VMware, I have deployed the ISO on three VMs without any issue but when it’s time to use the HyperScale Browser I’m setting up my StoragePool network IPs and CommServe registration IPs and I’m getting the error : The node Vmware-422…… does not have /hedvig/d2 mounted. Please verify. Any clue ?

Thank you

Be the way my Commvault email has cancelled since I am no longer doing work as an official contractor, I’m now in the Commvault Partner class ! That’s why me previous replies are Anonymous ! So now it’s marco.lachance@iti.ca also know has marco.lachance@procontact.ca ITI s the need branding.

 

Userlevel 2
Badge +7

@Pavan Bedadala  Hi again, I have another question for you ?

Maybe it’s not supported but, it’s only for a test purposed !

I’m trying to deploy HyperScale X inside VMware, I have deployed the ISO on three VMs without any issue but when it’s time to use the HyperScale Browser I’m setting up my StoragePool network IPs and CommServe registration IPs and I’m getting the error : The node Vmware-422…… does not have /hedvig/d2 mounted. Please verify. Any clue ?

Thank you

Be the way my Commvault email has cancelled since I am no longer doing work as an official contractor, I’m now in the Commvault Partner class ! That’s why me previous replies are Anonymous ! So now it’s marco.lachance@iti.ca also know has marco.lachance@procontact.ca ITI s the need branding.

 

I finally find the solutions to run it inside virtual environments, sorry for bothering you.

Userlevel 7
Badge +23

I finally find the solutions to run it inside virtual environments, sorry for bothering you.

Awesome - what did you need to do to get it working @Marco Lachance ?

Userlevel 2
Badge +7

Hi,

@Damian Andre, Three VMs with this minimum requirement.

 

Reply