Question

HyperScale X performance went time come to auxcopy to tape ?

  • 15 February 2021
  • 14 replies
  • 144 views

Badge +2

Hi guys, 

 

I'm having a customer who is actually using Standard MA with Commvault to back up to copy 1 to disk (PureFlash with NVMe Drives), copy 2 (and other PureFlash) and a copy 3 to Tape around 350 TB per week to send to tapes (LTO7) 4 tape drives (weekly Full) at a sustain throughput of 700 GB/Hour per drive (4 drive in parallel).

 

We are looking to replace both MA with two HyperScaleX clusters. 

 

The questions are ! How we need to configure the HyperScaleX (reference architecture) to sustain the weekly tape creation of 350 TB per week with the same throughput knowing that we are going to use NearLine SAS in the HyperscaleX Cluster !  Or ca we use SSD for the Storage Pool drive in a HyperScaleX. 


14 replies

Userlevel 1
Badge +2

@Marco Please correct me if I understood your requirement/question correctly. Are you looking to replace copy 1 and copy 2 with HyperScale clusters and retain Tape for copy3? Assuming this is correct, your next follow up question is whether HyperScale can copy 350TB per week to tape (700GB/hour throughput). Since you are copying to Tape, I am also assuming above throughput is without deduplication. Feel free to correct if my assumptions are not valid.

Now, allow me to state few observations/facts found in internal benchmarking tests we have performed with HyperScale X and Auxcopies. In a setup with both Primary and Secondary copy written to two HyperScale X clusters (HS 4300 appliances which is a 12 drive node and 14 TB drives), we observed Auxcopy speeds of 12TB/hour (baseline without deduplication) that far exceeds your expectations of 700 GB/hour. Now in your scenario, you are writing to Tape, but that should not affect HyperScale performance. In the worst case writing to Tape could act as a bottleneck and limit your overall Auxcopy throughput to 700 GB/hour (which is your baseline expectation). 

While I do not want to make a too generalized statement, our internal tests with HyperScale X across various operations such as Backups/Restores/Auxcopies etc. indicate that most often bottlenecks are either with source application (read or write during backups/restores) or network speeds or Auxcopy target ingests (Tape or Cloud). HyperScale X is usually not a bottleneck.

Badge +2

@Pavan Bedadala Hi, just to correct your first explanation, you are almost correct, but copy #1 and #2 will be replaced with Hyperscale X. But WITH dedupe ! So copy #3 will have to UNDedup the data to send it to tape ! This is where my concern is ??

Because yes, I agree with you the aux copy from #1 and #2 will have a high throughput. I’m already having a lot of customers with two hyper scale clusters and this it is the kind of speed I’m seeing across sites (around 12TB/hour). 

Thank you again.

Userlevel 1
Badge +2

@Marco Got it, first two copies on HyperScale X are with Dedupe enabled and third copy to Tape is without Dedupe. Hence your concerns on how fast HyperScale X can un-dedupe copy data to Tape. Even in the internal tests with HyperScale X and Auxcopy, it is the baseline Auxcopy job that showed 12 TB/hour performance. In the baseline case Auxcopy destination too was HyperScale, but remember that since Primary and Secondary copies were setup using different Dedupe databases, first job has to read raw data (after un-dedupe) and write into destination. Whether destination does dedupe or not does not affect the fact that source has to read raw data and since overall process performance is at 12 TB/hour what it implies is that HyperScale X is able to read raw data from source at that pace. In our tests destination too is able to write at the same pace of 12 TB/hour, but when you replace this by tape, which can only write at 700 GB/hour, now that becomes your system throughput for Auxcopy. So I do not see a concern from HyperScale side meeting your SLA objectives for Auxcopy, using Tapes.

Btw., thank you for validating our benchmark claim of 12 TB/hour, a stamp from real users always feels satisfying. 

Badge +2

@Pavan Bedadala  Thank you, but the customer is actually getting the throughput of 2.8 TB/hour since he is writing copy #3 with 4 tapes drives in parallel with the PureFlash system ! The Hedving file system is probably more efficient than NTFS for UNDeduping the data and sending it to tapes !?

 

Since the customer is going to go with reference architecture (we cannot do a try and buy) this why I need to be sure about the configuration we are going to propose will achieve the needed throughput !

 

Do you have any benchmark of tape copy from Hyperscale ? Example how many drive per node max, and what kind of throughput we get ?

 

Do you know if in a reference architecture configuration we can use SSDs drives for the Storage pool ? Or it will be overkill ?

Userlevel 1
Badge +2

@Marco Unfortunately, we do not have performance data copying to Tape from HyperScale X. All our tests used disk libraries, with dedupe, as secondary copy. If it helps, we did restore test on HyperScale X Reference Architecture (24 drive) and we observed a restore performance of 11 TB/hour. Since restore un-dedupes data and writes to application (file system), this is the closest metric I can offer. Assuming writing to file system is comparable to tape (with 2.8 TB/hour throughput as you said above), all I can conclude is HyperScale X can un-dedupe and eject data out of HyperScale X at 11 TB/hour. 

Reference architecture designs do not support SSD as data disks yet. 

Badge +2

@Pavan Bedadala  Hi, another question for you, we may be going to be able to do a try and buy for the reference architecture for two,  four node clusters with 24 drives each. So if we are going to attach a 4 LTO-8 Drive into one cluster, what would be the best connectivity ? One drive in each node or two drives in one and two in the second node ? Or ??

Thank you again.

Userlevel 3
Badge +6

@Marco are the drives not part of a library? why would you want to create a 1-on-1 relationship between a HyperScale cluster node and a tape drive? Why not add 2 FC switches and connect all HyperScale nodes using FC to the fabric? That would also add additional resilience if one of the nodes is down. 

Userlevel 1
Badge +2

@Marco I would recommend to distribute the drives across nodes so that you can not only leverage multiple media agents writing to them in parallel but also build resilience to node failures such that you can continue to auxcopy even if a node goes down. 

Userlevel 1
Badge +2

Ok, my message crossed with Onno response. I also like his idea to design fabric connectivity in between HyperScale nodes and the drives. 

Badge +2

Good, this what I was thinking ! Presenting all drives to all nodes in the cluster in the case, I have a FC Switch !

But if it’s not the case and I have to go direct attach for four drives to a four nodes what would be the connectivity recommendations ?

 

Badge +2

@Marco Unfortunately, we do not have performance data copying to Tape from HyperScale X. All our tests used disk libraries, with dedupe, as secondary copy. If it helps, we did restore test on HyperScale X Reference Architecture (24 drive) and we observed a restore performance of 11 TB/hour. Since restore un-dedupes data and writes to application (file system), this is the closest metric I can offer. Assuming writing to file system is comparable to tape (with 2.8 TB/hour throughput as you said above), all I can conclude is HyperScale X can un-dedupe and eject data out of HyperScale X at 11 TB/hour. 

Reference architecture designs do not support SSD as data disks yet. 

Also what was the amount of restore data who gave you the 11 TB/Hour ?

Userlevel 1
Badge +2

82 clients with 4 streams per client and 16TB total application data. Tests ran multiple times to pick an average of 11 TB/hour. Hope this helps.

Badge +2

@Pavan Bedadala Thank you again very good info, what kind of clients in the 82 ? FS, VMs, Database.

Userlevel 1
Badge +2

@Marco It is all file system clients. We intend to test other agents in future, but our starting point was File system. 

Reply