What type of storage array do you use?

  • 28 January 2021
  • 31 replies
  • 3988 views

Userlevel 3
Badge +5

We currently use IBM V5000 arrays for our Commvault backup target to land our deduped backups.  We are starting to review other options to see what other fast, cost effective options are out there.  I do prefer to use Fiber Channel connections, but open to options.   Since Commvault is really the brain in our scenario, the storage array does not really need any features, just good speed.

 

What Vendor Storage arrays do you use?  Are you happy with it?  


31 replies

Badge +5

Have used EMC Clarion’s and VNX’s in the past however have moved to Dell EMC Unity All Flash HP Nimble Hybrid storage systems during the last couple of years.

 

Performance wise the difference is night and day. Very happy with the new storage ecosystem.

Hi,

is it attached via SAN or NAS Protocol?

 

thanks

Userlevel 7
Badge +19

We have deiced to migrate the primary landing zone for backup data to a Pure Storage FlashBlade/S array located in one of our DCs. To make sure we always have a copy available we will create a near synchronous copy to our large archive tier which writes the data using replicated EC across 3 DCs. Really looking forward to be able to deliver a fast recovery tier to our customers! We'll definitely share our experiences with you all! 

Userlevel 7
Badge +19

@Laurent @JamesS can you share your experiences with Flashblade so far? we're strongly considering buying the new Flashblade/S version that was release earlier this month?

Are you leveraging the S3 protocol? 

Badge +3

Pure Storage - FlashBlade (NFS) for Rapid recovery

I’m curious as to what OS you have on your media agents?

Userlevel 1
Badge +6

Nice topic, found it just now.

IBM’s V7000’s are our bottleneck in backup speeds..

Having NLSAS spinning disks is a logical option for a backup solution, but when you have databases which are in 50-150TB size and you need to back them up over the weekend - problems arise.
And it does not matter if you have multiple mdisks/pools or one pool with a bigger spinning capacity to have generally better IO rates.

We manage 1-2 TB/hr for SQL and 1-3TB/hr for Oracle databases backups, but still. 

 

We have also tried Hitachi’s ObjectStorage - what a bad option for backups! We’ve been in read-only mode more time rather being available once commvault starts to send requests during backups..


Would be interesting to find better solution.

 

Userlevel 6
Badge +15

We have a customer that has an Isilon for disk storage.

Backup speeds are ok, but DASH to DR or Copy to Tape speeds are terrible. “Last night’s” backups copy to DR at no more than 500GB/hr, if we’re lucky and copy to tape speeds do not exceed 200GB/hr.

Fallen Behind Copies are literally years behind.

Both Commvault and Isilon have checked it out and cannot do anything about it.

We did spec Hyperscale before implementation but we were overruled and now I sit with this issue. Very frustrating.

Open another thread in the proper area and we would discuss about such, to avoid beeing off-topic here.

This is the perfect example of ‘intelligent storage’ that delivers performance while writing, but when reading data from outside of the ‘buffer’/’landing zone’ (or whatever they call it), the performance is degraded, as it has to be rehydrated from the array to be provided to Commvault, before it can be processed.

Always try to have only 1 deduplication level : Commvault’s (best) or hardware (worst, as you become hardware dependent), and not 2.

Userlevel 6
Badge +13

I’m quite happy with Isilon tbh. The performance issues what we are seeing is most of the time DDB related, and not isilon.

Userlevel 3
Badge +8

We have a customer that has an Isilon for disk storage.

Backup speeds are ok, but DASH to DR or Copy to Tape speeds are terrible. “Last night’s” backups copy to DR at no more than 500GB/hr, if we’re lucky and copy to tape speeds do not exceed 200GB/hr.

Fallen Behind Copies are literally years behind.

Both Commvault and Isilon have checked it out and cannot do anything about it.

We did spec Hyperscale before implementation but we were overruled and now I sit with this issue. Very frustrating.

Badge

We used / loved HPE StoreVirtual but sadly discontinued so we have tried some MSA 2050 / 2052 arrays for VMWare and Commvault destination, and they have not been nearly as reliable or capable.

Probably will get something else for our next separate SAN expansion.

Our dedupe ratios are more like 100:1 for multiple Windows servers with daily fulls for one year stored.

 

Userlevel 1
Badge +2

We looked at HPE StoreOnce too but wanted to let Commvault handle the de-dupe. We’re getting 10:1 with our datasets.

Badge

I really like the HPE StoreOnce solution, it hit my price point really well, and Commvault can talk natively to the HPE solution. That means I can offload all my de-dupe to HPE, leaving Commvault focused on just data protection. Across all my datasets, I’m getting about 7:1 dedupe ratio, with some stuff getting as high as 11:1.

 

Note, I’m in the Medium business market, so there are faster solutions out there, but this hit a perfect price vs performance balance for me.

Userlevel 1
Badge +2

Seems like NetApp is popular here! :wink: We use use the NetApp E 5600 Series and absolutely love it. Previously used (and still so in other places) Quantum QXS but could not deliver the performance we needed. We’re able to hammer this thing with backups/Aux Copies/DDB Ver jobs at the same time and it keeps humming all day long. Will replace the legacy storage with this going forward. Highly recommended!

Badge +3

We use Synology RS3614XS+ arrays.  We have 2 arrays one in each of our Data Centers.  We have a 22 TB disk in each site for backups and a 24 TB disk in each site for AUX Copy.  We have almost 6 TB of data being backed up and everything is working great.   Performance is great.

Badge

Pure Storage - FlashBlade (NFS) for Rapid recovery

Badge +1

hi all,

 

when it comes to backup-to-disk commvault media library, i would look at any entry-level simple cost-efficient block-based storage array. All drives to be near-line SAS 7.2Krpm drives. depending on the back-end storage you require you need to size the number of disks base on the acceptable IOPS in your environment and your backup plocy.

 

let’s say you have 100TB of back-end storage and you need to take a monthly full copy to tapes. so you need to have enough read throughput from your SAN medialibrary to pump to the LTO drives.

so if you require to read from the backup-to-disk array at 200MB/s this means around 3,200 iops (where io size = 64K)

so you need to put enough 7Krpm NLSAS HDDs spindles to give your 3,200 iops (around 48 7krpm NLSAS HDD)

so the number of 7krpm HDDs is more important than the capacity it provides.

 

 

Badge

We have a smaller operation and use a DataOn JBOD enclosure with a MegaRAID controller and 12G SAS connection.  It handles everything we need.  We AuxCopy to LTO8 for air-gapped copy and Azure Archive tier for offsite.

Badge

We moved from Hitachi midrange (HUS150/HUS130) to NetApp AFF A300.  They are fantastic arrays in every sense.  Worth every penny!

Hi David,

Thanks for the feedback. I love Hitachi arrays but I’ll keep this info in my pocket for the future. : )

Badge

We moved from Hitachi midrange (HUS150/HUS130) to NetApp AFF A300.  They are fantastic arrays in every sense.  Worth every penny!

Badge

Hi everyone,

We backup data from the following storage arrays/servers:

  • Hitachi VSP F900
  • Hitachi VSP G700
  • Hitachi VSP G600
  • Hitachi HNAS

We have great performance reading from all the Hitachi arrays, and writing to our maglibs which are on the G700 and G600.

Thanks,

Lavanya

Userlevel 6
Badge +15

Very good question.

I’ll probably surprise some, we now use Purestorage Flashblade array with a 4+1 gridstor, configured in S3 mode, using many 10G ethernet attachment for MAs, and multiple 40G for the array.

This array has been the replacement for an old Netapp FAS2554 10G NAS that became more and more overloaded as time (and stored data) went by. When we had a need to perform huge parrallel restores the FAS2554 was killed. This is not the case anymore with the Flashblade.

Of course, the price is far different, but our ambitions (and budget :grin: ) have been reviewed after this performance issue. 

This primary was and is still the source for auxcopies to a SAN spindle disks array (less performance/more capacity) to have a secondary copy, and also the same source to a Cloud copy. 

Top performance now, while having the previous FAS2554 as primary, auxcopies were delaying and piling up because of the multiple writes (primary) and reads (copy to SAN, copy to Cloud). 

Userlevel 5
Badge +10

@Onno van den Berg Good points, although I have found that is ideal for the primary target to be file for the sparse attribute support and secondary cloud (for all the reasons described).

Userlevel 7
Badge +19

Interesting discussion to follow for sure! Still read quite a few people who are talking about block storage arrays which is really interesting to me, because the choice for block storage also introduces a challenge if you want your MAs to be HA leveraging partitioned DDBs, just like @Christian Kubik mentions. 

My personal order would be to pick something that can deliver:
1) Cloud storage
2) File
3) Block

If I had to design something new and the customer is not leveraging on-premise cloud storage than I would definitely try to convince the customer to purchase cloud based storage like Cloudian/MinIO/StorageGRID/Flashblade. It offers many advantages like to be less sensitive for ransomware attacks, WORM, the ability to have copies in multiple locations without need to auxcopy and an additional selling point would be that it can be used to drive software/business development.

As for file storage you can look at the big names NetApp/Dell EMC or just have a look at he portofolio of Tintri. We have good experience with their IntelliFlash arrays. Very interesting from a price point of view and they offer multi-protocol capabilities. 

 

Userlevel 3
Badge +3

Interesting discussion - and I am not going to put any vendor or array name into this post because I believe that for backup storage on disk you usually don’t need any features that go beyond “standard stuff” that all vendors of a given class of storage should be able to provide - and performance in the end boils down to actually used physical devices and the Raid / Pooling etc. technology on top.

So when it comes to choosing a disk based storage solution for Commvault Backup there are a few questions one should ask themselves:

  • Do I need automated failover for Media Agents on backup as well as restore (Commvault GridStor)
    • This requires all involved Media Agents to be able to read and write to the same storage library at the same time. This can mainly be achieved using NAS or in some cases clustered/scale out filesystems (e.g. Hyperscale)
  • Do I want to use partitioned deduplication in Commvault
    • For this at least all Media Agents need to be able to read all the data - write could be to dedicated mount paths per MA. Data Server IP with Read Only sharing could be a solution here if DAS/SAN is used for storage. Keep in mind that if the sharing MA is offline then you won’t be able to guarantee restores/copies - backups will continue to run.
  • Do I need to do regular “full” copies (e.g. tape-out) or complete data recoveries
    • This will require to read all “logical” data blocks - meaning it will be quite random (dedupe references could be all over the library) and it will read the same data many times (references could be referenced multiple times)

What I have encountered in the past is that especially the last bullet point has not always been thought of, when sizing backend storage for Commvault. The classic way of thinking has been “it’s backup, so let’s just put few really really large drives into RAID 6 and we should be fine” which is definitely true for a non-deduplicated storage pool. Once deduplication comes into play, the randomness of data is growing on reads - writes will still be sequential as they will only write new data plus some references and metadata - so it’s always “additive”. 

Commvault provides a tool that lets you test disk performance of a disk library to find out if it is sufficient for your needs: https://documentation.commvault.com/commvault/v11/article?p=8855.htm

My personal preference for non-Hyperscale backup storage is high performance NAS as this will give you the sharing of the library across many Media Agents read and write so you can do failovers, AuxCopies using just the target Media Agent, restore through any MA etc. etc. 

It could also make sense to take a peek at Commvault Hyperscale technology which will run on many different server hardware platforms - as this is really easy to buy, setup and run and it gives you all the same benefits plus simple scale out.

Userlevel 3
Badge +9

We generally use NetApp e-series generally for short/mid term dedupe storage and StorageGRID for long term. 

The alternative is the HPE MSA class array direct attached to the MediaAgent. Something that’s scalable is the key.

Userlevel 5
Badge +11

For simple storage usage without any “smarts”, I’ve noticed many customers are choosing Isilon, especially for large scale deployments of PB’s in size. 

Other than that, NetApp’s have traditionally offered great balance between performance and capacity. 

 

Lastly, would caution against cheap/low end NAS devices like QNAP or Synology as most of their products have terrible performance when trying to do both reads and writes concurrently as well as having less support / product reliability. 

Reply