Hi,
I have a customer that would like us to detail the hardware & architecture required to use Activate (Sensitive Data Governance) on the existing backup job data within their Commvault solution.
The solution needs to be able to index and analyse 4PB of file system application data stored within the archive and backup deduplicated storage pool. The solution would also need to scale to support a live crawl of the file system data on the servers in the future.
The current architecture has a limit of 160TB per node. Using this architecture guideline would result in a large index server hardware footprint (25 servers for indexing alone).
Is there potentially a different architecture guideline to follow for big data sizes?
Specifications for Dedicated Servers for File Data
Component | Large | Medium | Small |
---|---|---|---|
Source data size per node* | 160 TB | 80 TB | 40 TB |
Objects per node (estimated) | 80 million | 40 million | 20 million |
CPU or vCPU | 32 cores | 16 cores | 8 cores |
RAM | 64 GB | 32 GB | 16 GB |
Index disk space (SSD class disk recommended) | 12 TB | 6 TB | 3 TB |