I ran into a bit of an issue… Yesterday, one of the Disk Libraries got filled up and the backups went into waiting status. After haveing a look at the utilization, indeed it turned out to be 99,7% full.
The main culprit were SQL server backups:
There were some backup jobs with an extended retention - so I’ve deleted those, and some more of the old backup jobs to make space. I also ran Data Aging and could clearly see data chunks being deleted in SIDBPhysicalDeletes.log and after a while I got this:
So, I assume quite a bit of data was deleted. The Primary copy (blue) went from 52.95TB down to 19.81 TB.
However, when I check the Free Space on the Library I got very little:
So I checked the Mountpaths Space Usage for that DL:
Data Written corresponds to amount of space used by the Primary copy: 19.8 TB
However Size on Disk, which takes into account Data Written + aged jobs which are still referenced by valid jobs is still very high. Almost unchanged.
I am quite confused by this. It seems that the data got aged (deleted) but it’s still being referenced and therefore… not deleted?
- Storage Resources → Storage Policy Copy → Databases → Run Data Verification
- Storage Resources → Storage Policy Copy → Databases → Validate And Pruned Aged Data (successfully validated and resync-ed)
How can I reduce Size on Disk? Is there a way to force a physical purge of aged referenced jobs so that I can finally free up some space?
I am out of ideas and need to figure this out quickly, so any help would be appreciated. Thanks in advance!
Best answer by IgorView original
Thanks for the question.
So, what this means for deduplicated jobs is that an original job that protected all of the application data on a client was written to disk. Then subsequent jobs do not write this data again to the disk, which is the principle of deduplication only write data once.
So, even though that original job may have qualified for aging, because subsequent jobs refer to this original data, it is still retained. Removing this data would invalidate all the jobs that ran later.
With that in mind, deleting these jobs would not be a sensible course of action.
Need to identify where the disk space is used and by which clients/subclients, then you can make an informed decision on what steps to take.
Here are some reports to check where the space consumed and if this is expected for your environment.
Client Storage Utilization by Storage Policy Copy Report will show largest clients by data written size.
Growth and Trends Report may show clients/subclients that have exhibited significant or unexpected growth.
License Summary Report shows Front End Application Size consumed by clients, so while this doesn’t translate to disk usage, it might highlight clients where size is significantly higher than expected.
Disk Library Utilization Report shows libraries low on space, but I assume this is already known, added for reference.
Storage Resources Summary Report also shows a different view on library space, but again this already known, adding for reference.
Thanks for the follow-up, I assure you it’s appreciated!
I already know which clients are the largest and there was no great increase of data. Just regular scheduled backups for those 4 MSSQL clients:
Retention is 7 days.
I checked daily space utilization for those clients:
I then compared it to the weekly space utilization:
These numbers made sense to me.
So, if Size On Media takes up 12.59 TB per week and the retention is 7 days - how can space utilization on that Disk Library balloon up to 59TB?
At this point, these are detailed questions that are specific to your environment and configuration, so I would need either a remote session or a Commserve database to check further.
With that in mind, as you are very low on space and need to have this checked, please may I suggest you raise a support ticket so that an engineer can work with you directly on this?
If you wouldn’t mind sharing the ticket number via PM, I can track progress of the case and potentially share the resolution with the community.
Just thinking… even with retention set to 7 days + 1 cycle, and the full size of the original jobs (kept for deduplication purposes) which comes to 24TB, that would be: 12,6 + 12,6 +24 < 50 TB. That still doesn’t account for the remaining 9 TB on the Disk Library.
Unless one of these pans out, I would suggest opening a support case. There’s so many factors that depend on each other.
Any chance this is a hole-drilling issue?
Someone internally brought that up as well…..it definitely could be.
In the end, what I did was to start Reclaim Idle Space process on the Storage Pool → Deduplication engines → Primary_Global → _Databases_ (in this case) → Data Verification
I’ve set the Reclamation Level to lowest (1). That immediately started new deletions and these activities started being logged in SIDBPhysicalDeletes.log. That fixed it.
@Igor ! I marked your answer as correct. What is the vendor of your library? Do they support drilling holes? You have a good point that it was not a problem previously, though!
The library is a Dell R740 with a local RAID controller and large capacity SATA drives.
Indeed, this was not a problem previously. I must say this is rather puzzling. Space Reclamation should take place automatically, no?
Now I am wondering if this is the case but it doesn’t t work correctly - or in fact it is not scheduled at all?
Take a look at this section and see what your thresholds are set to. It’s entirely possible this would have self resolved in time, depending on what you have set: