Explain like my manager is 5 years old, Data Written, Application size, and Size on disk?

  • 28 February 2022
  • 1 reply
  • 552 views

Badge +2

We are low on space.  As we look through old jobs, or anything that is still being held on past our retention period.   We have plenty of infinite and long term retained items mixed with our regular data in our primary DDB.  

 

I’m claiming that the infinite retention or long-term jobs could be hodling reference blocks.  This is the reason why we see data size on disk being a reasonable number but the library being full. For example, 800 TB size on disk but the 1.5PB library is full.  

as we go through jobs, some are seen as not “big fish” because data written may show for example 85GB written for a 1.5TB App size server. We skip and don’t worry about this because, I’m told well 85 GB is only what will come back in space.  Let's look for 1TB + being written.  

 

I’m thinking that even as we size our future library, we should consider a pool of space that will always sit there and hodl these reference blocks and be “unuseable” space.  

 

Hopefully this rant makes sense? 


If you have a question or comment, please create a topic

1 reply

Userlevel 7
Badge +23

@DaBackups , I’m a big fan of r/ELI5 so I love your title!

We have a thread here with the same question, though I’ll copy the answer here as well:

  • Size of Application is the scanned size of what we need to backup (before any dedupe, compression, etc.)  Basically Front End Terabytes (FET).
  • Size of backup is what we need to actually back up including metadata
  • Data Written is how much is actually on media (so dedupe and compression factored in)

With that said, you should absolutely always assume you will have a baseline of data, regardless of what the unaged jobs show.  the Job Details for a given JobID will show what was written for that exact job, but not the implied baseline files.

I always think of Dedupe as a big, gelatinous organism.  I add little pieces to it to make it bigger, but each job is also part of the whole thing.  You can add a few blocks from a new job, but only because the necessary blocks were already there.