Question

Data Written x Size on Media

1 year ago
August 23, 2023
3 replies
1132 views

+11

PedroRocha
Byte
80 replies

I've seen several discussions regarding this topic. But I still don't get it.

From the docs:

Application Size:
The size of the data that needs to be backed up from a client computer. Application size is not always identical to the size reported by file or database management tools.

Data Written:
New data written on media by each backup job.

Size on Media:
Size of deduplicated data written on the media.

Consider the following real job:

Job 1 (deduped disk lib is the destination):

App Size: 33.4TB

Data Written: 24.93

Size on Media: 7.30

1st question: what is the total size occupied by this job in the dedup disk lib?

2nd question: is data written size, before dedup? Does it account for sw compression only? both?

3rd question: is size on media just the size of unique blocks for the job (disregarding the baseline?)?

regards,

Pedro

Nutan Pawar G
Vaulter
92 replies
1 year ago
August 24, 2023

Hi @PedroRocha

Application Size is size of protected data on the client

Data Written is volume of this data written to storage - after compression, deduplication, etc.

Size on Media is the size of the backup job (application data and index) or the total size occupied on Media. With deduplication, this includes the data written AND the size occupied by aged jobs that are still referenced by other valid job.

Note: Data written is the total data written by the current active jobs and Size on disk is the total size occupied on disk by the current active jobs plus its dependent baseline of aged jobs. Hence Size on disk always tends to be on the higher side when compared with Data Written.

Example1:

=========

You run a full backup for 100 GB. Later another job runs with application size 110 GB runs but only 10 GB of data is written, the rest is deduplicated. The first job of 100 GB ages off over the time.

Now, here data written in storage policy = 10 GB (the size of active job)

And Size on Media = 110 GB (active +baseline of 100 GB)

Example2:

=======

The following letters represent data on the client:

A B C D E F G

You run your first backup and this becomes your baseline. Let's say this is a Data Written of 7MB, 1MB for each letter. After that backup completes, some of the data changes:

A B C D E F G H I

When you run your next backup, deduplication ensures that only the changed H and I are written; we already have the other data. The data written for this job is only 2MB.

This goes on for a few weeks but only the changed data will be written after the initial baseline was created, and eventually the original job meets retention and ages off. The original data that never changed accounts for 5MB written

That data is still associated with your newer backups, it just wasn't written again when they ran because of dedupe. You can't remove it and still have a good backup, but because the job that originally wrote it has aged, it's not reflected in your data written totals.

+11

PedroRocha
Author
Byte
80 replies
1 year ago
August 25, 2023

Hi! Thanks for the complete answer.

Still, there's something wrong here… size on media for all the jobs that we have, are smaller then data written. It seems that size on media is listing only the unique data blocks. Does it make sense?

Pedro

Nutan Pawar G
Vaulter
92 replies
1 year ago
August 28, 2023

@PedroRocha,

We can review this for you, However you can log a support case by uploading the CommserveDB with additional details such as library/mount paths, storage policy, etc.

Reply

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos + marketing

Reply

Related topics

Indicium - change interval pending system flowicon

Indicium - Aborting schedule '...' of process flow - please give informationicon

Performance troubleshooting

Revised synchronization and deployment

Release notes Universal GUI (2021.1.15)

Most helpful members this week

Sign up

Login to the community

Scanning file for viruses.

This file cannot be downloaded

Cookie policy

Cookie settings