Solved

Azure MA using blob aux copy to on-prem - continual recall jobs

  • 23 October 2021
  • 10 replies
  • 140 views

Badge +1

I have one-drive and exchange online backing up to a MA in Azure with blob storage, this all works fine - but I want a 2nd on-prem aux copy. I’m happy with with the network costs, traffic and on-prem disk capacity.

 

The storage plan uses Cool in the blob, when the aux copy runs, it stop/starts with recalls for each chunk.

 

Idea 1 - Ideally the aux copy should trigger a workflow like the restore job to recall the chunks all in 1 go, can this be enabled or something in the dev pipeline?

 

Idea 2 - create a new extra ‘Primary’ with hot storage and 7 days retention, copy this to the existing 6 months Primary (cold) and copy to the on-prem MA

 

Idea 3 - change the plan to hot, but I would prefer to manage costs

 

Idea 4 - ?

 

Cheers

 

icon

Best answer by chrism4444 8 November 2021, 10:03

View original

10 replies

Userlevel 7
Badge +15

Hi @chrism4444 

Thanks for the question!

Idea 1

If we recall all data from cool storage with a workflow, where does this get staged/cached?

If we recall data from cool storage for a restore, it’s likely to be limited in size and aux copy currently is recalling chunks sequentially so we don’t stage/cache ALL data in one go. This might seem inefficient, but it does at least prevent recalling everything in one go locally to the MA as potentially, significant local storage may be required. Which leads nicely into idea 2…

 

Idea 2

A new hot primary copy with short retention sounds like a great solution here.

Adding a new primary copy and demoting the current cool copy to secondary achieves what you have now, but leaves the same data in hot storage for an additional copy to be provided on-prem.

Create another copy targeted for on-prem storage, but do ensure the source copy is the primary (which it should be by default), hot storage and not the existing cool storage copy as you’ll be back to the same position as now.

Creating Synchronous Secondary Copy

 

Idea 3

Simple conversion to hot storage doesn’t stack up against Idea 2.

 

Idea 4

I still can’t beat Idea 2, but let’s see what the Community has to say…

 

Thanks,

Stuart

Badge +1

 

With Idea 1, I was assuming the recall could leave the data ‘hot’ while it was required, so I see the flaw - but how does the workflow for a restore manage to recall all chunks in one go, perform the restore?

 

Idea 2, is certainly easier to explain and implement which is always important

Userlevel 7
Badge +23

@chrism4444 , can you clarify your question a bit?  When the restore recalls the chunks, it is able to recall all of the files needed for the restore since the CS knows the associated archive files.

Badge +1

@chrism4444 , can you clarify your question a bit?  When the restore recalls the chunks, it is able to recall all of the files needed for the restore since the CS knows the associated archive files.

restore is fine, the work flow appears to submit a single job to recall all cool chunks at the same, so you don’t notice any delay. It’s the aux copy, going through repeated cycles per chunk of fail>recall>wait>copy>repeat - with 5Tb that’s probable never going to finish in my life time

 

Does that make sense?

Userlevel 7
Badge +23

IS the source (Primary) Hot or Cool?  Or is it a managed blob (where Azure/CV recalls files from Cool to Hot as needed)?   Yeah, I imagine that would take a lifetime….

I see that you and @Stuart Painter mapped out several options, so let me know the exact setup you have running and we can figure it out.

Where is Primary located?  where is Secondary going?  is there any managed tier storage involved on either copy?

Thanks!

Badge +1

Primary is stored in a blob mount point / storage account connected to a Azure/Windows MA. CV storage policy is configured to write as Cool. There are no life cycle policies on the blob.

 

Secondary is on-prem over a VPN (network level) - this is a physical MA with DAS - no managed tier storage on either level

Userlevel 7
Badge +23

So essentially we are doing a reverse seeding (where normally you send a snowball to the cloud vendor to seed for you).

Let me see if there’s any method to make this more efficient.

Userlevel 7
Badge +23

Spoke to a colleague who advised that there’s definitely a lot of factors involved, so hard to know the exact cause initially, though he did mention Cool storage WILL be slower…..is there any chance you can move to Hot?

Badge +1

 

 

fyi - ignore all the waffle, turns out when had a number of blobs set to Archive in Azure, few lines of powershell, found and reset to Cool the offening items - 14 hours later the job was running at full speed, no errors or pauses - I suspect the root cause was when the storage plan was setup, some ran the first few jobs with Archive (as it allows you to) - this was corrected, for future jobs, but the initial data was still Archive 

Userlevel 7
Badge +23

That makes absolute sense.  Explains everything, and how we were constantly trying to dig in that area :nerd:

Thanks for sharing the solution!

Reply