Solved

Backup M365 into existing CommVault on premises environment.


Badge +1

Good Morning

We are in the investigation stage for backing up M365/O365 into our existing Commvault on premises environment.

I’ve read through CommVault Doc (Applications -> Office 365 -> Exchange Online) but still have a lot of questions.

 

First, is there a diagram out there that shows the architecture of M365 backup to on-prem? A picture is worth 1,000 words. This is my basic understanding, but a real diagram would be helpful: M365- > Access Nodes -> Media Agents -> Storage

 

Second, is there a particular CommVault version that we should be on? “Get on the latest version” isn’t necessarily a good answer. What version has features, improvement, bug fixes, performance that I should be aware of? Targeting LTS releases only (ie 20, 24, or 28).

 

Third, how integral is Content Indexing? We don’t currently do any Content Indexing for our environment. We found the amount of space and time it would take to index traditional server/filesystem backups doesn’t come close to the benefits we’d get from it.

 

Fourth, for an enterprise with ~7,000 users, help me understand the hardware requirements:

  • How many Access Nodes (streams) do I need, what are their requirements?
  • How many Index Servers do I need, what are their requirements?
  • How many Media Agents do I need, what are their requirements?
  • Can some roles be combined using the same servers, or should the roles be dedicated to specific servers?
  • What is the network/bandwidth load?
  • With only 10 (or 20?) streams per Access Node, it seems like I might need a farm of at least 20 servers just dedicated for this purpose. So combining this role with Index and Media Agent doesn’t seem feasible.

 

Fifth, my understanding is the licensing is a 1-time cost per user. This license includes everything associated with the user in M365 (Mailbox, Sharepoint, OneDrive, Teams, etc.). Is this correct? Are these user licenses recyclable, or do they get burned up once a user is assigned a license? In other words, is it “active amount of users being backed up” or do we need to buy a new license when someone retires and we re-hire the position?

icon

Best answer by Jos Meijer 18 August 2022, 23:06

View original

12 replies

Userlevel 7
Badge +23

Hi @Greg , thanks for the post!

I’ll likely need to reach out to some of our Products team about many of these questions, though I can answer a few now.

  1. I’ll check with our products team.  your Account Rep likely has all sorts of diagram style information they can share.
  2. You’re already thinking in the right way here.  I would suggest going to 11.28 (also known as 2022E).  this will ensure you’re within a supported release for the longest time.  Now, whether this is best for you is a more lengthy discussion that your Sales Engineer can assist with.  It’s entirely possible you have a legacy component that can only be backed up in an older release.  A great place to start is the What’s New and What’s Changed section of 11.28: https://documentation.commvault.com/2022e/expert/148756_commvault_platform_release_2022e.html
  3. Depends on your overall needs.  If you don’t think it’s worth the effort, then maybe not necessary.  I’ll confirm with a few folks and reply back if I missed anything it’s needed for.
  4. This can get complex, though we do have a few docs that outline ALL of the requirements, grouped by size (and even includes with or without CI!): https://documentation.commvault.com/2022e/essential/114521_guidelines_for_exchange_online_access_nodes.html  Start there and see if you have any questions afterwards (though for your best architecture advice, definitely ask your SE for their advice).
  5. You are correct on the 1 license per user.  Paid licenses are used, not consumed.  You use it as you are doing backups for that entity, and if you deconfigure that, you can reuse the license since it gets released.  Now, if you get any temp/trial licenses, those are consumed/burned once applied.  

I’m tagging in @Onno van den Berg and @Jos Meijer since these guys have a TON of experience and wisdom to add

Userlevel 7
Badge +23

I checked with the Products Manager.  She agrees for the specific questions, the best path would be your Account Team.

Userlevel 6
Badge +14

This shows a diagram 

https://documentation.commvault.com/11.26/expert/93783_exchange_mailbox_agent_architecture_user_mailbox_and_journal_mailbox.html

Note access nodes can be added easily at any time however most important sizing when beginning is index server. Noted here:

https://documentation.commvault.com/11.26/expert/28819_hardware_recommendations_for_index_server.html

Access node requirements:

https://documentation.commvault.com/11.26/expert/28821_user_mailbox_and_journal_mailbox_access_nodes.html

 

Badge +1

Thanks Scott. These links seems to be specific to On-Prem Exchange, but helpful no less.

 

I guess my biggest question still is regarding the Access Nodes: https://documentation.commvault.com/11.24/essential/125313_best_practices_for_exchange_online.html

20 streams, 20 service accounts (Basic Authentication) or 10 Azure Apps (Modern Authentication)

We will likely be using Modern Auth. What are these “Azure Apps” the Doc is referencing? Does Exchange Online count as 1 Azure App? And Teams and 1 Azure App?

If we go with Basic Auth, and we’ve completed the first initial full, about how long does each account/mailbox take on average (assuming no network bottleneck)? I understand Microsoft itself does the throttling, so this should be a common average between all customers.

 

I don’t feel like I should have to pay for Professions Services to get a basic understanding of a product I may or may not even use… These are general questions, nothing specific to our organization. I’m sure PS will be involved when it comes time to actually implement, but again, just investigating right now.

 

Userlevel 6
Badge +14

Index sizing and access node hardware recommendations is the same onprem or O365.

When protecting O365 the backup will leverage a Azure app which is basically the connection to O365 without having to leverage a user account. Each stream for the backup will leverage a single app.

Once the full backup of all mailboxes is complete each mailbox backup is very quick as its just the changes exact time really depends on size change, size of mailboxes, etc. MS does not document the throttling mechanism and can be different from region/customer, etc

This graph is for online or onprem its just combined.

https://documentation.commvault.com/11.26/expert/28821_user_mailbox_and_journal_mailbox_access_nodes.html

So for 10k mailboxes 2 access nodes 10 azure apps. If you leverage the command center to create and configure the client it will create the azure apps for you during one of the steps. However if you choose to do it manually in azure these are the steps:

https://documentation.commvault.com/11.26/expert/124585_register_application.html

Userlevel 7
Badge +16

Hi @Greg 

Sorry for the delay, busy days currently.

Looking at the diagram it would look something like this for O365:

Don’t forget to configure the firewall/internet proxy to allow both 80 and 443, otherwise you will have issues connecting to certain aspects of the O365 service. Here is the url with info for ports and whitelisting per region:

https://documentation.commvault.com/2022e/expert/103534_exchange_mailbox_agent_user_mailbox_exchange_online_through_azure_active_directory_environment_port_requirements.html

The Azure apps can be defined manually and then you could probably combine azure ad roles on apps to facilitate different backup types. But I would make it easy for yourself and use the wizard to create apps for you based on modern authentication, this will be a separate configuration per service type (mail/teams etc) I believe.

Regarding the version I agree with @Mike Struening 11.28 would be best, contains all new developments.

Looking at content indexing, it depends what you want to accomplish.
But for basic backup and recovery this is not necessary, basic indexing though is required to capture all necessary metadata.

Requirement wise:

  • How many Access Nodes (streams) do I need, what are their requirements?
    I would start with one VM containing 8 vCPU and 16 GB RAM and configure 10 azure apps to begin with and add more apps/cpu/memory if performance is not as expected.
  • How many Index Servers do I need, what are their requirements?
    One should suffice, if you only use backup and recovery, then it’s only for meta data.
    If you want content indexing then we are talking a whole different story.
    Looking at specs, I am not really sure the 16 vCPU and 32 GB is necessary, let me check my production config tomorrow to verify what the actual index server impact is on our O365 config.
  • How many Media Agents do I need, what are their requirements?
    Media Agents configurations depend on so much more, what are you going to backup next to O365.
    To put it differently, what do you want to achieve with your backup solution looking at the bigger picture.
    This could need a proper design in order to answer your question. The disklibrary sizing alone needs calculation based on different variables such as data types, daily change rate, yearly growth rate, retention.
  • Can some roles be combined using the same servers, or should the roles be dedicated to specific servers?
    You can, but I would divide functions on different machines where possible, resource usage can go up quickly for a machine if the wrong combinations are made.
  • What is the network/bandwidth load?
    Not sure… will have to ask network management to see our workload, but currently we don’t have much time for additional non client related activities as our agenda is packed.
    Anyone else with an answer on this?

License wise I have an addition to @Mike Struening, the license is per user for each active user mailbox you backup.
Shared mailboxes are backed up also, but not counted in licensing.
Be aware though if you decide to backup for instance SharePoint later on as well, the license count changes.
Then not only the active mailboxes in the backup are counted, but all active users are counted.

Hope this helps 🙂

Userlevel 7
Badge +16

How many Index Servers do I need, what are their requirements?
One should suffice, if you only use backup and recovery, then it’s only for meta data.
If you want content indexing then we are talking a whole different story.
Looking at specs, I am not really sure the 16 vCPU and 32 GB is necessary, let me check my production config tomorrow to verify what the actual index server impact is on our O365 config.

 

Checked just now and we are using the advised specifications of 16 vCPU and 32 GB

Userlevel 7
Badge +23

 

I don’t feel like I should have to pay for Professions Services to get a basic understanding of a product I may or may not even use… These are general questions, nothing specific to our organization. I’m sure PS will be involved when it comes time to actually implement, but again, just investigating right now.

 

Totally agree, @Greg.  I wouldn’t expect you to pay 

I was referring to getting advice from the SE, not from PS.  they are often in possession of the tools and knowledge to advise on what you should get.  PS would implement based on the SE’s design.

That said, @Jos Meijer is an absolute legend around these parts and has offered some sage wisdom!

Badge +1

Are there any differences when backing up GCC (compared to standard Commercial/Worldwide)? Are they setup the same way? Does the Wizard ask you which tenant level your using? I think there are some fundamental differences that it would need to account for, like different URLs.

Userlevel 6
Badge +14

@Greg

Yes there is a check box for those leveraging GCC/GCC High which are supported.

Its when you get to the Exchange connection settings you can select the required option

https://documentation.commvault.com/2022e/essential/114421_use_express_configuration_option_in_office_365_guided_setup_for_exchange_online.html

Other than the connection settings and URLs leveraged the backup is performed and behaves the same way.

Hello Everyone,

I am in the same context as Greg describe (everything in place and accessible) and I got 2 issues:

  1. Doing backup on SharePoint Online, I got “Invalid Storage Policy ID. Please verify Client[APP-SHAREPOINT], Agent[Sharepoint Server], Instance[defaultInstance], Backupset[Sharepoint Online], Subclient[SharepointOnline] is associated to a valid Storage Policy.”. How can I fix it?
  2. How can I configure autodiscovery regarding content (Exchange, Sharepoint, OneDrive and Teams) before to add content?

Agent is not visible (from the console) in this context? What is missing?

Thanks 👍

Badge +1

Hello Everyone,

I am in the same context as Greg describe (everything in place and accessible) and I got 2 issues:

  1. Doing backup on SharePoint Online, I got “Invalid Storage Policy ID. Please verify Client[APP-SHAREPOINT], Agent[Sharepoint Server], Instance[defaultInstance], Backupset[Sharepoint Online], Subclient[SharepointOnline] is associated to a valid Storage Policy.”. How can I fix it?
  2. How can I configure autodiscovery regarding content (Exchange, Sharepoint, OneDrive and Teams) before to add content?

Agent is not visible (from the console) in this context? What is missing?

Thanks 👍

@Fred - With Exchange auto discovery, it should run at the beginning of each job. If you want to trigger discovery without running a job you can go to Protect > Office 365 > Exchange client > Mailboxes > Add.   When you hit Add it should trigger a discovery based on what you have set under Content.

 

 

I also have a follow up questions about licensing. We intended to get the 1-time purchase of O365 license (1 per user). Does the capacity we backup with O365 count towards our Commvault Complete capacity? For example, if our EXO client is 50TB, does that 50TB count against our Commvault Complete capacity; or, does the 1-time O365 license exclude it from our capacity limit?

 

Thanks

 

 

Reply