Solved

kubernetes error backing up application for etcdb backup

  • 22 November 2022
  • 14 replies
  • 115 views

Badge +4

Hi Team,

 

I have been trying to backup openshift 4.10 etcdb via kubernetes but i keep on getting the error: “Error backing up application”.

 

Below is the vsbkp.log excerpt 

 

K8sApp::CreateWorker() - Worker pod [data-dir-etcd-rwdrmstrpd01.rwdrocppd01.kcbad.com-cv-412129] created successfully for PVC [data-dir] 296908 48818 11/21 17:37:20 412129 K8sApp::IsETCDSnapDBFilePresent() - Checking if etcd db snapshot file [/mnt/data-dir/cv_etcdsnapshotdb_21_11_2022__17_25_54] is created. 296908 48818 11/21 17:37:20 412129 K8sApp::IsETCDSnapDBFilePresent() - Checking ETCD DB snap file presence from worker pod. namespace:[openshift-etcd] volume:[data-dir] snapid:[etcd-rwdrmstrpd01.rwdrocppd01.kcbad.com-cv-412129] container:[cvcontainer] podname:[data-dir-etcd-rwdrmstrpd01.rwdrocppd01.kcbad.com-cv-412129] CK8sInfo::MountVM() - Backup failed for app [etcd-rwdrmstrpd01.rwdrocppd01.kcbad.com]. Error [-1:0xFFFFFFFF:{CK8sInfo::Backup(399)} + {K8sCluster::CreateAppFromSnapshot(679)} + {K8sApp::IsETCDSnapDBFilePresent(1347)} + {K8sWS::FilePresentInfo::GetFilePresentStatus(185)/ErrNo.-1.(Unknown error -1)-Exec failed with unhandled exception: set_fail_handler: 20: Invalid HTTP status.}]

 

Raised a case for this under customer ticket: 221118-275

 

But it is really taking time to sort this out. I would appreciate any ideas from the community? Thanks.

icon

Best answer by MOGFREY 6 December 2022, 11:41

View original

14 replies

Userlevel 6
Badge +14

Hello @MOGFREY,

What Commvault Maintenance Release are you using here?

MR 11.28.32 contains HotFix 2819 for ETCD backup failures on Openshift clusters, Which could resolve this: https://documentation.commvault.com/2022e/assets/service_pack/updates/11_28_32.htm

 

Best Regards,

Michael

Badge +4

Hi Michael,

 

Thanks for the suggestion. I already tried upgrading to version 11.28.32 but it still did not solve the issue.

Userlevel 5
Badge +11

Hi @MOGFREY ,

Another customer is having similar issues.

Development has created a new Diag and customer confirmed it is now working.

I spoke to the Engineer working on your case and will provide the Diag.

Best Regards,

Sebastien

Badge +4

Hi Sebastian,

 

Thanks for the response and the input. I installed the hotfix on the proxy but it didnt solve the issue in my environment. Ticket has been escalated back to the dev team.

Userlevel 5
Badge +11

Hi @MOGFREY ,

I checked the latest logs on Comvaultproxy and didn't see that you installed the Diag 3312.

Can you install again and check updateinfo.log, you should see 3312 installed?

Otherwise that's fine, I will ask the Engineer to check with you.

Best Regards,

Sebastien

Badge +4

Hi Sebastian,

Quick update:

Thanks for your suggestion. You were right i had done the installation through the commserve, the update succeeded but on checking again the hotfix was not installed. So i had to install it manually on the server which worked and tried the backup again. Unfortunately i got same backup error and the job still fails. Support provided another hotfix which i installed as well (note that this second hotfix uninstalls the 3312 hotfix). Still the job fails with the same error. So now we are waiting on word from dev on this.

Userlevel 5
Badge +11

@MOGFREY ,

Yeah I had checked the logs and saw that.

Now we have provided the official Hotfix, can you try that please, meanwhile Development will check the latest logs.

Best Regards,

Sebastien

Userlevel 5
Badge +11

@MOGFREY ,

I can see you installed the official Hotfix but Backups are still failing.

Development will do a remote session.

I hope we will identify and fix the issue asap.

Best Regards,

Sebastien

Userlevel 5
Badge +11

@MOGFREY 

I am glad Dev found the issue.

He said that you have downloaded the wrong centos image:
https://documentation.commvault.com/v11/essential/149267_validating_your_kubernetes_environment.html

Best Regards

Sebastien

Badge +4

Hi Sebastian,

 

This does not seem to be the issue. We have downloaded and uploaded the specified image form docker pull centos:8 but the issue still persists

Userlevel 5
Badge +11

@MOGFREY ,

Ok, I have also told Development.

Best Regards,

Sebastien

Badge +4

Hi sebastian,

 

Please could you  inform dev team that the base architecture for the system is linuxone s390x and the commvault centos image from docker is x86_64... This is why we are getting crashloopbackoff on the worker pods. Do they have an image compatible with s390x?

Userlevel 5
Badge +11

@MOGFREY ,

It is in the TR, my colleague will check. The plan is to do a session with them today.

Best Regards,

Sebastien

Badge +4

Hi Sebastian,

 

Just an update on this: Issue was the image base architecture. we used a rhel s390x image for the worker pods and the issue was resolved. Thanks for your assistance

Reply