Skip to main content

Hi Team,

 

We have an issue with one of our Aux Copies, which for reasons unknown has poor rates of throughput compared to near identical Aux Copies.

I would like to see if anyone has an understanding of the values in the CvPerfMgr logfile so we can further troubleshoot.


We are writing to a target Cloud copy. Other SP’s write there with very good rates of throughput (10 GB/Hr).
My “problem” SP Copy is hovering around 300 GB\HR,  AND we have a large backlog of copy data.

I have checked the basics such as Q&I times, local server and DDB performance and I can’t see anything obvious.

 

Q&I times on the Primary are ok (roughly 1800).

Q&I times on the Secondary are around 630.

 

CPU utilisation on both servers (we are using partitioned dedupe for both primary and secondary) is excellent (maybe 30 or 40% whilst Aux Copies and backups are running).

So, i then took a look at the CvPerfMgr logfile, but I could do with some further information.

I can see a reference t reader speed possibly being 1488.11 GBPH:-

 

Replicator DashCopy                                    
|_Buffer allocation....      1248        >Samples -    156996] .Avg - 0.0    7949    ]                
|_Media Open...........         3        Samples -    6] Avg - 0.500000    ]                    
|_Chunk Recv...........         -        vSamples -    6] Avg - 0.000000    ]                    
|_Reader...............        17    7545376828    -7.03 GB]    v1488.11 GBPH]    

 

Further down the logfile, I can possibly see Network Transfer as only being 19.69 GBPH:-

 

Reader Pipeline ModulestClient]                                    
|_CVA Wait to received data from reader..      1547                                    
|_CVA Buffer allocation..................         -                                    
|_SDT: Receive Data......................        36    7552876220    7.03 GB]    Samples - 157014]    Avg    -    0.000229]    l703.4    2 GBPH]    
|_SDT-Head: CRC32 update.................         8    7552818684    7.03 GB]    >Samples - 157013]    .Avg    -    0.000051]     3165.    35 GBPH]    
|_SDT-Head: Network transfer.............      1286    7552818684    7.03 GB]    ]Samples - 157013]    aAvg    -    0.008190]    .19.69    GBPH]    
|_SDT:Stats..............................         -                                    
|_GCompression : ZIP]....................         -                                    
|_ Buf size : 65536].....................  -                                    
|_9Buf count : 180]......................         -                                    
|_.SDT threads : 16].....................         -                                    
|_oProcessor count : 16].................         -                                    
|_Thread per Connection : 8]............         -

 

Further down the logile I can possibly see the logfile showing write speeds as only 19.98 GBPH:-

Writer Pipeline ModulesMediaAgent]                                    
|_Stream target: MediaAgent1]..............        -                                    
|_SDT-Tail: Wait to receive data from source........      330    7552876220    .7.03 GB]    Samples - 157014]    Avg    -    0.002102]    76.74    GBPH]    
|_SDT-Tail: Writer Tasks............................      1562    7552818684    F7.03 GB]    tSamples - 157013]    bAvg    -    0.009948]    g16.21    GBPH]    
|_DSBackup: Update Restart Info.....................         -                                    
|_DSBackup: Media Write.............................     1266    7545896404    7.03 GB]    19.98 GBPH]    

 

However, as I don’t fully understand this logfile I can’t be certain on what it is stating.

So if there is anyone who has had dealing with these logfiles, maybe a support discussion along the way it would be good to know.

Thanks

Hi ​@MountainGoat 

Greetings !! You have pointed it out correctly , From the CVPerf logs it is evident that the issue is with the network throughput and the write throughput on the MA and you are looking at the right logs to understand the performance issues .

You did mention that other SP’s do not have this issue , just wanted to clarify if the target for the other SP’s is the same Cloud Library ? 

 In order to confirm the performance while writing to cloud , you can test this by using “CloudTestTool” from the MA / Base folder and pick the Cloud vendor from the list and try the “Upload/Download” test which will give you a fair understanding of the actual throughput and this will give you some further insights on the issue . 

 

Thanks & Regards 

Das


Thanks Bdas,

 

I don’t think this issue is as straightforward as it appears.

To clarify, this is not writing to the exact same library as other SP’s.
It’s the same storage layer but a unique set of Mount Points, carved into a library for exclusive use by this particular SP.

The same principles apply everywhere.

So we have access to Containers within Objectstorage.

We simply define one set of containers and map into Cloud Library “A”.
The next set of containers are defined and mapped into Cloud Library “B”.  And so on …

I have ran with CloudTestTool before. It’s a bit fiddly and I’ve only really used it proving it the network was accessible, not so much for a throughput test.

One other thing I have noticed, is that when I suspended all other Cloud-target Aux Copies, throughput picked up on my problem SP to reasonable levels. But once other Aux’s are resumed, it’s back to poor levels. Almost like some prioritisation issue with this SP. It’s difficult to assess.

I will take a look at CloudTestTool.
If I run it when all Aux’s suspended, and across different Cloud libraries, and compare results.

Updates to follow ...


Reply