Solved

DB2 backup on AIX very slow


Badge +4

Working with DB2 on AIX for our SAP AFS Application. Approx. 4TB database split over 4 logical volumes in an IBM Virtual IO environment on IBM Power. This backup takes 20 to 26 hours. Below the statistics in db2diag after the backup. I’m not sure how to read this, but I think the WaitQ is quite high.
Hope anyone can shed a light on this.

 

2021-03-13-01.02.01.804357+060 E411673A4491 LEVEL: Info
PID : 4718886 TID : 61753 PROC : db2sysc 0
INSTANCE: db2prd NODE : 000 DB : PRD
APPHDL : 0-12866 APPID: *LOCAL.db2prd.210311220047
AUTHID : DB2PRD HOSTNAME: lpdd001
EDUID : 61753 EDUNAME: db2agent (PRD) 0
FUNCTION: DB2 UDB, database utilities, sqluxLogDataStats, probe:395
MESSAGE : Performance statistics
DATA #‌1 : String, 3994 bytes

Parallelism = 40
Number of buffers = 40
Buffer size = 2494464 (609 4kB pages)

BM#‌ Total I/O  MsgQ WaitQ Buffers MBytes
--- -------- -------- -------- -------- -------- --------
000  93654.87 47851.71 43854.53 455.97 1032152 2450793
001 93638.40 5314.21 28579.66 58494.24 309031 733532
002 93627.64 912.52 23500.10 68044.89 58527 138565
003 93503.11 1678.18 28495.65 62110.43 221485 525641
004 93644.75 622.59 23105.48 68747.61 54465 128969
005 93165.99 346.84 12501.31 79156.65 28099 66315
006 93654.74 283.14 14433.68 77780.60 14188 33310
007 93244.06 36.63 2850.77 89202.64 2564 5703
008 93531.91 446.75 17849.08 74075.73 24954 58870
009 93545.34 101.93 3432.56 88856.77 2574 5715
010 93521.20 206.49 12001.90 80156.74 10726 25088
011 93240.49 87.94 4393.93 87603.56 6192 14320
012 93651.29 3.49 1221.64 91272.79 234 170
013 93124.41 113.58 5153.03 86702.00 9173 21398
014 92908.78 70.68 4522.99 87160.84 4633 10934
015 93645.90 0.69 1099.65 91392.15 173 24
016 93597.10 39.78 2567.56 89835.86 1630 3485
017 93239.47 46.48 3816.44 88221.91 4700 10776
018 93503.39 27.49 2128.51 90193.68 1068 2150
019 93653.45 1.24 1100.73 91398.14 176 31
020 92254.68 0.71 16.21 91395.10 7 14
021 92275.78 16.97 701.34 90693.58 401 947
022 92348.10 18.72 1493.37 89899.50 1333 3161
023 93546.50 14.29 1454.45 90924.33 481 756
024 92418.47 0.56 13.68 91397.77 5 8
025 93625.70 39.40 2874.08 89558.39 1570 3342
026 93492.63 0.48 941.87 91396.94 166 8
027 93614.09 3.70 1198.32 91258.53 281 280
028 93176.60 23.49 1422.08 90576.69 734 1357
029 93611.89 0.95 1057.11 91400.47 170 16
030 93613.95 2.82 1189.21 91268.55 238 178
031 93436.97 1.16 883.56 91398.95 173 24
032 93618.84 0.81 1068.24 91396.41 174 27
033 93236.32 0.52 686.91 91395.71 166 8
034 92442.61 1.19 20.41 91390.43 10 21
035 93626.35 0.46 1069.34 91403.11 166 9
036 93573.80 1.55 1020.01 91398.75 173 24
037 92463.75 0.85 9.93 91401.25 5 8
038 92487.93 1.04 15.74 91395.23 8 16
039 92564.58 12.59 1011.56 90387.66 717 1698
--- -------- -------- -------- -------- -------- --------
TOT 3730726.03 58334.80 254756.87 3372200.75 1793722 4247710
MC#‌ Total I/O MsgQ WaitQ Buffers MBytes
--- -------- -------- -------- -------- -------- --------
000 91412.17 62751.83 28626.82 0.00 178543 424734
001 91407.44 64251.43 27114.16 6.90 202906 482691
002 91407.65 64279.45 27074.99 6.90 205911 489840
003 91408.45 64356.74 27015.30 6.89 206245 490635
004 93677.88 65274.64 28331.31 22.06 205430 488691
005 91407.98 62846.17 28522.14 6.88 154640 367871
006 91410.40 62892.63 28471.70 6.88 148478 353212
007 91397.05 62804.50 28565.73 6.90 162703 387052
008 91411.33 62795.12 28563.84 6.90 173397 412492
009 91407.81 62879.88 28481.98 6.90 155480 369869
--- -------- -------- -------- -------- -------- --------
TOT 916348.20 635132.41 280768.01 77.27 1793733 4267094

Thanks in advance..

 

icon

Best answer by Marcel Huisman 12 July 2021, 15:18

View original

29 replies

Userlevel 7
Badge +23

Ok, appreciate the details!!!  will keep an eye for your update.

Userlevel 7
Badge +23

@Marcel Huisman , following up before I close this thread off.  Were there any other database changes that helped this issue?

Badge +4

Database parameters helped very much in speeding up the backup. Finally we rebalanced the volumegroups on this AIX server and created separate log volumes for each logical volume. At the end there was a defective drive library which was the greatest bottleneck. Still I think we can win extra speed by rebalancing the tablespaces, but we will probably do this the next time when we do a refresh.
Thanks for all of your help.

Userlevel 7
Badge +23

That’s great news, @Marcel Huisman !  I’ll go ahead and mark this as the Best Answer, though feel free to add more info after your next refresh!

Reply