@large environments: How do you monitor (proactive) CommVault - CommServe, MediaAgents, xAgents, Jobs, Logs, Dedup Engine/Performance, Alerts, Reports, ... and so on?
I've used Nagios with checkmk. But what is the current "state of the art"? Buzzword AI. For example, I like how Azure is handling all the metrics, and can solve some problems by itself or can give you useful informations, can trigger events/processes, can "learn” - and for example, can create trouble tickets or something similar.
Which tool makes your life easier? Grafana?
Excited about your feedback(s) :).
Cheers!
Best answer by Sean Crifasi
Hi @Fusi
In addition to Damian’s suggestion I’d like to mention some worthwhile alerts. In my experience these tend to get overlooked but can add peace of mind and early detection for potential problems. I recommend browsing the store for various Alerts/Reports as there’s new ones added from time to time that can be helpful in day to day operations.
Check out this thread, it covers some of what you mentioned here:
In terms of reports - my suggestion is ensuring SLA is kept, jobs are completing within required timeframes (if not, then its time to look at performance like DDB etc.) and managing by exception (anomaly reports).
In addition to Damian’s suggestion I’d like to mention some worthwhile alerts. In my experience these tend to get overlooked but can add peace of mind and early detection for potential problems. I recommend browsing the store for various Alerts/Reports as there’s new ones added from time to time that can be helpful in day to day operations.
I want to monitor the Nodes with checkmk/grafana. But for that an agent installation on the nodes is required, which is not recommended by CV. How can i do that? What i can do is collecting the sar files, but that requires a lot of work to display the data usefull. Any usefull solutions?