Monitoring
Monitoring systems are responsible to collect, show and generate alarms in monitored devices such us: servers or network devices.
Tools
editLinux includes different tools for collecting, observe and monitoring systems metrics or performance, such as: sysstat, atop, dstat and vmstat. Some of them include recording and alerting capabilities while others just visualization capabilities.
Monitoring solutions
editZabbix and Nagios have been historically tools used to perform monitoring across different servers or devices. Other tools included Icinga (Nagios fork), Prometheus and Netdata.
Alerting capabilities
editNetdata support email alerts and is planned to add support to Slack.
prometheus alertmanager support different notifications methods
vmstat
editvmstat 1 procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 10 2 837532 459608 64052 23976592 1 2 992 1714 1 1 23 5 70 2 0 1 0 837532 421692 64052 24014288 0 0 33444 2280 20629 33647 38 5 55 2 0 2 1 837532 496892 64012 23937976 0 32 56464 2224 19104 35788 24 4 70 3 0 7 0 837532 435928 64020 23999028 0 0 55584 2272 22717 37604 32 5 60 2 0 10 8 837532 411988 64020 24021820 0 0 21532 270348 25256 33189 41 6 38 16 0 8 3 837532 447948 63984 23986276 0 0 28788 20560 27664 42733 39 7 41 13 0
Activities
edit- Review wikipedia list of software monitors: https://en.wikipedia.org/wiki/System_monitor#List_of_software_monitors
- Review wikiversity articles covering system, software or network monitors: monit, Nagios, Prometheus (software), netdata, sar and Zabbix
- Identify key differences between network monitoring, system monitoring[1] and application performance monitoring (APM)[2].
- Implement a solution to detect disk array failures: System administration/ProLiant/Remove a disk from your redundant storage array and review OS logs
- Monitoring disk space:
- Monitor your RAID devices:
- Software RAIDs: mdadm
- Hardware RAIDs: HPE Array Controllers