Monit: lightweight monitoring solution
Submitted by vladimir on Thu, 11/27/2008 - 16:33
Monit is simple, lightweight, but useful and powerful enough monitoring solution for your servers.
Monit can monitor:
- OS processes (presence, resources)
- files, dirs and file system for changes (mtime, size and checksum changes)
- network hosts (ping, tcp connections)
Monit can notify administrator via configurable e-mail messages. It also can automatically restart failed service.
Monit have embedded web-server which allows to view state on monitoring objects and disable or enable them.
Of course, enterprise-class monitoring systems have much more features, but sometimes they are too complex and unstable.
BTW, there is product named M/Monit. It can control multiple monit instances. Unfortunately, M/Monit is only available under commercial license.
Let's try to install and configure monit:
emerge -av monit
And here are some config examples:
/etc/monitrc
set daemon 120 # check every 2 minutes set logfile syslog facility log_daemon set mailserver localhost set eventqueue # use event queue is case mail server is unreachable basedir /var/monit slots 10 set mail-format { from: monit@ myserver.com } set alert admin1 admin2 # list of alert revievers # internal httpd configuration set httpd port 2812 and use address 0.0.0.0 allow 1.2.3.4 allow admin:password include /etc/monit.d/*
/etc/monit.d/system
# overall OS resources checking check system myserver if loadavg (1min) > 30 then alert if loadavg (5min) > 20 then alert if memory usage > 75% then alert if cpu usage (user) > 70% then alert
# apache2: check process apache with pidfile /var/run/apache2.pid start program = "/etc/init.d/apache2 start" stop program = "/etc/init.d/apache2 stop" if totalmem > 500.0 MB for 5 cycles then restart if children > 250 then restart if loadavg(5min) greater than 30 for 8 cycles then stop if failed host myserver.com port 80 protocol http and request "/index.html" then restart if failed port 443 type tcpssl protocol http with timeout 15 seconds then restart if 3 restarts within 5 cycles then timeout
# file system: check device data with path /dev/sdb1 start program = "/bin/mount /data" stop program = "/bin/umount /data" if space usage > 80% for 5 times within 15 cycles then alert if inode usage > 80% then alert group server
Links:

Post new comment