Thursday, June 23, 2011

Monitor systems using Munin

Monitor systems using Munin
by GKONTOS on FEBRUARY 16, 2011

There is always a need to monitor your systems performance. System statistics can help you fine tune a system and can also warn you of possible issues that could lead to a system misbehavior. Munin is a very nice tool for graphing system statistics using the rrdtool and it is based on a client server model. Munin master is used to collect information from Munin nodes. Fortunately Munin has been ported to FreeBSD.

We will need to install two ports:
/usr/ports/sysutils/munin-master // Munin Server port.
/usr/ports/sysutils/munin-node // Munin Client port.

First we will install the master node which will collect all the information and create our graphs.

On the Munin Server, install Munin Server port:
# cd /usr/ports/sysutils/munin-master
# make install

Note: this will install the Munin master and will pull all Perl5 needed modules. At the end it will also create a user munin and a crontab for that user which will run the program every five minutes.

On the Munin Server, view the crontab setting by running this command:
# cat /var/cron/tabs/munin
*/5 * * * * /usr/local/bin/munin-cron

On the Munin Server, let's install the Munin node port on the same system since we want to monitor it as well:
# cd /usr/ports/sysutils/munin-node
# make install

Note: install munin-node port on all machines you wished to monitor.

Now, we have to configure our webserver. In this case I assume that apache is being used and Munin has been installed in /usr/local/www/munin:
# vi /usr/local/etc/apache22/httpd.conf
### [START] Munin
Alias /munin "/usr/local/www/munin/"

<Directory /usr/local/www/munin>
Options none
AllowOverride All
Order Deny,Allow
Deny from all
Allow from all
</Directory>
### [END] Munin

On the Munin Server, you can see AuthUserFile is required:
# cat /usr/local/www/munin/.htaccess
AuthUserFile /usr/local/etc/munin/munin-htpasswd

On the Munin Server, create a Munin Admin called "Munin":
# htpasswd -c /usr/local/etc/munin/munin-htpasswd Munin

On the Munin Server, reload Apache:
# apachectl restart

Now it is time to configure Munin. munin-node-configure is a nice script for setting up your plugins.

On any Munin Clients, run following command:
# /usr/local/sbin/munin-node-configure --shell

Note: the above command will generate a list of commands for making symbolic links. These symbolic links are used for Munin plugins. Select the ones you want, and make the symbolic links.

or you can run following command to create all symbolic links automatically:
# /usr/local/sbin/munin-node-configure --shell | sh -x

On any Munin Clients, add this line to /etc/rc.conf:
# vi /etc/rc.conf
munin_node_enable="YES"

On any Munin Clients:
# /usr/local/etc/rc.d/munin-node start

On any Munin Clients, make sure the munin-node daemon is running:
# ps ax | grep munin

On any Munin Clients, make sure the munin-node daemon is running:
# netstat -at | grep 4949

On any Munin Clients, edit /usr/local/etc/munin/munin-node.conf and allow your munin-master IP for example 10.10.10.4 to connect:
# vi /usr/local/etc/munin/munin-node.conf
allow ^127\.0\.0\.1$
allow ^10\.10\.10\.4$

On any Munin Clients, add this line to /etc/rc.conf:
# vi /etc/rc.conf
munin_node_enable="YES"

On any Munin Clients, start munin-node:
# /usr/local/etc/rc.d/munin-node start

On the Munin Server, edit the /usr/local/etc/munin/munin.conf of your munin-master server to collect information:
# vi /usr/local/etc/munin/munin.conf
### a simple host tree
[master-node.example.com]
address 127.0.0.1
use_node_name yes

[target-node.example.com]
address 10.10.10.1 #This is the IP address of my target host
use_node_name yes

Note: Munin wants the host names to match between its configuration and what the munin-node calls itself (see the Troubleshooting section below).

Troubleshooting Section

Check log files
# tail /var/log/munin/munin-update.log
# tail /var/log/munin/munin-node.log

On one host there are no graphs at all!

No plugins installed on the munin node

Plugins that munin-node uses are usually to be found in /etc/munin/plugins (or /etc/opt/munin/plugins). If the directory is empty you will need to fill it. The directory should have been filled by the package installation script or by you when you read the INSTALL instructions.

You can fill it manually by symlinking to files in /usr/share/munin/plugins (or /opt/munin/lib/plugins). Or automatically by running munin-node-configure --shell | sh -x. This will which plugins it thinks are suitable on your system and make the symlinks.

After making all the symlinks restart munin-node.

Then wait 5 to 10 minutes before re-loading the munin web pages to see graphs.

Did you restart munin-node?

Restarting munin-node is a rather heavy operation requiring running all the plugins as part of the startup. Therefore munin-node does not restart itself when the contents of the plugin directory changes. So after making a change in the plugin directory you need to restart munin-node.

There is a bug in a good number of versions of the Debian (and Ubuntu) munin package that did not restart munin-node after running munin-node-configure. A manual restart is needed in this case.

Then wait 5 to 10 minutes before re-loading the munin web pages to see graphs.

Inconsistent names for the node on the master and on the node

If your node has plugins and is restarted the next possibility is: it is likely because the server and the node have inconsistent name information for the node. Munin wants the host names to match between its configuration and what the munin-node calls itself.

If you telnet to the node you'll be told the node name:

$ telnet lorbanery 4949
Trying 10.1.0.2
Connected to lorbanery.
Escape character is '^]'.
# munin node at lorbanery.langfeldt.net
quit
Connection closed by foreign host.

This means that this machine knows itself to be named lorbanery.langfeldt.net. If the name shown is not what you expected, you need to configure the correct name in munin-node.conf:

host_name lorbanery.langfeldt.net
On the master you configure lorbanery like this:

[lorbanery.langfeldt.net]
address 10.1.0.2
This makes the names identical. If you had put lorbanery in the square brackets the result would be no graphs because munin expects the whole name to be the same, and the whole name isn't lorbanery.

Restart munin-node, then wait 5 to 10 minutes before re-loading the munin web pages to see graphs.

I just read the above answer and there still aren't any graphs

Then it's time to get more advanced. Consider the following protocol exchange for lorbanery aka lorbanery.langfeldt.net. You should of course use the host name of your host, as configured in the munin.conf file. Please make very sure that you use the whole and exactly the same filename as configured in munin.conf.

$ > telnet lorbanery.langfeldt.net 4949
Trying 10.1.0.2...
Connected to lorbanery.langfeldt.net.
Escape character is '^]'.
# munin node at lorbanery.langfeldt.net
> nodes
lorbanery.langfeldt.net
wifi.langfeldt.net
.
> list lorbanery.langfeldt.net
open_inodes http_loadtime irqstats apache_accesses df swap uptime load ntp_offset cpu df_inode open_files ntp_kernel_err forks iostat memory vmstat apache_processes entropy ntp_kernel_pll_freq postfix_mailqueue processes apache_volume users interrupts netstat iostat_ios if_err_eth1 if_eth1 postfix_mailvolume proc_pri surfboard threads ntp_kernel_pll_off
> fetch df
_dev_hda1.value 66
tmpfs.value 0
udev.value 1
tmpfs.value 1
.
> quit
Connection closed by foreign host.
The user input is marked with ">"s.

This is the actual exchange used with munin-nodes that understands the nodes command. The nodes command asks the munin-node which hosts it has information for, then asks it to list the plugins that represent lorbanery.langfeldt.net. Lastly it fetches the df results from lorbanery.langfeldt.net.

If the output of the list command with the host name behind is empty, there are no plugins installed for that host. And that's the reason there are no graphs.

If there are plugins listed by the list command then you have some other problem. Please contact the users mailing-list.

Node does not "allow" master to telnet

You have not added allow to node's munin-node.conf file. See munin-node.conf. Ensure you use the reg ex syntax as prescribed there.

See also Debugging_Munin

Reference:
http://www.aisecure.net/?p=50
http://munin-monitoring.org/wiki/FAQ_no_graphs

No comments: