Munin is a monitoring tool. It can survey a number of computers, keep statistics and generate pretty graphs (static images + html) that you can publish using any kind of a web server. Its emphasis is on plug and play capabilities and ease of use.
I've stumbled on a small, but quite annoying bug in the
Multips plugin of
munin-node.
If your server has the
pgrep command available, then Multips uses the following command to count the number of processes with a given name and a given regular expression:
pgrep -f -l "$name" | grep "$REGEX" | wc -l
If
pgrep is not available, then Multips uses only the regexp to filter the process list:
ps auxwww | grep "$REGEX" | grep -v grep | wc -l
It's quite obvious that the two variants work quite differently. The
pgrep version filters on the field name too, while the
ps version does not (and imho the latter is the correct approach). Eg. if you want to count the number of Oracle processes on a server running Oracle Express edition, you might want to have a label like "oracle" and a regexp like
"^[0-9]* xe_[a-zA-Z0-9]*_XE". The current version of Multips will filter the processes based on the label (
oracle in our example) too and since we are looking for processes with names not containing the string that we selected as field label, the command will return zero.
Actually the filtering on the field label does no good at all, so removing it will solve all our problems. Change this:
$PGREP -f -l "$name" | grep "$REGEX" | wc -l
to this:
$PGREP -f -l . | grep "$REGEX" | wc -l
One more hint while I'm at it: the field labels in the
munin-node plugin configuration file's Multips section (in the value of
env.names and in the variable names
env.regex_*) must contain only word characters (where "word" is meant as in Perl regular expressions ... thus word characters are letters (any case), numbers and the underscore). Otherwise you'll most probably get "nan" as the counter of the given process in the generated graphs.
PS: if you're using Debian, then you'll find the Multips plugin at
/usr/share/munin/plugins/multips in the
munin-node package. If you patch the plugin, you might want to set the package to "hold" (
echo "munin-node hold" | dpkg --set-selections) so it won't get automatically updated during an
apt-get upgrade.
Comments
I don't agree with your solution
On Linux at least (didn't check on Solaris, but it claims to be fully compatible with it), pgrep is checking its pattern as an Extended Regular Expression.
So, I would not replace the original line:
<code>pgrep -f -l "$name" | grep "$REGEX" | wc -l</code>
with yours, but rather with this one:
<code>pgrep -f -l "$REGEX" | wc -l</code>
BTW, it also fixes the fact that original code gets the line of the grep at the same time.
Re: I don't agree with your solution
pgrepexecutes faster than apgrep+grepcombo. However using just apgrepwill break already existingmultipsconfigurations since (as you noticed in your comment)pgrepwould interpret the$REGEXas an extended regular expression, whilegrepinterprets it as a basic regular expression. Thus regexps written for the originalmultipscode would have a fair chance to fail with your modification. My fix is backward compatible with the originalmultipsplugin (as far as configuration and regexps are concerned). You could argue that in this case we could replace the call topgrepalltogether with a call tops. You'd be probably right. But this would mean that you've to look into the source ofpgrep, check the exact format of it's output with the-l -foptions and figure out how to recreate the same output withps. I admit I was too lazy for that.