[Xymon] Incorrect Uptime Calculation for Solaris Clients Causes Regular NaNs in la.rrd and users.rrd [SEC=UNCLASSIFIED]

McAvoy, Jason MR jason.mcavoy at defence.gov.au
Tue Sep 25 02:24:45 CEST 2012


UNCLASSIFIED

I have found an issue in xymond_client.c where it fails to detect the correct uptime from Solaris 10 clients (and it may occur with other versions).
The symptom I had was that the charts for load and users & processes had regular NaN updates, at uniform spacing every hour.

The cause was the following code;

else if (strncmp(hourmark, "1 hr", 4) ==0) {
    uptimesecs = 3600;

This line only looks for uptimes with "1 hr" in them. My Solaris 10 systems report uptime like this;

11:05pm up 830 days(s), 21:55, 0 users, load average: 0.18, 0.20, 0.20
11:10pm up 830 days(s), 22 hrs, 0 users, load average: 0.19, 0.20, 0.20
11:15pm up 830 days(s), 21:05, 0 users, load average: 0.16, 0.20, 0.20

Note that at 11:10pm it reported "22 hrs", instead of the expected "22:00", and that "22 hrs" does not match "1 hr".

This was not matched in the uptimesecs calculation code, so the load and user graphs were not getting passed values (which I verified using the --processor switch to rrdstatus xymond_channel), and a Nan would appear in the la.rrd and users.rrd files, I also confirmed via a clientlog "section=uptime" query every 5 mins, which would return an empty set for uptime when the client sent uptime with xx hrs instead of xx:xx, but no one would notice this as the cpu doesn't go purple until it hasn't been updated for several polls, this is only missing one poll every hour.

A patch to fix xymond_client.c is;
373,364c373,374
<        else if(strncmp(hourmark, "1 hr", 4) == 0) {
<            uptimesecs = 3600;
---
>        else if (sscanf(hourmark, "%ld hr", &uphour) == 1) {
>            uptimesecs = 3600*uphour

This stops the interuptions to load and user graphs, and ensures the cpu status is updated every time the client sends an update.

--
Note: I am only at Defence Mondays and Tuesdays, so please cc: any response to jason.mcavoy at saltbushgroup.com<mailto:jason.mcavoy at saltbushgroup.com>.


IMPORTANT: This email remains the property of the Department of Defence and is subject to the jurisdiction of section 70 of the Crimes Act 1914. If you have received this email in error, you are requested to contact the sender and delete the email.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20120925/ade3a8da/attachment.html>


More information about the Xymon mailing list