<--planning ^--Soigan--^ even more planning-->

Soigan - a Multicast XML monitoring system - planning, continued

Planning, continued

Protocol - XML (Internal)

This section deals with the way that Workers communicate internally -- that is, with their plugins.

In with the new, in with the old

As mentioned in the design section, we want to maintain compatibility with Nagios plugins, both because they're numerous and pre-existing, as well as to help support plugin writers by allowing them to write for Nagios as well as Soigan. A bit hopeful, am I, that Soigan will go anywhere? We can dream...

Before we look into how Nagios plugins work, and how we might design our own plugin protocol, we need to consider that the Worker has to be able to support them both, and know which is which. We could, I suppose, design the Soigan method to be an extension of the Nagios method, but that might restrict us unnecessarily. Instead, we'll require the configuration file to denote whether a plugin is a Nagios plugin so the Worker knows to expect that type of output. Let's keep that in mind for the next section.

Nagios

The following is based on what I read.

Of the above, the only things we're concerned about are the return codes and the performance data. The return code allows us to determine whether or not the plugin was successful or not. We'll likely ignore the Warning value (which should never appear), because we aren't using Nagios plugins to make that determination for us -- Soigan collects data, and other tools determine whether the results are problematic. With most Nagios plugins, we'll be using them in a boolean way, determining whether a service is up or not, which is when we should see the Critical return code.

The Performance Data, or the line of output in general, might be parsed by Soigan to provide further information. For instance,

% ./check_ntp -H ntp
NTP OK: Offset 0.000375 secs, jitter 0.020 msec, peer is stratum 1
% ./check_users -w 100 -c 100
USERS OK - 2 users currently logged in
% ./check_ssh -H localhost
SSH OK - OpenSSH_3.6.1p2 (protocol 1.99)
None of these plugins support the Performance data extension, so their single lines of text aren't generally parseable. They can, however, be grabbed and returned as a generic result item.

Additionally, two of the above plugins, check_ntp and check_ssh, can also return more information by using the -v verbose option. Again, this isn't in a fixed format, so isn't really parseable unless we write either a parsing language that we include in the configuration file to detail how to pull useful information out; or by writing a wrapper around each (or all) Nagios plugins that knows the format of each and can restructure them accordingly.

I think I'm starting to regret deciding to support Nagios plugins, especially because they really do work differently than Soigan wants them to. Still, there are many out there, and a good number of them do simple checks on remote machines that we still need to do.

For now, then, a plugin labelled as "nagios" will be processed by a Worker by

For instance,
% ./check_imap -H imap
IMAP OK -   0.007 second response time on port 143 [* OK [CAPABILITY IMAP4REV1 LITERAL+ SASL-IR LOGIN-REFERRALS STARTTLS AUTH=LOGIN] imap IMAP4rev1 2004.350 at Thu, 3 Jun 2004 13:32:33 -0600 (MDT)]|time=  0.007
would create this XML-RPC:
<?xml version="1.0" encoding="ISO-8859-1"?>
  <methodCall>
    <methodName>plugin.results</methodName>
    <params>
      <param>
        <value>localhost</value>
      </param>
      <param>
        <value><dateTime.iso8601>20040602T11:46:00</dateTime.iso8601></value>
      </param>
      <param>
        <value>imap</value>
      </param>
      <param>
        <value><struct>
          <member>
            <name>host</name><value>imap</value>
          </member><member>
            <name>status</name><value><int>0</int></value>
          </member><member>
            <name>text</name><value> -   0.007 second response time on port 143 [* OK [CAPABILITY IMAP4REV1 LITERAL+ SASL-IR LOGIN-REFERRALS STARTTLS AUTH=LOGIN] imap IMAP4rev1 2004.350 at Thu, 3 Jun 2004 13:32:33 -0600 (MDT)]</value>
          </member><member>
            <name>time</name><value><double>0.007</double></value>
          </member>
        </struct></value>
      </param>
    </params>
  </methodCall>
The biggest amount of work for the Worker (or rather, the Nagios-support portion of the Worker) is taking the Performance data and converting it into the structure members. There's also a slight chance that the Nagios plugins will use the names host, status and text, which would collide with the pre-existing structure members. Do we rename the known ones to nagios-host, etc., or rename the Performance data ones (perf-data)? We'll figure that out as we go.

Soigan

Of course, Soigan is going to use XML. A "native" Soigan plugin will output XML by default, but might support output in Nagios format, or plain text, given the right flags. And what should this XML be?

I think it makes perfect sense to have the output from the plugin closely match the XML-RPC code that will make up the Response. But how much of the XML-RPC call should be output? Why not all of it?

<?xml version="1.0" encoding="ISO-8859-1"?>
  <methodCall>
    <methodName>plugin.results</methodName>
    <params>
      <param>
        <value>mailhost</value>
      </param>
      <param>
        <value><dateTime.iso8601>20040601T09:46:00</dateTime.iso8601></value>
      </param>
      <param>
        <value>ping</value>
      </param>
      <param>
        <value><struct>
          <member>
            <name>packetloss</name><value><int>0</int></value>
          </member><member>
            <name>rta</name><value><double>0.90</double></value>
          </member>
        </struct></value>
      </param>
    </params>
  </methodCall>
A Worker could then just open up a connection to the Server and spit back the exact output from the plugin. The Worker then becomes a simple pipe between the network and the machine. Great!

What if, however, we don't want to write our Worker from scratch, where we open up network sockets and shove bytes down it? We are, after all, supporting XML-RPC, and most Workers are likely to be written using XML-RPC libraries. For instance, in Java, we might do something like the following:

XmlRpcClient client=new XmlRpcClient(server);
Vector params=new Vector();

params.addElement(new String("mailhost"));
params.addElement(new Date(Calendar.getInstance().getTime());
params.addElement(new String("ping"));

Hashtable hashtable=new Hashtable();
hashtable.put("packetloss",0);
hashtable.put("rta",0.90);

params.addElement(hashtable);

cilent.execute("plugin.results",params);
If we want to allow a Worker author to support using XML-RPC directly like this, we need a way to supply all of the information above from the plugin. Sure, they could take the XML that we returned above and walk through it, but is there an easier format we could send that would allow a programmer to iteratively go through the steps above?
string
mailhost
date
2301230123102
string
ping
struct
int
packetloss
0
double
rta
0.90
end
The above is just off the top of my head, so may not be the best, but it's easily parsed, I should think. The struct marker assumes that everything after it is in the structure, which is what our format has been. Note that the date field is in seconds-since-the-epoch format, which every language supports.

If I have an epiphany and think of a better way to do the above, I'll change it, but I'll quickly write some Java to see how easy it would be to use.

XmlRpcClient client=new XmlRpcClient(server);
Vector params=new Vector();
Hashtable hastable=new Hashtable();

String line;
boolean instruct=false;

do {
  line=plugin.readLine();
  if (line.equals("end")) {
    break;
  }

  if (instruct) {
    String line2=plugin.readLine();
    String line3=plugin.readLine();

    if (line.equals("string")) {
      hashtable.put(line2,line3);
    } else if (line.equals("int")) {
      hashtable.put(line2,Integer.parseInt(line3));
    } else if (line.equals("double")) {
      hashtable.put(line2,new Double(line3));
    } else if (line.equals("date")) {
      hashtable.put(line2,Long.parseLong(line3));
    }
  } else {
    if (line.equals("struct") {
      instruct=true;
    } else {
      String line2=plugin.readLine();

      if (line.equals("string")) {
        params.addElement(new String(line2));
      } else if (line.equals("int")) {
        params.addElement(new Integer(line2));
      } else if (line.equals("double")) {
        params.addElement(new Double(line2));
      } else if (line.equals("date")) {
        params.addElement(new Date(line2));
      }
    }
  }
} while (true);

params.addElement(hashtable);
client.execute("plugin.results",params);
That doesn't seem so bad; I'm sure I could pretty it up even more if I wanted to. Note that this code is generic, regardless of the plugin. One thing it doesn't handle is arrays. We can do this
string
mailhost
date
230123001230
string
users
struct
int
count
3
array
names
string
ben
string
crwth
string
crwth
end
end
to represent the data from the users plugin. In theory, a Worker should be able to handle arrays of arrays, or structures with structures in them. We could require that plugins don't get that "nested", I suppose. We'll leave that on the backburner for now.

Now we have to ask ourselves, "which format should be the default?" Should we return the XML, or the above XML-RPC -ready form? I hate to say it, but I think the XML should be secondary, perhaps with a -X flag. The fact is that most Workers will be using XML-RPC APIs, and won't be making direct network connections to which the XML can be passed, so we shouldn't force people to parse through XML just to make more.

Schemas

Previously we came up with a way for Servers to ask Workers about the schema that a given plugin would return. How do the Workers know this? They have to ask the plugins, of course!

In the case of Nagios plugins, we have a problem with retrieving the schema, since that's not something Nagios plugins support. Additionally, Nagios plugins report their status reliably (OK, WARNING, CRITICAL), but the text afterwards is free-flowing. The Worker could figure it out after it ran the plugin once, I suppose. Instead, I think we'll just report that a Nagios plugin's schema has the known values -- host, status and text -- and leave the rest for those who know about them (instead of discovery).

This leaves the Soigan plugins. How should they return their schemas? Should this be part of every call to them -- providing a schema as well as results? Or should there be a flag that can be passed to a plugin to get it to spit out its schema? I like the second, myself, to reduce the output of plugins in general. One can hope that a Server might only ask a Worker about its plugin schemas once it a very long while.

For the optional XML-RPC output, we mentioned using a -X flag, so let's use a -S flag to ask a plugin for its schema. If both are used together, the plugin should spit out an XML-RPC schema:

<?xml version="1.0" encoding="ISO-8859-1"?>
  <methodCall>
    <methodName>schema.results</methodName>
    <params>
      <param>
        <value>mailhost</value>
      </param>
      <param>
        <value><dateTime.iso8601>20040601T10:10:20</dateTime.iso8601></value>
      </param>
      <param>
        <value>ping</value>
      </param>
      <param>
        <value><struct>
          <member>
            <name>packetloss</name><value>int</value>
          </member><member>
            <name>rta</name><value>double</value>
          </member>
        </struct></value>
      </param>
    </params>
  </methodCall>
As mentioned before, something fully ready to be passed onto the network socket. Without the -X option, we'd expect something like this:
string
mailhost
date
2301230123102
string
ping
struct
string
packetloss
int
string
rta
double
end
It's very familiar to the output we had before when it was a plugin result. This makes sense, since the XML-RPC looks very similar between results and schemas. This is all the information needed by an XML-RPC program to create the call needed to give the Server the Response it's looking for.
<--planning ^--Soigan--^ even more planning-->
©2002-2017 Wayne Pearson