<--design ^--Soigan--^ planning-->

Soigan - a Multicast XML monitoring system - design, continued

Design - Answers

I'm going to address all these questions in a completely different order than I asked them, partly because the answer to some decides the answers to others.

Protocol - XML and XML-RPC

I really like XML. I know that some people don't, but I'm hooked. It's human-readable, which makes it great for debugging. That also allows one to circumvent most clients and just retrieve data with a web browser. There are numerous libraries out there for reading and writing it, and every programming language has support for it. It's well-formed. It's portable. It allows for easy processing with XSLT. It's extensible. It's trendy.

Cons? Of course. It's bulky, in that all data must be wrapped with tags. Binary data, if needed, needs to be encoded. It's not human-readable, according to some people, because the tags make a mess of it. It requires restructuring of the results into a different form (though this would likely have to happen regardless of the protocol).

As an extension of the protocol choice, I plan on extending it to use XML-RPC. Better to use an existing standard for data transfer, with existing clients and libraries, than implement my own XML schema for this data. This will allow others to extend this system as they see fit.

Storage - none (see text)

In the beginning, I have no plans on keeping the results after they've been viewed/displayed/computed-upon. I do, however, plan on having the data contain all of the necessary information to re-create the session (time, host, request, response) so if one wished to store all of the XML responses somewhere, they could then re-read that data and have the system treat it as if it had just come in. In this way, you could have a semblance of replaying of events if you choose.

Configuration - XML

Yep, XML again, for the same reasons as before. It's very easy to process, and you could use any XML editor or text editor to edit the configuration. You could even write a utility to do so if you wanted to. It's also easy, again with XSLT, to take an XML configuration file and turn it into a different kind of file format, if the need arose.

Because we use an XML system to manage our hosts here, it's likely that the configuration information for Soigan will "fit" into the structure already used there, but will not require the extra host information that we currently keep there (for sites apart from ours).

Language - Java and C/C++

My top four languages, in order, are C, Java, C++ and Perl. Java, however, beats them all for a wide-encompassing and consistent set of libraries and APIs. C is nice and low-level, which is why it will always be my first love. But for something very object-oriented such as this, and XML, I'm not sure it's suitable. The C++ STL sucks, but that's only my opinion. Perl is my hacking language, so I admit I've never done much more than use it to bundle together multiple shell commands, or to sneak in easier regular expression processing in a hurry.

The first implementation will be in Java, hopefully with a companion version in C/C++ at the same time. This is a personal choice, really, not based on something like performance or suitability. Java, for me, is my prototyping language, one that's so good at it that it can be used as my final product (to hell with the Waterfall Model, I say!) I can get this system written in Java a lot quicker than in any other language. I do know, though, that when it becomes a serious package, there will be a need for a C/C++ implementation, for portability and performance. That's right, portability. Not that Java won't run everywhere, but that people won't run Java everywhere. Might as well cater to others!

Platforms - Linux, Solaris, Windows

These are the three main platforms in our department. The hope is that the implementation for Linux and Solaris will be kept as similar as possible, following all Posix definitions, so the code can be ported to other Unix platforms. In the case of Java, this should be simple enough, and doing the same under C/C++ should also be pretty easy.

Windows should be interesting. If we had the requirement that everyone ran Cygwin on their Windows boxes, then we'd have access to a nice Unix-like interface to port code to. As this isn't likely, we've probably got a little extra work ahead of us to support Windows, both as a client and server. How do you find out who's logged into a machine in Windows? Or what processes they're running? Or who was logged in before them? Or how much memory is being used? This is all information that can be culled from simple command line utilities (or /proc filesystems) under Unix, but there aren't any default tools under Windows for this. This means system calls to find this information, and you can be sure that these calls aren't going to map to any Unix equivalent!

Still, I think doing this for Windows will be important. I admit I don't know much about Active Directory, and perhaps all of this information is easily accessible from there (I know you can find out who's logged in to a machine with it). Perhaps an Active Directory hook will be the answer? All I know is that right now, I have no way of knowing if a specific user is logged into a Windows machine when I'm sitting on a Unix box. And that needs to change.

Plugins - Nagios, and custom

Since Nagios has so many plugins that can fetch a lot of information, we might as well support that. I'll look at the plugin-writing page for Nagios and follow that method. The Nagios plugins, from what I saw, only return single-line bits of information, such as ping times or summaries of their attempts to connect to a specific service. We need to support more than that, such as lists of users, list of processes, or lists of packages. Lots of lists. I'll also have to work out how to reconcile the output of the Nagios plugins to an XML schema -- I'll probably have a nagios_wrapper plugin that can take Nagios plugin output and turn it into the XML output I expect.

Networking - Multicast, unicast and remote probing

I think support for all three is required. Multicast, as described before, will cut down on the traffic when asking the "where is this happening?" questions. Unicast allows for specific questions to individual machines, and for unsolicited information from a machine. Remote probing, while possible to accomplish from a client running on the host machine, might be desirable for two reasons. First, we have self-administered machines in this department on which we do not have administrative privileges, so we can't force our client onto these machines to give up useful information. Additionally, if we want to support external plugins such as from Nagios, this is the way to do it.


To sum up, then, we're looking to write a system that
<--design ^--Soigan--^ planning-->
©2002-2018 Wayne Pearson