Machine Learning

The PDF format is used to allow ease of reuse of figures and text, quoted with citation.

Publication citations are given where a version of the report has been published. Often material has been edited in publication and the cited version differs to some extent.

A learning machine in the context of the general control problem, Brian R Gaines and John H. Andreae, Proceedings of the 3rd Congress of the International Federation of Automatic Control (IFAC), London, pp.342-9, 1966. PDF.

In this paper the STeLLA learning scheme, which was described at the Second IFAC Congress (Andreae, 1963) in the context of games-playing and pattern manipulation environments, is examined both for the nature of its alternative simplification of the general control problem and for the relationship of its control strategies to those proposed in adaptive control. These strategies are exemplified by its behaviour in controlling a second-order, sampled-data, stochastic plant with bounded phase-space and transport-lag. <PDF has discussion appended)

An Introductory Description of the STeLLA Learning Machine, John H. Andreae and Brian R Gaines, Standard Telecommunication Laboratories Technical Memorandum No.556, August 1966. PDF.

The Standard Telecommunication Laboratories Learning Automaton has been described in a previous Technical Memorandum, No. 523, from the point of view of automatic control theory (Gaines and Andreae, 1966). This account is a relatively non-technical introduction to the problem and strategies of the learning machine.

Adaptive pattern classifiers in control systems, B.R. Gaines and D.J.Quarmby, Institution of Electrical Engineers, Conference on Pattern Recognition, National Physical Laboratory, July 1968. PDF.

Whilst the convergence of adaptive-threshold-logic elements may be dealt with quite simply when they are used for pattern recognition, the behaviour of such devices is far more complex and less amenable to analysis when they form the variable element of an adaptive control system. This is because they have the dual control problem of identifying their environment by classification whilst controlling it, and performance feedback is both relative and global. This paper considers five problem areas in the utilization of pattern classifying control systems: the derivation of performance feedback, appropriate coding of input and response patterns, the state-space of adaption and training, linguistic communication with the controller, and hardware realization of the controller.

Approximate identification of automata, Brian R Gaines, Electronics Letters, 11(18), 444-445, 1975. PDF.

A technique is described for the identification of probabilistic and other non-deterministic automata from sequences of their input/output behaviour. For a given number of states the models obtained are optimal in well defined senses, one related to least-mean-square approximation and the other to Shannon entropy. Practical and theoretical investigations of the technique are outlined.

On the complexity of causal models, Brian R Gaines, IEEE Transactions on Systems, Man & Cybernetics, SMC-6(1), 56-59, 1976. PDF.

It is argued that principle of causality is fundamental to human thinking, and it has been observed experimentally that this assumption leads to complex hypothesis formation by human subjects attempting to solve comparatively simple problems involving acausal randomly generated events. This correspondence provides an automata-theoretic explanation of this phenomenon by analyzing the performance of an optimal modeler observing the behaviour of a system and forming a minimal state model of it.

Stability and admissibility of adaptive threshold logic convergence, Brian R Gaines & Ian H Witten, IEEE Transactions on Computers C-26, 88-91, 1977. PDF.

It is shown that an adaptive threshold logic element (ATLE) whose weight vector is perturbed reconverges without error provided its threshold is large compared with the perturbation. This result is generalized to a criterion of convergence, based on admissibility, for the nonseparable case. It is shown that conventional ATLE's cannot cope with nonseparability even according to this very weak criterion.

System identification, approximation and complexity, Brian R Gaines, International Journal of General Systems, 3(3), 145-174, 1977. PDF.

This paper is concerned with establishing broadly-based system-theoretic foundations and practical techniques for the problem of system identification that are rigorous, intuitively clear and conceptually powerful. A general formulation is first given in which two order relations are postulated on a class of models: a constant one of complexity; and a variable one of approximation induced by an observed behaviour. An admissible model is such that any less complex model is a worse approximation. The general problem of identification is that of finding the admissible subspace of models induced by a given behaviour. It is proved under very general assumptions that, if deterministic models are required then nearly all behaviours require models of nearly maximum complexity. A general theory of approximation between models and behaviour is then developed based on subjective probability concepts and semantic information theory The role of structural constraints such as causality, locality, finite memory, etc., are then discussed as rules of the game. These concepts and results are applied to the specific problem or stochastic automaton, or grammar, inference. Computational results are given to demonstrate that the theory is complete and fully operational. Finally the formulation of identification proposed in this paper is analysed in terms of Klir's epistemological hierarchy and both are discussed in terms of the rich philosophical literature on the acquisition of knowledge.

General system identification, Brian R Gaines, In Klir, G.J., Ed. Applied General Systems Research. pp. 91-104. New York, USA: Plenum Press, 1978. PDF.

This paper is a brief survey of the current state-of-the-art of system identification, concentrating on the general problem rather than the massive amount of work on special cases. Eykhoff's book, the I.F.A.C. symposia, and I.E.E.E. transactions on Control, Information Theory, Circuits and Systems, System Man and Cybernetics, etc., give accounts of specific states-of-the-art. I have previously published a comprehensive formulation of general systems identification giving over 200 related references (Gaines, 1977a). This present note abstracts the key features of that formulation and brings the earlier report up-to-date with notes on recent developments.

Sequential fuzzy system identification, Brian R Gaines, Fuzzy Sets and Systems 2(1), 15-24, 1979. PDF.

The problem of deriving the structure of a non-deterministic system from its behaviour is a difficult one even when that behaviour is itself well-defined. When the behaviour can be described only in fuzzy terms structural inference may appear virtually impossible. However, a rigorous formulation and solution of the problem for stochastic automata has recently been given and, in this paper, the results are extended to fuzzy stochastic automata and grammars. The results obtained are of interest on a number of counts: they are a further step towards an integrated theory of uncertainty; they give new insights into problems of inductive reasoning and processes of precisiation; they are algorithmic and have been embodied in a computer program that can be applied to the modelling of sequential fuzzy data; they demonstrate that sequential fuzzy data may be modelled naturally in terms of possibility vectors.

An ounce of knowledge is worth a ton of data: quantitative studies of the trade-off between expertise and data based on statistically well-founded empirical induction, Brian R Gaines, Proceedings of the Sixth International Workshop on Machine Learning. 156-159. San Mateo, California: Morgan Kaufmann, 1989. PDF.

There is currently a division in knowledge acquisition research and practice between techniques for the transfer of existing knowledge from human experts and those for the creation of new expertise through machine learning. This paper reports studies of the spectrum of trade-offs between these two extremes, measuring the amount of data required to attain knowledge through empirical induction given different forms and levels of expertise. This gives a principled economic evaluation of knowledge which can be used to guide knowledge acquisition theory and practice.

Induction of ripple-down rules applied to modeling large databases, Brian R Gaines and Paul Compton, Journal for Intelligent Information Systems, 5(3) 211-228, 1995. PDF.

A methodology for the modeling of large data sets is described which results in rule sets having minimal inter-rule interactions, and being simply maintained. An algorithm for developing such rule sets automatically is described and its efficacy shown with standard test data sets. Comparative studies of manual and automatic modeling of a data set of some nine thousand five hundred cases are reported. A study is reported in which ten years of patient data have been modeled on a month by month basis to determine how well a diagnostic system developed by automated induction would have performed had it been in use throughout the project.

Structured and unstructured induction with EDAGs, Brian R Gaines, Fayyad, U.M. & Uthurusamy, R., Eds. Proceedings of the First International Conference on Knowledge Discovery and Data Mining (KDD-95). 124-129. Cambridge, Massachusetts: MIT Press, 1995. PDF.

Exception directed acyclic graphs (EDAGs) are knowledge structures that subsume trees and rules but can be substantially more compact. Manually constructed and induced EDAGs are compared by reconstructing Shapiro's "structured induction" of a chess end game. It is shown that the induced EDAG is very similar to that produced through consultation with experts, and that both are small, comprehensible solutions to a problem that is complex for people.

Transforming rules and trees into comprehensible knowledge structures, Brian R Gaines, Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P. & Uthurusamy, R., Eds. Advances in Knowledge Discovery and Data Mining. 205-226. Cambridge, Massachusetts: MIT Press, 1996. PDF.

The problem of transforming the knowledge bases of expert systems using induced rules or decision trees into comprehensible knowledge structures is addressed. A knowledge structure is developed that generalizes and subsumes production rules, decision trees, and rules with exceptions. It gives rise to a natural complexity measure that allows them to be understood, analyzed and compared on a uniform basis. The structure is a directed acyclic graph with the semantics that nodes are premises, some of which have attached conclusions, and the arcs are inheritance links with disjunctive multiple inheritance. A detailed example is given of the generation of a range of such structures of equivalent performance for a simple problem, and the complexity measure of a particular structure is shown to relate to its perceived complexity. The simplest structures are generated by an algorithm that factors common sub-premises from the premises of rules. A more complex example of a chess dataset is used to show the value of this technique in generating comprehensible knowledge structures.

CPCS 6-Apr-2012