The decision concerning which system we should choose to generate individual theories is not really too important in the context of this work. Here we require only that the system(s) are capable of generating theories that perform reasonably well on tests. As we had earlier reimplemented AQ- and ID3- like systems, we decided to use these as the basic inductive engines in our set-up.
The inductive rule learning system Irule1 is an incremental learning program that was partially inspired by CN2 [Clark and Niblett, 1989]. This system updates the existing rules incrementally, using operations of generalization and/or specialization.
The reimplementation of ID3 based on earlier work (e.g. Quinlan [1986], Cestnik et al. [1987], Clark and Niblett [1987; 1989]) will be referred to as Itree1 (Inductive Tree Learning System). So far this system does not incorporate pruning. The decision tree generated is automatically converted into a rule form which we find more amenable for further manipulation.
Different theories that are required by the knowledge integration system (INTEG3.1) are generated by the inductive learning systems in a series of independent learning tasks. In each task the inductive learning system generated a theory (consisting of a set of rules) on the basis of its own data.
Following the standard practice, 30% of the data available were separated out by a random process and reserved for the final tests of the integrated theory. The remaining data were used for the creation of alternative theories. In each experiment some cases were selected at random from the pool of data available. Let us call this set Di. Set Di was then supplied to the appropriate inductive learning system to generate theory Ti. This process was then repeated to generate the next theory. In most of our experiments we have generated four theories. Some were generated by ITree1 and others by Irule1 (see Fig.1).
As sets D1 .. Dn supplied to the inductive system were selected at random from the same pool of examples, these sets could have some cases in common. We had no objection to that, because in real life people may also encounter cases that are identical to those used by others.