In this paper we describe a method for merging several separate theories (knowledge bases)[1]. We assume that, in general, these will have been generated in different ways. That is, some theories can be obtained by querying an expert and transcribing his knowledge in the form of rules. Others can be obtained on the basis of data, using inductive learning tools. The theories may differ from one another in various ways, and then it is not quite clear which theory is right. Our system (INTEG3.1) is capable of analysing the individual theories and its rules, and selecting (or marking) some to be included in the integrated theory. These selected rules determine the decision of the overall system.
Knowledge integration is concerned with issues that are related to those in incremental learning systems, such as ID4 [Schlimmer and Fisher, 1988], ID5 [Utgoff, 1988], AQ16 [Janikow, 1989], for example. Both knowledge integration and incremental learning attempt to construct a theory that explains best the given data. There are some important differences between the two approaches, however.
When we talk about incremental learning, usually it is assumed that only one system is constructing theories by employing some incremental version of a given learning algorithm. In consequence all the data is actually analyzed by this system at one time or another. Knowledge integration, on the other hand, involves several systems all of which try to construct their own theories on the basis of their own experience. Although knowledge integration may require common data in the process of constructing an integrated theory, knowledge integration tends to capitalize on the results obtained by different systems. For us an important issue is - how to make use of theories that have already been constructed.
If we let the systems communicate theories instead of data, some learning effort will be saved. On the other hand we could expect that some information would be lost in the process. It follows that the results obtained by knowledge integration should not supercede the results obtained by incremental learning systems. Our experiments have, however, indicated the contrary. In our experiments the integrated theory had a better performance than the incremental learning systems (see Section 3.1 for details).
Knowledge integration enables to overcome one problem that incremental systems have, and that is, how to improve upon the theory provided by the user. Incremental learning systems need to keep various statistical measures in the memory (e.g. informativity of various attributes etc.). These enable the system to update its theory when new data gets available. If these are not supplied by the user, the incremental systems can do little to improve the user<<s theory.
The rest of the paper is organized as follows. Section 2 describes the basic method of knowledge integration, including how we assess rule quality. Section 3 describes the results of our experiments with INTEG3.1. This system is an enhanced version of INTEG.3 that has been described in an earlier paper [Brazdil and Torgo, 1990]. Section 4 discusses some other alternative methods of evaluating rule quality. This section is followed by a general discussion and conclusions.