Several experiments with YAILS were performed on real world domains. The three medical domains chosen were obtained from the Jozef Stefan Institute, Ljubljana. This choice enables comparisons with other systems as these datasets are very often used to test learning algorithms. On the other hand, the datasets offer different characteristics thus enabling the test to be more thorough. Table 1 shows the main characteristics of the datasets :
Table 1. Main characteristics of the datasets.
Lymphography Breast Cancer Primary Tumour ================================================================================ Dimension 148 exs./18 attrs. 288 exs./10attrs. 339 exs./17attrs. 4 classes 2 classes 22 classes Attributes Symbolic Symbolic+numeric Symbolic Noise Low level Noisy Very noisy Unknowns No Yes Yes
The experiments carried out had the following structure: each time 70% of examples where randomly chosen for training and the remaining left for testing; all tests were repeated 10 times and averages calculated.
Table 2 presents a summary of the results obtained on the 3 datasets (standard deviations are between brackets).
Table 2. Results of the experiments.
Lymphography Breast Cancer Primary Tumour Accuracy 85% (5%) 80% (3%) 34% (6%) No. of Used Rules 14 (2.7) 13.9 (5.6) 37.2 (2.8) Aver.Conditions / Rule 1.86 (0.2) 1.94 (0.13) 1.96 (0.22)
The results are very good on two of the datasets and the theories are sufficiently simple (see table 3 for a comparison with other systems). This gives a clear indication of the advantages of redundancy. We should take into account that YAILS is an incremental system which means that all decisions are made sin a step-wise fashion and not with a general overview of all the data as in non-incremental systems. Because of this, a lower performance is generally accepted. This is not the case with YAILS (with exception to primary tumour) as we can see from the following table :
Table 3. Comparative results.
Lymphography Breast Cancer Primary Tumour System Accuracy Complexity Accuracy Complexity Accuracy Complexity YAILS 85% 14 cpxs. 80% 13.9 cpxs. 34% 37.2 cpxs. Assistant 78% 21 leaves 77% 8 leaves 42% 27 leaves AQ15 82% 4 cpxs. 68% 2 cpxs. 41% 42 cpxs. CN2 82% 8 cpxs. 71% 4 cpxs. 37% 33 cpxs.
The results presented in table 3 do not establish any ranking of the systems as this requires that tests of significance are carried out. As no results concerning standard deviations are given in the papers of the other systems and the number of repetitions of the tests is also different, the table is merely informative. It should also be noted that AQ15 uses VL-1 descriptive language that includes internal disjunctions in each selector. This means that, for instance, the 4 complexes obtained with AQ15 are much more complex than 4 complexes in the language used by YAILS (which does not allow internal disjunction).