YAILS Research Project [1991-1993]

by Luís Torgo

This was the basis of my Scientific Capabilities Examinations (something similar to a Master Thesis by Portuguese standards).

YAILS is an incremental rule learning systems that deals with supervised classifcation problems. The system was a result of my initial studies on Machine Learning. The system includes some intuitively interesting (at least in my opinion!) ideas. Among these I may refer the following :

The system does not use a covering algorithm - unlike the systems that I knew at that time (and even now with the exception of RISE system by Pedro Domingos), YAILS did not use a covering cycle as the main algorithm like most rule learning systems do (eg. AQ, CN2,etc.). These type of algorithms find a rule with the goal of covering the yet uncovered examples. After obtaining this rule they throw away the covered examples and proceed considering only the remaining examples. YAILS on the contrary does not throw away any example after creating a new rule. This has as consequence that each rule is evaluated globally (i.e. taking all learning set in consideration). Another consequence is that the resulting theory can be redundant in the sense that rules may have overlapping coverage. The main advantage of using a non-covering search algorithm lies on the fact that rule quality estimates are obtained on all training data and are thus more reliable. This means that YAILS does not suffer from the small sub-sets problem typical of covering algorithms (like for instance decision trees). The disadvantage of this search method is on computational efficiency as the system "works" always with all training samples while covering algorithms get faster as instances are being thrown away.

The systems uses a bi-directional search methodology - unlike most of the systems that use either a top-down or bottom-up search algorithm, YAILS has both specialization as well as generalization operators to modify its rules during the learning process. This was mainly motivated by the goal of having an incremental algorithm that given new examples was able to modify its current theory (thus the need for both operators to revise previous search decisions).

YAILS enables a controlled use of redundancy - the system allows the user to control the level of rendundancy on the resulting theory. Although redundancy was proved to be benefical in terms of accuracy (Torgo, 1993) it is harmful in terms of theory comprehensibility as the theories are larger. YAILS provides a flexible mechanism of controlling this level of redundancy by means of dividing the learned rules in two separate sets, a foreground set of rules (the result of learning) and a background set which is only used when the former gives no answers to a user query.

YAILS uses a flexible matching mechanism to perform classification - the system attaches a set of weights to each of the conditions of a learned rule. These weights are sopoused to estimate the conditions' importance within the rule (using an entropy-based heuristic measure). The weights are then used in conjunction with an estimated rule quality to perform flexible mathcing classification. Each learned rule is matched with the given query and a matching score is obtained for each rule which takes into account the conditions' weights. This matching score is used together with the rules' quality to obtain an opinion score of the rule (again through a heuristic formula). The rule with highest opinion score is chosed to classify the example. Notice that this scheme allow that rules whose conditions are not completly satisfied by the query example, still be candidates for classifying the query.

YAILS obtained interesting experimental results on several real world domains. However the systems lacks some theoretical justification of several of the implemented features. Another problem is the fact that due to its incremental nature and to the fact that it is implemented in Prolog, the system is quite slow.

Some references about this research line :

Torgo, L.(1992): YAILS, an incremental learning program - LIACC, Machine Learning Group, Technical Report-92.1
(69028 bytes in format ".ps.gz")(HTML version)

Torgo, L.(1992): YAILS, a brief user's manual

Torgo,L. (1993) : Controlled Redundancy in Incremental Rule Learning in Proceedings of the European Conference on Machine Learning (ECML-93), Brazdil,P.(ed.), Lecture Notes in Artificial Intelligence 667, Springer Verlag.
(Abstract) (49308 bytes in format ".ps.gz")(HTML version)