Without a title - 4.3 Efficiency problems

4.3 Efficiency problems

Yails had several versions (and will have more probabily) mainly in order to solve efficiency problems. In effect the first versions of YAILS were rather slow on learning average size problems. Several new heuristics were introduced as well as improvements on the efficiency of the written Prolog code. At present time YAILS learns quickly several problems but still can have some difficulties on others. Just to give an idea of the times I am talking about, YAILS learns the Lymphography (148 exs; 18 attrs; 4 classes; no numerical; no noise; no unknowns) dataset in few seconds, but takes few minutes to learn Breast Cancer (288 exs.; 10 attrs.; 2 classes; some numerical attrs.; some unknowns; some noise), and can take even more to learn Primary Tumor (339 exs.; 17 attrs.; 22 classes; no numeric; very noisy; lots of unknowns).

There are several causes to this learning time. First of all YAILS is an incremental program which means that it can make lots of "wrong" decisions during the first steps of learning. Secondly the treatement of problems such as noise and numerical attributes (invention of intervals, etc.) in an incremental program is very costy. The cause for that is that these operations are computacionally heavy, and being incremental raises the frequency with which they are made. Another issue is that redundancy has as a consequence the existence of more rules and looking at YAILS algorithms we can very easly see that its example's processing time is highly dependent on the number of rules (this is the case of the Primary Tumor dataset which has 22 classes thus with a big number of rules). Due to this there is the intention of incorporating into YAILS the possibility of learning at the same time a batch of examples but using the same search strategy. Some prototypes were already made and the results were promising in terms of learning time.

The initialization of a classification task can also be slow. The cause for that is the needed calculation of rules weights as well as the separation of the learned theory.

<< , >> , up , Title , Contents