<< , >> , up , Title , Contents

3.2 Generating candidate sets of discrete classes

In this section we address the problem of modifying the current discretization setting based on its estimated predictive accuracy result. The search space consists of all possible partitions of a set of continuous values. Our system has two alternative ways of exploring the search space. Both are applicable to the 3 splitting methods that were mentioned on section 2.1:

· Varying the number of intervals (VNI)

This alternative consists on trying several values of the number of intervals with the current splitting strategy. We start with a number of intervals and on each iteration of the search process we increment this number by a constant value. This is the more obvious way of improving the splitting methods presented in section 2.1.

· Selective specialization of individual classes (SIC)

The second alternative is a bit more complex. The basic idea is to try to improve the previously tried set of intervals (classes). We start with any given number of intervals and during the CV-evaluation we also calculate the error estimates of each individual discrete class. The next trial is built by looking at this individual error estimates. The median of these errors is calculated. All classes whose error is above the median are specialized. The specialization consists on splitting each of these classes in two other classes. We do this by applying the current splitting method to the values within that class interval. All the other classes remain the same in the next iteration.

The next section provides an illustrative example of these two search alternatives in a discretization task.


<< , >> , up , Title , Contents