Dynamic Discretization of Continuous Attributes

João Gama, Luís Torgo and Carlos Soares
1998


Abstract

Discretization of continuous attributes is an important task for certain types of machine learning algorithms. Bayesian approaches, for instance, require assumptions about data distributions. Decision Trees, on the other hand, require sorting operations to deal with continuous attributes, which largely increase learning times. This paper presents a new method of discretization, whose main characteristic is that it takes into account interdependencies between attributes. This means that our method performs attribute selection as a side effect of the discretization. Empirical evaluation on five benchmark datasetes from UCI repository, using C4.5 and a naive Bayes, shows a consistent reduction of the features without loss of generalization accuracy.