RECLA (REgression through CLAssification) Research Project [1996]

by Luís Torgo and João Gama


This research project was mainly motivated by a work of Weiss and Indurkhya on Rule-based Regression (see for instance JAIR, vol. 3). Their work dealt with regression by mapping this problem to classification. Our goal was to extend this idea to any classification system. By means of RECLA system we may use any classification algorithm on any regression problem, thus greatly extending the applicability of a wide range of ML systems.

We have added extended flexibility to this mapping task by using a wrapper approach that tries to optimize the mapping taking into account three issues :- the target classification system; the regression dataset; and the regression error statistic to use. We also provide several alternative search strategies and class discretization methods when compared to the referred work.

We have tried our RECLA system with two classification systems, namely CN2 [Clark&Nibblet, 1988] and C4.5 [Quinlan,1993]. Recently, we have also tried a linear discriminant. We have conducted several experiments on real world data sets (see reference bellow).

Recently, we have extended a bit further this idea by providing a better theoretical justification for the transformation of regression into classification by means of misclassification costs (Torgo & Gama,1997). In this work we have empirically justified the use of a search-based approach to class discretization and have verified the value of using misclassification costs.

Some references about this research line :

  • Torgo,L. & Gama,J.(1996): RECLA program - LIACC, Machine Learning Group, Technical Report-96.1
    (45329 bytes in format ".ps.gz")(HTML version)
  • Torgo,L. and Gama,J. (1996) : Regression by Classification in Proceedings of SBIA'96 (Brasilian AI Symposium), Lecture Notes in AI 1159, Springer Verlag. Curitiba, Brazil.
    (Abstract)(47365 bytes in format ".ps.gz")(HTML version)
  • Torgo,L. and Gama,J. (1997) : Search-based Class Discretization to appear in Proceedings of ECML-97. Prague, Czech Republic. Springer-Verlag.
    (Abstract)(38695 bytes in format ".ps.gz")(HTML version)