<< , up , Title , Contents

6. Conclusions

The method described in this paper enables the use of classification systems in regression domains. Previous work [20] provided evidence for the validity of transforming regression into classification. This was oriented towards one learning algorithm. Our work enables the use of a similar transformation strategy with other classification systems. This extends the applicability of a wide range of existent inductive systems.

Our algorithm chooses the best discretization method among a set of available strategies. We estimate the prediction error of each candidate method and select the best among them. The resulting discrete classes are obtained by an iterative search procedure using the chosen discretization method. This iterative search is basically a wrapper process based on a N-fold CV evaluation that estimates the predictive error resulting from using a set of discrete classes. We have also introduced five novel methods for discrete class formation.

We have showed the validity of our search-based approach by means of extensive experiments on four real world domains. These experiments indicated that a search-based approach is necessary if we want to handle several domain/learning system/error measure scenarios. The results of the experiments also showed that some of our methods for class formation were among the best on most of the cases.

We have applied our methodology to two classification inductive systems (C4.5 and CN2). It is easy to use it with other learning algorithms. This generality turns our methodology into a powerful tool for handling regression using existing ML classification inductive systems.

References

[1]. Breiman,L. , Friedman,J.H., Olshen,R.A. & Stone,C.J. (1984): Classification and Regression Trees, Wadsworth Int. Group, Belmont, California, USA, 1984.

[2]. Clark, P. and Niblett, T. (1988) : The CN2 induction algorithm. In Machine Learning, 3, 261-283.

[3]. Dillon,W. and Goldstein,M. (1984) : Multivariate Analysis.. John Wiley & Sons, Inc.

[4]. Fayyad, U.M., and Irani, K.B. (1993) : Multi-interval Discretization of Continuous-valued Attributes for Classification Learning. In Proceedings of the 13th International Joint Conference on Artificial Intelligence (IJCAI-93). Morgan Kaufmann Publishers.

[5]. Friedman, J. (1991) : Multivariate Adaptative Regression Splines. In Annals of Statistics, 19:1.

[6]. Ginsberg, M. (1993) : Essentials of Artificial Intelligence. Morgan Kaufmann Publishers.

[7]. Holland, J. (1992) : Adaptation in natural and artificial systems : an introductory analysis with applications to biology, control and artificial intelligence. MIT Press.

[8]. John,G.H., Kohavi,R. and Pfleger, K. (1994) : Irrelevant features and the subset selection problem. In Machine Learning : proceedings of the 11th International Conference. Morgan Kaufmann.

[9]. Kohavi, R. (1995) : Wrappers for performance enhancement and oblivious decision graphs. PhD Thesis.

[10]. Langley, P., and Sage, S. (1994) : Induction of selective bayesian classifiers. In Proceedings of the 10th conference on Uncerrtainty in Artificial Intelligence. Morgan Kaufmann Publishers.

[11]. Lee, C. and Shin, D. (1994) : A context-sensitive Discretization of Numeric Attributes for classification learning. In Proceedings of the 11th European Conference on Artificial Intelligence (ECAI-94), Cohn, A.G. (ed.). John Wiley & Sons.

[12]. Michie,D., Spiegelhalter;D.J. & Taylor,C.C. (1994): Machine Learning, Neural and Statistical Classification, Ellis Horwood Series in Artificial Intelligence, 1994.

[13]. Mladenic, D. (1995) : Automated model selection. In Mlnet workshop on Knowledge Level Modelling and Machine Learning. Heraklion, Crete, Greece.

[14]. Pazzani, M.J. (1995) : Searching for dependencies in bayesian classifiers. In Proceedings of the 5th international workshop on Artificial Intelligence and Statitics. Ft. Laurderdale, FL.

[15]. Quinlan, J. R. (1993) : C4.5 : programs for machine learning. Morgan Kaufmann Publishers.

[16]. Quinlan, J.R. (1992): Learning with Continuos Classes. In Proceedings of the 5th Australian Joint Conference on Artificial Intelligence. Singapore: World Scientific, 1992.

[17]. Torgo, L. (1995) : Data Fitting with Rule-based Regression. In Proceedings of the 2nd international workshop on Artificial Intelligence Techniques (AIT'95), Zizka,J. and Brazdil,P. (eds.). Brno, Czech Republic.

[18]. van Laarhoven,P. and Aarts,E. (1987) : Simulated annealing : Theory and Applications. Kluwer Academic Publishers.

[19]. Weiss, S. and Indurkhya, N. (1993) : Rule-base Regression. In Proceedings of the 13th International Joint Conference on Artificial Intelligence, pp. 1072-1078.

[20]. Weiss, S. and Indurkhya, N. (1995) : Rule-based Machine Learning Methods for Functional Prediction. In Journal Of Artificial Intelligence Research (JAIR), volume 3, pp.383-403.


<< , up , Title , Contents