<< , up , Title , Contents

4. Conclusions and future work

Although regression is an important problem in data analysis, it has not been dealt with extensively by the ML community. In this paper we presented a system capable of learning regression rules. The system integrates the task of developing a regression model from data, with the technique of searching for logical conditions that enable a better fitting error by the model. Although regression trees also follow a similar strategy, R2 uses a more powerful descriptive language.

Piecewise regression models like the ones built by R2 have advantages of achieving better prediction accuracy when compared to one-model approaches. These advantages come at the cost of more specific models. R2 implements a flexible compromise between model generality and correctness.

Our system compares reasonably to other ML regression algorithms and even outperforms them on some data sets. However, due to the size of the data sets many differences are not statistically significant.

In future we plan to extend our comparisons and weigh both accuracy and comprehensibility. These comparisons should include systems from other non-symbolic fields. We believe that this will show the advantage of R2 over other sub-symbolic methods.

We also intend to explore techniques that enable our system to deal with large scale domains. We think that methods of sampling and iterative regression modeling will help to overcome these problems.

ACKNOWLEDGMENTS

I would like to thank Prof. Ross Quinlan for making available to us the system M5. I would also like to thank my supervisor Prof. Pavel Brazdil for the interesting discussions we had throughout the development of system R2 .

References

[1]. Aha,D. Kibler,D. (1991): Instance-Based Learning Algorithms. In Machine Learning, vol. 6 - 1. Kluwer Academic Publishers.

[2]. Breiman, L. , Friedman,J.H., Olshen,R.A. & Stone,C.J. (1984): Classification and Regression Trees, Wadsworth Int. Group, Belmont, California, USA.

[3]. Clark, P., Niblett, T. : Induction in noisy domains, in Proc. of the 2th European Working Session on Learning , Bratko,I. and Lavrac,N. (eds.), Sigma Press, Wilmslow, 1987.

[4]. Dillon,W., Goldstein,M. (1984) : Multivariate Analysis methods and applications. John Wiley & Sons.

[5]. Friedman,J. (1991): Mutivariate Adaptative Regression Splines. In Annals of Statistics , 19:1.

[6]. Friedman,J. Stuetzle,W. (1981): Projection Pursuit Regression. In J. American Statistics Association 76.

[7]. Karalic, A.. (1991): The Bayesian Approach to Tree-Structured Regression. In Proceedings of ITI-91 , Cavtat, Croatia, 1991.

[8]. Karalic, A..(1992): Employing Linear Regression in Regression Tree Leaves. In Proceedings of ECAI-92 , Wiley & Sons, 1992.

[9]. McClelland,J. Rumelhart,D. (1988): Explorations in Parallel Distributed Processing. Cambridge, Ma. : MIT Press.

[10]. Michalski, R.S. , Mozetic, I. , Hong, J., Lavrac, N. : The multi-purpose incremental learning system AQ15 and its testing application to three medical domains, in Proceedings of AAAI-86, 1986.

[11]. Michie,D., Spiegelhalter,D.J., Taylor,C. (1994) : Machine Learning, Neural and Statistical Classification. Ellis Horwood series in Artificial Intelligence. Ellis Horwood.

[12]. Quinlan, J.R. (1992): Learning with Continuos Classes. In Proceedings of the 5th Australian Joint Conference on Artificial Intelligence. Singapore: World Scientific, 1992.

[13]. Quinlan, J.R. (1993): Combining Instance-Based and Model-Based Learning. In Proceedings of 10th IML, Utgoff,P. (ed.). Morgan Kaufmann Publishers.

[14]. Rissanen,J. (1983) : A universal prior for integers and estimation by minimum description length. In Annals of Statistics 11, 2.

[15]. Torgo,L. (1993a) : Controlled Redundancy in Incremental Rule Learning. In Proceedings of ECML-93, Brazdil,P. (ed.). Lecture Notes in Artificial Intelligence - 667. Springer-Verlag.

[16]. Torgo,L. (1993b) : Rule Combination in Inductive Learning. In Proceedings of ECML-93, Brazdil,P. (ed.). Lecture Notes in Artificial Intelligence - 667. Springer-Verlag.

[17]. Torgo,L. (1995) : Applying Propositional Learning to Time Series Prediction. In ECML-95 workshop on Statistics, Machine Learning and Knowledge Discovery in Databases. Kodratoff, Y. et al. (eds.).

[18]. Urbancic,T., Bratko,I. (1994): Reconstructing Human Skill with Machine Learning. In Proceedings of the European Conference in Artificial Intelligence (ECAI-94), Cohn, A.G. (ed.). John Wiley & Sons.

[19]. Weiss,S.M., Indurkhya,N. (1993): Rule-Based Regression. In Proceedings of IJCAI-93, Bajesy,R. (ed.). Morgan Kaufmann Publishers.


<< , up , Title , Contents