We have presented an empirical study of hybrid regression trees. We have
compared several alternative models for regression tree leaves. Regarding
existing work on regression we have added the possibility of using kernel
models in regression tree leaves. Our experiments with 11 data sets have shown
the advantage of these models over the existing approaches.
The experiments carried out revealed that hybrid models are able to overcome
some limitations of the individual models. Hybrid models achieve high accuracy
levels across all tested domains. Moreover, they provide a symbolic
representation of the unknown regression function that improves our knowledge
of the domain.
The use of local models has a significant computational cost on large data
sets. Hybrid models relax this problem by applying the models on subsets of the
input domain without a significant loss on accuracy and even with some gains
when the unknown regression surface has strong discontinuities.
References
Aha, D. (1990) : A study of instance-based learning algorithms for supervised
learning tasks: Mathematical, empirical, and psychological evaluations. PhD
Thesis. Tech. Report 90-42. University of California at Irvine, Department of
Information and Computer Science.
Aha, D. (1992) : Generalizing from case studies : A case study. In Proceedings
of the 9th International Conference on Machine Learning. Sleeman,D. &
Edwards,P. (eds.). Morgan Kaufmann.
Atkeson,C.G., Moore,A.W., Schaal,S. (in press) : Locally Weighted Learning.
Special issue on lazy learning. Aha, D. (Ed.). Artificial Intelligence
Review.
Bentley,J.L. (1975) : Multidimensional binary search trees used for associative
searching. Communications of the ACM, 18(9), 509-517.
Brazdil,P. , Gama,J., Henery,B. (1994) : Characterizing the applicability of
classification algorithms using meta-level learning. In Proceedings ECML-94.
Bergadano,F. & De Raedt,L. (eds.). Lecture Notes in Artificial
Intelligence, 784. Springer-Verlag.
Brazdil,P. Torgo,L. (1990) : Knowledge Acquisition via Knowledge Integration.
In Current Trends in Knowledge Acquisition. Wielinga,B. et al. (eds.).
IOS Press.
Breiman,L. (1996) : Bagging Predictors. In Machine Learning, 24,
(p.123-140). Kluwer Academic Publishers.
Breiman,L. , Friedman,J.H., Olshen,R.A. & Stone,C.J. (1984):
Classification and Regression Trees, Wadsworth Int. Group, Belmont, California,
USA, 1984.
Broadley, C. E. (1995) : Recursive automatic bias selection for classifier
construction. Machine Learning, 20, 63-94. Kluwer Academic
Publishers.
Chatfield, C. (1983) : Statistics for technology (third edition).
Chapman and Hall, Ltd.
Cleveland, W.S. (1979) : Robust locally weighted regression and smoothing
scatterplots. Journal of the American Statistical Association,
74, 829-836.
Cleveland,W.S., Loader,C.R. (1995) : Smoothing by Local Regression: Principles
and Methods (with discussion). In Computational Statistics.
Deng,K., Moore,A.W. (1995) : Multiresolution Instance-based Learning. In
Proceedings of IJCAI'95.
Domingos,P. (1996) : Unifying Instance-based and Rule-based Induction. In
Machine Learning, 24-2, 141-168. Kluwer Academic Publishers.
Draper,N.R., Smith,H. (1981) : Applied Regression Analysis, 2nd edition,
John Wiley.
Fix, E., Hodges, J.L. (1951) : Discriminatory analysis, nonparametric
discrimination consistency properties. Technical Report 4, Randolph Filed, TX:
US Air Force, School of Aviation Medicine.
Freund,Y., Schapire,R.E. (1995) : A decision-theoretic generalization of
on-line learning and an application to boosting. Technical Report . AT & T
Bell Laboratories.
Hong,S.J. (1994) : Use of contextual information for feature ranking and
discretization. Technical Report RC19664, IBM. To appear in IEEE Trans. on
Knowledge and Data Engineering.
Indurkhya,N., Weiss,S. (1995) : Using Case Data to Improve on Rule-based
Function Approximation. In proceedings of the 1st International Conference on
Case-base Reasoning (ICCBR'95). Lecture Notes in AI, 1010, p.217-228.
Springer-Verlag.
Karalic,A. (1992) : Employing Linear Regression in Regression Tree Leaves. In
Proceedings of ECAI-92. Wiley & Sons.
Kira,K., Rendell,L.A. (1992) : The feature selection problem : traditional
methods and new algorithm. In Proceedings of AAAI'92.
Kononenko,I. (1994) : Estimating attributes : analysis and extensions of
Relief. In Proceedings ECML-94. Bergadano,F. & De Raedt,L. (eds.). Lecture
Notes in Artificial Intelligence, 784. Springer-Verlag.
Merz,C.J., Murphy,P.M. (1996) : UCI repository of machine learning databases
[http:// www.ics.uci.edu/ MLReposiroty.html]. Irvine, CA. University of
California, Dept. of Information and Computer Science.
Michie,D., Spiegelhalter;D.J. & Taylor,C.C. (1994): Machine Learning,
Neural and Statistical Classification, Ellis Horwood Series in Artificial
Intelligence, 1994.
Nadaraya, E.A. (1964) : On estimating regression. Theory of Probability and
its Applications, 9:141-142.
Press,W., Teukolsky,S., Vetterling,W., Flannery,B. (1992) : Numerical
Recipes in C, 2nd edition. Cambridge University Press.
Quinlan, J.R. (1992): Learning with Continuos Classes. In Proceedings of
the 5th Australian Joint Conference on Artificial Intelligence. World
Scientific, 1992.
Quinlan,J.R. (1993) : Combining Instance-based and Model-based Learning.
Proceedings of the 10th ICML. Morgan Kaufmann.
Robnik-Sikonja,M., Kononenko,I. (1996) : Context-sensitive attribute estimation
in regression. Proceedings of the ICML-96 Workshop on Learning in
Context-Sensitive Domains.
Salzberg,S. (1991) : A nearest hyperrectangle learning method. Machine
Learning, 6-3, 251-276. Kluwer Academic Publishers.
Smyth,P., Gray,A., Fayyad,U.M. (1995) : Retrofitting Decision Tree Classifiers
using Kernel Density Estimation. Proceedings of the 12th ICML. Prieditis,A.,
Russel,S. (Eds.). Morgan Kaufmann.
Stone, C.J. (1977) : Consistent nonparametric regression. The Annals of
Statistics, 5, 595-645.
Torgo,L. Gama,J. (1997) : Search-based Class Discretization. To appear in
Proceedings of ECML'97. Springer-Verlag.
Utgoff,P. (1989) : Incremental induction of decision trees. Machine
Learning, 4-2, 161-186. Kluwer Academic Publishers.
Watson, G.S. (1964) : Smooth Regression Analysis. Sankhya: The Indian
Journal of Statistics, Series A, 26 : 359-372.
Weiss, S. and Indurkhya, N. (1993) : Rule-base Regression. In Proc. of the 13th
IJCAI, p.1072-1078.
Weiss, S., Indurkhya, N. (1995) : Rule-based Machine Learning Methods for
Functional Prediction. In Journal Of Artificial Intelligence Research (JAIR),
3, p.383-403.
Wettschereck,D. (1994) : A study of distance-based machine learning algorithms.
PhD thesis. Oregon State University.
Wettschereck,D., Aha,D.W., Mohri,T. (in press) : A review and empirical
evaluation of feature weighting methods for a class of lazzy learning
algorithms. Special issue on lazy learning. Aha, D. (Ed.). Artificial
Intelligence Review.
Wolpert,D.H. (1992) : Stacked Generalization. In Neural Networks,
5, (p.241-259).