A Comparative Study of Reliable Error Estimators
for Pruning Regression Trees
Luís Torgo
LIACC/FEP - University of Porto
R. Campo Alegre, 823, 2º - 4150 PORTO - PORTUGAL
Phone : (+351) 2 607 8830 Fax : (+351) 2 600 3654
email : ltorgo@liacc.up.pt WWW : http://www.liacc.up.pt/~ltorgo
Abstract.
This paper presents a comparative study of several methods for estimating the true error of tree-structured regression models. We evaluate these methods in the context of regression tree pruning. Pruning is considered a key issue for obtaining reliable tree-structured models in a real world scenario. The major step of a pruning process consists of obtaining accurate estimates of the error of alternative tree models. We evaluate experimentally four methods for obtaining these estimates in twelve domains. The goal of this evaluation was to characterise the performance of the methods in the task of selecting the best possible tree among the set of trees considered during pruning. The results of the comparison show that certain estimators lead to poor decisions in some domains. The Cross Validation variant that we have proposed achieved the best results on the set-ups we have considered.
Keywords :
Machine Learning, Regression Trees, Pruning methods.
1 Introduction
2 Inducing Regression Trees
3 The Estimation Methods
4 The Experiments
5 Conclusions
Contents