A Comparative Study of Reliable Error Estimators for Pruning Regression Trees
Luís Torgo
1998
Abstract
This paper presents a comparative study of several methods for
estimating the true error of tree- structured regression models. We
evaluate these methods in the context of regression tree pruning.
Pruning is considered a key issue for obtaining reliable
tree-structured models in a real world scenario. The major step of a
pruning process consists of obtaining accurate estimates of the error
of alternative tree models. We evaluate experimentally four methods
for obtaining these estimates in twelve domains. The goal of this
evaluation was to characterise the performance of the methods in the
task of selecting the best possible tree among the set of trees
considered during pruning. The results of the comparison show that
certain estimators lead to poor decisions in some domains. Our study
shows that the Cross Validation variant that we propose is the best
choice for the set-ups we have considered.