Partial Linear Trees

Luís Torgo
2000


Abstract

Partial linear regression is a successful method of dealing with regression tasks. This technique integrates a linear polynomial with a kernel model with the goal of being able to handle non-linear problems. Kernel models are a kind of instance-based learners and as such do not obtain any intelligible model of the data, which is not the case of linear polynomials. By integrating these two approaches partial linear regression looses some of the intelligibility of linear polynomials, particularly in highly non-linear domains. In such cases the importance of the kernel component grows and the linear polynomial hardly describes the true regression surface approximated by the partial linear model. In this paper we describe partial linear trees that aim at increasing the description accuracy of partial linear models by decreasing the importance of the kernel corrections in non-linear domains. This is accomplished by identifying several regions of low variance where different partial linear models are fitted. Within these regions linear polynomials are more accurate and thus the kernel "adjustments" are smaller. Partial linear trees are able to maintain the high predictive accuracy of partial linear regression while providing a more accurate description of the regression surface and being computationally more efficiency.