Contents

ACKNOWLEDGEMENTS						        5
ABSTRACT								7
CONTENTS								9
LIST OF TABLES							        13
LIST OF FIGURES								15
LIST OF SYMBOLS								19
CHAPTER 1 - INTRODUCTION						21
1.1	OBJECTIVES							24
1.2	MAIN CONTRIBUTIONS						24
1.3	ORGANISATION OF THE THESIS					26
CHAPTER 2 - INDUCTIVE LEARNING						29
2.1	INTRODUCTION							29
2.2	SUPERVISED LEARNING						32
2.3	REGRESSION PROBLEMS						38
2.3.1	A Formalisation of Regression Problems				40
2.3.2	Measuring the Accuracy of Regression Models			42
2.4	EXISTING REGRESSION METHODS					45
2.4.1	Statistical Methods						45
2.4.2	Artificial Neural Networks					50
2.4.3	Machine Learning Methods					52
CHAPTER 3 - TREE-BASED REGRESSION					57
3.1	TREE-BASED MODELS					        58
3.2	LEAST SQUARES REGRESSION TREES				        62
3.2.1	Efficient Growth of LS Regression Trees				66
3.2.2	Splits on Continuous Variables					69
3.2.3	Splits on Discrete Variables					70
3.2.4	Some Practical Considerations					73
3.3	LEAST ABSOLUTE DEVIATION REGRESSION TREES			75
3.3.1	Splits on Continuous Variables					77
3.3.2	Splits on Discrete Variables					86
3.4	LAD VS. LS REGRESSION TREES					90
3.5	CONCLUSIONS							97
3.5.1	Open Research Issues						98
CHAPTER 4 - OVERFITTING AVOIDANCE IN REGRESSION TREES			105
4.1	INTRODUCTION							106
4.2	AN OVERVIEW OF EXISTING APPROACHES				108
4.2.1	Error-Complexity Pruning in CART				108
4.2.2	Pruning based on m estimates in RETIS			        110
4.2.3	MDL-based pruning in CORE					113
4.2.4	Pruning in M5							114
4.3	PRUNING BY TREE SELECTION					115
4.3.1	Generating Alternative Pruned Trees				117
4.3.2	Methods for Comparing Alternative Pruned Trees			122
4.3.3	Choosing the Final Tree					        140
4.3.4	An Experimental Comparison of Pruning by Tree Selection Methods	141
4.3.5	Summary							        154
4.4	COMPARISONS WITH OTHER PRUNING METHODS				155
4.4.1	A Few Remarks Regarding Tree Size				159
4.4.2	Comments Regarding the Significance of the Experimental Results	161
4.5	CONCLUSIONS						        162
4.5.1	Open Research Issues						163
CHAPTER 5 - LOCAL REGRESSION TREES					167
5.1	INTRODUCTION							168
5.2	LOCAL MODELLING							171
5.2.1	Kernel Models							172
5.2.2	Local Polynomial Regression					175
5.2.3	Semi-parametric Models						177
5.3	INTEGRATING LOCAL MODELLING WITH REGRESSION TREES		178
5.3.1	Method of Integration						181
5.3.2	An illustrative example						183
5.3.3	Relations to Other Work						185
5.4	AN EXPERIMENTAL EVALUATION OF LOCAL REGRESSION TREES		186
5.4.1	Local Regression Trees vs. Standard Regression Trees		187
5.4.2	Local Regression Trees vs. Local Models				189
5.4.3	Local Regression Trees vs. Linear Regression Trees		193
5.4.4	Local Regression Trees versus Existing Regression Methods	194
5.5	CONCLUSIONS						        196
5.5.1	Open Research Issues						197
CHAPTER 6 -CONCLUSIONS							199
6.1	SUMMARY								199
6.1.1	Growing Regression Trees					200
6.1.2	Pruning Regression Trees					201
6.1.3	Local Regression Trees						203
6.2	FUTURE RESEARCH DIRECTIONS					204
ANNEX A - MATERIAL AND METHODS						207
A.1.	THE EXPERIMENTAL METHODOLOGY					207
A.2.	THE USED BENCHMARK DATA SETS					212
A.3.	THE LEARNING SYSTEMS USED IN THE COMPARISONS			218
ANNEX B - EXPERIMENTAL RESULTS						221
B.4.	EXPERIMENTS WITH TREE GENERATION METHODS			221
B.5.	CART TREE-MATCHING VS. OUR PROPOSAL				224
B.6.	COMPARISON OF METHODS FOR GENERATING PRUNED TREES		224
B.7.	TUNING OF SELECTION METHODS					227
B.8.	COMPARISON OF METHODS OF EVALUATING A TREE			229
B.9.	COMPARISONS WITH OTHER PRUNING ALGORITHMS			230
REFERENCES								235