Contents
ACKNOWLEDGEMENTS 5
ABSTRACT 7
CONTENTS 9
LIST OF TABLES 13
LIST OF FIGURES 15
LIST OF SYMBOLS 19
CHAPTER 1 - INTRODUCTION 21
1.1 OBJECTIVES 24
1.2 MAIN CONTRIBUTIONS 24
1.3 ORGANISATION OF THE THESIS 26
CHAPTER 2 - INDUCTIVE LEARNING 29
2.1 INTRODUCTION 29
2.2 SUPERVISED LEARNING 32
2.3 REGRESSION PROBLEMS 38
2.3.1 A Formalisation of Regression Problems 40
2.3.2 Measuring the Accuracy of Regression Models 42
2.4 EXISTING REGRESSION METHODS 45
2.4.1 Statistical Methods 45
2.4.2 Artificial Neural Networks 50
2.4.3 Machine Learning Methods 52
CHAPTER 3 - TREE-BASED REGRESSION 57
3.1 TREE-BASED MODELS 58
3.2 LEAST SQUARES REGRESSION TREES 62
3.2.1 Efficient Growth of LS Regression Trees 66
3.2.2 Splits on Continuous Variables 69
3.2.3 Splits on Discrete Variables 70
3.2.4 Some Practical Considerations 73
3.3 LEAST ABSOLUTE DEVIATION REGRESSION TREES 75
3.3.1 Splits on Continuous Variables 77
3.3.2 Splits on Discrete Variables 86
3.4 LAD VS. LS REGRESSION TREES 90
3.5 CONCLUSIONS 97
3.5.1 Open Research Issues 98
CHAPTER 4 - OVERFITTING AVOIDANCE IN REGRESSION TREES 105
4.1 INTRODUCTION 106
4.2 AN OVERVIEW OF EXISTING APPROACHES 108
4.2.1 Error-Complexity Pruning in CART 108
4.2.2 Pruning based on m estimates in RETIS 110
4.2.3 MDL-based pruning in CORE 113
4.2.4 Pruning in M5 114
4.3 PRUNING BY TREE SELECTION 115
4.3.1 Generating Alternative Pruned Trees 117
4.3.2 Methods for Comparing Alternative Pruned Trees 122
4.3.3 Choosing the Final Tree 140
4.3.4 An Experimental Comparison of Pruning by Tree Selection Methods 141
4.3.5 Summary 154
4.4 COMPARISONS WITH OTHER PRUNING METHODS 155
4.4.1 A Few Remarks Regarding Tree Size 159
4.4.2 Comments Regarding the Significance of the Experimental Results 161
4.5 CONCLUSIONS 162
4.5.1 Open Research Issues 163
CHAPTER 5 - LOCAL REGRESSION TREES 167
5.1 INTRODUCTION 168
5.2 LOCAL MODELLING 171
5.2.1 Kernel Models 172
5.2.2 Local Polynomial Regression 175
5.2.3 Semi-parametric Models 177
5.3 INTEGRATING LOCAL MODELLING WITH REGRESSION TREES 178
5.3.1 Method of Integration 181
5.3.2 An illustrative example 183
5.3.3 Relations to Other Work 185
5.4 AN EXPERIMENTAL EVALUATION OF LOCAL REGRESSION TREES 186
5.4.1 Local Regression Trees vs. Standard Regression Trees 187
5.4.2 Local Regression Trees vs. Local Models 189
5.4.3 Local Regression Trees vs. Linear Regression Trees 193
5.4.4 Local Regression Trees versus Existing Regression Methods 194
5.5 CONCLUSIONS 196
5.5.1 Open Research Issues 197
CHAPTER 6 -CONCLUSIONS 199
6.1 SUMMARY 199
6.1.1 Growing Regression Trees 200
6.1.2 Pruning Regression Trees 201
6.1.3 Local Regression Trees 203
6.2 FUTURE RESEARCH DIRECTIONS 204
ANNEX A - MATERIAL AND METHODS 207
A.1. THE EXPERIMENTAL METHODOLOGY 207
A.2. THE USED BENCHMARK DATA SETS 212
A.3. THE LEARNING SYSTEMS USED IN THE COMPARISONS 218
ANNEX B - EXPERIMENTAL RESULTS 221
B.4. EXPERIMENTS WITH TREE GENERATION METHODS 221
B.5. CART TREE-MATCHING VS. OUR PROPOSAL 224
B.6. COMPARISON OF METHODS FOR GENERATING PRUNED TREES 224
B.7. TUNING OF SELECTION METHODS 227
B.8. COMPARISON OF METHODS OF EVALUATING A TREE 229
B.9. COMPARISONS WITH OTHER PRUNING ALGORITHMS 230
REFERENCES 235