TITLE PAGE - 3.3 the experiments

3.3 The experiments

The goal of these experiments was to compare the results of the ML learned models to other types of systems not belonging to the symbolic learning field.

We compare our models to the 12 methods used in the DGOR competition. This comparison was made using three statistics of errors :- the mean square error (MAE), root mean squared error (RMSE) and Theil's coefficient (TU). They are calculated using the following formulas :

The goal of the experiments was to predict the value of the goal variable for the next 12 time periods. Errors were calculated on the basis of those 12 predictions. In resume for each of our 5 data sets we constructed the data sets of examples for each of the candidate attribute introduction strategy and after learning, we tried to predict those 12 future values using the resulting learned model.

The 12 methods used in the DGOR competition included smoothing average variants, Kalman filters, Box-Jenkins [1] methods, etc. In the table of results we refer to them as m1 to m12.

We summarize the results of the experiments in table 1. This table shows the results on the MAE statistic. The results are ordered showing the best model at the top.

     ZR03             ZR04             ZR06             ZR11             ZR15        
m5         9.9   ZR04_t2    0.97  m10        3.28  m8         208   m11        26.8  
m12        12.2  ZR04_t3    0.98  m11        3.38  m1         217   m4         28.8  
m3         14.8  ZR04_t4    0.98  m7         3.73  m7         222   m8         30.9  
m8         16.9  ZR04_t5    0.98  m5         3.97  m11        243   m5         36.5  
m11        17.9  ZR04_t2d1  0.98  m12        3.98  m6         251   m6         37.3  
m7         19.7  ZR04_t3d1  0.98  m3         4.1   m3         255   m7         40    
m9         22.3  ZR04_t4d1  0.98  m8         4.15  m2         257   m1         40.4  
m4         25.1  ZR04_t5d1  0.98  m1         4.26  m9         292   ZR15_t1    42.2  
                                                                               6     
m10        25.9  ZR04_t3d2  0.98  m2         4.87  m12        292   m10        44.2  
ZR03_t1    26.4  ZR04_t4d2  0.98  m9         5.33  m5         295   m12        45.8  
           8                                                                         
m2         26.5  ZR04_t5d2  0.98  m4         5.65  m10        308   m3         46.8  
m1         29.4  ZR04_sm3   0.98  ZR06_t3d1  6.64  m4         317   ZR15_t2d1  48.2  
ZR03_t3d1  30.6  ZR04_t4sm  0.98  ZR06_t2d1  6.71  ZR11_t4d2  422.  ZR15_t5    48.4  
           9     4w                                           27               4     
ZR03_t3d2  30.6  ZR04_t4sm  0.98  ZR06_sm3   6.71  ZR11_t4sm  432.  m2         48.6  
           9     4wv                               4w         85                     
ZR03_t2    31.6  m4         1     m6         6.79  ZR11_t4sm  440.  ZR15_t4    48.7  
           3                                       4wv        84               4     
ZR03_t2d1  32.1  ZR04_t4sm  1     ZR06_t5    6.83  ZR11_t4sm  466.  ZR15_t3d1  48.7  
           8     4                                 4          3                4     
ZR03_smt5  32.6  m2         1.08  ZR06_t3d2  6.87  ZR11_t1    471.  ZR15_t4d1  48.7  
d2         7                                                  88               4     
ZR03_t4d1  34    m10        1.08  ZR06_t4d1  6.88  ZR11_t2    477.  ZR15_t5d1  48.7  
                                                              8                4     
ZR03_t5d1  34    m11        1.09  ZR06_t5d1  6.98  ZR11_t2d1  477.  ZR15_t4d2  48.7  
                                                              8                4     
ZR03_t3    34.5  ZR04_smt5  1.16  ZR06_t5d2  6.98  ZR11_sm3   478.  ZR15_t5d2  48.7  
           3     d2                                           37               4     
ZR03_t5d2  34.5  ZR04_t1    1.25  ZR06_t4    7.29  ZR11_t3d1  518.  ZR15_sm3   48.7  
           5                                                  93               4     
ZR03_t5    35.0  m6         1.29  ZR06_t4sm  7.29  ZR11_t4d1  526.  ZR15_smt5  48.7  
           9                      4wv                         02    d2         4     
ZR03_t4d2  35.2  m5         1.38  ZR06_t3    7.53  ZR11_t5d2  546.  ZR15_t4sm  48.7  
           7                                                  57    4          4     
ZR03_t4    35.4  m12        1.44  ZR06_t4d2  7.58  ZR11_t5d1  579.  ZR15_t4sm  48.7  
           5                                                  36    4w         4     
ZR03_sm3   35.4  m8         1.45  ZR06_t4sm  7.63  ZR11_t3d2  585.  ZR15_t4sm  48.7  
           7                      4                           25    4wv        4     
ZR03_t4sm  35.5  m3         1.54  ZR06_smt5  8.17  ZR11_t4    591.  ZR15_t2    49.8  
4wv        5                      d2                          29               4     
ZR03_t4sm  35.5  m7         1.57  ZR06_t4sm  8.19  ZR11_smt5  769.  ZR15_t3    54.0  
4w         6                      4w               d2         3                1     
ZR03_t4sm  36.8  m9         1.89  ZR06_t2    8.5   ZR11_t3    827.  m9         58.6  
4          1                                                  32                     
m6         37.9  m1         1.9   ZR06_t1    8.96  ZR11_t5    841.  ZR15_t3d2  64.8  
                                                              71               2

Table 1. Summary of comparative results with other fields' methods.

The ordering for the other statistics is similar so we omit the tables for space reasons.

As we can see from table 1 the results of our methods vary a lot from domain to domain. They range from the surprisingly good results of M5 on ZR04 to the very bad results on ZR11. This variation also occurs with the other methods as there seem to be no clear winner on all problems. However, our results are not generally ranked in the best positions. This indicates that further research is needed before this ML-based models are ready for this kind of competitions. We expect that by tuning some of the learning parameters of M5 we can improve these results. Automatic feature selection is another technique that can improve our rankings. We should also try other learning algorithms like RETIS.

We have tried to understand the reason for such bad outcome in the ZR11 data. We observed that M5 was using a branch of its learned model tree whose label was not a regression formula but an average value (which can happen in M5). The system was making almost always the same prediction for all 12 values of this data set. This value was an average value which of course would not allow the system to go outside of the scope of the values in the learning set. This clearly indicates that further investigation is needed on either improving the behavior of M5 through parameter tuning, or develop other systems that do not have this behaviour.

<< , >> , up , Title , Contents