Data Mining With R - learning with case studies

Chapters

Other Information

Book R Package

Errata

Extra Data Files

Errors/typos found in the book

page 45: this is not actually an error, it is a warning, but still instead of qq.plot(algae$mxPH,main='Normal QQ plot of maximum pH') you should use qqPlot(algae$mxPH,main='Normal QQ plot of maximum pH') as the function qq.plot() of package car became deprecated in recent versions of the package and was substituted by function qqPlot(). In this particular case the result is exactly the same.
page 45: on the same code snippet, in order to obtain exactly the result show on Figure 2.2 (page 46) you should add to both the call of function hist() and qqPlot() the parameter ylab='' (thanks Ehud Ben-Reuven for detecting this).
page 103: there is a missing end quote (') after src='oanda (thanks Ehud Ben-Reuven for detecting this).
page 109: there is an unfortunate error in the code of the T.ind() function (thanks Andy Couldrake for spotting this). Where you have
```
for(x in 1:n.days) r[,x] <- Next(Delt(v,k=x),x)
```
there should be
```
for(x in 1:n.days) r[,x] <- Next(Delt(Cl(quotes),v,k=x),x)
```
instead. Unfortunately, this error means that the data used in the subsequent modeling attempts in the chapter is wrong, which in turn means that the results will most probably be different if you try the code with the corrected target variable values.
page 110: In the first sentence it says the function Next can be used to move the values both forward and backward. This is wrong. It can only move them in one direction. To move in the other direction one should use the function Lag(). (thanks Hongcheng Li for detecting this error)
page 120: where I wrote where 0 <= Beta <= 1, ... it should read where Beta is a non-negative real value, .... I can add also that values between 0 and 1 give more importance to precision, while values above 1 given more importance to recall (with Beta=1 being the case of equal importance) (thanks Ehud Ben-Reuven for detecting this).
page 126: a typo in the beginning of section 3.4.2.2, where you should read (SVMs) and not (SMVs) (thanks Ehud Ben-Reuven for detecting this).
page 144,146: the references and definition of a function single() should be changed into another name because of conflits with a function with the same name in the base package. Namely, on page 144 the function single() is defined. Change its name to singleModel for instance. Then the code that uses this function on page 146 must be changed accordingly. Namely, within the call to experimentalComparison() the first do.call that you see has one line that reads as c(list('single',... You should change 'single' into 'singleModel'. In case you have any doubt use the code given on this web page on the section "R Code". There the code is already corrected.

page 147: top of the page in the code where you have

varsRootName=paste('single',td,sep='.')))

you should change to

varsRootName=paste('grow',td,sep='.')))

page 239: where it appears ...use it do download... should be ...use it to download... (thanks Ehud Ben-Reuven for detecting this).
page 240: the given definition of the shorth function is wrong. It should be: "It is calculated as the mean of the values in the shortest interval containing 50% of the observations". (thanks Hongcheng Li for detecting this error)
page 259: at the code presented in the end of the page there is a mistake in the call to the function knn(), namey the columns of the train and test data frames given as the two first arguments of that function. These two first arguments should be:
```
train[,varsSets[[v]]]
```
and
```
test[,varsSets[[v]]]
```
, respectively.