The file with the training cases (*.data), contains one case per line. Each line is a set of attribute values separate either by comma, space or tab characters. The last value in this list is the goal variable value (i.e. the variable to be predicted by the learned model).
The file with the information on the used variables (*.domain) contains as many lines as there are variables. Each line describes one variable. The order of the lines is exactly the order in which the variable values appear in each case of the data file. This means that the first line describes the variable that appears in the first position in the list of values of each case, and so on. Thus the last line describes the goal variable.
The description of each variable has the following format:
variable_name : variable_type.
The types of the variables can be:
An example:
smail LIACC-CIUP R. Campo Alegre, 823 4150 PORTO PORTUGAL |
UP LIACC MLgroup Luís Torgo |
email : ltorgo@liacc.up.pt WWW : http://www.liacc.up.pt/~ltorgo phone : (+351) 2 6078830 fax : (+351) 2 6003654 |