ILP: Compute Once, Reuse Often
Nuno A. Fonseca, Ricardo Rocha, Rui Camacho and VĂtor Santos Costa
September 2007
Abstract
Inductive Logic Programming (ILP) is a powerful and well-developed
abstraction for multi-relational data mining techniques. However, ILP
systems are not particularly fast, most of their execution time is
spent evaluating the hypotheses they construct. The evaluation time
needed to assess the quality of each hypothesis depends mainly on the
number of examples and the theorem proving effort required to
determine if an example is entailed by the hypothesis. We propose a
technique that reduces the theorem proving effort to a bare minimum
and stores valuable information to compute the number of examples
entailed by each hypothesis (using a tree data structure). The
information is computed only once (pre-compiled) per
example. Evaluation of hypotheses requires only basic and efficient
operations on
trees. This proposal avoids re-computation of
hypothesis' value in theory-level search and cross-validation
algorithms, whenever the same data set is used with different
parameters. In an empirical evaluation the technique yielded
considerable speedups.
Bibtex
@InProceedings{fonseca-mrdm07,
author = {N. A. Fonseca and R. Rocha and R. Camacho and V. Santos Costa},
title = {{ILP: Compute Once, Reuse Often}},
booktitle = {Proceedings of the 6th Workshop on Multi-Relational Data Mining (MRDM 2007)},
pages = {34--45},
editor = {D. Malerba and A. Appice and M. Ceci},
month = {September},
year = {2007},
address = {Warsaw, Poland},
}
Download Paper
PDF file