ILP: Compute Once, Reuse Often

Nuno A. Fonseca, Ricardo Rocha, Rui Camacho and VĂ­tor Santos Costa

September 2007


Abstract

Inductive Logic Programming (ILP) is a powerful and well-developed abstraction for multi-relational data mining techniques. However, ILP systems are not particularly fast, most of their execution time is spent evaluating the hypotheses they construct. The evaluation time needed to assess the quality of each hypothesis depends mainly on the number of examples and the theorem proving effort required to determine if an example is entailed by the hypothesis. We propose a technique that reduces the theorem proving effort to a bare minimum and stores valuable information to compute the number of examples entailed by each hypothesis (using a tree data structure). The information is computed only once (pre-compiled) per example. Evaluation of hypotheses requires only basic and efficient operations on trees. This proposal avoids re-computation of hypothesis' value in theory-level search and cross-validation algorithms, whenever the same data set is used with different parameters. In an empirical evaluation the technique yielded considerable speedups.

Bibtex

@InProceedings{fonseca-mrdm07,
  author =    {N. A. Fonseca and R. Rocha and R. Camacho and V. Santos Costa},
  title =     {{ILP: Compute Once, Reuse Often}},
  booktitle = {Proceedings of the 6th Workshop on Multi-Relational Data Mining (MRDM 2007)},
  pages =     {34--45},
  editor =    {D. Malerba and A. Appice and M. Ceci},
  month =     {September},
  year =      {2007},
  address =   {Warsaw, Poland},
}

Download Paper

PDF file