Compile the hypothesis space: do it once, use it often
Nuno A. Fonseca, Rui Camacho, Ricardo Rocha and VĂtor Santos Costa
2008
Abstract
Inductive Logic Programming (ILP) is a powerful and well-developed
abstraction for multi-relational data mining techniques. Despite the
considerable success of ILP, deployed ILP systems still have
efficiency problems when applied to complex problems. In this paper we
propose a novel technique that avoids the procedure of deducing each
example to evaluate each constructed clause. The technique is based on
the Mode Directed Inverse Entailment approach to ILP, where a bottom
clause is generated for each example and the generated clauses are
subsets of the literals of such bottom clause. We propose to store in
a prefix-tree all clauses that can be generated from all bottom
clauses together with some extra information. We show that this
information is sufficient to estimate the number of examples that can
be deduced from a clause and present an ILP algorithm that exploits
this representation. We also present an extension of the algorithm
where each prefix-tree is computed only once (compiled) per
example. The evaluation of hypotheses requires only basic and
efficient operations on trees. This proposal avoids re-computation of
hypothesis' value in theory-level search, in cross-validation
evaluation procedures and in parameter tuning. Both proposals are
empirically evaluated on real applications and considerable speedups
were observed.
Bibtex
@Article{fonseca-fi08,
author = {N. A. Fonseca and R. Camacho and R. Rocha and V. Santos Costa},
title = {{Compile the hypothesis space: do it once, use it often}},
journal = {Fundamenta Informaticae},
pages = {45--67},
volume = {89},
number = {1},
year = {2008},
}
Download Paper
PDF file
IOS Press