Technical Report: DCC-2003-03

On the Implementation of an ILP System with Prolog

Nuno Fonseca(1), Vitor Santos Costa(2), Fernando Silva(1), Rui Camacho(3)

(1) DCC-FC & LIACC, Universidade do Porto
R. do Campo Alegre 823, 4150-180 Porto, Portugal

(2) COPPE/Sistemas, Universidade Federal do Rio de Janeiro
Centro de Tecnologia, Bloco H-319, Cx. Postal 68511 Rio de Janeiro, Brasil
(3) Faculdade de Engenharia & LIACC, Universidade do Porto
Rua Dr. Roberto Frias, s/n 4200-465 Porto, Portugal

October 2003


Inductive Logic Programming (ILP) systems is a set of Machine Learning techniques that have been quite successful in knowledge discovery in relational domains. These systems implemented in Prolog are among the most successfull ILP systems. They challenge the limits of Prolog systems due to heavy usage of resources, such as database accesses and memory usage, and very long execution times. In this work we discuss the fundamental performance issues found in an ILP engine -- the April system. Namely, we evaluate the impact of a fundamental technique, called coverage caching, that stores previous results in order to avoid recomputation. To understand the results obtained we profiled April's execution and present initial results. We advocate that the indexing mechanisms used in YAP Prolog database are inefficient and that improvement of these mechanisms may lead to significant improvements in Prolog based ILP systems.