A MapReduce Construct for Yap Prolog

Joana Côrte-Real

July 2013


Abstract

This work’s aim was to design and implement a high-level Prolog primitive, based on the MapReduce programming paradigm. MapReduce is a programming model made popular by Google in 2008, even though its origins are more remote. It is composed by two simple operations, map and reduce, which can easily be applied to numerous algorithms. On the other hand, Prolog is a first- order logic predicate language with significant declarative power. This allowing the programmer to focus on the resolution strategies for a problem in preference to the execution technicalities. Prolog is also especially suited for data storage and processing; in fact, ILP deals with making inferences from that data. A MapReduce construct applied in these circumstances would be able to efficiently scale that process and thus significantly reduce execution times.
Including a MapReduce programming primitive in Prolog has three benefits: (i) to make available a high-level abstract construct in the MapReduce functional model which maintains the declarative nature of the programs; (ii) to give access to a previously nonexistent Prolog construct which is relevant to applications in numerous fields of knowledge; (iii) to allow for parallelism, thus speeding-up the execution of programs using this construct. The latter is particularly relevant now that multicore processors have become the favourite choice to assemble machines, even those for personal use. This, along with the fact that there are increasingly larger data processing requirements in everyday life, renders a framework using multicore architectures for efficient data processing highly relevant.
MapReduce for Prolog’s focus are multicore architectures, but our primitive supports hybrid environments (shared and distributed memory), implicitly and transparently. MapReduce for Prolog was implemented in the Yap system and it follows a master-slave paradigm, in which the master is responsible for dividing and assigning the work and the slaves for processing the chunks dispatched to them. This construct’s interface has various customisation levels, and our aim is that it will come to integrate the Yap Prolog system as built-in construct. Our system was successfully tested using four distinct applications common in the literature: two of these were numeric, and the other two were composed of Prolog terms. The test were made using two implementations for the same programming interface, one for a cluster of machines and another for a multicore architecture. It was determined that our construct scaled almost ideally for these datasets, both in shared and distributed memory. Four scheduling methods were also developed and assessed, and the two more efficient ones will be made available in the final version of the library. An evaluation of the effect of the chunk size variation for different datasets and scheduling methods was performed as well, in order to define standard parameters for MapReduce for Prolog.

Bibtex

@MastersThesis{cortereal-msc,
  author =  {J. Côrte-Real},
  title =   {{A MapReduce Construct for Yap Prolog}},
  school =  {University of Porto},
  address = {Portugal},
  month =   {July},
  year =    {2013},
  type =    {{MSc Thesis}},
}

Download Thesis

PDF file