A MapReduce Construct for Yap Prolog
Joana Côrte-Real
July 2013
Abstract
This work’s aim was to design and implement a high-level Prolog
primitive, based on the MapReduce programming paradigm. MapReduce is a
programming model made popular by Google in 2008, even though its
origins are more remote. It is composed by two simple operations, map
and reduce, which can easily be applied to numerous algorithms. On the
other hand, Prolog is a first- order logic predicate language with
significant declarative power. This allowing the programmer to focus
on the resolution strategies for a problem in preference to the
execution technicalities. Prolog is also especially suited for data
storage and processing; in fact, ILP deals with making inferences from
that data. A MapReduce construct applied in these circumstances would
be able to efficiently scale that process and thus significantly
reduce execution times.
Including a MapReduce programming primitive in Prolog has three
benefits: (i) to make available a high-level abstract construct in the
MapReduce functional model which maintains the declarative nature of
the programs; (ii) to give access to a previously nonexistent Prolog
construct which is relevant to applications in numerous fields of
knowledge; (iii) to allow for parallelism, thus speeding-up the
execution of programs using this construct. The latter is particularly
relevant now that multicore processors have become the favourite
choice to assemble machines, even those for personal use. This, along
with the fact that there are increasingly larger data processing
requirements in everyday life, renders a framework using multicore
architectures for efficient data processing highly relevant.
MapReduce for Prolog’s focus are multicore architectures, but our
primitive supports hybrid environments (shared and distributed
memory), implicitly and transparently. MapReduce for Prolog was
implemented in the Yap system and it follows a master-slave paradigm,
in which the master is responsible for dividing and assigning the work
and the slaves for processing the chunks dispatched to them. This
construct’s interface has various customisation levels, and our aim is
that it will come to integrate the Yap Prolog system as built-in
construct. Our system was successfully tested using four distinct
applications common in the literature: two of these were numeric, and
the other two were composed of Prolog terms. The test were made using
two implementations for the same programming interface, one for a
cluster of machines and another for a multicore architecture. It was
determined that our construct scaled almost ideally for these
datasets, both in shared and distributed memory. Four scheduling
methods were also developed and assessed, and the two more efficient
ones will be made available in the final version of the library. An
evaluation of the effect of the chunk size variation for different
datasets and scheduling methods was performed as well, in order to
define standard parameters for MapReduce for Prolog.
Bibtex
@MastersThesis{cortereal-msc,
author = {J. Côrte-Real},
title = {{A MapReduce Construct for Yap Prolog}},
school = {University of Porto},
address = {Portugal},
month = {July},
year = {2013},
type = {{MSc Thesis}},
}
Download Thesis
PDF file