next up previous
Next: DSLP Up: Scheduling Previous: Tamagoshi

Penelope

This section presents Penelope [28], a hierarchical distributed scheduler for the Plosys system. Penelope aims at reducing number and size of messages exchanged between processors as well as to maintain data locality.

In the Plosys system, a Prolog program is executed by several workers. Each worker is designed with three threads: exporter, importer and a Prolog engine. The load of a worker consists of alternatives not yet explored from one or more choicepoints. Each worker sends its own load to the scheduler through the engine. A worker can assume three load status: (1) idle: it does not have any Prolog task to do, (2) quiet: the worker is active, but does not have enough work to share with another idle worker, and (3) overloaded: the worker has amount of work above a threshold, therefore it can share work with other idle workers.

The scheduling strategy applied by the Plosys system was originally centralised. Therefore the scheduler needed to maintain a global state about the system load using the information sent by workers every so often. The system uses full stack copying in order to transfer work from one worker to another. It also chooses the topmost choicepoint to share with other worker, trusting that topmost work is coarser-grained than bottom most work. In order to know what choicepoint is older, the system marks each choicepoint creation with a timestamp. This timestamp corresponds to its depth in the search tree. The strategy adopted by Penelope chooses choicepoints depending on its complexity given by the GRANLOG tool. It also tries to maintain locality by exporting/importing work from workers that already communicated sometime in the past.

Penelope also implements a symmetrically-initiated strategy, where idle workers ask for work, and overloaded workers inform idle workers that there is available work.

In the centralised scheduling model used by Plosys, only one worker is responsible for collecting information about workload in the system. This can produce a significant overhead, when we use a high number of processors. In order to avoid the bottleneck on this worker, we implemented a distributed hierarchical scheduler for the Plosy system.

In order to reduce the number of messages exchanged between processors, we established a partial order among the processors. This way only parent processes communicate with their children and vice-versa. In order to reduce the size of the messages exchanged in the system, we do not implement full stack copying as in the Plosys original design. We perform an incremental stack copying, where workers import/export only parts of the stacks that are not yet common to both workers involved in the communication. The stack copying algorithm used is similar to the one use by the Muse system [2].

The important point about incremental copying is that workers that share work have common parts of the Prolog stacks. In order to achieve this and maintain good locality we adopt a bottom up communication style, where work is transferred to other workers from the leaves to the root of the execution tree.

As all information flow from the leaves to the root of the execution tree, each scheduler keeps information about its descendents and about its parent.


next up previous
Next: DSLP Up: Scheduling Previous: Tamagoshi
Ines de Castro Dutra
1999-05-04