> Lecture 1: 17.09.2018 (Miguel Areias)
Course Introduction
Syllabus, bibliography and assessment rules.
> Lecture 2: 17.09.2018 (Miguel Areias)
Foundations of Parallel Computing
Motivation to go parallel, grand challenges, concurrency and
parallelism, implicit vs explicit parallelism, parallel architectures
- Flynn's taxonomy. Multiprocessors vs. Multicomputers. Shared memory
architectures. Distributed memory architectures. Recent multicore
architectures. Parallel programming.
> Lecture 3: 24.09.2018 (Miguel Areias)
Foundations of Parallel Computing
Foster methodology for designing parallel algorithms:
decomposition, communication patterns, agglomerate, and
mapping. Granularity. Parallel overheads, latency,
bandwidth.
Main parallel programming models: shared memory, distributed memory
and hybrid. An example of a program in MPI (numerical
integration). Parallel programming paradigms: master-slave,
single-program-mutiple-data, data pipelining, tree-search and
divide-and-conquer.
> Lecture 4: 24.09.2018 (Miguel Areias)
Programming for Distributed Memory with MPI.
The MPI (Message-Passing Interface): history, main concepts amd
goals. The SPMD model (Single Program Multiple Data). Basic structure
of a MPI program. The universal communicator (MPI_COMM_WORLD). Basic
data types. An example of a program using MPI.
Sending and receiving messages in standard, synchronous or buffered
modes.
> Lecture 5: 01.10.2018 (Miguel Areias)
Programming for Distributed Memory with MPI.
Types of communication: blocking versus non-blocking. Defining new
derived data tipes (type vector and struct). Example programs. Packing
and unpacking data in communications. Examples. Collective
communication functions: broadcast, reduce, scatter, gather, and
allreduce. Examples.
> Lecture 6: 01.10.2018 (Miguel Areias)
Programming for Distributed Memory with MPI.
Laboratory: setting up OpenMPI and exercises.
> Lecture 7: 08.10.2018 (Miguel Areias)
Programming for Distributed Memory with MPI.
Defining MPI groups and communicators of
processes. Examples. Measuring execution time.
> Lecture 8: 08.10.2018 (Miguel Areias)
Programming for Distributed Memory with MPI.
MPI functions specialized topologies, such as grids. An example on the
parallelization of an algorithm to calculate "pi" using a Monte Carlo
method. Standard I/O. Compilation and execution of MPI programs using
OpenMPI.
Introduction to Practical Assignment-1.
> Lecture 9: 15.10.2018 (Miguel Areias)
Programming for shared memory with processes.
The rank sort algorithm. Process creation. Shared memory: shm and
mmap. Implementation of rank-sort with shm and mmap.
> Lecture 10: 15.10.2018 (Miguel Areias)
Programming for Distributed Memory with MPI.
Laboratory: exercises in MPI.
> Lecture 11: 22.10.2018 (Miguel Areias)
Programming for shared memory with processes.
Synchronization mechanisms for shared memory. Test-and-set
lock. Read-write spinlocks. Semaphores. Examples: multiple readers and
writers with a bounded buffer, sleeping barber.
> Lecture 12: 22.10.2018 (Miguel Areias)
Programming for shared memory with processes.
Laboratory: exercises.
> Lecture 13: 29.10.2018 (Eduardo R. B. Marques)
Parallel programming using threads.
Threads and processes, multi-threaded processes, the pthreads library.
Thread creation and joining ("Hello pthreads" example).
Dot product example: work sharing, prevention of races using mutexes.
Heat diffusion example: using barriers for synchronized execution.
> Lecture 14: 29.10.2018 (Eduardo R. B. Marques)
Parallel programming using threads - part 2.
Heat diffusion example (cont.).
Condition variables. Example uses of
condition variables: fixed-capacity blocking queue, barrier implementation.
> Lecture 15: 05.11.2018 (Eduardo R. B. Marques)
Parallel programming using OpenMP (part 1)
Introduction to OpenMP. The 'omp parallel' directive,
basic OpenMP functions, private and shared variables.
Simple data parallelism: the 'omp for' directive and the 'reduction' clause.
Mutual exclusion in OpenMP: 'omp atomic', 'omp critical', and the use of
locks. Other synchronization constructs: 'omp master', 'omp single',
'omp barrier', 'nowait' and 'ordered' clauses.
> Lecture 16: 05.11.2018 (Eduardo R. B. Marques)
Pthreads and OpenMP - lab exercises.
> Lecture 17: 12.11.2018 (Eduardo R. B. Marques)
Parallel programming using OpenMP (cont.)
Functional parallelism using 'omp sections'.
The 'schedule' clause for 'omp for' loops.
Static vs dynamic schedules with irregular computation.
Loop-carried dependencies: detection, classification, and removal.
> Lecture 18: 12.11.2018 (Eduardo R. B. Marques)
OpenMP - lab exercises.
> Lecture 19: 19.11.2018 (Eduardo R. B. Marques)
Parallel programming using OpenMP (cont.)
Loop-carried dependencies (cont.). OpenMP orphaned constructs.
Data sharing in the context of function calls.
Introduction to project assignment II.
> Lecture 20: 19.11.2018 (Eduardo R. B. Marques)
Project II - familiarisation with provided code.
OpenMP - lab exercises (cont.).
> Lecture 21: 26.11.2018 (Miguel Areias)
Lock-Free Hash Tries.
Introduction to the key concepts about progress in concurrent systems
and to the lock-free progress. Understand why lock-freedom is
important in highly concurrent environments. Linearization. Study the
internals of a lock-free hash map. Performance analysis comparison
between state-of-the-art concurrent hash map designs.
> Lecture 22: 27.11.2018 (Eduardo R. B. Marques)
Lab exercises (project 2 + OpenMP).
> Lecture 23: 03.12.2018 (Eduardo R. B. Marques)
Cache memories and multi-core/multi-processor architectures.
Moore's Law and the "memory wall". The memory hiearchy and cache memories.
Basic cache characteristics. Spatial and temporal locality
in programs. Locality and synchronization in shared-memory multi-core
programming: loop scheduling, false sharing, inconsistent parallelisation.
The cost of synchronization: barriers, critical sections, and locks.
Loop tiling for maximising cache hit ratios.
> Lecture 24: 03.12.2018 (Eduardo R. B. Marques)
Lab exercises - memory locality in OpenMP programs.
> Lecture 25: 04.01.2018 (Eduardo R. B. Marques)
Performance Analysis metrics
Speedup and efficiency. Linear/sub-linear/super-linear speedup.
Efficiency in relation
to growth in processing units and/or problem size.
Amdhal's law and Gustafson-Barsis' law.
The Karp-Flatt metric. The isoefficiency relation and the
scalability function.