MAP-I doctoral program
 Cluster and Grid Computing
 Summary
This document presents a Ph.D. level course, as a joint FCUP-FEUP-DIUM proposal in the MAP-I doctoral program, corresponding to a Curriculum Unit - UCT - credited with 5 ECTS.  
 I.  Theme, Justification and Context

Introduction

This course is intended to Ph.D. level students whose main theme of research is related to parallel and distributed systems in a broad context. As such, the course is structured in 3 main parts. The first part is dedicated to basic concepts of parallel and distributed computing focusing on parallel programming models, parallel architectures, including modern multi-cores and multi-processors, performance metrics, performance analysis, parallel benchmarks, and state-of-the-art in parallel and distributed computing. The second part shifts to what is currently considered the main stream in distributed computing worldwide: cluster computing. In  this part of the course, the student will have the opportunity to understand what is a cluster, how to develop applications for a cluster, how to configure a cluster, which is not merely to join off-the-shelf computers in a  room or shelf or buying an expensive ready solution, and limitations imposed by applications that demand more resources than those available in a local network. In the third part of the course, we will concentrate on today's hot topics, which are grid computing and data grid, where the student will have the opportunity to understand what is a grid, its functioning, and will be able to appreciate current developments in grid computing and data grid, as well as to identify limitations of the algorithms and techniques used to implement grid-aware systems.

The relevance of these topics is well demonstrated by the number of articles published every year in main stream scientific events and periodicals with high impact rate, such as the Journal of Grid Computing, the Journal of High Performance Applications, the Journal of Parallel and Distributed Computing, the IEEE Transactions on Distributed Systems, the Grid Computing Conference, the Conference on Cluster and Grid Computing, the Conference on Middleware for Grid Computing, among many others.

Regarding the first part of the course, an important amount of research effort is being put on the development of efficient parallel applications for the modern computer architectures that are moving towards multi-core machines. And this connects well with the second part of the course, where machines that belong to clusters very often have this kind of architecture based on multi-core or multi-processor systems. The ability to understand and develop applications that take advantage of such architectures is a mandatory skill for next generation Ph.D. students.

With respect to the second part of the course, its relevance comes from the fact that current clusters spread geographically are becoming closer due to high bandwidth connections among countries, and continents, which makes it feasible to interconnect several different clusters in order to increase the number of resources available to develop resource-demanding applications. This interconnection of resources that are distant geographically brings us to what is becoming the standard to integrate resources worldwide: the grid. Traditionally, grid computing only existed in the realms of high performance computing and technical computer farms. Increasing demand for computer power and data sharing, combined with technological advances, particularly relating to network bandwidth increases, has extended the scope of the grid beyond its traditional bounds.

Today, large-scale production Grid infrastructures such as EGEE in Europe, OSG in the US, CNGrid in China, and NAREGI in Japan are offering their services to many scientific and industrial applications, from domains as diverse as Astronomy, Biomedicine, Computational Chemistry, Earth Sciences,  Financial Simulations, and High Energy Physics. Grid infrastructures provide these applications a new means for collaborative research by facilitating the sharing of computational and data resources at an unprecedented scale.

Experience has shown that the technologies, algorithms and techniques involved in sharing resources across the Internet (i.e., grid computing) are very different and have different prerequisite knowledge than the ones used in high-performance cluster computing.  Cluster computing builds on knowledge of algorithms, and message passing programming in the C or Fortran programming languages. In contrast, grid computing builds on knowledge of network security, the client/server paradigm, XML, web services and network programming in the Java programming language. Moreover, traditional distributed systems have a scale different from grid systems, where there is no room for centralized queueing systems, or single points of failure, or traditional scheduling algorithms.

Grid protocols and technologies are being adopted in a wide variety of academic, government, and industrial environments, and there is a growing body of research-oriented literature in grid computing.  As so, there is a need for educational on the fields of Cluster and Grid Computing. The next generation of scientists and engineers need to be prepared for a technological workplace that is changing the world.

Cluster Computing

Clustering is a powerful concept and technique for deriving extended capabilities from existing classes of components. In the field of computing systems, clustering is being applied to render new systems structures from existing computing elements to deliver capabilities that through other approaches could easily cost ten times as much. In recent years clustering hardware and software have evolved so that today potential user institutions have a plethora of choices in terms of form, scale, environments, cost, and means of implementation to meet their scalable computing requirements. Some of the largest computers in the world are cluster systems. 

Beowulf-class clusters systems have become extremely popular, providing exceptional price/performance, flexibility of configuration and upgrade, and scalability. For many applications they replace previous-generation monolithic vector supercomputers and MPPs (Massive Parallel Processors).  Beowulf integrates widely available and easily accessible low-cost or no-cost system software to provide many of the capabilities required by system environment. As result of these attributes and the opportunities they imply, Beowulf-class clusters have penetrated almost every aspect of computing and are rapidly coming to dominate the medium to high end. Computing with Beowulf cluster engages four distinct but interrelated areas of consideration: i) hardware system structure, ii) resource administration and management environment, iii) distributed programming libraries and tools and iv) parallel algorithms.

Hardware system structure encompasses all aspects of the hardware node components and their capabilities, the dedicated network controllers and switches, and the interconnection topology that determines the system?s global organization. 
The resource management environment is the battery of system software and tools that govern all phases of system operation from installation, configuration, and initialization, through administration and task management, to system status monitoring, fault diagnosis, and maintenance. 

The distributed programming libraries and tools determine the paradigm by which the end user coordinates the distributed computing resources to execute simultaneously and cooperatively the many concurrent logical components constituting the parallel application program. 

Finally, the domain of parallel algorithms provides the models and approaches for organizing a user?s application to exploit the intrinsic parallelism of the problem while operating within the practical constraints of effective performance. 

Grid Computing

Our current computational grids have been accepted as one of the most powerful tools to tackle large-scale problems of computational science and engineering, does promoting the transformation of traditional applications into grid-enabled software packages.

Grid computing is an important new approach to distributed computing that uses geographically distributed computers collectively to achieve higher performance computing and resource sharing. Its combination of regular structures and dynamic algorithms will deliver large and sustained data intensive, knowledge intensive and computational resources across heterogeneous contributing sites. It is enabling rapid advances in many disciplines. 

On the front of these applications are domains where scientists need the grid to analyze huge amounts of data as provided by sophisticated experiments such as the Large Hadron Collider  at CERN. 

Grids underpin the rapidly emerging e-Infrastructure and Cyber infrastructure. It will enable global collaborations to make rapid advances addressing challenges in science, economics, design, engineering and medicine, in fact in all walks of life. 

Description of the Course

The Cluster and Grid Computing (CGC) Curriculum Unit is offered as a fully comprehensive discussion of the foundations and practices for the operation, development and application of commodity clusters and Grids. 

The course is divided into 3 parts that cover broad topics in the area of parallel and distributed systems, including high-performance parallel computing, cluster and grid Computing.
     
Part I is dedicated to basic concepts of parallel and distributed computing focusing on parallel programming models, parallel architectures, including modern multi-cores and multi-processors, performance metrics, performance analysis, parallel benchmarks, and state-of-the-art in parallel and distributed computing. 

Part II begins with an overview of high-performance computing from the perspective of clusters. It introduces the basic terminology, gives a broad overview of the different architectures and presents some of their limitations. It also describes the hardware components that make up a Beowulf system and shows how to assemble such a system, which is not merely to join off-the-shelf computers in a room or a shelf. Cluster planning determines what we want a cluster to do and gives an overview of the different types of interconnections communication technologies and cluster file systems  alternatives  as well as take it out for an initial spin using some readily available parallel benchmarks. Finally, we also explain how to manage the resources of Beowulf systems, including system administration and task scheduling, monitoring and accounting.

Part III of the course provides an introduction to Grid systems and implementation that underpin e-infrastructure and cyber infrastructure offering an opportunity to hear about the latest achievements from Europe, North America and Asia, and to experience a variety of Grid systems. It will present a conceptual framework to provide students with theory, technology and implementation skills that forms the foundation of the field of grid computing. The student will also study about the main limitations of the current systems regarding scheduling algorithms, data location, and network traffic control as well as new research topics raised by the use of this new technology..


Related Courses 

To our knowledge there is no course equivalent in other Education Institutions. However, there are  several other courses and training events, built on recent advances on cluster and grid technologies such as the various editions of the International Summer School on Grid Computing or the training events organized in the context of the various grid projects around the world. These events provide in-depth introduction to cluster and grid technologies, through lectures on the principles, technologies experience and exploitation of grids. 

 II. Objectives

The key distinction between clusters and grids mainly lies in the way resources are managed. In case of clusters, the resource allocation is performed by a centralised resource manager and all nodes cooperatively work together as a single unified resource. In case of Grids, each node has its own resource manager and does not aim at providing a single system view. Another difference relates to the scale. Grid environments usually involve tens of thousands of resources, while cluster environments are limited to a few thousands of resources.

We identify a two set of component interesting parts that can help to structure course material in cluster and grid computing. Each set consists of several lectures organized in parts that conclude with a case study. 
 
The work will be challenging but rewarding, enhancing each student's ability to work in this rapidly advancing field and giving participants experience to widely used Grid middleware so that students find themselves to:  
* be familiar with the fundamental components of Cluster Grid environments, such as: authentication, authorization, resource discovery and resource access; 
* be able to use Cluster Grid environments for basic and advanced job submission, and distributed data management; 
* be conversant with Grid achievement worldwide; 
* be alert to emerging Grid applications; appreciate the potential of e-infrastructure; and be aware of new research opportunities.


Target Audience

We will assume that students have diverse backgrounds and build on that diversity. They may come from any country, from computer science and computational science from enthusiasts who have recently heard about this emergent technology to ambitious researchers planning to work on cluster and Grid projects.  

Students may be planning to engage in fundamental distributed systems research, or be involved in advanced middleware design and development, to develop new methods or applications in any discipline that depends on the emerging capabilities of e-Infrastructure. Other students may plan to use advanced Grid middleware or be engaged in Grid computing and operations. 

 III. Learning Outcomes

By the end of the course students should be able to develop parallel applications that can run on a wide range of computing platforms, from multi-core machines to Grids or be able to engage on research work on grid middleware aspects such as scheduling or data placement, among others. 

Students should leave the course with: 

* A good understanding of the principles and foundations of cluster and Grid computing 
* An appreciation of the variations in technologies and views of cluster and Grid computing
* An awareness of good strategies for using these emerging technologies in their research
* Confidence in using several of the available technologies
* An introduction to the research challenges and the future of Grid computing
* Insight into the architectural implications of Grid-scale computation 
* Awareness of current research issues in:  Grid architecture and infrastructure 
* Integration of applications across autonomous organisations 
* Practical experience of current Grid technologies and the associated standards 
* Skills in utilising current Grid tools and technologies 
* Appreciation of the weaknesses of existing tools and technologies, and potential areas for improvement
* identify and characterize several types of parallelism and architectures that support them
* design, implement and optimize concurrent, parallel, distributed and grid enabled applications
* characterize and develop scalable algorithms, selecting the best parallelization for a particular parallel system.

In addition students will also gain proficiency into the following areas:
* Cluster Administration - advanced scientific parallel computing on the grid requires the installation and configuration of a cluster that is accessible using grid protocols. 
* Parallel Computing - developers of advanced scientific parallel computing on the grid need to know how to use tools and parallel programming languages and libraries to develop and run parallel applications. 
* Grid Administration - grid computing requires the installation, configuration, and administration of a grid development and execution environment.
* Security: administrators, developers, and grid users all need to understand the use of public key encryption, certificates, and how security is implemented in the Grid. 
* Grid Access and Management ? administrators and grid users all need to know how to use commands and tools to submit and execute different grid applications, how to locate, access, and manage data repositories and other grid resources, and how to interface to advanced grid computing toolkits. 

 
 IV. Course Contents
Part I.  Parallel and Distributed Programming Models
Programming Paradigms
* Shared memory 
* Message passing
* Workflows
Development of parallel and distributed applications
* Design phases
* Common parallel patterns
* Performance metrics and profiling
* Optimizations techniques
* Mapping and scheduling
* State-o-the-art on parallel and distributed systems and applications

Part II. Cluster Computing

Overview of Cluster Computing
* The Role of Clusters 
* Definition and Taxonomy  
* Distributed Computing  
* Limitations 
Cluster Planning
* Architecture and Cluster Software
* Design Decisions
* Network Hardware
* Network Software
* Protocols
* Distributed File Systems 
* Virtualization technologies
* Benchmarks
Part III.  Introduction to Grid Computing

Overview of Grid Computing
* Global Computing
* Network Resources
* General Attributes
* Emerging Technologies 
* Architectural attributes
* Emerging standards
* Grid Initiatives and Applications
* Job Submission Service
* Application Development

Grid Computing Components
* Operations Infrastructure
* Middleware and Application
* Virtual Organization management 
* Resource Discovery and Info Services
* Data Access Integration and Management
* Web Services
* Grid Portals

 V. Methodology

The overall aim is to give students a broad and well balanced understanding of Cluster and Grid computing that will serve well as a foundation for more specific work or research. 
The Course will include lectures and discussions on the principles, technologies, experience and exploitation of clusters and Grids, through which students will receive an integrated and well-structured introduction to the fields of Cluster and Grid computing, their applications and deployment. 
Lectures will also review the research horizon and report recent significant successes.
The methodology will also favour critical discussion and reasoning about large-scale distributed system architectures, infrastructures and technologies 
Students will be exposed to cluster computing platforms and solutions to study first, how to install, configure and manage computer cluster. Secondly, how to apply high-performance parallel computing using cluster architectures and models, based on the latest technologies such as Beowulf. This will expose principles, research challenges, technical issues and leading views. 
Two cases of study of widely used middleware produced by projects in the USA and Europe have been selected to provide experience of collaboratively using Cluster and Grid-based resources, and support the hands-on sessions. 
The test bed will be connected to SeARCH and EGEE to provide a rich environment for learning and experimentation. Exercises and team work will encourage students to learn by using this test bed.

There is no set text for this module. Books and research papers will be distributed as required and technical manuals and related documentation will be issued as part of the practical activities.  Specific lecture notes for each course unit will be written.


 VI. Grading
70% Examination 
30% Course work 

 VII. Bibliographic References

Education: Integrating Parallel and Distributed Computing in Computer Science Curricula, Marcin Paprzycki, IEEE Distributed Systems Online, vol.?7,? no.?2,? Feb.,? 2006;
Grid computing in the undergraduate classroom: Topics, exercises, and experiences, Mache, J., Apon, A., Proceedings of the 5th IEEE/ ACM International Symposium on Cluster Computing and the Grid, 2005;
SeARCH, Services and Advanced Research Computing with HTC/HPC clusters?, 2005, http://www.di.uminho.pt/search
Classroom exercises for grid services, Proceedings of the Linux Cluster Institute International Conference on High Performance Computing, [Apon, A., Mache, J., Yara, Y., Landrus, K.,2004;
EGEE Consortium. EGEE: Enabling Grids for E-Science in Europe, Project funded by the European Union under contract IST-2003-508833, June 2004;
High-Performance Linux Clusters, with Oscar, Rocks, openMosix & MPI, Joseph D. Sloan, O? Reilly, 2004;
Beowulf Cluster Computing with Linux, Second Edition, William Gropp, Ewing Lusk and Thomas Sterling, MIT PRESS, ISBN 0-262-69292-9, 2003;
Grid Computing:  Making the Global Infrastructure a Reality, F. Berman, G. C. Fox and A. J. G. Hey editors, Wiley, 2003, ISBN 0-470-85319-0.
 
 Annex: Team

This course is supported by a team involving researchers from the University of Porto, FCUP (In?s de Castro Dutra), and FEUP (Jorge Barbosa) and the University of Minho, DI-CCTC (Ant?nio Manuel da Silva Pina, Jo?o Lu?s Sobral and Alberto Proen?a). The team has already a record of effective research collaborations both on research projects and supervision of Ph.D. and M.Sc.?students. 
In what follows we give a brief presentation of each researcher, which includes, for each of them, up to 5 key publications related to the scientific area in which this course is proposed. 

In?s de Castro Dutra: is a lecturer at Science Faculty of Porto University, FCUP. 
She obtained her B.Sc. degree in Computer Science from State University of Rio de Janeiro, in 1985, and her M.Sc. degree in Systems Engineering and Computer Science from Federal University of Rio de Janeiro, in 1988, and the Ph.D. degree from Bristol University, in 1995. 
In 1998, she was a lecturer in the Department of Systems Engineering and Computer Science of COPPE, an institution for postgraduate studies in Engineering, at Federal University of Rio de Janeiro.
During the periods between October 2001 and December 2002, and between April 2004 and March 2005, she worked as a visiting researcher at University of Wisconsin-Madison, USA, in the department of Biostatistics and Medical Informatics. During these periods, she worked for machine learning projects funded by NSF, DARPA and American Air Force (projects COLLEAGUE, EELD and EAGLE) and started to work with applications that demanded a huge amount of resources. At this time, she had the opportunity to work with the Condor team, and to largely use the Condor resource manager to run her experiments. Having some experience on running and managing thousands of experiments in a grid environment, she supervised a D.Sc. student on a new middleware for managing applications, and an M.Sc. student on a tool to manage the submission and monitoring of machine learning applications.
She was invited for committees of several workshops and conferences, including the Brazilian Workshop on Operating Systems, the Brazilian Workshop on High Performance Computing (held with the International Symposium on Computer Architecture and High Performance Computing), International Conference on Logic Programming, International Conference on Complex Open Distributed Systems, LA-Grid, CCGrid and is the program chair of the Brazilian Meeting on Artificial Intelligence. She has published over 40 scientific articles in conferences and journals. Currently she is a member of the EELA (E-infrastructure shared between Europe and Latin America) project, whose main objective is to establish an infrastructure of hardware and software between Europe and Latin America to form a human research network.

Supervision:

Currently, she is supervising three M.Sc. students in the area of grid computing. One of them is proposing and implementing a grid interface to access data from geographically distant databases. Another one is working on a meta-scheduler for grid computing based on reinforcement learning. The third one is working on supporting scheduling of MPI tasks across domains using several different scheduling strategies over the Samba-Grid system developed by Plastino et al. at Federal University Fluminense, Brazil.

* Ph.D. students:
o Adriano Caminha, Algorithms for breakpoint detection and identification in DNA sequences (co-supervised by Prof. Amit Bhaya), Department of Systems Engineering and Computer Science, COPPE, Federal University of Rio de Janeiro, 2008;
o Rogerio Lopes Salvini, Obtaining Efficient Classifiers from Induced Rules, Department of Systems Engineering and Computer Science, COPPE, Federal University of Rio de Janeiro, 2008;
o Luciana Itida Ferrari: Analysis of Immunoblots, Department of Systems Engineering and Computer Science, COPPE, Federal University of Rio de Janeiro, 2009;
o Laci Mary Barbosa Manhaes, Multi-Classisifcation in Inductive Logic Programming, Department of Systems Engineering and Computer Science, COPPE, Federal University of Rio de Janeiro, 2009;
o Alberto Arkader Kopiler, Managing infrastructures using systems inspired on the nature, Department of Systems Engineering and Computer Science, COPPE, Federal University of Rio de Janeiro, 2009.

* M.Sc. students:
o Bernardo Fortunato Costa: meta-scheduler for bag-of-tasks, based on reinforcement learning, Department of Systems Engineering and Computer Science, COPPE, Federal University of Rio de Janeiro, 2007;
o Andre Oliveira: meta-scheduling of MPI processes, over Samba-Grid, Department of Systems Engineering and Computer Science, COPPE, Federal University of Rio de Janeiro, 2008;
o Joao Victor de Almeida Pap: Grid querying, Department of Systems Engineering and Computer Science, COPPE, Federal University of Rio de Janeiro, 2008.

Research Projects: 

EELA project: its main objective is to bring the less-experienced and less-resourced countries of the Latin American region to the level of European developments in terms of the e-Infrastructures. With the networking infrastructure reaching stability through the ALICE project, the focus of the EELA project will be on Grid infrastructure and related e-Science applications. 
She is working in the EELA project (E-infrastructure shared between Europe and Latin America), as deputy of Work Package 3 (Applications), which is divided in three tasks and leader of task 3.3 (e-learning, climate and other applications). In 2006, she organized the third grid tutorial in the context of the EELA project, in Rio de Janeiro. 

Selected Publications (cluster and Grid computing): 

o P. K. Vargas, I. C. Dutra, V. D. Nascimento, L. A. S. Santos, L. C. Silva, C. F. R. Geyer and B. Schulze, GRAND: Toward Scalability in a Grid Environment, Concurrency and Computation: Practice and Experience, November, 2006, Wiley InterScience, doi: 10.1002/cpe.1138;
o J. Davis, D. Page, E. Burnside, I. C. Dutra, R. Ramakrishnam, V. Santos Costa, and J. Shavlik, View Learning for Statistical Relational Learning: with an Application to Mammography, International Joint Conference on Artificial Intelligence, Edinburgh, Scotland, July, 2005;
o J. A. L. Sanches, P. K. Vargas, I. C. Dutra, V. Santos Costa, and C. F. R Geyer, ReGS: user-level Reliability in a Grid Environment, IEEE International Symposium on Cluster Computing and the Grid, May 9--12, 2005, Cardiff, UK;
o P. K. Vargas, L. A. S. Santos, C. F. R. Geyer, I. C. Dutra, An Implementation of the GRAND Hierarchical Application Management Model using the ISAM/EXEHDA system, III Workshop on Computational Grid and Applications, LNCC, Petr?polis, RJ, January 31st -- February 2nd, 2005;
o I. C. Dutra, D. Page, V. Santos Costa, J. Shavlik and M. Waddell, Toward Automatic Management of Embarrassingly Parallel Applications , Euro-Par'03, Klagenfurt, Austria, August 26 - 29, pp. 509--516, 2003.

Ant?nio Manuel Silva Pina: is a lecturer at the Department of Computer Science of University of Minho, and a researcher member of CCTC. His scientific research activities are centred on the areas of Parallel Computing modelling and programming, cluster and grids platforms development and deployment.

Supervision

* Ph.D. students:
o  Albano Alves, Rocmeu: orienta??o ao recurso na modela??o de aplica??es paralelas e explora??o cooperativa de clusters multi-SAN, Eng /G/GC-6300, Universidade do Minho, Dezembro 2004.
o Jos? Carlos Rufino, Domus: Tabelas de Hash Distribu?das em Clusters de N?s de Computa??o Heterog?neos, Universidade do Minho (waiting public defense), 2007;
o Jos? Lu?s Exposto, Parti??o multi-objectivo para descarga eficiente de recursos na WWW, Universidade do Minho, (to be concluded) Julho 2007.

* MSc students 
o  Albano Serrano, Infra-estrutura para a Computa??o Multi-cluster em Ambiente Grid, Universidade do Porto, Jan. 2007 
o Cec?lia Moreira, CoRes ? Computa??o orientada ao Recurso ? uma Especifica??o, Universidade do Minho 2001;
o Ant?nio Tavares, Realiza??o de um Modelo de Computa??o Baseado em Agentes, Universidade do Minho, 1997.

Research Projects: 

* AspectGrid: Pluggable Grid Aspects for Scientific Applications, member of the research team, GRID/GRI/81880/2006, (submitted )
* CROSS-Fire-Collaborative Resources Online to Support Simulations on Forest Fires : a Grid Platform to Integrate Geo-referenced Web Services for Real-Time Management, member of the research team, GRID/GRI/81795/2006, (submitted);
* SAVoIR: SAN-aware distributed dictionaries over Virtual Clusters for high-performance and data-intensive applications, Investigator Principal, PTDC-EIA-73934-2006, (submitted);
* IGIDE-Interactive Global illumination within Dynamic Environments, member of the research team,  PTDC/EIA/65965/2006, (submitted);
* SeARCH: Services and Advanced Research Computing with HTC/HPC clusters, member of the research team, Ref?: CONC-REEQ/443/2001, funded;
* ?SIRe: A Scalable Information Retrieval environment?, Principal Investigator, POSI/2001/CHS/41739, funded.

Selected Publications (cluster and Grid computing): 
o A. Alves, A. Pina, J. Rufino, J. Exposto, ?Deploying Applications in Multi-SAN SMP Clusters?, full version, Special Issue of International Journal of Computational Science and Engineering (LJCSE). ISSN (print):1742-7185, ISSN(on-line):1742-7193. InderScience Publishers; 
o J. Rufino, A. Pina, A. Alves, J. Exposto, pDomus: a prototype for Cluster-oriented Distributed Hash Tables, 15th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP 2007), Naples, Italy, February 2007 (to be presented);
o J. Exposto, J. Macedo, A. Pina, A. Alves, J. Rufino, ?Geographical Partition for Distributed Web Crawling?, 2nd International Workshop on Geographic Information Retrieval (GIR 2005), ACM Press, pp. 55-60, Bremen, 2005;
o Alves, A. Pina, ?Bridging the gap between cluster and grid computing?, 6th International Conference on Parallel Processing and Applied Mathematics (PPAM 2005), LNCS 3911, Springer, Poznan, 2005; 
o Alves, A. Pina, J. Rufino, J. Exposto, ?meu: unifying application modeling and cluster exploitation?, 16th Symposium on Computer Architecture and High Performance Computing (SBAC-PAD 2004), IEEE Computer Society, 132-139, Foz do Igua?u, Brazil, 2004.
Selected Technical / Scientific  Ativities (cluster and grid)
o Member of the Scientific Committee of the ?1st IBERIAN INFRASTRUCTURE FOR DISTRIBUTED COMPUTING CONFERENCE, Santiago de Compostela, Spain, May 2007;
o Co-supervision technique, scientific and administration of the communication, /computation, and storage of  SeARCH, Universidade Minho, since 2005;
o Review of scientific papers submitted to the ?11th International Euro-Par 2005, Parallel Processing, Lisboa 2005.

Jo?o Luis Ferreira Sobral: is a lecturer at the Department of Computer Science of University of Minho, and a researcher member of CCTC. His scientific research activities are centred on the areas of parallel programming paradigms, cluster and grids platforms. He has a large experience on teaching parallel computing to undergraduate and graduate students, at the University of Minho.

Supervision

* MSc students 
o Carlos Cunha, Reusable Aspect-Oriented Implementations of Concurrency Patterns and Mechanisms, Universidade do Minho, Jan. 2007
o Ant?nio Sousa, Localiza??o Autom?tica de Objectos em Sequ?ncias de Imagens, Universidade do Minho, May 2005

Research Projects (cluster and Grid Computing): 

* PPC-VM: Parallel Computing Based on Virtual Machines (March 2004 - March 2007), Principal Investigator, Project funded by FCT (POSI/CHS/47158/2002)
* ViAr - Arqueologia Virtual e Acess?vel com Computa??o Adaptativa em Cluster (March 2002 - March 2005), member of the research team, Project funded by FCT (POSI/42041/CHS/2001)
* AspectGrid: Pluggable Grid Aspects for Scientific Applications, member of the research team, GRID/GRI/81880/2006, Principal Investigator, (submitted )
* GasPar: General-purpose ASpect-Oriented framework for PArallel computing, Principal Investigator, (submitted)
* CROSS-Fire-Collaborative Resources Online to Support Simulations on Forest Fires : a Grid Platform to Integrate Geo-referenced Web Services for Real-Time Management, member of the research team, GRID/GRI/81795/2006, (submitted)
* P-found: GRID computing and distributed data warehousing of protein folding and unfolding simulations, member of the research team (submitted)

Selected Publications (cluster and Grid computing):

o C. Cunha, J. Sobral. An Annotation-Based Framework for Parallel Computing, 15th Euromicro Conference on Parallel, Distributed and Network-based Processing (PDP 2007), Naples, February 2007, IEEE Computer Society 2007
o J. Sobral, C. Cunha, M. Monteiro, Aspect-Oriented Pluggable Support for Parallel Computing, Proceedings of the 6th International Meeting of Vector and Parallel Processing (VecPar?2006), Rio de Janeiro, Brasil, June 2006. 
o J. Fernando, J. Sobral, A. Proenca. JaSkel: A Java Skeleton-Based Framework for Structured Cluster and Grid Computing, 6th IEEE International Symposium on Cluster Computing and the Grid (CCGrid'2006), Singapore, May 2006, IEEE Computer Society, 2006.
o J. Sobral. Incrementally Developing Parallel Applications with AspectJ, 20th IEEE International Parallel & Distributed Processing Symposium (IPDPS?06), Greece, Rhodes, April 2006, IEEE Computer Society, 2006
o C. Cunha, J. Sobral, M. Monteiro, Reusable Aspect-Oriented Implementation of Concurrency Patterns and Mechanisms, Fifth ACM International Conference on Aspect Oriented Software Development (AOSD 06), Bonn, Germany, March 2006.
o J. Sobral, J. Fernando. ParC#: Parallel Computing in .Net, Parallel Computing Technologies 2005 (PaCT'05), Russia, September 2005, LNCS vol. 3606, Springer 2005. 

Jorge Manuel Gomes Barbosa: is a lecturer at the Department of Electrical and Computer Engineering of University of Porto, and a research member of INEB (Biomedical Engineering Institute), being the research activities related to parallel algorithms for biomedical applications, scheduling and performance modelling. He obtained his BSc degree in Computer Science from University of Porto Faculty of Engineering, in 1992, and his MSc degree in Digital System from the University of Manchester Institute of Science and Technology, in 1993, and the PhD in Computer Science from the University of Porto Faculty of Engineering, in 2001. From 2003 to present, he is responsible of the discipline ?Parallel and Distributed Programming? of the MSc course in Informatics of Faculty of Engineering of the University of Porto.


Supervision
* Ph.D. students:
o Daniel Moura, ?Geometric and physical modulation for the vertebral torsion analysis?. Co-supervisor: Jo?o Tavares (DEMEGI). Faculty of Engineering, University of Porto. Started in 2006. 

* MSc students 
o Carlos Firmino, Parallelization of Multi-objective Meta-heuristics Algorithms, University of Porto, to be concluded in 2007.

Financed Research Projects: 
* Re-equipment project: in 2002 collaborated in the proposal presented by Biomedical Engineering Institute to acquire a cluster infrastructure with 72 processors, which was installed in December 2006.
* POSI/EEA-SRI/55386/2004: collaborated in the project proposal, that was approved, ?Segmenta??o, Seguimento e An?lise de Movimento de Objectos Deform?veis (2D/3D) usando Princ?pios F?sicos?.
* POS-Conhecimento program, reference 242/4.2/C/REG/2006: colaborates as an information system expert in the Project ?ACTIDEF - Avalia??o Computacional e Tecnol?gica Integrada do Desempenho e Funcionalidade de Cidad?os com Incapacidades M?sculo-esquel?ticas?. Proponent entity: Centro de Reabilita??o Profissional de Gaia.
* Principal Researcher of the project ?Scheduler of parallel tasks on heterogeneous computer systems and high level user interface for job submission?, approved in 2004 by the Department of Electrical and Computer Engineering at the Faculty of Engineering of University of Porto.

Publications: 
o J. Barbosa, J. Tavares, A.J. Padilha, ?Optimizing Dense Linear Algebra Algorithms on Heterogeneous Machines? Algorithms and Tools for Parallel Computing On Heterogeneous Clusters, Nova Science Publisher, N.Y., pp 17-31, ISSN: 1-60021-049-X, 2006;
o R. N?brega, J. Barbosa, A.P. Monteiro, ?BioGrid Application Toolkit: a Grid-based Problem Solving Environment Tool for Biomedical Data Analysis?, VECPAR?06 - 7th International Meeting on High Performance Computing for Computational Science, Rio de Janeiro, pp. 1-13, 2006;
o C. N. Morais, J. Barbosa, P. Tadeu, ?Data parallel scheduling of Operations in Linear Algebra on heterogeneous clusters?, WSEAS Transactions on Computers, Vol.4 n? 10, pp.1440-1448, ISSN: 1109-2750, 2005;
o J. Barbosa, C. Morais, R. N?brega, A.P. Monteiro, ?Static scheduling of dependent parallel tasks on heterogeneous clusters?, Heteropar05 - 4th International Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogeneous Networks, IEEE CS Press, Boston, EUA, pp. 1-8, 2005;
o J. Barbosa, C. Morais, A.J. Padilha, ?Simulation of data distribution strategies for LU factorization on heterogeneous machines?, 12th Heterogeneous Computing Workshop, IEEE CS Press, Nice, pp. 1-8, 2003;
o J. Barbosa, J.Tavares, A.J. Padilha, ?A Group Block Distribution Strategy for a Heterogeneous Machine?, Proceedings of the IASTED International Conference Applied Informatics, Fevereiro 2002.
Scientific Activities 
o Since 2003, member of the Scientific Committee of the ?Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogeneous Networks ? Heteropar?, integrated on IEEE Cluster conference;
o Member of the Scientific Committee of the IASTED International Conference on Parallel and Distributed Computing and Networks, PDCN 2007;
o Reviewer of scientific papers submitted to the ?Vecpar?06 - International Meeting on High Performance Computing for Computational Science?.
o Reviewer of scientific papers submitted to the ?IPDPS 2007 - IEEE International Parallel & Distributed Processing Symposium?.


Alberto Jos? Proen?a: is a full professor at Dep. Computer Science,? University of Minho, since January 2001, and he lead the creation of the research center CCTC presented at FCT, where he is a member. He received a MSc and PhD degree from University of Manchester, UK, in 1979 and 1982. He is currently the Program Director of the degree Lic. Computer Science. His current R&D interests lie on high performance and computer imaging, namely on the exploitation of system and devices capabilities to increase the performance of graphics/vision applications, including functional parallelism (vector co-processors), and CPU parallelism in cluster and grid environments. 

Coordination of Projects:

ViAr: Affordable Interactive Virtual Archaeology with Adaptive Cluster Computing
? POSI/CHS/42041/2001 (FCT)
? Apr'02 ? Sep'05
? Budget: ?68k

SeARCH: Services and Advanced Research Computing with HTC/HPC clusters
? REEQ/443/EEI/2005 (Infrastructures)
? Jan'05 ? Dec'06
? Budget: ?286.6k

CROSS-Fire - Collaborative Resources Online to Support Simulations on Forest Fires: a Grid Platform to Integrate Geo-referenced Web Services for Real-Time Management
? GRID/GRI/81795/2006 (FCT)
? Aug'07 ? Aug'10 
? Budget: ?170k

CICH: Computer Imaging in the Cultural Heritage: techniques to foster the use of reflective transformation imaging with large artefacts
? PTDC/EIA/70556/2006 (submitted to FCT)
? Oct'07 ? Sep'10 
? Budget: ?200k


Selected Publications:

J. Fernando, J. Sobral, A. Proenca. "JaSkel: "A Java Skeleton-Based Framework for Structured Cluster and Grid Computing", 6th IEEE Int. Symp. Cluster Computing and the Grid (CCGrid'2006), IEEE Computer Society, Singapore, May 2006

A. Oliveira, L.P. Santos, A. Proen?a, "Refinement Criteria for High Fidelity Interactive Walkthroughs"; 4th Int. Conf. Computer Graphics and Interactive Techniques (Graphite'2006), Australasia and South-east Asia, pp 453..460; ACM SIGGRAPH; Kuala Lumpur, Malaysia, December 2006

L.P. Santos, V. Coelho, P. Bernardes, A. Proen?a, "High Fidelity Walkthroughs in Archaeology Sites"; 6th Int. Symp. Virtual Reality, Archaeology and Cultural Heritage (VAST'2005)(short paper); Italy, November 2005

L.P. Santos, A. Proen?a, "Scheduling Under Conditions of Uncertainty: a Bayesian Approach"; EuroPar'2004: Parallel Processing, LNCS 3149, Springer-Verlag, pp. 222-229, September 2004
In?s Dutra/Ant?nio Pina/Jo?o Sobral/Alberto Proen?a/Jorge Barbosa2	  Cluster and Grid Computing