Collective communication operations: experimental results vs. theory

IRIS

Collective communication operations (CCOs) are one of the most powerful tools for parallel processing on distributed memory architectures. From the theoretical viewpoint there has been a major effort in the design of optimal algorithms for these operations, especially for massive parallel processors (MPPs). However, in spite of the increasing availability of MPPs, there are just a few limited experimental checks of the different theories, so the assessment of their real value is not easy. The aim of the present paper is to address such issues for the most common CCOs, considering practical algorithms that can be included in a generic communication library. The main result is a new algorithm for building a quasi-optimal broadcast tree that is much simpler than, and as efficient as, previously available algorithms. To investigate the advantages and drawbacks of the proposed algorithms, a large set of experimental data has been collected on an IBM SP2 parallel system. The data demonstrate the efficiency of our approach in a number of interesting cases. Finally, all the experimental results have been related to the model used in designing the algorithms.

Collective communication operations: experimental results vs. theory

Bernaschi M;Iannello G

1998-01-01

Abstract

Collective communication operations (CCOs) are one of the most powerful tools for parallel processing on distributed memory architectures. From the theoretical viewpoint there has been a major effort in the design of optimal algorithms for these operations, especially for massive parallel processors (MPPs). However, in spite of the increasing availability of MPPs, there are just a few limited experimental checks of the different theories, so the assessment of their real value is not easy. The aim of the present paper is to address such issues for the most common CCOs, considering practical algorithms that can be included in a generic communication library. The main result is a new algorithm for building a quasi-optimal broadcast tree that is much simpler than, and as efficient as, previously available algorithms. To investigate the advantages and drawbacks of the proposed algorithms, a large set of experimental data has been collected on an IBM SP2 parallel system. The data demonstrate the efficiency of our approach in a number of interesting cases. Finally, all the experimental results have been related to the model used in designing the algorithms.

Scheda breve

Scheda completa

Scheda completa (DC)

Anno

1998

Appare nelle tipologie:

1.1 Articolo in rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12610/2572

Citazioni

ND

35

22

social impact