
Chapter 4. Program development 35
4.3.3 BLACS
Both BLAS and LAPACK are developed for single processor computa-
tions. In order to solve problems of linear algebra on parallel machines
where matrices can be distributed over several processors, we need to
communicate data between the processors. For this purpose there is a
special library called BLACS (Basic Linear Algebra Communication Sub-
routines).
The routines in BLACS can be divided into three classes. First, there
are communication routines for sending and receiving parts of matrices
between two or more processors. Second, there are global reduction rou-
tines in which all processors take part. An example of these is finding
the element of the largest absolute value in a distributed matrix. Third,
there are a few general support routines for setting up the communica-
tion network.
4.3.4 PBLAS and ScaLAPACK
PBLAS (Parallel BLAS) and ScaLAPACK (Scalable LAPACK) are parallelized
versions of BLAS and LAPACK, respectively. The names of the multipro-
cessor routines in these libraries are almost the same as the ones used
for the corresponding single processor routines, except for an initial P
for “parallel”.
For some obscure reason, Cray has not documented the PBLAS library
at all, except for a short notice on the ScaLAPACK manual pages about
PBLAS being supported. For ScaLAPACK, the situation is somewhat bet-
ter, since for all available routines there is a manual page. On the other
hand, the current implementation of ScaLAPACK on the T3E does not
support all routines available in the public domain version. See Table 4.1
for the existing ScaLAPACK routines.
4.3.5 Details
All of the above mentioned libraries follow a naming convention. This
dictates that any subroutine operating with single precision floating-
point numbers should be given a name beginning with S. Correspond-
ingly, those routines accepting double precision floating point numbers
as arguments have a name beginning with D (not counting the letter P
for parallel versions, which precedes the actual name).
However, since the single precision floating point numbers on the T3E
have 8 bytes, which on most other computers corresponds to the double
precision, you should make sure that you change not only the type defi-
nitions of the variables but also all calls to BLAS etc. accordingly. Note
that on the T3E there are no BLAS routines starting with the letter D.
Comentarios a estos manuales