
Chapter 7. Interprocess communication 75
responding call would be
CALL SHMEM_type_op_TO_ALL(target, source, nreduce, &
pe_start, logpe_stride, pe_size, pwrk, psync)
and here type is one of {INT8, INT4, REAL8, REAL4, COMP8, COMP4},and
op is one of the operations already mentioned.
The call above applies reduction operation op on data of type type at ad-
dress source in the memories of all PEs involved. The result is stored at
address target. The argument nreduce tells on how many consecutive
data items the reduction operation is to be performed.
Let us suppose that we have two PEs both of which store a vector of 4
integer elements. If we call shmem_int8_sum_to_all with nreduce =
1, the result will be one integer which equals the sum of the first elements
of the vectors. If nreduce equals 4, we get an array of 4 integers, and
each element in this array is the sum of the corresponding elements
in the original vectors. Thus, if the total sum of all elements in both
vectors is to be calculated, one must first call a SHMEM routine to form
an array of partial sums, and then finish the calculation by summing up
the elements in the resulting array with, e.g., a BLAS routine.
The triple pe_start, logpe_stride, pe_size is used to define the so
called active set, which includes the PEs taking part in the reduction op-
eration. The value of pe_start is simply the number of the first PE in
the active set. The value of logpe_stride is the logarithm (in base 2) of
the stride between the PEs, and pe_size is the number of PEs in the ac-
tive set. Thus {pe_start, logpe_stride, pe_size}={0, 1, 5} indicates
that the active set consists of the PEs 0, 2, 4, 6, 8. As another example,
{pe_start, logpe_stride, pe_size}={0, 0,n} indicates that the ac-
tive set is PEs {0, 1,...,n− 1}.
Note: all the PEs in an active set and only these should call a collective
routine!
Finally, pwrk and psync are symmetric work arrays. The argument
psync is of integer type, and of size shmem_reduce_sync_size (this
constant is defined in the file mpp/shmem.h or in the file mpp/shmem.fh.
They should be included at the beginning of a code utilizing SHMEM
library). The variable psync must be initialized so that the value of each
entry is equal to shmem_sync_value. After initialization it is a good
idea to call a barrier routine to guarantee synchronization before using
psync.
The argument pwrk should be of the same type as the reduction routine
and of size max(nreduce/2 + 1, shmem_reduce_min_wrkdata_size).
Comentarios a estos manuales