Collective communication involves communication of data using all processes inside of a given communicator, the default communicator that contains all available processes is called MPI_COMM_WORLD. When a collective call is made it must be called by all processes inside of the communicatior. Collective communications will not interfere with point-to-point communications nor will point-to-point communications interfere with collective communication. Collective communications also do not need the use of tags. Send and receive buffers when using collective communication calls must match in order for the call to work and there is no guarantee that a function will be synchronizing (except for barrier). Also all collective communication operations are blocking. These are some things to keep in mind while using collective communication operations.
Collective communication operations are made of the following types:
Barrier Synchronization – Blocks until all processes have reached a synchronization point
Data Movement (or Global Communication) – Broadcast, Scatters, Gather, All to All transmission of data across the communicator.
Collective Operations (or Global Reduction) – One process from the communicator collects data from each process and performs an operation on that data to compute a result.
MPI_Barrier( MPI_Comm comm );
Parameter | Meaning of Parameter |
comm | communicator (handle) |
MPI_Barrier blocks until all process have reached this routine.
MPI_Bcast( void *buffer, int count, MPI_Datatype datatype, int root, MPI_Comm comm );
Parameter | Meaning of Parameter |
buffer | starting address of buffer (choice) |
count | number of entries in buffer (integer) |
datatype | datatype of buffer (handle) |
root | rank of broadcast root (integer) |
comm | communicator (handle) |
MPI_Bcast broadcasts a message from the process with rank "root" to all other processes of the group.
MPI_Scatter( void *sendbuf, int sendcnt, MPI_Datatype sendtype, void *recvbuf, int recvcnt, MPI_Datatype recvtype, int root, MPI_Comm comm );
Parameter | Meaning of Parameter |
sendbuf | address of send buffer (choice, significant only at root) |
sendcnt | number of elements sent to each process (integer, significant only at root) |
sendtype | data type of send buffer elements (significant only at root) (handle) |
recvbuf | address of receive buffer (choice) |
recvcnt | number of elements in receive buffer (integer) |
recvtype | data type of receive buffer elements (handle) |
root | rank of sending process (integer) |
comm | communicator (handle) |
MPI_Scatter sends data from one task to all other tasks in a group.
MPI_Gather( void *sendbuf, int sendcount, MPI_Datatype sendtype, void *recvbuf, int recvcount, MPI_Datatype recvtype, int root, MPI_Comm comm );
Parameter | Meaning of Parameter |
sendbuf | starting address of send buffer (choice) |
sendcount | number of elements in send buffer (integer) |
sendtype | data type of send buffer elements (handle) |
recvbuf | address of receive buffer (choice, significant only at root) |
recvcount | number of elements for any single receive (integer, significant only at root) |
recvtype | data type of receive buffer elements (significant only at root) (handle) |
root | rank of receiving process (integer) |
comm | communicator (handle) |
MPI_Gather gathers together values from a group of processes.
MPI_Allgather( void *sendbuf, int sendcount, MPI_Datatype sendtype, void *recvbuf, int recvcount, MPI_Datatype recvtype, MPI_Comm comm );
Parameter | Meaning of Parameter |
sendbuf | starting address of send buffer (choice) |
sendcount | number of elements in send buffer (integer) |
sendtype | data type of send buffer elements (handle) |
recvbuf | address of receive buffer (choice) |
recvcount | number of elements received from any process (integer) |
recvtype | data type of receive buffer elements (handle) |
comm | communicator (handle) |
MPI_Allgather gathers data from all tasks and distribute it to all.
MPI_Reduce( void *sendbuf, void *recvbuf, int count, MPI_Datatype datatype, MPI_Op op, int root, MPI_Comm comm );
Parameter | Meaning of Parameter |
sendbuf | address of send buffer (choice) |
recvbuf | address of receive buffer (choice, significant only at root) |
count | number of elements in send buffer (integer) |
datatype | data type of elements in send buffer (handle) |
op | reduction operation (handle) |
root | rank of root process (integer) |
comm | communicator (handle) |
MPI_Reduce reduces values on all processes to a single
value. The example illustration is using the operation MPI_SUM.
The
op handle can be many predefined operations give to us by MPI or it
can also be a user-defined operation. The following table lists
predefined reduction operations available for you to use:
MPI Reduction Operation
|
Meaning
|
C Data Types
|
MPI_MAX | Maximum | integer, float |
MPI_MIN | Minimum | integer, float |
MPI_SUM | Sum | integer, float |
MPI_PROD | Product | integer, float |
MPI_LAND | Logical AND | integer |
MPI_BAND | Bitwise AND | integer, MPI_BYTE |
MPI_LOR | Logical OR | integer |
MPI_BOR | Bitwise OR | integer, MPI_BYTE |
MPI_LXOR | Logical XOR | integer |
MPI_BXOR | Bitwise XOR | integer, MPI_BYTE |
MPI_MAXLOC | Maximum Value and Location | float, double and long double |
MPI_MINLOC | Minimum Values and Location | float, double and long double |
MPI_Allreduce( void *sendbuf, void *recvbuf, int count, MPI_Datatype datatype, MPI_Op op, MPI_Comm comm );
Parameter | Meaning of Parameter |
sendbuf | address of send buffer (choice) |
recvbuf | starting address of receive buffer (choice) |
count | number of elements in send buffer (integer) |
datatype | data type of elements in send buffer (handle) |
op | operation (handle) |
comm | communicator (handle) |
MPI_Allreduce combines values from all processes and distribute the result back to all processes. The illustration is using the illustration MPI_SUM.
MPI_Reduce_scatter( void *sendbuf, void *recvbuf, int *recvcounts, MPI_Datatype datatype, MPI_Op op, MPI_Comm comm );
Parameter | Meaning of Parameter |
sendbuf | address of send buffer (choice) |
recvbuf | starting address of receive buffer (choice) |
recvcounts | integer array specifying the number of elements in result distributed to each process. Array must be identical on all calling processes. |
datatype | data type of elements of input buffer (handle) |
op | operation (handle) |
comm | communicator (handle) |
MPI_Reduce_scatter combines values and scatters the results. The illustration is using the operation MPI_SUM.
MPI_Scan( void *sendbuf, void *recvbuf, int count, MPI_Datatype datatype, MPI_Op op, MPI_Comm comm );
Parameter | Meaning of Parameter |
sendbuf | starting address of send buffer (choice) |
recvbuf | starting address of receive buffer (choice) |
count | number of elements in input buffer (integer) |
datatype | data type of elements of input buffer (handle) |
op | operation (handle) |
comm | communicator (handle) |
MPI_Scan computes the scan (partial reductions) of data on a collection of processes. The illustration is using the operation MPI_SUM.
MPI gives us the ability to create our own operations that can be used with the MPI_Reduce or MPI_Scan calls. To do this you must declare a C function using this prototype:
typedef void MPI_User_function( void *invar, void * outvar, int *len, MPI_Datatype * datatype );
Then you need to a handle to the function using MPI_Op_create, here are the details of the call:
Make sure you declare a variable for the handle before calling MPI_Op_create. EX:
MPI_Op myOp;
MPI_Op_create( MPI_User_function *function, int commute, MPI_Op *myOp );
Parameter | Meaning of Parameter |
function | user defined function (function) |
commute | true if commutative; false otherwise. |
myOp | operation (handle) |
MPI_Op_create creates a user-defined combination function handle.
This new hande, op variable, that is created can now be used with a call to MPI_Reduce and MPI_Scan.
MPI_Op_free( MPI_Op *op );
Parameter | Meaning of Parameter |
op | operation (handle) |
MPI_Op_free frees a user-defined combination function handle.