cds_rb (which stands for "reduce-broadcast") effectively deqs a region from a specified cell in each process in the from-set, reduces those regions into a single region (by applying a commutative operation of the programmer's choice to the first item in each region to produce the first item in the result region, the second item in each region to produce the second, etc.), and then enqs that resulting region into a specified cell in each process in the to-set. Note that if the from-set contains only a single process, then cds_rb is effectively a broadcast (i.e. it just moves a single region from one process into a cell in one or more other processes), and if the to-set contains a single process, then cds_rb is effectively a reduction (i.e. it takes a region from one or more processes, reduces them into one region, and places that into a cell in the single process of the to-set).
cds_tpose (which stands for "transpose") serves to gather and/or scatter data from a specified cell in the from-set to a specified cell in the to-set. Specifically, if the from-set contains f process IDs, and the to-set contains t process IDs, then cds_tpose effectively deqs a region containing t data items from a specified cell in each process in the from-set, and enqs a region containing f data items into a specified cell in each process in the to-set. The ith data item in the region enq'd into the jth process is equal to the jth data item in the region deq'd from the ith process. Note that, if the from-set contains only a single process, then this scatters a single region over the to-set processes, and if the to-set contains only a single process, this gathers data from the from-set processes into a single region.
Both cds_rb and cds_tpose may be expensive to start up, since each may spend a significant amount of time analyzing the from-set and to-set to determine the most efficient way to perform the operation. Partially for this reason, each of these operations takes a replicator argument--i.e. an integer count, which can be infinity--and the operation is performed that number of times before completing. One iteration does not necessarily complete before the next begins, but regions will always be deq'd or enq'd from any particular cell in the proper order. The iterator has the effect of converting some cells to full-time use by the many-to-many operations for some period, thereby making a broadcast or reduction during that period as simple as enqing the inputs and deqing the results.
cds_enqm (rgid,nprocs,procs,cntxt,cell,perm)nprocs and procs are the number of processes and a vector of process IDS. iter is the number of times to perform the operation (and may be CDS_INFIN). insetn, fmset, fmcntxt, and fmcell are, respectively, the number of elements in the from-set, the from-set itself (as a vector of process IDs), and the context and cell to obtain the input regions. tosetn, toset, tocntxt, and tocell are the corresponding values for the to-set. nitems is the number of items to be read and written in each region, type is the data type of each of those elements, and reduct is the reduction to perform from the list (very similar to MPI): maximum (CDS_MAX), minimum (CDS_MIN), sum (CDS_SUM), product (CDS_PROD), bit-wise and (CDS_AND), bit-wise or (CDS_OR), bit-wise xor (CDS_XOR), max value with location (CDS_MAXLOC), and min value with location (CDS_MINLOC). ithrd is the thread ID for the non-blocking versions.
cds_writem(rgid,nprocs,procs,cntxt,cell,perm,waitflg)
cds_rb(iter,nitems,type,reduct,
fmsetn,fmset,fmcntxt,fmcell,
tosetn,toset,tocntxt,tocell)
cds_tpose(iter,nbytes,
fmsetn,fmset,fmcntxt,fmcell,
tosetn,toset,tocntxt,tocell)
cds_irb(&ithrd,iter,nitems,type,reduct,
fmsetn,fmset,fmcntxt,fmcell,
tosetn,toset,tocntxt,tocell)
cds_itpose(&ithrd,iter,nbytes,
fmsetn,fmset,fmcntxt,fmcell,
tosetn,toset,tocntxt,tocell)
In cases where the from-set is identical to the to-set, it is possible
(and recommended) that tosetn be set to CDS_SAME,
and toset be specified as zero (i.e. null pointer).
In other cases, where the from-set and to-set do not intersect at all,
the caller can make the call slightly more efficient by negating the value
of tosetn.
Copyright 2000 © elepar All rights reserved