For programmers familiar with a distributed shared-memory
paradigm, the CDS primitives can be used to mimic a similar style, specifically
one which exhibits release consistency and where each shared memory region
has a home process(or) to which it returns when not being accessed.
This very portable approach is detailed further here, primarily as a stepping
stone for translating existing shared-memory programs to CDS. Once
converted, these programs will often benefit by broadening their use of
other available CDS primitives. In terms of shared-memory terminology,
such broadening can be considered as using multiple locks (and thereby
home locations) for a single shared region and/or a single lock to cover
multiple "versions" of a shared region, thereby obviating some amount of
superfluous communication and/or providing greater potential concurrency.
This exclusive-access paradigm can be easily extended into one allowing multiple readers but only a single writer through judicious use of the read, deq, and rgmod primitives. Specifically, a reader performs a read on the lock (i.e. cell) to access the region, thereby leaving the region available for other readers and writers, and performs an rgfree when finished, while a writer performs a deq on the lock (cell) and an rgmod of the region, and when finished, performs a write back to the lock (cell). Since each process logically gets its own copy of the region, this protocol ensures that a reader does not logically interfere with any other processes, but once a writer has taken the lock, no other readers or writers will successfully obtain the lock until the writer is done. (See the Physical Model section, below, for a discussion on how to optimize this logical model in various ways based on architectural and algorithmic properties.)
To facilitate the use of cells in this way, CDS provides mnemonic locking routines that are actually semantically equivalent to pre-existing CDS primitives, as follows:
The use of these routines is illustrated here.
To allocate a region (from the local comm heap) and assign it to a lock (i.e. a comm cell) on process proc in context cntxt:
cds_rgalloc(&rgid,regionsize);To block until a write lock to the region is acquired
(Initialize the region, if desired)
cds_rlswl(rgid,proc,cntxt,lock)
cds_acqwl(&rgid,proc,cntxt,lock,CDS_BLOCK,0)which translates into
cds_deq(&rgid,proc,cntxt,lock,CDS_PWRIT,CDS_BLOCK)Note that acqwl contains an extra last argument. If it is non-zero, it is used as the waitflg argument on an explicit rgmod. That is, if the last argument had been 1 rather than 0, above, then it would have translated into:
cds_deq(&rgid,proc,cntxt,lock,CDS_PREAD,CDS_BLOCK)(The value of this extra argument actually has no logical effect on the behavior of the program, and the potential efficiency differences are discussed under the Physical Model of the basic primitives.)
cds_rgmod(rgid,1)
To release a write lock to the region
cds_rlswl(rgid,proc,cntxt,lock)To block until a read lock to the region is acquired
cds_acqrl(&rgid,proc,cntxt,lock,CDS_BLOCK)To release a read lock to the region
cds_rlsrl(rgid,proc,cntxt,lock)(Note that all arguments to rlsrl except the first are ignored, since rgfree takes only the region id. The other arguments are only to provide symmetry with rlswl.)
To convert a write lock to a read lock
cds_wl2rl(rgid,proc,cntxt,lock)
Certainly, the "lock" (i.e. cell) used for a logical shared memory region should be in or near the process or processes where that region will be accessed the most frequently. In fact, by using multiple "locks" (i.e. cells), located in the different processes where the region will be used, and always "unlocking" the the region in the process where it will be used next, the region will effectively be predictively forwarded to its next destination, helping to hide latency (if any). (This approach is not always possible, depending upon the nature of the algorithm being programed.)
The more general form of cells, which may contain multiple regions,
can (in some cases) be regarded as simply queuing multiple versions of
a shared region (and therefore queuing the lock). Again, the utility
of such an interpretation will depend upon the algorithm being programed.
In any case, the basic primitives (rather than these shared memory mnemonics)
should probably be used whenever dealing with cells that might contain
multiple regions.
Copyright 2000 © elepar All rights reserved