Friday, February 4, 2011

Volatile Keyword and Memory Barriers

The volatile keyword instructs the compiler to generate an acquire-fence on every read from that field, and a release-fence on every write to that field. An acquire-fence prevents other reads/writes from being moved before the fence; a release-fence prevents other reads/writes from being moved after the fence.

Intel’s processors always apply acquire-fences to reads and release-fences to writes — whether or not you use the volatile keyword — so this keyword has no effect on the hardware if you’re using these processors. However, volatile does have an effect on optimizations performed by the compiler and the CLR — as well as on 64-bit AMD. This means that you cannot be more relaxed by virtue of your clients running a particular type of CPU.

The effect of applying volatile to fields can be summarized as follows:

First instructionSecond instructionCan they be swapped?
ReadReadNo
ReadWriteNo
WriteWriteNo (The CLR ensures that write-write operations are never
swapped, even without the volatile keyword)
WriteReadYes!


Notice that applying volatile doesn’t prevent a write followed by a read from being swapped, and this can create brainteasers. Joe Duffy illustrates the problem well with the following example: if Test1 and Test2 run simultaneously on different threads, it’s possible for a and b to both end up with a value of 0 (despite the use of volatile on both x and y):
class MyVolatile
{
  volatile int x, y;

  void Test1() // Executed on one thread
  {
    x = 1; // Volatile write (release-fence)
    int a = y; // Volatile read (acquire-fence)
  }

  void Test2() // Executed on another thread
  {
    y = 1; // Volatile write (release-fence)
    int b = x; // Volatile read (acquire-fence)
  }
}
The MSDN documentation states that use of the volatile keyword ensures that the most up-to-date value is present in the field at all times. This is incorrect, since as we’ve seen, Joe Duffy's example shows that a write followed by a read can be reordered.