Be Driven
  Device Drivers in the Be Os
     
    Locking Stories

Narratives on Locking, and what you can and can not do.
Newsgroups Only.. (Be Sure to refer to Be Articles)

A Locking Example

"
...
Here's a real-world situation of when one might need to synchronize with
an interrupt. Consider a driver for a piece of hardware that fills a
buffer. The driver maintains a count of how many bytes are in the buffer
-- hardware fills the buffer, thread-level code empties the buffer. Thus,
a thread-level program must decrement the bytecount and an interrupt will
increment it.

volatile std::uint32_t gByteCount;


void LowerByteCount(std::uint32_t howMuch) 
{ 
    gByteCount = gByteCount - howMuch; 
} 

 
void HandleInterrupt() 
{ 
    if (<it's the correct interrupt>) 
    { 
        std::uint32_t howMuchToAdd = <figure out how much to add> 
        // This may involve reading the hardware, for example 
        gByteCount = gByteCount + howMuch; 
        <clear the interrupt> 
    } 
}

Nice and neat, huh? But there's a race condition, it's small, but it
exists (and this is an example anyway). Consider the situation where the
byte count is being lowered when an interrupt comes in. LowerByteCount
has already read the old value. The interrupt handler reads the byte
count, increments the value, writes it back, and exits. Then LowerByteCount resumes, decrementing the old value, and writing it back.

gByteCount is now incorrect because it ignores the intervening interrupt. In this example, gByteCount is the shared resource to which access must be synchronized. The obvious solution to the problem is to use atomic_add, which will absolutely work and is the best answer. However, because this is a teaching example, we'll use locks to demonstrate how they might work. Traditional synchronization techniques would use semaphores (N.B. benaphores are equally useful and can be substituted for semaphores in the following examples and discussion) and would look like this:

sem_id gSemaphore; 
volatile std::uint32_t gByteCount; 

void LowerByteCount(std::uint32_t howMuch) 
{ 
    acquire_sem(gSemaphore); 
    gByteCount = gByteCount - howMuch; 
    release_sem(gSemaphore); 
}

void HandleInterrupt() 
{ 
    if (<it's the correct interrupt>) 
    { 
        acquire_sem(gSemaphore); //!!!! 
        std::uint32_t howMuchToAdd = <figure out how much to add> 
        // This may involve reading the hardware, for example 
        gByteCount = gByteCount + howMuch; 
        release_sem(gSemaphore); //!!!! 
        <clear the interrupt> 
    } 
} 

The problem with this solution is that acquire_sem cannot be called from interrupt code because it may block (see definitions above). If our shared resource were only shared with non-interrupt code, semaphores are the best solution. However, if you must synchronize with an interrupt, you cannot use semaphores ever.

Period.

The first approximation of a solution is to disable interrupts. As follows:

volatile std::uint32_t gByteCount; 


void LowerByteCount(std::uint32_t howMuch) 
{ 
    const cpu_status theStatus = disable_interrupts(); 
    gByteCount = gByteCount - howMuch; 
    restore_interrupts(theStatus); 
} 

void  HandleInterrupt() 
{ 
    if (<it's the correct interrupt>) 
    { 
        std::uint32_t howMuchToAdd = <figure out how much to add> 
        // This may involve reading the hardware, for example 
        gByteCount = gByteCount + howMuch; 
        <clear the interrupt> 
    } 
} 

This code is almost correct. After disabling interrupts, LowerByteCount is guaranteed that an interrupt cannot come in and read gByteCount under its nose. Because disable_interrupts also disables pre-emption (which is, really, run through an interrupt), it knows that it won't be pre-empted. Thus, we've achieved mutual exclusion (!). Similarly, when the interrupt code is entered, we know that we have access (!) to the shared resource.

The problem with this solution is that it is not MP friendly.

Consider the situation where LowerByteCount is executing on one CPU and the interrupt fires and is scheduled on another CPU. disable_interrupts only disables interrupts on the current CPU, so we still have the race condition. Spinlocks solve this problem by providing a non-blocking way to provide mutual exclusion between multiple CPUs.

spinlock gSpinlock; 
volatile std::uint32_t gByteCount; 


void LowerByteCount(std::uint32_t howMuch) 
{ 
    const cpu_status theStatus = disable_interrupts(); 
    acquire_spinlock(&gSpinlock); 
    gByteCount = gByteCount - howMuch; 
    release_spinlock(&gSpinlock); 
    restore_interrupts(theStatus); 
} 

void HandleInterrupt() 
{ 
    if (<it's the correct interrupt>) 
    { 
        acquire_spinlock(&gSpinlock); 
        std::uint32_t howMuchToAdd = <figure out how much to add> 
        // This may involve reading the hardware, for example 
        gByteCount = gByteCount + howMuch; 
        release_spinlock(&gSpinlock); 
        <clear the interrupt> 
    } 
} 

Now, we are completely safe, even in an MP situation. Moving forward, the issue is that the spinlock should only be held by thread-level code for a very short time because you are increasing the interrupt latency of your device and, potentially, of the entire system. This is a design issue to which driver writers need to be sensitive.

So, the high points:

  1. Semaphores are completely reasonable for synchronizing access to
    shared resources between any non-interrupt level code anywhere.

  2. Disabling interrupts and acquiring a spinlock is the way to provide
    synchronized access to a resource shared with an interrupt.
    "
    Ercic Berdahl
    Email to BeDevTalk@be.com

Misconceptions with Locking


"
(P1 runs on CPU1 and P2 on CPU2)
What about the following conditions?


P1 ->Interrupt Started

Start Accessing Shared Memory
Half way through interrupt when ...

P2 ->Application code masks off interrupts in parralel..

Acquires SpinLock
Access Shared Memory


P1 ->Enter Interrupt A
P2 ->Enter Interrupt B
P1 ->Access Shared Memory Pool
P2 ->Access Shared Memory Pool

-
Neither of these can happen if you did your locking correctly.
The interrupt handler must also acquire the spinlock before
accessing the shared objects, exactly for this reason.

So it will look like this, rather (P1 runs on CPU1 and P2 on CPU2):

P1 ->Interrupt Started (interrupts disabled on CPU1)

Acquire spinlock
Start Accessing Shared Memory
Half way through interrupt when ...

P2 ->Application code masks off interrupts in parralel.. (on CPU2 only)

Blocks on spinlock until P1 has finished with the shared memory.

P1->Finish using shared memory

Release spinlock
Return from interrupt handler (interrupts restored on CPU2)

P2->Return from acquire_spinlock

Access Shared Memory
Finish using it
Release spinlock
Restore interrupts on CPU2

"
Jonathan Perret
Email to BeDevTalk@be.com


The Communal Be Documentation Site
1999 - bedriven.miffy.org