Wednesday, April 13, 2016

Wait Queues in Kernel

When one or more processes/threads want to sleep and wait for one or more events,  the kernel provides wait queues to handle such kind of scenarios. These are higher level mechanism implemented inside Linux kernel to put process/thread in sleep and wake them up.

Important Points :-

  • A Wait queue for an event is a list of nodes
  • Each node points to the thread/process waiting for that particular event. 
  • An individual node in the list is called a wait queue entry.
  • On the occurrence of the event, one or more processes on the list are woken.
  • After waking up, the processes remove themselves from the list.
Wait Queue is defined and used in the following way :-
  • wait_queue_head_t my_event;
  • init_waitqueue_head(&my_event);

The same could be also achieved by using this macro:
DECLARE_WAIT_QUEUE_HEAD(my_event);

Any thread/process that wants to wait for my_event could use either of the following options:
  • wait_event(&my_event, (event_present == 1) ); //Uninterruptible Sleep
  • wait_event_interruptible(&my_event, (event_present == 1) );   

Note that the second argument to the wait function is condition for waiting of the my_event here.


How to wake up thread/process which are sleeping on a wait queue:
  • wake_up(&my_event);: wakes up only one process from the wait queue.
  • wake_up_all(&my_event);: wakes up all the processes on the wait queue.
  • wake_up_interruptible(&my_event);: wakes up only one process from the wait queue that is in interruptible sleep.
Below is an example from Kernel driver(driver/mmc/core/core.c)

/*

 *  Start a new MMC custom command request for a host.
 * If there is on ongoing async request wait for completion
 * of that request and start the new one and return.
 * Does not wait for the new request to complete.
 */
static int mmc_wait_for_data_req_done(struct mmc_host *host,
     struct mmc_request *mrq,
     struct mmc_async_req *next_req)
{
while (1) {
wait_event_interruptible(context_info->wait,
(context_info->is_done_rcv ||
                        context_info->is_new_req));
                               .........
                               .........
                               .........
             }
}


/*

 * Wakes up mmc context, passed as a callback to host controller driver
 */
static void mmc_wait_data_done(struct mmc_request *mrq)
{
mrq->host->context_info.is_done_rcv = true;
wake_up_interruptible(&mrq->host->context_info.wait);
}




Tuesday, April 12, 2016

The Classic Lost Wake up in Linux Kernel

The lost wake-up problem arises out of a race condition that occurs while a thread/process goes to conditional sleep. It is a classic problem in operating systems.

Consider two threads/processes, A and B. 
  • Thread/Process A is processing from a list, consumer.
  • The thread/process B is adding to this list, producer. 
  • When the list is empty, thread/process A sleeps.
  • Thread/Process B wakes A up when it appends anything to the list. 
And the code looks like as below :

 Process/Thread A (Processing the List):

1  spin_lock(&list_lock);
2  if(list_empty(&list_head)) {
3      spin_unlock(&list_lock);  
4      set_current_state(TASK_INTERRUPTIBLE);  
5      schedule();
6      spin_lock(&list_lock);
7  }
8
9  /* Rest of the code ... */
10 spin_unlock(&list_lock);


Process/Thread B( Adding to the List ):

100  spin_lock(&list_lock);
101  list_add_tail(&list_head, new_node);
102  spin_unlock(&list_lock);
103  wake_up_process(processa_task);

There is one problem with this situation as described below. 

STEP1 :  It may happen that after process A executes line 3 but before it executes line 4,thread/process B is scheduled on another processor. In other words. the Process/Thread A is not blocked however thread/process B is scheduled before the Line 4 Marked in RED is executed.

STEP2  : In this timeslice, thread/process B executes all its instructions, 100 through 103. Thus, it performs a wake-up on thread/process A, which has not yet gone to sleep. 

STEP3 :  Now, thread/process A sets the state to TASK_INTERRUPTIBLE and goes to sleep.

Thus, a wake up from thread/process B is lost or it was not processed at all. 
This is known as the lost wake-up problem. The thread/process A sleeps, even though there are nodes available on the list.

SOLUTION to the problem :- We need to re-write the thread or Process A  so that it doesn't misses any wake up.

Process A:

1  set_current_state(TASK_INTERRUPTIBLE);
2  spin_lock(&list_lock);
3  if(list_empty(&list_head)) {
4         spin_unlock(&list_lock);
5         schedule();
6         spin_lock(&list_lock);
7  }
8  set_current_state(TASK_RUNNING);
9
10 /* Rest of the code ... */
11 spin_unlock(&list_lock);

How did this solve problem?

1. The default state of Thread/Process A is Interruptible and after executing step 4 [ Marked RED] before the schedule ,the wake up is called by thread/process B hence the thread/process A is put to TASK_RUNNING state. Hence the wake up call of thread/process A for thread/process B is not missed.


2. If a wake-up is delivered by thread/process B at any point after the check for list_empty is made, the state of thread/process A automatically is changed to TASK_RUNNING. Hence, the call to schedule() does not put thread/process A to sleep; it merely schedules it out for a while.


Monday, April 4, 2016

Sleeping in the Kernel - Part 1

We will discuss here a simple way of sleeping and waking up a thread in there Kernel.

In Linux kernel, there are scenarios where a thread might be suspended/ or waiting for something else to happen voluntarily. In such a case, the thread should be allowed to sleep as much as possible and not scheduled unnecessary. Hence the thread will be Sleeping and waiting for events(Async or Sync). In other word, a process/thread is allowed to voluntarily relinquish the CPU.

One classic scenario could be that a thread is in sleep waiting for the Interrupt to occur and then goes to the scheduler queue and does it works. 

In Linux, the ready-to-run processes are maintained on a run queue.A ready-to-run process has the state TASK_RUNNING.Once the time-slice of a running process is over, the Linux scheduler picks up another appropriate process from the run queue and allocates CPU power to that process.

Two important process or task states which is important to discuss sleep :-

TASK_RUNNING A ready-to-run or running process/thread in the queue.

TASK_INTERRUPTIBLE  - The process/task is suspended and waiting for some event to occur - Interrupts or Signals, though need not be explicitly woken up by the code.

For all other task states, please refer the click here.

Sleep scenario with kernel code

At times, threads want to wait until a certain event occurs, such as a device to initialise, I/O to complete or a timer to expire. In such a case, the thread is said to sleep on that event. A thread can go to sleep using the schedule() function. The following code puts the executing thread to sleep:

sleeping_task = current;
set_current_state(TASK_INTERRUPTIBLE);
schedule();
func1();
/* The rest of the code */

sleeping_task is used to store the context of the task so that it can be woken up later on interrupt/timer expiration or any similar event.

set_current_state() takes argument as the state of the task and switches the current task to that state. Hence in this case the task is brought out of the run queue and suspended or put to sleep till the code wakes up the task. Hence the code gets control over scheduling or state of the task.

When the schedule() function is called with the state as TASK_INTERRUPTIBLE in this case, the currently executing process is moved off the run queue before another process is scheduled. The effect of this is the executing process goes to sleep, as it no longer is on the run queue. Hence, it never is scheduled by the scheduler. And, that is how a process can sleep.

The schedule() function is used by the thread in this case to indicate to the scheduler that it can schedule some other process on the processor.

let's wake it up now. Given a reference to a task structure, the thread could be woken up by calling:
wake_up_process(sleeping_task);








Sunday, April 3, 2016

Introduction to Kthreads in Linux Kernel

Threads are programming abstractions used in concurrent processing. A kernel thread is a way to implement background tasks inside the kernel. A background task can be busy handling asynchronous events or can be asleep, waiting for an event to occur. Kernel threads are similar to user processes, except that they live in kernel space and have access to kernel functions and data structures. Like user processes, kernel threads appear to monopolize the processor because of preemptive scheduling.

To see these threads, run ps -ef from command line and note all of the processes in [square brackets] at the beginning of the listing are kernel threads. 





Examples of kthreads

(A). [ksoftirqd/n] is a kthread supporting implementation of softirqs in Kernel. where 'n' represents the core in SMP systems.Soft IRQs are raised by interrupt handlers to request “bottom half” processing of portions of the interrupt handler whose execution can be deferred. The idea is to minimize the code inside interrupt handlersm which results in reduced interrupt-off times in the system, thus resulting in lower latencies. ksoftirqd ensures that a high load of soft IRQs neither starves the soft IRQs nor overwhelms the system. On Symmetric Multi-Processing (SMP) machines, where multiple thread instances can run on different processors in parallel, one instance of ksoftirqd is created per processor to improve throughput.

(B). The [events/n] threads (where n is the processor number) help implement work queues, which are another way of deferring work in the kernel. If a part of the kernel wants to defer execution of work, it can either create its own work queue or make use of the default events/worker thread.

(C). The [pdflush] kernel thread flushes dirty pages from the page cache. 

(D). The [khubd] thread, part of the Linux USB core, monitors the machine’s USB hub and configures USB devices when they are hot-plugged into the system.


Example from Kernel Source

File :- kernel/time/clocksource.c

static int clocksource_watchdog_kthread(void *data)
{
mutex_lock(&clocksource_mutex);
if (__clocksource_watchdog_kthread())
clocksource_select();
mutex_unlock(&clocksource_mutex);
return 0;
}

static void __clocksource_watchdog_kthread(unsigned long data)
{
--- CODE -----
}

Mutex examples from Linux Kernel

(A). TWL6030(power-management integrated circuit)  Device Driver's ADC read function uses Mutex in order to create a critical section, the sole Mutex is used everywhere hence all functions get protected by a Mutex : -

int twl6030_gpadc_read_raw( )
{
      Mutex Lock ;

      Start_Conversion of GPADC channel;

      /* Waiting for IRQ to complete with a given timeout */
      wait_for_completion_interruptible_timeout();
   
     ----  CODE  ---

      Mutex UnLock ;
}


static irqreturn_t twl6030_gpadc_irq_handler(int irq, void *indio_dev)
{

       ----- CODE ----

       /* Triggers the completion of the IRQ */
complete(irq_complete);

}

(B).  Device Tree for clocks - OF data structures uses Mutex so as to avoid transaction issues in the Data Structure. One such example is the function as below where the node is deleted while not allowing access to the list as the sole Mutex is used in all places :-

/* Remove a previously registered clock provider */
of_clk_del_provider()
{
mutex_lock(&of_clk_mutex);

        Find the node and delete it;

mutex_unlock(&of_clk_mutex);
}