248 lines
		
	
	
		
			9.8 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			248 lines
		
	
	
		
			9.8 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
| completions - wait for completion handling
 | |
| ==========================================
 | |
| 
 | |
| This document was originally written based on 3.18.0 (linux-next)
 | |
| 
 | |
| Introduction:
 | |
| -------------
 | |
| 
 | |
| If you have one or more threads of execution that must wait for some process
 | |
| to have reached a point or a specific state, completions can provide a
 | |
| race-free solution to this problem. Semantically they are somewhat like a
 | |
| pthread_barrier and have similar use-cases.
 | |
| 
 | |
| Completions are a code synchronization mechanism which is preferable to any
 | |
| misuse of locks. Any time you think of using yield() or some quirky
 | |
| msleep(1) loop to allow something else to proceed, you probably want to
 | |
| look into using one of the wait_for_completion*() calls instead. The
 | |
| advantage of using completions is clear intent of the code, but also more
 | |
| efficient code as both threads can continue until the result is actually
 | |
| needed.
 | |
| 
 | |
| Completions are built on top of the generic event infrastructure in Linux,
 | |
| with the event reduced to a simple flag (appropriately called "done") in
 | |
| struct completion that tells the waiting threads of execution if they
 | |
| can continue safely.
 | |
| 
 | |
| As completions are scheduling related, the code is found in
 | |
| kernel/sched/completion.c.
 | |
| 
 | |
| 
 | |
| Usage:
 | |
| ------
 | |
| 
 | |
| There are three parts to using completions, the initialization of the
 | |
| struct completion, the waiting part through a call to one of the variants of
 | |
| wait_for_completion() and the signaling side through a call to complete()
 | |
| or complete_all(). Further there are some helper functions for checking the
 | |
| state of completions.
 | |
| 
 | |
| To use completions one needs to include <linux/completion.h> and
 | |
| create a variable of type struct completion. The structure used for
 | |
| handling of completions is:
 | |
| 
 | |
| 	struct completion {
 | |
| 		unsigned int done;
 | |
| 		wait_queue_head_t wait;
 | |
| 	};
 | |
| 
 | |
| providing the wait queue to place tasks on for waiting and the flag for
 | |
| indicating the state of affairs.
 | |
| 
 | |
| Completions should be named to convey the intent of the waiter. A good
 | |
| example is:
 | |
| 
 | |
| 	wait_for_completion(&early_console_added);
 | |
| 
 | |
| 	complete(&early_console_added);
 | |
| 
 | |
| Good naming (as always) helps code readability.
 | |
| 
 | |
| 
 | |
| Initializing completions:
 | |
| -------------------------
 | |
| 
 | |
| Initialization of dynamically allocated completions, often embedded in
 | |
| other structures, is done with:
 | |
| 
 | |
| 	void init_completion(&done);
 | |
| 
 | |
| Initialization is accomplished by initializing the wait queue and setting
 | |
| the default state to "not available", that is, "done" is set to 0.
 | |
| 
 | |
| The re-initialization function, reinit_completion(), simply resets the
 | |
| done element to "not available", thus again to 0, without touching the
 | |
| wait queue. Calling init_completion() twice on the same completion object is
 | |
| most likely a bug as it re-initializes the queue to an empty queue and
 | |
| enqueued tasks could get "lost" - use reinit_completion() in that case.
 | |
| 
 | |
| For static declaration and initialization, macros are available. These are:
 | |
| 
 | |
| 	static DECLARE_COMPLETION(setup_done)
 | |
| 
 | |
| used for static declarations in file scope. Within functions the static
 | |
| initialization should always use:
 | |
| 
 | |
| 	DECLARE_COMPLETION_ONSTACK(setup_done)
 | |
| 
 | |
| suitable for automatic/local variables on the stack and will make lockdep
 | |
| happy. Note also that one needs to make *sure* the completion passed to
 | |
| work threads remains in-scope, and no references remain to on-stack data
 | |
| when the initiating function returns.
 | |
| 
 | |
| Using on-stack completions for code that calls any of the _timeout or
 | |
| _interruptible/_killable variants is not advisable as they will require
 | |
| additional synchronization to prevent the on-stack completion object in
 | |
| the timeout/signal cases from going out of scope. Consider using dynamically
 | |
| allocated completions when intending to use the _interruptible/_killable
 | |
| or _timeout variants of wait_for_completion().
 | |
| 
 | |
| 
 | |
| Waiting for completions:
 | |
| ------------------------
 | |
| 
 | |
| For a thread of execution to wait for some concurrent work to finish, it
 | |
| calls wait_for_completion() on the initialized completion structure.
 | |
| A typical usage scenario is:
 | |
| 
 | |
| 	struct completion setup_done;
 | |
| 	init_completion(&setup_done);
 | |
| 	initialize_work(...,&setup_done,...)
 | |
| 
 | |
| 	/* run non-dependent code */              /* do setup */
 | |
| 
 | |
| 	wait_for_completion(&setup_done);         complete(setup_done)
 | |
| 
 | |
| This is not implying any temporal order on wait_for_completion() and the
 | |
| call to complete() - if the call to complete() happened before the call
 | |
| to wait_for_completion() then the waiting side simply will continue
 | |
| immediately as all dependencies are satisfied if not it will block until
 | |
| completion is signaled by complete().
 | |
| 
 | |
| Note that wait_for_completion() is calling spin_lock_irq()/spin_unlock_irq(),
 | |
| so it can only be called safely when you know that interrupts are enabled.
 | |
| Calling it from hard-irq or irqs-off atomic contexts will result in
 | |
| hard-to-detect spurious enabling of interrupts.
 | |
| 
 | |
| wait_for_completion():
 | |
| 
 | |
| 	void wait_for_completion(struct completion *done):
 | |
| 
 | |
| The default behavior is to wait without a timeout and to mark the task as
 | |
| uninterruptible. wait_for_completion() and its variants are only safe
 | |
| in process context (as they can sleep) but not in atomic context,
 | |
| interrupt context, with disabled irqs. or preemption is disabled - see also
 | |
| try_wait_for_completion() below for handling completion in atomic/interrupt
 | |
| context.
 | |
| 
 | |
| As all variants of wait_for_completion() can (obviously) block for a long
 | |
| time, you probably don't want to call this with held mutexes.
 | |
| 
 | |
| 
 | |
| Variants available:
 | |
| -------------------
 | |
| 
 | |
| The below variants all return status and this status should be checked in
 | |
| most(/all) cases - in cases where the status is deliberately not checked you
 | |
| probably want to make a note explaining this (e.g. see
 | |
| arch/arm/kernel/smp.c:__cpu_up()).
 | |
| 
 | |
| A common problem that occurs is to have unclean assignment of return types,
 | |
| so care should be taken with assigning return-values to variables of proper
 | |
| type. Checking for the specific meaning of return values also has been found
 | |
| to be quite inaccurate e.g. constructs like
 | |
| if (!wait_for_completion_interruptible_timeout(...)) would execute the same
 | |
| code path for successful completion and for the interrupted case - which is
 | |
| probably not what you want.
 | |
| 
 | |
| 	int wait_for_completion_interruptible(struct completion *done)
 | |
| 
 | |
| This function marks the task TASK_INTERRUPTIBLE. If a signal was received
 | |
| while waiting it will return -ERESTARTSYS; 0 otherwise.
 | |
| 
 | |
| 	unsigned long wait_for_completion_timeout(struct completion *done,
 | |
| 		unsigned long timeout)
 | |
| 
 | |
| The task is marked as TASK_UNINTERRUPTIBLE and will wait at most 'timeout'
 | |
| (in jiffies). If timeout occurs it returns 0 else the remaining time in
 | |
| jiffies (but at least 1). Timeouts are preferably calculated with
 | |
| msecs_to_jiffies() or usecs_to_jiffies(). If the returned timeout value is
 | |
| deliberately ignored a comment should probably explain why (e.g. see
 | |
| drivers/mfd/wm8350-core.c wm8350_read_auxadc())
 | |
| 
 | |
| 	long wait_for_completion_interruptible_timeout(
 | |
| 		struct completion *done, unsigned long timeout)
 | |
| 
 | |
| This function passes a timeout in jiffies and marks the task as
 | |
| TASK_INTERRUPTIBLE. If a signal was received it will return -ERESTARTSYS;
 | |
| otherwise it returns 0 if the completion timed out or the remaining time in
 | |
| jiffies if completion occurred.
 | |
| 
 | |
| Further variants include _killable which uses TASK_KILLABLE as the
 | |
| designated tasks state and will return -ERESTARTSYS if it is interrupted or
 | |
| else 0 if completion was achieved.  There is a _timeout variant as well:
 | |
| 
 | |
| 	long wait_for_completion_killable(struct completion *done)
 | |
| 	long wait_for_completion_killable_timeout(struct completion *done,
 | |
| 		unsigned long timeout)
 | |
| 
 | |
| The _io variants wait_for_completion_io() behave the same as the non-_io
 | |
| variants, except for accounting waiting time as waiting on IO, which has
 | |
| an impact on how the task is accounted in scheduling stats.
 | |
| 
 | |
| 	void wait_for_completion_io(struct completion *done)
 | |
| 	unsigned long wait_for_completion_io_timeout(struct completion *done
 | |
| 		unsigned long timeout)
 | |
| 
 | |
| 
 | |
| Signaling completions:
 | |
| ----------------------
 | |
| 
 | |
| A thread that wants to signal that the conditions for continuation have been
 | |
| achieved calls complete() to signal exactly one of the waiters that it can
 | |
| continue.
 | |
| 
 | |
| 	void complete(struct completion *done)
 | |
| 
 | |
| or calls complete_all() to signal all current and future waiters.
 | |
| 
 | |
| 	void complete_all(struct completion *done)
 | |
| 
 | |
| The signaling will work as expected even if completions are signaled before
 | |
| a thread starts waiting. This is achieved by the waiter "consuming"
 | |
| (decrementing) the done element of struct completion. Waiting threads
 | |
| wakeup order is the same in which they were enqueued (FIFO order).
 | |
| 
 | |
| If complete() is called multiple times then this will allow for that number
 | |
| of waiters to continue - each call to complete() will simply increment the
 | |
| done element. Calling complete_all() multiple times is a bug though. Both
 | |
| complete() and complete_all() can be called in hard-irq/atomic context safely.
 | |
| 
 | |
| There only can be one thread calling complete() or complete_all() on a
 | |
| particular struct completion at any time - serialized through the wait
 | |
| queue spinlock. Any such concurrent calls to complete() or complete_all()
 | |
| probably are a design bug.
 | |
| 
 | |
| Signaling completion from hard-irq context is fine as it will appropriately
 | |
| lock with spin_lock_irqsave/spin_unlock_irqrestore and it will never sleep.
 | |
| 
 | |
| 
 | |
| try_wait_for_completion()/completion_done():
 | |
| --------------------------------------------
 | |
| 
 | |
| The try_wait_for_completion() function will not put the thread on the wait
 | |
| queue but rather returns false if it would need to enqueue (block) the thread,
 | |
| else it consumes one posted completion and returns true.
 | |
| 
 | |
| 	bool try_wait_for_completion(struct completion *done)
 | |
| 
 | |
| Finally, to check the state of a completion without changing it in any way, 
 | |
| call completion_done(), which returns false if there are no posted
 | |
| completions that were not yet consumed by waiters (implying that there are
 | |
| waiters) and true otherwise;
 | |
| 
 | |
| 	bool completion_done(struct completion *done)
 | |
| 
 | |
| Both try_wait_for_completion() and completion_done() are safe to be called in
 | |
| hard-irq or atomic context.
 | 
