Doing Things "Whenever" - Asynchronous Procedure Calls in NT The NT Insider,Vol 5, Issue 1, Jan-Feb 1998 | Published: 15-Feb-98| Modified: 20-Aug-02
In Windows NT, the concept of "Asynchronous Procedure Calls" (APCs) are mentioned numerous times, although precisely what an APC is and how it is used is entirely absent from the standard Microsoft DDK documentation. Despite this oversight, an understanding of APCs is essential to understanding how Windows NT works.
Of course, those of you intimately familiar with the Win32 API are no doubt aware that APCs are fully supported by the Win32 API (c.f., QueueUserApc a standard Win32 API function.) The Win32 APC abstraction on the Windows NT platform is built upon the native APC support present in the kernel.
In the May 1997 issue of The NT Insider we discussed I/O completion. One of the key elements of I/O completion was the "second stage" of I/O completion. This second stage had to complete in the context of the thread that started the I/O. Normally, the I/O Manager does this by using an APC – because a key property of an APC is that it runs in a specific thread context. Note that we did find that a knowledge base article (# Q126416) erroneously states that APCs run in arbitrary thread context.
Asynchronous Procedure Calls have the following interesting properties:
An APC always runs in a specific thread context An APC runs at OS predetermined times. APCs can cause pre-emption of the currently running thread APC routines can be pre-empted In Windows NT, each running thread is represented by the operating system using a data structure known as the "thread structure". Inside this structure is not one but two APC queues. One of these queues is used to store "user mode" APC objects, and the other of these queues is used to store "kernel mode" APC objects. In turn, each of these queues has two flavors of APC objects – regular or special.
Before we describe the distinction between user mode and kernel mode APCs, it would be a good idea to describe the control object in this case – the APC object. While not explicitly listed in the DDK, the APC object is declared in NTDDK.H. This declaration is shown in Figure 1
From this declaration of the APC object, many of its properties can easily be described. For example, the APC is a thread-specific data structure, so the APC object contains a pointer to its associated thread. Like any of the standard Windows NT control objects, an APC object contains a single LIST_ENTRY, which is used to enqueue the object precisely once.
Were this a normal APC object (a kernel mode APC), it would have a valid KernelRoutine(…) function pointer, while a user mode APC would have a valid NormalRoutine(…) function pointer. Either one may optionally have a valid RundownRoutine() function pointer, as this is called whenever the OS needs to discard the contents of the APC queue (such as when the thread exits.) An APC without such a routine would just be deleted. In either case, neither the KernelRoutine(…) nor the NormalRoutine(…) for the APC object is called under these circumstances – just the RundownRoutine(…).
Of course we mentioned previously that while there were two APC queues, each type of queue can have both normal and special APC objects. For kernel mode operations, special APCs are used frequently – indeed, that is how the I/O Manager does I/O completion. It creates a "special" kernel mode APC and in the context of that special kernel mode APC completes the thread-specific portion of the I/O (such as copying the results into the appropriate output locations). However, special user mode APCs are uncommon. They are used, for example, when the thread is being terminated. In either case, a special APC is inserted in the front of the APC queue, before any normal APC objects. This ensures that a special APC runs before any normal APC runs.
For a file system, management of APC delivery is essential to ensuring correct behavior of the file system. Indeed, one of the most complex issues with respect to developing file systems on Windows NT is the complex locking model, due to interactions between the VM system and file system. The danger for an FSD is that an APC might cause an additional I/O to be triggered into the file system.
For example, suppose that I/O to some file, say X, has just completed. If a kernel APC for the same thread (a common case) then starts I/O on a different file, say Y, it is possible a deadlock condition could result. Indeed, even if the FSD developer has defined a locking order between files X and Y, the introduction of code written outside the FSD’s control might not obey those rules.
To alleviate this situation, a typical Windows NT file system disables kernel mode APC delivery by calling KeEnterCriticalRegion(…) (albeit under the name FsRtlEnterFileSystem(…), which in current versions of Windows NT is defined to be KeEnterCriticalRegion(…).) This disables the delivery of kernel mode APCs, although it allows the I/O Manager’s special kernel mode APCs to be delivered. These APCs are safe because they do not re-enter the file system and hence do not introduce any risk of deadlock.
Another way that APC delivery can be disabled is to raise the IRQL of the system to APC_LEVEL. This disables the delivery of all APCs, of any type. For example, the Windows NT Memory Manager issues I/O operations at APC_LEVEL under certain circumstances. This ensures that any APCs, especially I/O completion APCs are not delivered while it is starting a new paging I/O operation.
Some of the synchronization primitives in Windows NT raise the IRQL of the running system to APC_LEVEL in order to ensure that code cannot reenter for the currently running thread. The notable case here is the fast mutex operations. ExAcquireFastMutex(…) raises the IRQL of the system to APC_LEVEL, lowering it when the driver calls ExReleaseFastMutex(…). Thus, while a fast mutex is held all APCs for this thread are not delivered.
User mode APCs aren’t enabled and disabled in the same way that kernel mode APCs are. Instead, the kernel examines the user APC queue on certain events, including while the thread is blocked waiting for an event to occur, such as with KeWaitForSingleObject(…) and its variants. Another common event that triggers APC delivery is exiting a system service call. Regardless, any code using APCs cannot rely upon the existing system behavior. Instead, the code must be properly written to synchronize access to data structures.
While APC Objects are used throughout the operating system, how to create such an APC Object is never documented within the DDK. A few routines document that they support APC functions, but there are no examples in the DDK using those routines. For example, the function ZwReadFile(…) accepts a function pointer and context argument. The names are suggestive that these are, in fact, APC routines. Unfortunately, the DDK documentation is terse on this point, stating only that "Device and intermediate drivers should set this pointer to NULL."
While the warning is true, it gives little guidance to highest level drivers, such as file systems or file servers, on how they should use these parameters. Typically, this results in using events, rather than APC routines, with there corresponding higher overhead and lower performance characteristics. Of course, Win32 uses this functionality to implement overlapped I/O, as the Platform SDK documentation clearly states: "Note that the ReadFileEx(…), SetWaitableTimer(…), and WriteFileEx(…) functions are implemented using an APC as the completion notification callback mechanism."
For Win32 applications, using APCs is considerably simpler than it is for kernel level applications. They can simply call QueueUserApc, passing a function, a thread handle, and a context argument. Assuming they have the appropriate permissions, the OS will construct an APC object and insert it into the user APC queue for the target.
While APC objects cannot be directly created by kernel mode applications, Microsoft does provide an operation for file system drivers that can be used to ensure code runs within a particular process context. The key routines are KeAttachProcess(…) and KeDetachProcess(…). While they do not allow one to specify a particular thread context, they do ensure that the resources for a given process are available, notably its address space. Prototypes for these functions are shown in Figure 2. They can also be found in NTIFS.H.
NTKERNELAPI VOID KeAttachProcess (IN PRKPROCESS Process); NTKERNELAPI VOID KeDetachProcess (VOID);
Figure 2
A file system driver may use KeAttachProcess(…) in order to force a switch to the specific process address space. Upon return from this call, the thread is now running in the process address space specified in the call to KeAttachProcess(…). The file system driver can then operate on data within the attached process. When the operations have been completed, the thread restores the process context by calling KeDetachProcess(…).
In general, use of KeAttachProcess(…) and KeDetachProcess(…) should be avoided. These functions, while they do allow your file system driver to run in a specific process context, are quite expensive. For many file systems using this function is unnecessary. However, it can be helpful for file systems that need to copy data between address spaces in an efficient manner.
To summarize then, Windows NT uses asynchronous procedure calls to run an arbitrary procedure in a known thread context. It does this by maintaining a per-thread queue of APC objects and checks that queue periodically to determine if there is any work to do. Indeed, APCs are essential to how Windows NT handles fundamental system operations, such as I/O completion. For most drivers, the existence of APCs is immaterial. For file system drivers, properly controlling APC delivery is essential to correct behavior.
文章评论(0条评论)
登录后参与讨论