原创时间,延时,和延后工作Linux设备驱动全文

 2009-6-12 22:10  7396 8 8 分类: MCU/ 嵌入式

Processor-Speci?c Registers——处理器特定的寄存器
If you need to measure very short time intervals or you need extremely high precision in your figures, you can resort to platform-dependent resources, a choice of precision over portability.
如果你需要测量非常短时间间隔, 或者你需要非常高精度, 你可以借助平台依赖的资源, 一个要精度不要移植性的选择.
In modern processors, the pressing demand for empirical performance figures is thwarted by the intrinsic unpredictability of instruction timing in most CPU designs due to cache memories, instruction scheduling, and branch prediction. As a response, CPU manufacturers introduced a way to count clock cycles as an easy and reliable way to measure time lapses. Therefore, most modern processors include a counter register that is steadily incremented once at each clock cycle. Nowadays, this clock counter is the only reliable way to carry out high-resolution timekeeping tasks.
在现代处理器中, 对于经验性能数字的迫切需求被大部分 CPU 设计中内在的指令定时不确定性所阻碍, 这是由于缓存内存, 指令调度, 以及分支预测引起. 作为回应, CPU 制造商引入一个方法来计数时钟周期, 作为一个容易并且可靠的方法来测量时间流失. 因此, 大部分现代处理器包含一个计数器寄存器, 它在每个时钟周期固定地递增一次. 现在, 资格时钟计数器是唯一可靠的方法来进行高精度的时间管理任务.
The details differ from platform to platform: the register may or may not be readable from user space, it may or may not be writable, and it may be 64 or 32 bits wide. In the last case, you must be prepared to handle overflows just like we did with the jiffy counter. The register may even not exist for your platform, or it can be implemented in an external device by the hardware designer, if the CPU lacks the feature and you are dealing with a special-purpose computer.
细节每个平台不同: 这个寄存器可以或者不可以从用户空间可读, 它可以或者不可以写, 并且它可能是 64 或者 32 位宽. 在后一种情况, 你必须准备处理溢出, 就象我们处理 jiffy 计数器一样. 这个寄存器甚至可能对你的平台来说不存在, 或者它可能被硬件设计者在一个外部设备实现, 如果 CPU 缺少这个特性并且你在使用一个特殊用途的计算机.
Whether or not the register can be zeroed, we strongly discourage resetting it, even when hardware permits. You might not, after all, be the only user of the counter at any given time; on some platforms supporting SMP, for example, the kernel depends on such a counter to be synchronized across processors. Since you can always measure differences between values, as long as that difference doesn’t exceed the overflow time, you can get the work done without claiming exclusive ownership of the register by modifying its current value.
无论是否寄存器可以被清零, 我们强烈不鼓励复位它, 即便当硬件允许时. 毕竟, 在任何给定时间你可能不是这个计数器的唯一用户; 在一些支持 SMP 的平台上, 例如, 内核依赖这样一个计数器来在处理器之间同步. 因为你可以一直测量各个值的差, 只要差没有超过溢出时间, 你可以通过修改它的当前值来做这个事情不用声明独自拥有这个寄存器.
The most renowned counter register is the TSC (timestamp counter), introduced in x86 processors with the Pentium and present in all CPU designs ever since—including the x86_64 platform. It is a 64-bit register that counts CPU clock cycles; it can be read from both kernel space and user space.
最有名的计数器寄存器是 TSC ( timestamp counter), 在 x86 处理器中随 Pentium 引入的并且在所有从那之后的 CPU 中出现 -- 包括 x86_64 平台. 它是一个 64-位寄存器计数 CPU 的时钟周期; 它可从内核和用户空间读取.
After including <asm/msr.h> (an x86-specific header whose name stands for “machine-specific registers”), you can use one of these macros:
在包含了 <asm/msr.h> (一个 x86-特定的头文件, 它的名子代表"machine-specific registers"), 你可使用一个这些宏:
rdtsc(low32,high32);
rdtscl(low32);
rdtscll(var64);
The first macro atomically reads the 64-bit value into two 32-bit variables; the next one (“read low half”) reads the low half of the register into a 32-bit variable, discarding the high half; the last reads the 64-bit value into a long long variable, hence, the name. All of these macros store values into their arguments.
第一个宏自动读取 64-位值到 2 个 32-位变量; 下一个("read low half") 读取寄存器的低半部到一个 32-位变量, 丢弃高半部; 最后一个读 64-位值到一个 long long 变量, 由此得名. 所有这些宏存储数值到它们的参数中.
Reading the low half of the counter is enough for most common uses of the TSC. A 1-GHz CPU overflows it only once every 4.2 seconds, so you won’t need to deal with multiregister variables if the time lapse you are benchmarking reliably takes less time. However, as CPU frequencies rise over time and as timing requirements increase, you’ll most likely need to read the 64-bit counter more often in the future.
对大部分的 TSC 应用, 读取这个计数器的的低半部足够了. 一个 1-GHz 的 CPU 只在每 4.2 秒溢出一次, 因此你不会需要处理多寄存器变量, 如果你在使用的时间流失确定地使用更少时间. 但是, 随着 CPU 频率不断上升以及定时需求的提高, 将来你会几乎可能需要常常读取 64-位计数器.
As an example using only the low half of the register, the following lines measure the execution of the instruction itself:
作为一个只使用寄存器低半部的例子, 下面的代码行测量了指令自身的执行:
unsigned long ini, end;
rdtscl(ini); rdtscl(end);
printk("time lapse: %li\n", end - ini);
Some of the other platforms offer similar functionality, and kernel headers offer an architecture-independent function that you can use instead of rdtsc. It is called get_cycles, defined in <asm/timex.h> (included by <linux/timex.h>). Its prototype is:
一些其他的平台提供相似的功能, 并且内核头文件提供一个体系独立的功能, 你可用来代替 rdtsc. 它称为 get_cycles, 定义在 <asm/timex.h>( 由 <linux/timex.h> 包含). 它的原型是:
#include <linux/timex.h>
cycles_t get_cycles(void);
This function is defined for every platform, and it always returns 0 on the platforms that have no cycle-counter register. The cycles_t type is an appropriate unsigned type to hold the value read.
这个函数为每个平台定义, 并且它一直返回 0 在没有周期-计数器寄存器的平台上. cycles_t 类型是一个合适的 unsigned 类型来持有读到的值.
Knowing the Current Time——获知当前时间
Kernel code can always retrieve a representation of the current time by looking at the value of jiffies. Usually, the fact that the value represents only the time since the last boot is not relevant to the driver, because its life is limited to the system uptime. As shown, drivers can use the current value of jiffies to calculate time intervals across events (for example, to tell double-clicks from single-clicks in input device drivers or calculate timeouts). In short, looking at jiffies is almost always sufficient when you need to measure time intervals. If you need very precise measurements for short time lapses, processor-specific registers come to the rescue (although they bring in serious portability issues).
内核代码能一直获取一个当前时间的表示, 通过查看 jifies 的值. 常常地, 这个值只代表从最后一次启动以来的时间, 这个事实对驱动来说无关, 因为它的生命周期受限于系统的 uptime. 如所示, 驱动可以使用 jiffies 的当前值来计算事件之间的时间间隔(例如, 在输入驱动中从单击中区分双击或者计算超时). 简单地讲, 查看 jiffies 几乎一直是足够的, 当你需要测量时间间隔. 如果你需要对短时间流失的非常精确的测量, 处理器特定的寄存器来帮忙了( 尽管它们带来严重的移植性问题 ).
It’s quite unlikely that a driver will ever need to know the wall-clock time, expressed in months, days, and hours; the information is usually needed only by user programs such as cron and syslogd. Dealing with real-world time is usually best left to user space, where the C library offers better support; besides, such code is often too policy-related to belong in the kernel. There is a kernel function that turns a wall-clock time into a jiffies value, however:
它是非常不可能一个驱动会需要知道墙上时钟时间, 以月, 天, 和小时来表达的; 这个信息常常只对用户程序需要, 例如 cron 和 syslogd. 处理真实世界的时间常常最好留给用户空间, 那里的 C 库提供了更好的支持; 另外, 这样的代码常常太策略相关以至于不属于内核. 有一个内核函数转变一个墙上时钟时间到一个 jiffies 值, 但是:
#include <linux/time.h>
unsigned long mktime (unsigned int year, unsigned int mon,
                      unsigned int day, unsigned int hour,
                      unsigned int min, unsigned int sec);
To repeat: dealing directly with wall-clock time in a driver is often a sign that policy is being implemented and should therefore be questioned.
重复:直接在驱动中处理墙上时钟时间往往是一个在实现策略的信号, 并且应当因此而被置疑.
While you won’t have to deal with human-readable representations of the time, sometimes you need to deal with absolute timestamp even in kernel space. To this aim, <linux/time.h> exports the do_gettimeofday function. When called, it fills a struct timeval pointer—the same one used in the gettimeofday system call—with the familiar seconds and microseconds values. The prototype for do_gettimeofday is:
虽然你不会一定处理人可读的时间表示, 有时你需要甚至在内核空间中处理绝对时间. 为此, <linux/time.h> 输出了 do_gettimeofday 函数. 当被调用时, 它填充一个 struct timeval 指针 -- 和在 gettimeofday 系统调用中使用的相同 -- 使用类似的秒和毫秒值. do_gettimeofday 的原型是:
#include <linux/time.h>
void do_gettimeofday(struct timeval *tv);
The source states that do_gettimeofday has “near microsecond resolution,” because it asks the timing hardware what fraction of the current jiffy has already elapsed. The precision varies from one architecture to another, however, since it depends on the actual hardware mechanisms in use. For example, some m68knommu processors, Sun3 systems, and other m68k systems cannot offer more than jiffy resolution. Pentium systems, on the other hand, offer very fast and precise subtick measures by reading the timestamp counter described earlier in this chapter.
这段源代码声明 do_gettimeofday 有" 接近毫秒的精度", 因为它询问时间硬件当前 jiffy 多大比例已经流失. 这个精度每个体系都不同, 但是, 因为它依赖实际使用中的硬件机制. 例如, 一些 m68knommu 处理器, Sun3 系统, 和其他 m68k 系统不能提供大于 jiffy 的精度. Pentium 系统, 另一方面, 提供了非常快速和精确的小于嘀哒的测量, 通过读取本章前面描述的时戳计数器.
The current time is also available (though with jiffy granularity) from the xtime variable, a struct timespec value. Direct use of this variable is discouraged because it is difficult to atomically access both the fields. Therefore, the kernel offers the utility function current_kernel_time:
当前时间也可用( 尽管使用 jiffy 的粒度 )来自 xtime 变量, 一个 struct timespec 值. 不鼓励这个变量的直接使用, 因为难以原子地同时存取这 2 个字段. 因此, 内核提供了实用函数 current_kernel_time:
#include <linux/time.h>
struct timespec current_kernel_time(void);
Code for retrieving the current time in the various ways it is available within the jit (“just in time”) module in the source files provided on O’Reilly’s FTP site. jit creates a file called /proc/currentime, which returns the following items in ASCII when read:
用来以各种方式获取当前时间的代码, 可以从由 O' Reilly 提供的 FTP 网站上的源码文件的 jit ("just in time") 模块获得. jit 创建了一个文件称为 /proc/currentime, 当读取时, 它以 ASCII 码返回下列项:
? The current jiffies and jiffies_64 values as hex numbers
? The current time as returned by do_gettimeofday
? The timespec returned by current_kernel_time
●当前的 jiffies 和 jiffies_64 值, 以 16 进制数的形式.
●如同 do_gettimeofday 返回的相同的当前时间.
●由 current_kernel_time 返回的 timespec.
We chose to use a dynamic /proc file to keep the boilerplate code to a minimum—it’s not worth creating a whole device just to return a little textual information.
我们选择使用一个动态的 /proc 文件来保持样板代码为最小 -- 它不值得创建一整个设备只是返回一点儿文本信息.
The file returns text lines continuously as long as the module is loaded; each read system call collects and returns one set of data, organized in two lines for better readability. Whenever you read multiple data sets in less than a timer tick, you’ll see the difference between do_gettimeofday, which queries the hardware, and the other values that are updated only when the timer ticks.
这个文件连续返回文本行只要这个模块加载着; 每次 read 系统调用收集和返回一套数据, 为更好阅读而组织为 2 行. 无论何时你在少于一个时钟嘀哒内读多个数据集, 你将看到 do_gettimeofday 之间的差别, 它询问硬件, 并且其他值仅在时钟嘀哒时被更新.
In the screenshot above, there are two interesting things to note. First, the current_kernel_time value, though expressed in nanoseconds, has only clock-tick granularity; do_gettimeofday consistently reports a later time but not later than the next timer tick. Second, the 64-bit jiffies counter has the least-significant bit of the upper 32-bit word set. This happens because the default value for INITIAL_JIFFIES, used at boot time to initialize the counter, forces a low-word overflow a few minutes after boot time to help detect problems related to that very overflow. This initial bias in the counter has no effect, because jiffies is unrelated to wall-clock time. In /proc/ uptime, where the kernel extracts the uptime from the counter, the initial bias is removed before conversion.
在上面的屏幕快照中, 由 2 件有趣的事情要注意. 首先, 这个 current_kernel_time 值, 尽管以纳秒来表示, 只有时钟嘀哒的粒度; do_gettimeofday 持续报告一个稍晚的时间但是不晚于下一个时钟嘀哒. 第二, 这个 64-位的 jiffies 计数器有高 32-位字集合的最低有效位. 这是由于 INITIAL_JIFFIES 的缺省值, 在启动时间用来初始化计数器, 在启动时间后几分钟内强加一个低字溢出来帮助探测与这个刚好溢出相关的问题. 这个在计数器中的初始化偏好没有效果, 因为 jiffies 与墙上时钟时间无关. 在 /proc/uptime 中, 这里内核从计数器中抽取 uptime, 初始化偏好在转换前被去除.
Delaying Execution——延后执行
Device drivers often need to delay the execution of a particular piece of code for a period of time, usually to allow the hardware to accomplish some task. In this section we cover a number of different techniques for achieving delays. The circumstances of each situation determine which technique is best to use; we go over them all, and point out the advantages and disadvantages of each.
设备驱动常常需要延后一段时间执行一个特定片段的代码, 常常允许硬件完成某个任务. 在这一节我们涉及许多不同的技术来获得延后. 每种情况的环境决定了使用哪种技术最好; 我们全都仔细检查它们, 并且指出每一个的长处和缺点.
One important thing to consider is how the delay you need compares with the clock tick, considering the range of HZ across the various platforms. Delays that are reliably longer than the clock tick, and don’t suffer from its coarse granularity, can make use of the system clock. Very short delays typically must be implemented with software loops. In between these two cases lies a gray area. In this chapter, we use the phrase “long” delay to refer to a multiple-jiffy delay, which can be as low as a few milliseconds on some platforms, but is still long as seen by the CPU and the kernel.
一件要考虑的重要的事情是你需要的延时如何与时钟嘀哒比较, 考虑到 HZ 的跨各种平台的范围. 那种可靠地比时钟嘀哒长并且不会受损于它的粗粒度的延时, 可以利用系统时钟. 每个短延时典型地必须使用软件循环来实现. 在这 2 种情况中存在一个灰色地带. 在本章, 我们使用短语" long " 延时来指一个多 jiffy 延时, 在一些平台上它可以如同几个毫秒一样少, 但是在 CPU 和内核看来仍然是长的.
The following sections talk about the different delays by taking a somewhat long path from various intuitive but inappropriate solutions to the right solution. We chose this path because it allows a more in-depth discussion of kernel issues related to timing. If you are eager to find the right code, just skim through the section.
下面的几节讨论不同的延时, 通过采用一些长路径, 从各种直觉上不适合的方法到正确的方法. 我们选择这个途径因为它允许对内核相关定时方面的更深入的讨论. 如果你急于找出正确的代码, 只要快速浏览本节.
Long Delays——长延时
Occasionally a driver needs to delay execution for relatively long periods—more than one clock tick. There are a few ways of accomplishing this sort of delay; we start with the simplest technique, then proceed to the more advanced techniques.
Busy waiting——忙等待
If you want to delay execution by a multiple of the clock tick, allowing some slack in the value, the easiest (though not recommended) implementation is a loop that monitors the jiffy counter. The busy-waiting implementation usually looks like the following code, where j1 is the value of jiffies at the expiration of the delay:
如果你想延时执行多个时钟嘀哒, 允许在值中某些疏忽, 最容易的( 尽管不推荐 ) 的实现是一个监视 jiffy 计数器的循环. 这种忙等待实现常常看来象下面的代码, 这里 j1 是 jiffies 的在延时超时的值:
while (time_before(jiffies, j1))
    cpu_relax( );
The call to cpu_relax invokes an architecture-specific way of saying that you’re not doing much with the processor at the moment. On many systems it does nothing at all; on symmetric multithreaded (“hyperthreaded”) systems, it may yield the core to the other thread. In any case, this approach should definitely be avoided whenever possible. We show it here because on occasion you might want to run this code to better understand the internals of other code.
对 cpu_relex 的调用使用了一个特定于体系的方式来说, 你此时没有在用处理器做事情. 在许多系统中它根本不做任何事; 在对称多线程(" 超线程" ) 系统中, 可能让出核心给其他线程. 在如何情况下, 无论何时有可能, 这个方法应当明确地避免. 我们展示它是因为偶尔你可能想运行这个代码来更好理解其他代码的内幕.
So let’s look at how this code works. The loop is guaranteed to work because jiffies is declared as volatile by the kernel headers and, therefore, is fetched from memory any time some C code accesses it. Although technically correct (in that it works as designed), this busy loop severely degrades system performance. If you didn’t configure your kernel for preemptive operation, the loop completely locks the processor for the duration of the delay; the scheduler never preempts a process that is running in kernel space, and the computer looks completely dead until time j1 is reached. The problem is less serious if you are running a preemptive kernel, because, unless the code is holding a lock, some of the processor’s time can be recovered for other uses. Busy waits are still expensive on preemptive systems, however.
我们来看一下这个代码如何工作. 这个循环被保证能工作因为 jiffies 被内核头文件声明做易失性的, 并且因此, 在任何时候 C 代码寻址它时都从内存中获取. 尽管技术上正确( 它如同设计的一样工作 ), 这种忙等待严重地降低了系统性能. 如果你不配置你的内核为抢占操作, 这个循环在延时期间完全锁住了处理器; 调度器永远不会抢占一个在内核中运行的进程, 并且计算机看起来完全死掉直到时间 j1 到时. 这个问题如果你运行一个可抢占的内核时会改善一点, 因为, 除非这个代码正持有一个锁, 处理器的一些时间可以被其他用途获得. 但是, 忙等待在可抢占系统中仍然是昂贵的.
Still worse, if interrupts happen to be disabled when you enter the loop, jiffies won’t be updated, and the while condition remains true forever. Running a preemptive kernel won’t help either, and you’ll be forced to hit the big red button.
更坏的是, 当你进入循环时如果中断碰巧被禁止, jiffies 将不会被更新, 并且 while 条件永远保持真. 运行一个抢占的内核也不会有帮助, 并且你将被迫去击打大红按钮.
This implementation of delaying code is available, like the following ones, in the jit module. The /proc/jit* files created by the module delay a whole second each time you read a line of text, and lines are guaranteed to be 20 bytes each. If you want to test the busy-wait code, you can read /proc/jitbusy, which busy-loops for one second for each line it returns.
这个延时代码的实现可拿到, 如同下列的, 在 jit 模块中. 模块创建的这些 /proc/jit* 文件每次你读取一行文本就延时一整秒, 并且这些行保证是每个 20 字节. 如果你想测试忙等待代码, 你可以读取 /proc/jitbusy, 每当它返回一行它忙-循环一秒.
Be sure to read, at most, one line (or a few lines) at a time from /proc/jitbusy. The simplified kernel mechanism to register /proc files invokes the read method over and over to fill the data buffer the user requested. Therefore, a command such as cat /proc/jitbusy, if it reads 4 KB at a time, freezes the computer for 205 seconds.
为确保读, 最多一行( 或者几行 ) 一次从/proc/jitbusy. 简化的注册 /proc 文件的内核机制反复调用 read 方法来填充用户请求的数据缓存. 因此, 一个命令, 例如 cat /proc/jitbusy, 如果它一次读取 4KB, 会冻住计算机 205 秒.
The suggested command to read /proc/jitbusy is dd bs=20 < /proc/jitbusy, optionally specifying the number of blocks as well. Each 20-byte line returned by the file represents the value the jiffy counter had before and after the delay. This is a sample run on an otherwise unloaded computer:
推荐的读 /proc/jitbusy 的命令是 dd bs=200 < /proc/jitbusy, 可选地同时指定块数目. 文件返回的每 20-字节的行表示 jiffy 计数器已有的值, 在延时之前和延时之后. 这是一个例子运行在一个其他方面无负担的计算机上:
All looks good: delays are exactly one second (1000 jiffies), and the next read system call starts immediately after the previous one is over. But let’s see what happens on a system with a large number of CPU-intensive processes running (and nonpreemptive kernel):
看来都挺好: 延时精确地是 1 秒 ( 1000 jiffies ), 并且下一个 read 系统调用在上一个结束后立刻开始. 但是让我们看看在一个有大量 CPU-密集型进程在运行(并且是非抢占内核)的系统上会发生什么:
Here, each read system call delays exactly one second, but the kernel can take more than 5 seconds before scheduling the dd process so it can issue the next system call. That’s expected in a multitasking system; CPU time is shared between all running processes, and a CPU-intensive process has its dynamic priority reduced. (A discussion of scheduling policies is outside the scope of this book.)
这里, 每个 read 系统调用精确地延时 1 秒, 但是内核耗费多过 5 秒在调度 dd 进程以便它可以发出下一个系统调用之前. 在一个多任务系统就期望是这样; CPU 时间在所有运行的进程间共享, 并且一个 CPU-密集型进程有它的动态减少的优先级. ( 调度策略的讨论在本书范围之外).
The test under load shown above has been performed while running the load50 sample program. This program forks a number of processes that do nothing, but do it in a CPU-intensive way. The program is part of the sample files accompanying this book, and forks 50 processes by default, although the number can be specified on the command line. In this chapter, and elsewhere in the book, the tests with a loaded system have been performed with load50 running in an otherwise idle computer.
上面所示的在负载下的测试已经在运行 load50 例子程序中进行了. 这个程序派生出许多什么都不做的进程, 但是以一种 CPU-密集的方式来做. 这个程序是伴随本书的例子文件的一部分, 并且缺省是派生 50 个进程, 尽管这个数字可以在命令行指定. 在本章, 以及在本书其他部分, 使用一个有负载的系统的测试已经用 load50 在一个其他方面空闲的计算机上运行来进行了.
If you repeat the command while running a preemptible kernel, you’ll find no noticeable difference on an otherwise idle CPU and the following behavior under load:
如果你在运行一个可抢占内核时重复这个命令, 你会发现没有显著差别在一个其他方面空闲的 CPU 上以及下面的在负载下的行为:
Here, there is no significant delay between the end of a system call and the beginning of the next one, but the individual delays are far longer than one second: up to 3.8 seconds in the example shown and increasing over time. These values demonstrate that the process has been interrupted during its delay, scheduling other processes. The gap between system calls is not the only scheduling option for this process, so no special delay can be seen there.
这里, 没有显著的延时在一个系统调用的末尾和下一个的开始之间, 但是单独的延时远远比 1 秒长: 直到 3.8 秒在展示的例子中并且随时间上升. 这些值显示了进程在它的延时当中被中断, 调度其他的进程. 系统调用之间的间隙不是唯一的这个进程的调度选项, 因此没有特别的延时在那里可以看到.
Yielding the processor——让出处理器
As we have seen, busy waiting imposes a heavy load on the system as a whole; we would like to find a better technique. The first change that comes to mind is to explicitly release the CPU when we’re not interested in it. This is accomplished by calling the schedule function, declared in <linux/sched.h>:
如我们已见到的, 忙等待强加了一个重负载给系统总体; 我们乐意找出一个更好的技术. 想到的第一个改变是明确地释放 CPU 当我们对其不感兴趣时. 这是通过调用调度函数而实现地, 在 <linux/sched.h> 中声明:
while (time_before(jiffies, j1)) {
    schedule( );
}
This loop can be tested by reading /proc/jitsched as we read /proc/jitbusy above. However, is still isn’t optimal. The current process does nothing but release the CPU, but it remains in the run queue. If it is the only runnable process, it actually runs (it calls the scheduler, which selects the same process, which calls the scheduler, which. . .). In other words, the load of the machine (the average number of running processes) is at least one, and the idle task (process number 0, also called swapper for historical reasons) never runs. Though this issue may seem irrelevant, running the idle task when the computer is idle relieves the processor’s workload, decreasing its temperature and increasing its lifetime, as well as the duration of the batteries if the computer happens to be your laptop. Moreover, since the process is actually executing during the delay, it is accountable for all the time it consumes.
这个循环可以通过读取 /proc/jitsched 如同我们上面读 /proc/jitbusy 一样来测试. 但是, 还是不够优化. 当前进程除了释放 CPU 不作任何事情, 但是它保留在运行队列中. 如果它是唯一的可运行进程, 实际上它运行( 它调用调度器来选择同一个进程, 进程又调用调度器, 这样下去). 换句话说, 机器的负载( 在运行的进程的平均数 ) 最少是 1, 并且空闲任务 ( 进程号 0, 也称为对换进程, 由于历史原因) 从不运行. 尽管这个问题可能看来无关, 在计算机是空闲时运行空闲任务减轻了处理器工作负载, 降低它的温度以及提高它的生命期, 同时电池的使用时间如果这个计算机是你的膝上机. 更多的, 因为进程实际上在延时中执行, 它所耗费的时间都可以统计.
The behavior of /proc/jitsched is actually similar to running /proc/jitbusy under a preemptive kernel. This is a sample run, on an unloaded system:
/proc/jitsched 的行为实际上类似于运行 /proc/jitbusy 在一个抢占的内核下. 这是一个例子运行, 在一个无负载的系统:
It’s interesting to note that each read sometimes ends up waiting a few clock ticks more than requested. This problem gets worse and worse as the system gets busy, and the driver could end up waiting longer than expected. Once a process releases the processor with schedule, there are no guarantees that the process will get the processor back anytime soon. Therefore, calling schedule in this manner is not a safe solution to the driver’s needs, in addition to being bad for the computing system as a whole. If you test jitsched while running load50, you can see that the delay associated to each line is extended by a few seconds, because other processes are using the CPU when the timeout expires.
有趣的是要注意每次 read 有时结束于等待比要求的多几个时钟嘀哒. 这个问题随着系统变忙会变得越来越坏, 并且驱动可能结束于等待长于期望的时间. 一旦一个进程使用调度来释放处理器, 无法保证进程将拿回处理器在任何时间之后. 因此, 以这种方式调用调度器对于驱动的需求不是一个安全的解决方法, 另外对计算机系统整体是不好的. 如果你在运行 load50 时测试 jitsched, 你可以见到关联到每一行的延时被扩充了几秒, 因为当定时超时的时候其他进程在使用 CPU .
Timeouts——超时
The suboptimal delay loops shown up to now work by watching the jiffy counter without telling anyone. But the best way to implement a delay, as you may imagine, is usually to ask the kernel to do it for you. There are two ways of setting up jiffy-based timeouts, depending on whether your driver is waiting for other events or not.
到目前为止所展示的次优化的延时循环通过查看 jiffy 计数器而不告诉任何人来工作. 但是最好的实现一个延时的方法, 如你可能猜想的, 常常是请求内核为你做. 有 2 种方法来建立一个基于 jiffy 的超时, 依赖于是否你的驱动在等待其他的事件.
If your driver uses a wait queue to wait for some other event, but you also want to be sure that it runs within a certain period of time, it can use wait_event_timeout or wait_event_interruptible_timeout:
如果你的驱动使用一个等待队列来等待某些其他事件, 但是你也想确保它在一个确定时间段内运行, 可以使用 wait_event_timeout 或者 wait_event_interruptible_timeout:
#include <linux/wait.h>
long wait_event_timeout(wait_queue_head_t q, condition, long timeout);
long wait_event_interruptible_timeout(wait_queue_head_t q,
                      condition, long timeout);
These functions sleep on the given wait queue, but they return after the timeout (expressed in jiffies) expires. Thus, they implement a bounded sleep that does not go on forever. Note that the timeout value represents the number of jiffies to wait, not an absolute time value. The value is represented by a signed number, because it sometimes is the result of a subtraction, although the functions complain through a printk statement if the provided timeout is negative. If the timeout expires, the functions return 0; if the process is awakened by another event, it returns the remaining delay expressed in jiffies. The return value is never negative, even if the delay is greater than expected because of system load.
这些函数在给定队列上睡眠, 但是它们在超时(以 jiffies 表示)到后返回. 因此, 它们实现一个限定的睡眠不会一直睡下去. 注意超时值表示要等待的 jiffies 数, 不是一个绝对时间值. 这个值由一个有符号的数表示, 因为它有时是一个相减运算的结果, 尽管这些函数如果提供的超时值是负值通过一个 printk 语句抱怨. 如果超时到, 这些函数返回 0; 如果这个进程被其他事件唤醒, 它返回以 jiffies 表示的剩余超时值. 返回值从不会是负值, 甚至如果延时由于系统负载而比期望的值大.
The /proc/jitqueue file shows a delay based on wait_event_interruptible_timeout, although the module has no event to wait for, and uses 0 as a condition:
/proc/jitqueue 文件展示了一个基于 wait_event_interruptible_timeout 的延时, 结果这个模块没有事件来等待, 并且使用 0 作为一个条件:
wait_queue_head_t wait;
init_waitqueue_head (&wait);
wait_event_interruptible_timeout(wait, 0, delay);
The observed behaviour, when reading /proc/jitqueue, is nearly optimal, even under load:
当读取 /proc/jitqueue 时, 观察到的行为近乎优化的, 即便在负载下:
Since the reading process (dd above) is not in the run queue while waiting for the timeout, you see no difference in behavior whether the code is run in a preemptive kernel or not.
因为读进程当等待超时( 上面是 dd )不在运行队列中, 你看不到表现方面的差别, 无论代码是否运行在一个抢占内核中.
wait_event_timeout and wait_event_interruptible_timeout were designed with a hardware driver in mind, where execution could be resumed in either of two ways: either somebody calls wake_up on the wait queue, or the timeout expires. This doesn’t apply to jitqueue, as nobody ever calls wake_up on the wait queue (after all, no other code even knows about it), so the process always wakes up when the timeout expires. To accommodate for this very situation, where you want to delay execution waiting for no specific event, the kernel offers the schedule_timeout function so you can avoid declaring and using a superfluous wait queue head:
wait_event_timeout 和 wait_event_interruptible_timeout 被设计为有硬件驱动存在, 这里可以用任何一种方法来恢复执行: 或者有人调用 wake_up 在等待队列上, 或者超时到. 这不适用于 jitqueue, 因为没人在等待队列上调用 wake_up ( 毕竟, 没有其他代码知道它 ), 因此这个进程当超时到时一直唤醒. 为适应这个特别的情况, 这里你想延后执行不等待特定事件, 内核提供了 schedule_timeout 函数, 因此你可以避免声明和使用一个多余的等待队列头:
#include <linux/sched.h>
signed long schedule_timeout(signed long timeout);
Here, timeout is the number of jiffies to delay. The return value is 0 unless the function returns before the given timeout has elapsed (in response to a signal). schedule_timeout requires that the caller first set the current process state, so a typical call looks like:
这里, timeout 是要延时的 jiffies 数. 返回值是 0 除非这个函数在给定的 timeout 流失前返回(响应一个信号). schedule_timeout 请求调用者首先设置当前的进程状态, 因此一个典型调用看来如此:
set_current_state(TASK_INTERRUPTIBLE);
schedule_timeout (delay);
The previous lines (from /proc/jitschedto) cause the process to sleep until the given time has passed. Since wait_event_interruptible_timeout relies on schedule_timeout internally, we won’t bother showing the numbers jitschedto returns, because they are the same as those of jitqueue. Once again, it is worth noting that an extra time interval could pass between the expiration of the timeout and when your process is actually scheduled to execute.
前面的行( 来自 /proc/jitschedto ) 导致进程睡眠直到经过给定的时间. 因为 wait_event_interruptible_timeout 在内部依赖 schedule_timeout, 我们不会费劲显示 jitschedto 返回的数, 因为它们和 jitqueue 的相同. 再一次, 不值得有一个额外的时间间隔在超时到和你的进程实际被调度来执行之间.
In the example just shown, the first line calls set_current_state to set things up so that the scheduler won’t run the current process again until the timeout places it back in TASK_RUNNING state. To achieve an uninterruptible delay, use TASK_UNINTERRUPTIBLE instead. If you forget to change the state of the current process, a call to schedule_timeout behaves like a call to schedule (i.e., the jitsched behavior), setting up a timer that is not used.
在刚刚展示的例子中, 第一行调用 set_current_state 来设定一些东西以便调度器不会再次运行当前进程, 直到超时将它置回 TASK_RUNNING 状态. 为获得一个不可中断的延时, 使用 TASK_UNINTERRUPTIBLE 代替. 如果你忘记改变当前进程的状态, 调用 schedule_time 如同调用 shcedule( 即, jitsched 的行为), 建立一个不用的定时器.
If you want to play with the four jit files under different system situations or different kernels, or try other ways to delay execution, you may want to configure the amount of the delay when loading the module by setting the delay module parameter.
如果你想使用这 4 个 jit 文件在不同的系统情况下或者不同的内核, 或者尝试其他的方式来延后执行, 你可能想配置延时量当加载模块时通过设定延时模块参数.
Short Delays——短延时
When a device driver needs to deal with latencies in its hardware, the delays involved are usually a few dozen microseconds at most. In this case, relying on the clock tick is definitely not the way to go.
当一个设备驱动需要处理它的硬件的反应时间, 涉及到的延时常常是最多几个毫秒. 在这个情况下, 依靠时钟嘀哒显然不对路.
The kernel functions ndelay, udelay, and mdelay serve well for short delays, delaying execution for the specified number of nanoseconds, microseconds, or milliseconds respectively. Their prototypes are:
内核函数 ndelay, udelay, 以及 mdelay 对于短延时好用, 分别延后执行指定的纳秒数, 微秒数或者毫秒数. 它们的原型是:
#include <linux/delay.h>
void ndelay(unsigned long nsecs);
void udelay(unsigned long usecs);
void mdelay(unsigned long msecs);
The actual implementations of the functions are in <asm/delay.h>, being architecture-specific, and sometimes build on an external function. Every architecture implements udelay, but the other functions may or may not be defined; if they are not, <linux/delay.h> offers a default version based on udelay. In all cases, the delay achieved is at least the requested value but could be more; actually, no platform currently achieves nanosecond precision, although several ones offer submicrosecond precision. Delaying more than the requested value is usually not a problem, as short delays in a driver are usually needed to wait for the hardware, and the requirements are to wait for at least a given time lapse.
这些函数的实际实现在 <asm/delay.h>, 是体系特定的, 并且有时建立在一个外部函数上. 每个体系都实现 udelay, 但是其他的函数可能或者不可能定义; 如果它们没有定义, <linux/delay.h> 提供一个缺省的基于 udelay 的版本. 在所有的情况中, 获得的延时至少是要求的值, 但可能更多; 实际上, 当前没有平台获得了纳秒的精度, 尽管有几个提供了次微秒的精度. 延时多于要求的值通常不是问题, 因为驱动中的短延时常常需要等待硬件, 并且这个要求是等待至少一个给定的时间流失.
The implementation of udelay (and possibly ndelay too) uses a software loop based on the processor speed calculated at boot time, using the integer variable loops_per_jiffy. If you want to look at the actual code, however, be aware that the x86 implementation is quite a complexone because of the different timing sources it uses, based on what CPU type is running the code.
udelay 的实现( 可能 ndelay 也是) 使用一个软件循环基于在启动时计算的处理器速度, 使用整数变量 loos_per_jiffy. 如果你想看看实际的代码, 但是, 小心 x86 实现是相当复杂的一个因为它使用的不同的时间源, 基于什么 CPU 类型在运行代码.
To avoid integer overflows in loop calculations, udelay and ndelay impose an upper bound in the value passed to them. If your module fails to load and displays an unresolved symbol, __bad_udelay, it means you called udelay with too large an argument. Note, however, that the compile-time check can be performed only on constant values and that not all platforms implement it. As a general rule, if you are trying to delay for thousands of nanoseconds, you should be using udelay rather than ndelay; similarly, millisecond-scale delays should be done with mdelay and not one of the finer-grained functions.
为避免在循环计算中整数溢出, udelay 和 ndelay 强加一个上限给传递给它们的值. 如果你的模块无法加载和显示一个未解决的符号, __bad_udelay, 这意味着你使用太大的参数调用 udleay. 注意, 但是, 编译时检查只对常量进行并且不是所有的平台实现它. 作为一个通用的规则, 如果你试图延时几千纳秒, 你应当使用 udelay 而不是 ndelay; 类似地, 毫秒规模的延时应当使用 mdelay 完成而不是一个更细粒度的函数.
It’s important to remember that the three delay functions are busy-waiting; other tasks can’t be run during the time lapse. Thus, they replicate, though on a different scale, the behavior of jitbusy. Thus, these functions should only be used when there is no practical alternative.
重要的是记住这 3 个延时函数是忙等待; 其他任务在时间流失时不能运行. 因此, 它们重复, 尽管在一个不同的规模上, jitbusy 的做法. 因此, 这些函数应当只用在没有实用的替代时.
There is another way of achieving millisecond (and longer) delays that does not involve busy waiting. The file <linux/delay.h> declares these functions:
有另一个方法获得毫秒(和更长)延时而不用涉及到忙等待. 文件 <linux/delay.h> 声明这些函数:
void msleep(unsigned int millisecs);
unsigned long msleep_interruptible(unsigned int millisecs);
void ssleep(unsigned int seconds)
The first two functions puts the calling process to sleep for the given number of millisecs. A call to msleep is uninterruptible; you can be sure that the process sleeps for at least the given number of milliseconds. If your driver is sitting on a wait queue and you want a wakeup to break the sleep, use msleep_interruptible. The return value from msleep_interruptible is normally 0; if, however, the process is awakened early, the return value is the number of milliseconds remaining in the originally requested sleep period. A call to ssleep puts the process into an uninterruptible sleep for the given number of seconds.
前 2 个函数使调用进程进入睡眠给定的毫秒数. 一个对 msleep 的调用是不可中断的; 你能确保进程睡眠至少给定的毫秒数. 如果你的驱动位于一个等待队列并且你想唤醒来打断睡眠, 使用 msleep_interruptible. 从 msleep_interruptible 的返回值正常地是 0; 如果, 但是, 这个进程被提早唤醒, 返回值是在初始请求睡眠周期中剩余的毫秒数. 对 ssleep 的调用使进程进入一个不可中断的睡眠给定的秒数.
In general, if you can tolerate longer delays than requested, you should use schedule_timeout, msleep, or ssleep.
通常, 如果你能够容忍比请求的更长的延时, 你应当使用 schedule_timeout, msleep, 或者 ssleep.
Kernel Timers——内核定时器
Whenever you need to schedule an action to happen later, without blocking the current process until that time arrives, kernel timers are the tool for you. These timers are used to schedule execution of a function at a particular time in the future, based on the clock tick, and can be used for a variety of tasks; for example, polling a device by checking its state at regular intervals when the hardware can’t fire interrupts. Other typical uses of kernel timers are turning off the floppy motor or finishing another lengthy shut down operation. In such cases, delaying the return from close would impose an unnecessary (and surprising) cost on the application program. Finally, the kernel itself uses the timers in several situations, including the implementation of schedule_timeout.
无论何时你需要调度一个动作以后发生, 而不阻塞当前进程直到到时, 内核定时器是给你的工具. 这些定时器用来调度一个函数在将来一个特定的时间执行, 基于时钟嘀哒, 并且可用作各类任务; 例如, 当硬件无法发出中断时, 查询一个设备通过在定期的间隔内检查它的状态. 其他的内核定时器的典型应用是关闭软驱马达或者结束另一个长期终止的操作. 在这种情况下, 延后来自 close 的返回将强加一个不必要(并且吓人的)开销在应用程序上. 最后, 内核自身使用定时器在几个情况下, 包括实现 schedule_timeout.
A kernel timer is a data structure that instructs the kernel to execute a user-defined function with a user-defined argument at a user-defined time. The implementation resides in <linux/timer.h> and kernel/timer.c and is described in detail in the section “The Implementation of Kernel Timers.”
一个内核定时器是一个数据结构, 它指导内核执行一个用户定义的函数使用一个用户定义的参数在一个用户定义的时间. 这个实现位于 <linux/timer.h> 和 kernel/timer.c 并且在"内核定时器"一节中详细介绍.
The functions scheduled to run almost certainly do not run while the process that registered them is executing. They are, instead, run asynchronously. Until now, everything we have done in our sample drivers has run in the context of a process executing system calls. When a timer runs, however, the process that scheduled it could be asleep, executing on a different processor, or quite possibly has exited altogether.
被调度运行的函数几乎确定不会在注册它们的进程在运行时运行. 它们是, 相反, 异步运行. 直到现在, 我们在我们的例子驱动中已经做的任何事情已经在执行系统调用的进程上下文中运行. 当一个定时器运行时, 但是, 这个调度进程可能睡眠, 可能在不同的一个处理器上运行, 或者很可能已经一起退出.
This asynchronous execution resembles what happens when a hardware interrupt happens (which is discussed in detail in Chapter 10). In fact, kernel timers are run as the result of a “software interrupt.” When running in this sort of atomic context, your code is subject to a number of constraints. Timer functions must be atomic in all the ways we discussed in the section “Spinlocks and Atomic Context” in Chapter 1, but there are some additional issues brought about by the lack of a process context. We will introduce these constraints now; they will be seen again in several places in later chapters. Repetition is called for because the rules for atomic contexts must be followed assiduously, or the system will find itself in deep trouble.
这个异步执行类似当发生一个硬件中断时所发生的( 这在第 10 章详细讨论 ). 实际上, 内核定时器被作为一个"软件中断"的结果而实现. 当在这种原子上下文运行时, 你的代码易受到多个限制. 定时器函数必须是原子的以所有的我们在第 1 章"自旋锁和原子上下文"一节中曾讨论过的方式, 但是有几个附加的问题由于缺少一个进程上下文而引起的. 我们将介绍这些限制; 在后续章节的几个地方将再次看到它们. 循环被调用因为原子上下文的规则必须认真遵守, 否则系统会发现自己陷入大麻烦中.
A number of actions require the context of a process in order to be executed. When you are outside of process context (i.e., in interrupt context), you must observe the following rules:
为能够被执行, 多个动作需要进程上下文. 当你在进程上下文之外(即, 在中断上下文), 你必须遵守下列规则:
? No access to user space is allowed. Because there is no process context, there is no path to the user space associated with any particular process.
? The current pointer is not meaningful in atomic mode and cannot be used sincethe relevant code has no connection with the process that has been interrupted.
? No sleeping or scheduling may be performed. Atomic code may not call sched-ule or a form of wait_event, nor may it call any other function that could sleep.For example, calling kmalloc(..., GFP_KERNEL) is against the rules. Sema-phores also must not be used since they can sleep.
●没有允许存取用户空间. 因为没有进程上下文, 没有和任何特定进程相关联的到用户空间的途径.
●这个 current 指针在原子态没有意义, 并且不能使用因为相关的代码没有和已被中断的进程的联系.
●不能进行睡眠或者调度. 原子代码不能调用 schedule 或者某种 wait_event, 也不能调用任何其他可能睡眠的函数. 例如, 调用 kmalloc(..., GFP_KERNEL) 是违犯规则的. 旗标也必须不能使用因为它们可能睡眠.
Kernel code can tell if it is running in interrupt context by calling the function in_interrupt( ), which takes no parameters and returns nonzero if the processor is currently running in interrupt context, either hardware interrupt or software interrupt.
内核代码能够告知是否它在中断上下文中运行, 通过调用函数 in_interrupt(), 它不要参数并且如果处理器当前在中断上下文运行就返回非零, 要么硬件中断要么软件中断.
A function related to in_interrupt( ) is in_atomic( ). Its return value is nonzero whenever scheduling is not allowed; this includes hardware and software interrupt contexts as well as any time when a spinlock is held. In the latter case, current may be valid, but access to user space is forbidden, since it can cause scheduling to happen. Whenever you are using in_interrupt( ), you should really consider whether in_atomic( ) is what you actually mean. Both functions are declared in <asm/hardirq.h>
一个和 in_interrupt() 相关的函数是 in_atomic(). 它的返回值是非零无论何时调度被禁止; 这包含硬件和软件中断上下文以及任何持有自旋锁的时候. 在后一种情况, current 可能是有效的, 但是存取用户空间被禁止, 因为它能导致调度发生. 无论何时你使用 in_interrupt(), 你应当真正考虑是否 in_atomic 是你实际想要的. 2 个函数都在 <asm/hardirq.h> 中声明.
One other important feature of kernel timers is that a task can reregister itself to run again at a later time. This is possible because each timer_list structure is unlinked from the list of active timers before being run and can, therefore, be immediately relinked elsewhere. Although rescheduling the same task over and over might appear to be a pointless operation, it is sometimes useful. For example, it can be used to implement the polling of devices.
内核定时器的另一个重要特性是一个任务可以注册它本身在后面时间重新运行. 这是可能的, 因为每个 timer_list 结构在运行前从激活的定时器链表中去连接, 并且因此能够马上在其他地方被重新连接. 尽管反复重新调度相同的任务可能表现为一个无意义的操作, 有时它是有用的. 例如, 它可用作实现对设备的查询.
It’s also worth knowing that in an SMP system, the timer function is executed by the same CPU that registered it, to achieve better cache locality whenever possible. Therefore, a timer that reregisters itself always runs on the same CPU.
也值得了解在一个 SMP 系统, 定时器函数被注册时相同的 CPU 来执行, 为在任何可能的时候获得更好的缓存局部特性. 因此, 一个重新注册它自己的定时器一直运行在同一个 CPU.
An important feature of timers that should not be forgotten, though, is that they are a potential source of race conditions, even on uniprocessor systems. This is a direct result of their being asynchronous with other code. Therefore, any data structures accessed by the timer function should be protected from concurrent access, either by being atomic types (discussed in the section “Atomic Variables” in Chapter 1) or by using spinlocks (discussed in Chapter 5).
不应当被忘记的定时器的一个重要特性是, 它们是一个潜在的竞争条件的源, 即便在一个单处理器系统. 这是它们与其他代码异步运行的一个直接结果. 因此, 任何被定时器函数存取的数据结构应当保护避免并发存取, 要么通过原子类型( 在第 1 章的"原子变量"一节) 要么使用自旋锁( 在第 9 章讨论 ).
The Timer API——定时器 API
The kernel provides drivers with a number of functions to declare, register, and remove kernel timers. The following excerpt shows the basic building blocks:
内核提供给驱动许多函数来声明, 注册, 以及去除内核定时器. 下列的引用展示了基本的代码块:
#include <linux/timer.h>
struct timer_list {
        /* ... */
        unsigned long expires;
        void (*function)(unsigned long);
        unsigned long data;
};
void init_timer(struct timer_list *timer);
struct timer_list TIMER_INITIALIZER(_function, _expires, _data);
void add_timer(struct timer_list * timer);
int del_timer(struct timer_list * timer);
The data structure includes more fields than the ones shown, but those three are the ones that are meant to be accessed from outside the timer code iteslf. The expires field represents the jiffies value when the timer is expected to run; at that time, the function function is called with data as an argument. If you need to pass multiple items in the argument, you can bundle them as a single data structure and pass a pointer cast to unsigned long, a safe practice on all supported architectures and pretty common in memory management (as discussed in Chapter 15). The expires value is not a jiffies_64 item because timers are not expected to expire very far in the future, and 64-bit operations are slow on 32-bit platforms.
这个数据结构包含比曾展示过的更多的字段, 但是这 3 个是打算从定时器代码自身以外被存取的. 这个 expires 字段表示定时器期望运行的 jiffies 值; 在那个时间, 这个 function 函数被调用使用 data 作为一个参数. 如果你需要在参数中传递多项, 你可以捆绑它们作为一个单个数据结构并且传递一个转换为 unsiged long 的指针, 在所有支持的体系上的一个安全做法并且在内存管理中相当普遍( 如同 15 章中讨论的 ). expires 值不是一个 jiffies_64 项因为定时器不被期望在将来很久到时, 并且 64-位操作在 32-位平台上慢.
The structure must be initialized before use. This step ensures that all the fields are properly set up, including the ones that are opaque to the caller. Initialization can be performed by calling init_timer or assigning TIMER_INITIALIZER to a static structure, according to your needs. After initialization, you can change the three public fields before calling add_timer. To disable a registered timer before it expires, call del_timer.
这个结构必须在使用前初始化. 这个步骤保证所有的成员被正确建立, 包括那些对调用者不透明的. 初始化可以通过调用 init_timer 或者安排 TIMER_INITIALIZER 给一个静态结构, 根据你的需要. 在初始化后, 你可以改变 3 个公共成员在调用 add_timer 前. 为在到时前禁止一个已注册的定时器, 调用 del_timer.
The jit module includes a sample file, /proc/jitimer (for “just in timer”), that returns one header line and sixdata lines. The data lines represent the current environment where the code is running; the first one is generated by the read file operation and the others by a timer. The following output was recorded while compiling a kernel:
jit 模块包括一个例子文件, /proc/jitimer ( 为 "just in timer"), 它返回一个头文件行以及 6 个数据行. 这些数据行表示当前代码运行的环境; 第一个由读文件操作产生并且其他的由定时器. 下列的输出在编译内核时被记录:
In this output, the time field is the value of jiffies when the code runs, delta is the change in jiffies since the previous line, inirq is the Boolean value returned by in_interrupt, pid and command refer to the current process, and cpu is the number of the CPU being used (always 0 on uniprocessor systems).
在这个输出, time 字段是代码运行时的 jiffies 值, delta 是自前一行的 jiffies 改变值, inirq 是由 in_interrupt 返回的布尔值, pid 和 command 指的是当前进程, 以及 cpu 是在使用的 CPU 的数目( 在单处理器系统上一直为 0).
If you read /proc/jitimer while the system is unloaded, you’ll find that the context of the timer is process 0, the idle task, which is called “swapper” mainly for historical reasons.
如果你读 /proc/jitimer 当系统无负载时, 你会发现定时器的上下文是进程 0, 空闲任务, 它被称为"对换进程"只要由于历史原因.
The timer used to generate /proc/jitimer data is run every 10 jiffies by default, but you can change the value by setting the tdelay (timer delay) parameter when loading the module.
用来产生 /proc/jitimer 数据的定时器是缺省每 10 jiffies 运行一次, 但是你可以在加载模块时改变这个值通过设置 tdelay ( timer delay ) 参数.
The following code excerpt shows the part of jit related to the jitimer timer. When a process attempts to read our file, we set up the timer as follows:
下面的代码引用展示了 jit 关于 jitimer 定时器的部分. 当一个进程试图读取我们的文件, 我们建立这个定时器如下:
unsigned long j = jiffies;
/* fill the data for our timer function */
data->prevjiffies = j;
data->buf = buf2;
data->loops = JIT_ASYNC_LOOPS;
/* register the timer */
data->timer.data = (unsigned long)data;
data->timer.function = jit_timer_fn;
data->timer.expires = j + tdelay; /* parameter */
add_timer(&data->timer);
/* wait for the buffer to fill */
wait_event_interruptible(data->wait, !data->loops);
The actual timer function looks like this:
void jit_timer_fn(unsigned long arg)
{
    struct jit_data *data = (struct jit_data *)arg;
    unsigned long j = jiffies;
    data->buf += sprintf(data->buf, "%9li %3li     %i    %6i   %i   %s\n",
                 j, j - data->prevjiffies, in_interrupt( ) ? 1 : 0,
                 current->pid, smp_processor_id( ), current->comm);
    if (--data->loops) {
        data->timer.expires += tdelay;
        data->prevjiffies = j;
        add_timer(&data->timer);
    } else {
        wake_up_interruptible(&data->wait);
    }
}
The timer API includes a few more functions than the ones introduced above. The following set completes the list of kernel offerings:
定时器 API 包括几个比上面介绍的那些更多的功能. 下面的集合是完整的核提供的函数列表:
int mod_timer(struct timer_list *timer, unsigned long expires);
Updates the expiration time of a timer, a common task for which a timeout timer is used (again, the motor-off floppy timer is a typical example). mod_timer can be called on inactive timers as well, where you normally use add_timer.
更新一个定时器的超时时间, 使用一个超时定时器的一个普通的任务(再一次, 关马达软驱定时器是一个典型例子). mod_timer 也可被调用于非激活定时器, 那里你正常地使用 add_timer.
int del_timer_sync(struct timer_list *timer);
Works like del_timer, but also guarantees that when it returns, the timer function is not running on any CPU. del_timer_sync is used to avoid race conditions on SMP systems and is the same as del_timer in UP kernels. This function should be preferred over del_timer in most situations. This function can sleep if it is called from a nonatomic context but busy waits in other situations. Be very careful about calling del_timer_sync while holding locks; if the timer function attempts to obtain the same lock, the system can deadlock. If the timer function reregisters itself, the caller must first ensure that this reregistration will not hap- pen; this is usually accomplished by setting a “shutting down” flag, which is checked by the timer function.
如同 del_timer 一样工作, 但是还保证当它返回时, 定时器函数不在任何 CPU 上运行. del_timer_sync 用来避免竞争情况在 SMP 系统上, 并且在 UP 内核中和 del_timer 相同. 这个函数应当在大部分情况下比 del_timer 更首先使用. 这个函数可能睡眠如果它被从非原子上下文调用, 但是在其他情况下会忙等待. 要十分小心调用 del_timer_sync 当持有锁时; 如果这个定时器函数试图获得同一个锁, 系统会死锁. 如果定时器函数重新注册自己, 调用者必须首先确保这个重新注册不会发生; 这常常同设置一个" 关闭 "标志来实现, 这个标志被定时器函数检查.
int timer_pending(const struct timer_list * timer);
Returns true or false to indicate whether the timer is currently scheduled to run by reading one of the opaque fields of the structure.
返回真或假来指示是否定时器当前被调度来运行, 通过调用结构的其中一个不透明的成员.
The Implementation of Kernel Timers——内核定时器的实现
...无语了，还是以后有机会再看
Tasklets机制
Another kernel facility related to timing issues is the tasklet mechanism. It is mostly used in interrupt management (we’ll see it again in Chapter 10.)
另一个有关于定时问题的内核设施是 tasklet 机制. 它大部分用在中断管理(我们将在第 10 章再次见到).
Tasklets resemble kernel timers in some ways. They are always run at interrupt time, they always run on the same CPU that schedules them, and they receive an unsigned long argument. Unlike kernel timers, however, you can’t ask to execute the function at a specific time. By scheduling a tasklet, you simply ask for it to be executed at a later time chosen by the kernel. This behavior is especially useful with interrupt handlers, where the hardware interrupt must be managed as quickly as possible, but most of the data management can be safely delayed to a later time. Actually, a tasklet, just like a kernel timer, is executed (in atomic mode) in the context of a “soft interrupt,” a kernel mechanism that executes asynchronous tasks with hardware interrupts enabled.
tasklet 类似内核定时器在某些方面. 它们一直在中断时间运行, 它们一直运行在调度它们的同一个 CPU 上, 并且它们接收一个 unsigned long 参数. 不象内核定时器, 但是, 你无法请求在一个指定的时间执行函数. 通过调度一个 tasklet, 你简单地请求它在以后的一个由内核选择的时间执行. 这个行为对于中断处理特别有用, 那里硬件中断必须被尽快处理, 但是大部分的时间管理可以安全地延后到以后的时间. 实际上, 一个 tasket, 就象一个内核定时器, 在一个"软中断"的上下文中执行(以原子模式), 在使能硬件中断时执行异步任务的一个内核机制.
A tasklet exists as a data structure that must be initialized before use. Initialization can be performed by calling a specific function or by declaring the structure using certain macros:
一个 tasklet 存在为一个时间结构, 它必须在使用前被初始化. 初始化能够通过调用一个特定函数或者通过使用某些宏定义声明结构:
#include <linux/interrupt.h>
struct tasklet_struct {
      /* ... */
      void (*func)(unsigned long);
      unsigned long data;
};
void tasklet_init(struct tasklet_struct *t,
void (*func)(unsigned long), unsigned long data);
DECLARE_TASKLET(name, func, data);
DECLARE_TASKLET_DISABLED(name, func, data);
Tasklets offer a number of interesting features:
tasklet 提供了许多有趣的特色:
? A tasklet can be disabled and re-enabled later; it won’t be executed until it is enabled as many times as it has been disabled.
? Just like timers, a tasklet can reregister itself.
? A tasklet can be scheduled to execute at normal priority or high priority. The latter group is always executed first.
? Tasklets may be run immediately if the system is not under heavy load but never later than the next timer tick.
? A tasklets can be concurrent with other tasklets but is strictly serialized with respect to itself—the same tasklet never runs simultaneously on more than one processor. Also, as already noted, a tasklet always runs on the same CPU that schedules it.
●一个 tasklet 能够被禁止并且之后被重新使能; 它不会执行直到它被使能与被禁止相同的次数.
●如同定时器, 一个 tasklet 可以注册它自己.
●一个 tasklet 能被调度来执行以正常的优先级或者高优先级. 后一组一直是首先执行.
●taslet 可能立刻运行, 如果系统不在重载下, 但是从不会晚于下一个时钟嘀哒.
●一个 tasklet 可能和其他 tasklet 并发, 但是对它自己是严格地串行的 -- 同样的 tasklet 从不同时运行在超过一个处理器上. 同样, 如已经提到的, 一个 tasklet 常常在调度它的同一个 CPU 上运行.
The jit module includes two files, /proc/jitasklet and /proc/jitasklethi, that return the same data as /proc/jitimer, introduced in the section “Kernel Timers.” When you read one of the files, you get back a header and sixdata lines. The first data line describes the context of the calling process, and the other lines describe the context of successive runs of a tasklet procedure. This is a sample run while compiling a kernel:
jit 模块包括 2 个文件, /proc/jitasklet 和 /proc/jitasklethi, 它返回和在"内核定时器"一节中介绍过的 /proc/jitimer 同样的数据. 当你读其中一个文件时, 你取回一个 header 和 sixdata 行. 第一个数据行描述了调用进程的上下文, 并且其他的行描述了一个 tasklet 过程连续运行的上下文. 这是一个在编译一个内核时的运行例子:
As confirmed by the above data, the tasklet is run at the next timer tick as long as the CPU is busy running a process, but it is run immediately when the CPU is otherwise idle. The kernel provides a set of ksoftirqd kernel threads, one per CPU, just to run “soft interrupt” handlers, such as the tasklet_action function. Thus, the final three runs of the tasklet take place in the context of the ksoftirqd kernel thread associated to CPU 0. The jitasklethi implementation uses a high-priority tasklet, explained in an upcoming list of functions.
如同由上面数据所确定的, tasklet 在下一个时间嘀哒内运行只要 CPU 在忙于运行一个进程, 但是它立刻被运行当 CPU 处于空闲. 内核提供了一套 ksoftirqd 内核线程, 每个 CPU 一个, 只是来运行 "软件中断" 处理, 就像 tasklet_action 函数. 因此, tasklet 的最后 3 个运行在关联到 CPU 0 的 ksoftirqd 内核线程的上下文中发生. jitasklethi 的实现使用了一个高优先级 tasklet, 在马上要来的函数列表中解释.
The actual code in jit that implements /proc/jitasklet and /proc/jitasklethi is almost identical to the code that implements /proc/jitimer, but it uses the tasklet calls instead of the timer ones. The following list lays out in detail the kernel interface to tasklets after the tasklet structure has been initialized:
jit 中实现 /proc/jitasklet 和 /proc/jittasklethi 的实际代码与 /proc/jitimer 的实现代码几乎是一致的, 但是它使用 tasklet 调用代替那些定时器. 下面的列表详细展开了 tasklet 结构已被初始化后的内核对 tasklet 的接口:
void tasklet_disable(struct tasklet_struct *t);
This function disables the given tasklet. The tasklet may still be scheduled with tasklet_schedule, but its execution is deferred until the tasklet has been enabled again. If the tasklet is currently running, this function busy-waits until the tasklet exits; thus, after calling tasklet_disable, you can be sure that the tasklet is not running anywhere in the system.
这个函数禁止给定的 tasklet. tasklet 可能仍然被 tasklet_schedule 调度, 但是它的执行被延后直到这个 tasklet 被再次使能. 如果这个 tasklet 当前在运行, 这个函数忙等待直到这个tasklet退出; 因此, 在调用 tasklet_disable 后, 你可以确保这个 tasklet 在系统任何地方都不在运行.
void tasklet_disable_nosync(struct tasklet_struct *t);
Disable the tasklet, but without waiting for any currently-running function to exit.When it returns, the tasklet is disabled and won’t be scheduled in the future until re-enabled, but it may be still running on another CPU when the function returns.
禁止这个 tasklet, 但是没有等待任何当前运行的函数退出. 当它返回, 这个 tasklt 被禁止并且不会在以后被调度直到重新使能, 但是它可能仍然运行在另一个 CPU 当这个函数返回时.
void tasklet_enable(struct tasklet_struct *t);
Enables a tasklet that had been previously disabled. If the tasklet has already been scheduled, it will run soon. A call to tasklet_enable must match each call to tasklet_disable, as the kernel keeps track of the “disable count” for each tasklet.
使能一个之前被禁止的 tasklet. 如果这个 tasklet 已经被调度, 它会很快运行. 一个对 tasklet_enable 的调用必须匹配每个对 tasklet_disable 的调用, 因为内核跟踪每个 tasklet 的"禁止次数".
void tasklet_schedule(struct tasklet_struct *t);
Schedule the tasklet for execution. If a tasklet is scheduled again before it has a chance to run, it runs only once. However, if it is scheduled while it runs, it runs again after it completes; this ensures that events occurring while other events are being processed receive due attention. This behavior also allows a tasklet to reschedule itself.
调度 tasklet 执行. 如果一个 tasklet 被再次调度在它有机会运行前, 它只运行一次. 但是, 如果他在运行中被调度, 它在完成后再次运行; 这保证了在其他事件被处理当中发生的事件收到应有的注意. 这个做法也允许一个 tasklet 重新调度它自己.
void tasklet_hi_schedule(struct tasklet_struct *t);
Schedule the tasklet for execution with higher priority. When the soft interrupt handler runs, it deals with high-priority tasklets before other soft interrupt tasks, including “normal” tasklets. Ideally, only tasks with low-latency requirements (such as filling the audio buffer) should use this function, to avoid the additional latencies introduced by other soft interrupt handlers. Actually, /proc/ jitasklethi shows no human-visible difference from /proc/jitasklet.
调度 tasklet 在更高优先级执行. 当软中断处理运行时, 它处理高优先级 tasklet 在其他软中断之前, 包括"正常的" tasklet. 理想地, 只有具有低响应周期要求( 例如填充音频缓冲 )应当使用这个函数, 为避免其他软件中断处理引入的附加周期. 实际上, /proc/jitasklethi 没有显示可见的与 /proc/jitasklet 的区别.
void tasklet_kill(struct tasklet_struct *t);
This function ensures that the tasklet is not scheduled to run again; it is usually called when a device is being closed or the module removed. If the tasklet is scheduled to run, the function waits until it has executed. If the tasklet reschedules itself, you must prevent it from rescheduling itself before calling tasklet_kill, as with del_timer_sync.
这个函数确保了这个 tasklet 没被再次调度来运行; 它常常被调用当一个设备正被关闭或者模块卸载时. 如果这个 tasklet 被调度来运行, 这个函数等待直到它已执行. 如果这个 tasklet 重新调度它自己, 你必须阻止在调用 tasklet_kill 前它重新调度它自己, 如同使用 del_timer_sync.
Tasklets are implemented in kernel/softirq.c. The two tasklet lists (normal and highpriority) are declared as per-CPU data structures, using the same CPU-affinity mechanism used by kernel timers. The data structure used in tasklet management is a simple linked list, because tasklets have none of the sorting requirements of kernel timers.
tasklet 在 kernel/softirq.c 中实现. 2 个 tasklet 链表( 正常和高优先级 )被声明为每-CPU数据结构, 使用和内核定时器相同的 CPU-亲和机制. 在 tasklet 管理中的数据结构是简单的链表, 因为 tasklet 没有内核定时器的分类请求.
Workqueues——工作队列
Workqueues are, superficially, similar to tasklets; they allow kernel code to request that a function be called at some future time. There are, however, some significant differences between the two, including:
工作队列是, 表面上看, 类似于 taskets; 它们允许内核代码来请求在将来某个时间调用一个函数. 但是, 有几个显著的不同在这 2 个之间, 包括:
? Tasklets run in software interrupt context with the result that all tasklet code must be atomic. Instead, workqueue functions run in the context of a special kernel process; as a result, they have more flexibility. In particular, workqueue functions can sleep.
? Tasklets always run on the processor from which they were originally submitted. Workqueues work in the same way, by default.
? Kernel code can request that the execution of workqueue functions be delayed for an explicit interval.
●tasklet 在软件中断上下文中运行的结果是所有的 tasklet 代码必须是原子的. 相反, 工作队列函数在一个特殊内核进程上下文运行; 结果, 它们有更多的灵活性. 特别地, 工作队列函数能够睡眠.
●tasklet 常常在它们最初被提交的处理器上运行.工作队列缺省地以相同地方式工作.
●内核代码可以请求工作队列函数被延后一个明确的时间间隔.

The key difference between the two is that tasklets execute quickly, for a short period of time, and in atomic mode, while workqueue functions may have higher latency but need not be atomic. Each mechanism has situations where it is appropriate.
两者之间关键的不同是 tasklet 执行的很快, 短时期, 并且在原子态, 而工作队列函数可能有高周期但是不需要是原子的. 每个机制有它适合的情形.
Workqueues have a type of struct workqueue_struct, which is defined in <linux/workqueue.h>. A workqueue must be explicitly created before use, using one of the following two functions:
工作队列有一个 struct workqueue_struct 类型, 在 <linux/workqueue.h> 中定义. 一个工作队列必须明确的在使用前创建, 使用一个下列的 2 个函数:
struct workqueue_struct *create_workqueue(const char *name);
struct workqueue_struct *create_singlethread_workqueue(const char *name);
Each workqueue has one or more dedicated processes (“kernel threads”), which run functions submitted to the queue. If you use create_workqueue, you get a workqueue that has a dedicated thread for each processor on the system. In many cases, all those threads are simply overkill; if a single worker thread will suffice, create the workqueue with create_singlethread_workqueue instead.
每个工作队列有一个或多个专用的进程("内核线程"), 它运行提交给这个队列的函数. 如果你使用 create_workqueue, 你得到一个工作队列它有一个专用的线程在系统的每个处理器上. 在很多情况下, 所有这些线程是简单的过度行为; 如果一个单个工作者线程就足够, 使用 create_singlethread_workqueue 来代替创建工作队列
To submit a task to a workqueue, you need to fill in a work_struct structure. This can be done at compile time as follows:
提交一个任务给一个工作队列, 你需要填充一个 work_struct 结构. 这可以在编译时完成, 如下:
DECLARE_WORK(name, void (*function)(void *), void *data);
Where name is the name of the structure to be declared, function is the function that is to be called from the workqueue, and data is a value to pass to that function. If you need to set up the work_struct structure at runtime, use the following two macros:
这里 name 是声明的结构名称, function 是从工作队列被调用的函数, 以及 data 是一个传递给这个函数的值. 如果你需要建立 work_struct 结构在运行时, 使用下面 2 个宏定义:
INIT_WORK(struct work_struct *work, void (*function)(void *), void *data);
PREPARE_WORK(struct work_struct *work, void (*function)(void *), void *data);
INIT_WORK does a more thorough job of initializing the structure; you should use it the first time that structure is set up. PREPARE_WORK does almost the same job, but it does not initialize the pointers used to link the work_struct structure into the workqueue. If there is any possibility that the structure may currently be submitted to a workqueue, and you need to change that structure, use PREPARE_WORK rather than INIT_WORK.
INIT_WORK 做更加全面的初始化结构的工作; 你应当在第一次建立结构时使用它. PREPARE_WORK 做几乎同样的工作, 但是它不初始化用来连接 work_struct 结构到工作队列的指针. 如果有任何的可能性这个结构当前被提交给一个工作队列, 并且你需要改变这个队列, 使用 PREPARE_WORK 而不是 INIT_WORK.
There are two functions for submitting work to a workqueue:
有 2 个函数来提交工作给一个工作队列:
int queue_work(struct workqueue_struct *queue, struct work_struct *work);
int queue_delayed_work(struct workqueue_struct *queue,
                       struct work_struct *work, unsigned long delay);
Either one adds work to the given queue.If queue_delayed_work is used, however, the actual work is not performed until at least delay jiffies have passed. The return value from these functions is 0 if the work was successfully added to the queue; a nonzero result means that this work_struct structure was already waiting in the queue, and was not added a second time.
每个都添加工作到给定的队列. 如果使用 queue_delay_work, 但是, 实际的工作没有进行直到至少 delay jiffies 已过去. 从这些函数的返回值是 0 如果工作被成功加入到队列; 一个非零结果意味着这个 work_struct 结构已经在队列中等待, 并且第 2 次没有加入.
At some time in the future, the work function will be called with the given data value. The function will be running in the context of the worker thread, so it can sleep if need be—although you should be aware of how that sleep might affect any other tasks submitted to the same workqueue. What the function cannot do, however, is access user space. Since it is running inside a kernel thread, there simply is no user space to access.
在将来的某个时间, 这个工作函数将被使用给定的 data 值来调用. 这个函数将在工作者线程的上下文运行, 因此它可以睡眠如果需要 -- 尽管你应当知道这个睡眠可能怎样影响提交给同一个工作队列的其他任务. 这个函数不能做的是, 但是, 是存取用户空间. 因为它在一个内核线程中运行, 完全没有用户空间来存取.
Should you need to cancel a pending workqueue entry, you may call:
如果你需要取消一个挂起的工作队列入口, 你可以调用:
int cancel_delayed_work(struct work_struct *work);
The return value is nonzero if the entry was canceled before it began execution. The kernel guarantees that execution of the given entry will not be initiated after a call to cancel_delayed_work.If cancel_delayed_work returns 0, however, the entry may have already been running on a different processor, and might still be running after a call to cancel_delayed_work. To be absolutely sure that the work function is not running anywhere in the system after cancel_delayed_work returns 0, you must follow that call with a call to:
返回值是非零如果这个入口在它开始执行前被取消. 内核保证给定入口的执行不会在调用 cancel_delay_work 后被初始化. 如果 cancel_delay_work 返回 0, 但是, 这个入口可能已经运行在一个不同的处理器, 并且可能仍然在调用 cancel_delayed_work 后在运行. 要绝对确保工作函数没有在 cancel_delayed_work 返回 0 后在任何地方运行, 你必须跟随这个调用来调用:
void flush_workqueue(struct workqueue_struct *queue);
After flush_workqueue returns, no work function submitted prior to the call is running anywhere in the system.
在 flush_workqueue 返回后, 没有在这个调用前提交的函数在系统中任何地方运行.
When you are done with a workqueue, you can get rid of it with:
当你用完一个工作队列, 你可以去掉它, 使用:
void destroy_workqueue(struct workqueue_struct *queue);
The Shared Queue——共享队列
A device driver, in many cases, does not need its own workqueue. If you only submit tasks to the queue occasionally, it may be more efficient to simply use the shared, default workqueue that is provided by the kernel. If you use this queue, however, you must be aware that you will be sharing it with others. Among other things, that means that you should not monopolize the queue for long periods of time (no long sleeps), and it may take longer for your tasks to get their turn in the processor.
一个设备驱动, 在许多情况下, 不需要它自己的工作队列. 如果你只偶尔提交任务给队列, 简单地使用内核提供的共享的, 缺省的队列可能更有效. 如果你使用这个队列, 但是, 你必须明白你将和别的在共享它. 从另一个方面说, 这意味着你不应当长时间独占队列(无长睡眠), 并且可能要更长时间它们轮到处理器.
The jiq (“just in queue”) module exports two files that demonstrate the use of the shared workqueue. They use a single work_struct structure, which is set up this way:
jiq ("just in queue") 模块输出 2 个文件来演示共享队列的使用. 它们使用一个单个 work_struct structure, 这个结构这样建立:
static struct work_struct jiq_work;
    /* this line is in jiq_init( ) */
    INIT_WORK(&jiq_work, jiq_print_wq, &jiq_data);
When a process reads /proc/jiqwq, the module initiates a series of trips through the shared workqueue with no delay. The function it uses is:
当一个进程读 /proc/jiqwq, 这个模块不带延迟地初始化一系列通过共享的工作队列的路线.
int schedule_work(struct work_struct *work);
Note that a different function is used when working with the shared queue; it requires only the work_struct structure for an argument. The actual code in jiq looks like this:
注意, 当使用共享队列时使用了一个不同的函数; 它只要求 work_struct 结构作为一个参数. 在 jiq 中的实际代码看来如此:
prepare_to_wait(&jiq_wait, &wait, TASK_INTERRUPTIBLE);
schedule_work(&jiq_work);
schedule( );
finish_wait(&jiq_wait, &wait);
The actual work function prints out a line just like the jit module does, then, if need be, resubmits the work_struct structure into the workqueue. Here is jiq_print_wq in its entirety:
这个实际的工作函数打印出一行就象 jit 模块所作的, 接着, 如果需要, 重新提交这个 work_structcture 到工作队列中. 在这是 jiq_print_wq 全部:
static void jiq_print_wq(void *ptr)
{
    struct clientdata *data = (struct clientdata *) ptr;
    if (! jiq_print (ptr))
        return;
    if (data->delay)
        schedule_delayed_work(&jiq_work, data->delay);
    else
        schedule_work(&jiq_work);
}
If the user is reading the delayed device (/proc/jiqwqdelay), the work function resubmits itself in the delayed mode with schedule_delayed_work:
如果用户在读被延后的设备 (/proc/jiqwqdelay), 这个工作函数重新提交它自己在延后的模式, 使用 schedule_delayed_work:
int schedule_delayed_work(struct work_struct *work, unsigned long delay);
When /proc/jiqwq is read, there is no obvious delay between the printing of each line. When, instead, /proc/jiqwqdelay is read, there is a delay of exactly one jiffy between each line. In either case, we see the same process name printed; it is the name of the kernel thread that implements the shared workqueue. The CPU number is printed after the slash; we never know which CPU will be running when the /proc file is read, but the work function will always run on the same processor thereafter.
当 /proc/jiqwq 被读, 在每行的打印之间没有明显的延迟. 相反, 当 /proc/jiqwqdealy 被读时, 在每行之间有恰好一个 jiffy 的延时. 在每一种情况, 我们看到同样的进程名子被打印; 它是实现共享队列的内核线程的名子. CPU 号被打印在斜线后面; 我们从不知道当读 /proc 文件时哪个 CPU 会在运行, 但是这个工作函数之后将一直运行在同一个处理器.
If you need to cancel a work entry submitted to the shared queue, you may use cancel_delayed_work, as described above. Flushing the shared workqueue requires a separate function, however:
如果你需要取消一个已提交给工作队列的工作入口, 你可以使用 cancel_delayed_work, 如上面所述. 刷新共享队列需要一个不同的函数, 但是:
void flush_scheduled_work(void);
Since you do not know who else might be using this queue, you never really know how long it might take for flush_scheduled_work to return.
因为你不知道别人谁可能使用这个队列, 你从不真正知道 flush_schduled_work 返回可能需要多长时间.

写原创有奖励！2025面包板原创奖励正在进行中