|
经过两天的热身,开始读linux内核代码,我选择ULK2为主要蓝本。由于以前读FreeBSD代码时,曾参考了ULK1的中文版,我从ULK2的第5章开始。还没读完,把已读的和大家交流. :-)
Note1: 临界区
A critical region is any section of code that must be completely executed by any kernel control path that enters it before another kernel control path can enter it.
临界区:当内核正在执行处于临界区内的代码的时候,在该内核控制路径没有执行完临界区的代码之前,任何其它内核控制路径不能进入该临界区。
Note2: 原子操作
类型atomic_t:typedef struct { volatile int counter; } atomic_t;
原子操作的实现关键点:
1. volatile标志
The ‘volatile’ keyword indicates that the instruction has important side-effects. GCC will not delete a volatile `asm' if it is reachable. (The instruction can still be deleted if GCC can prove that control-flow will never reach the location of the instruction.) In addition, GCC will not reschedule instructions across a volatile ‘asm’ instruction. [GCC]
2. lock
The ‘lock’ prefix forces an atomic operation to insure exclusive use of shared memory in a multiprocessor environment. [Intel]
3. __builtin_constant_p
You can use the built-in function ‘__builtin_constant_p’ to determine if a value is known to be constant at compile-time and hence that GCC can perform constant-folding on expressions involving that value. The argument of the function is the value to test. The function returns the integer 1 if the argument is known to be a compile-time constant and 0 if it is not known to be a compile-time constant. A return of 0 does not indicate that the value is _not_ a constant, but merely that GCC cannot prove it is a constant with the specified value of the ‘-O’ option. [GCC]
[code:1]
E.g.1 atomic_add(int i, atomic_t *v)函数
static __inline__ void atomic_add(int i, atomic_t *v)
{
__asm__ __volatile__(
LOCK "addl %1,%0"
:"=m" (v->counter)
:"ir" (i), "m" (v->counter));
}
[/code:1]
从volatile和lock两方面切实地保护了atomic_add在MP环境里,在执行这条指令的时候,不会被调度(volatile),内存v空间不会被其它CPU访问(LOCK)。
[code:1]
E.g.2 test_bit(nr, addr)
#define test_bit(nr, addr) \
(__builtin_constant_p(nr) ? \
constant_test_bit((nr),(addr)) : \
variable_test_bit((nr),(addr)))
[/code:1]
如果nr在编译的时候就确定了是常量,则调用constant_test_bit处理,否则调用variable_test_bit处理。
疑问:关于这种情况的同步,不是很清楚,系统是如何保护的?
Note3: Memory Barrier
A memory barrier primitive ensures that the operations placed before the primitive are finished before starting the operations placed after the primitive. Thus, a memory barrier is like a firewall that cannot be passed by any assembly language instruction.
我们提取几个关键控制函数/宏分析:
[code:1]
#define mb() __asm__ __volatile__ ("lock; addl $0,0(%%esp)": : :"memory")
#define rmb() mb()
#ifdef CONFIG_X86_OOSTORE
#define wmb() __asm__ __volatile__ ("lock; addl $0,0(%%esp)": : :"memory")
#else
#define wmb() __asm__ __volatile__ ("": : :"memory")
#endif [/code:1]
这几个宏对MP和UP同样有效
“memory”:If your assembler instruction modifies memory in an unpredictable fashion, add `memory' to the list of clobbered registers. This will cause GCC to not keep memory values cached in registers across the assembler instruction. You will also want to add the `volatile' keyword if the memory affected is not listed in the inputs or outputs of the `asm', as the `memory' clobber does not count as a side-effect of the `asm'. [GCC]
wmb比mb简单的原因是:通常,Intel处理器从来不会对写内存操作重新排序。
疑问:内存屏障到底保护的什么指令?即,我们如何使用?
Note4: Spin Locks
自旋锁(spin lock):只有在MP系统中才有意义,参考<asm-i386/spinlock.h>。
自旋锁定义:
[code:1]typedef struct {
volatile unsigned int lock;
#if SPINLOCK_DEBUG
unsigned magic;
#endif
} spinlock_t;
E.g.: spin_lock(spinlock_t *lock)
static inline void spin_lock(spinlock_t *lock)
{
#if SPINLOCK_DEBUG
__label__ here;
here:
if (lock->magic != SPINLOCK_MAGIC) {
printk("eip: %p\n", &&here);
BUG();
}
#endif
__asm__ __volatile__(
spin_lock_string
:"=m" (lock->lock) : : "memory");
}[/code:1]
Note5: Read/Write(R/W) Spin Locks
显而易见,Read Lock是可以由多个控制路径所共享的,而Write Lock只能有一个控制路径独占使用。通过数据结构rwlock_t表示,关键域lock。
阀值:0x01000000,是该类型锁lock的初始值。当申请一个Read Lock时,lock减1;释放一个Read Lock时,lock加1。当申请一个Write Lock时,lock减去阀值;释放一个Write Lock时,lock加上阀值。
在申请不能获得的时候,会自旋等待直到获得该锁或是时间片用完。辅助函数write_trylock是申请Write Lock,在不能获得的情况下,直接返回。Linux目前(2.4.20)没有实现read_trylock的功能。
和FreeBSD的lockmgr锁相似,Linux采用Spin机制,使得代码简介。FreeBSD的实现代码相对复杂些,但是控制更为灵活写。
感觉后面讲的读写semaphore和FreeBSD的s/x锁比较相似。
[GCC] gcc info
[Intel] Intel OS volume |
|