From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Fri, 14 Jan 2005 22:09:17 +0000 (GMT) From: Hugh Dickins Subject: Re: smp_rmb in mm/memory.c in 2.6.10 In-Reply-To: <20050114211441.59635.qmail@web14305.mail.yahoo.com> Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Sender: owner-linux-mm@kvack.org Return-Path: To: Kanoj Sarcar Cc: Anton Blanchard , Andi Kleen , William Lee Irwin III , Andrea Arcangeli , linux-mm@kvack.org, davem@redhat.com List-ID: On Fri, 14 Jan 2005, Kanoj Sarcar wrote: > > Here are the relevant steps of the two procedures: > > do_no_page() > 1. sequence = atomic_read(&mapping->truncate_count); > 2. smp_rmb(); > 3. vma->vm_ops->nopage() > 4. spin_lock(&mm->page_table_lock); > 5. Retry if sequence != > atomic_read(&mapping->truncate_count) > 5a. See later. > 6. update_mmu_cache() > 7. spin_unlock(&mm->page_table_lock); > > unmap_mapping_range() > 8. spin_lock(&mapping->i_mmap_lock); /* irrelevant */ > 9. atomic_inc(&mapping->truncate_count); > 10.zap_page_range():spin_lock(&mm->page_table_lock); > zap_page_range():tlbcleaning > zap_page_range():spin_unlock(&mm->page_table_lock) > 11. spin_unlock(&mapping->i_mmap_lock); Yes (except that 8 is somewhat relevant to removing atomicity; I say somewhat because there's also an exclusive i_sem protecting). > --- Hugh Dickins wrote: > > On Thu, 13 Jan 2005, Kanoj Sarcar wrote: > > > > > > Thanks, I think this explains it. IE, if > > do_no_page() > > > reads truncate_count, and then later goes on to > > > acquire a lock in nopage(), the smp_rmb() is > > > guaranteeing that the read of truncate_count > > completes > > > before nopage() starts executing. > > > > > > For x86 at least, it seems to me that since the > > > spin_lock (in nopage()) uses a "lock" instruction, > > > that itself guarantees that the truncate_count > > read is > > > completed, even without the smp_rmb(). (Refer to > > IA32 > > > SDM Vol 3 section 7.2.4 last para page 7-11). Thus > > for > > > x86, the smp_rmb is superfluous. > > > > You're making me nervous. If you look at 2.6.11-rc1 > > you'll find > > that I too couldn't see the point of that smp_rmb(), > > on any architecture, > > and so removed it; while also removing the > > "atomicity" of truncate_count. > > I haven't looked at the 2.6.11 code, Please do if you have time. > but you could look at atomicity and smp_rmb() > as two different changes. Definitely (oh, the shame that I put them together in one patch!) > I believe the ordering of the C code in steps > 8 and 9 could be interchanged without any problems, ie > truncate_count is not protected by i_mmap_lock. In > that case, you would need truncate_count to be atomic, > unless you can guarantee unmap_mapping_range() is > single threaded wrt "mapping" from callers. Right, but given the ordering 8 before 9, there is no point to truncate_count being atomic. > > Here was my comment to that patch: > > > Why is mapping->truncate_count atomic? It's > > incremented inside > > > i_mmap_lock (and i_sem), and the reads don't need > > it to be atomic. > > > > > > And why smp_rmb() before call to ->nopage? The > > compiler cannot reorder > > > the initial assignment of sequence after the call > > to ->nopage, and no > > > cpu (yet!) can read from the future, which is all > > that matters there. > > > > Now I'm not so convinced by that "no cpu can read > > from the future". > > > > I don't entirely follow your remarks above, but I do > > think people > > on this thread have a better grasp of these matters > > than I have: > > does anyone now think that smp_rmb() needs to be > > restored? > > As to the smp_rmb() part, I believe it is required; we > are not talking about compiler reorderings, Did need to be considered, but I still agree with myself that the function call makes it no problem. > rather cpu > reorderings. Given just steps 1 and 3 above, there is > no guarantee from the cpu that the read of > truncate_count would not be performed before nopage() > is almost complete, even though the compiler generated > the proper instruction order (ie the cpu could pull > down the read of truncate_count). This is your crucial point. Now I think you're right. But I have remembered how I was thinking at the time, what's behind my "no cpu can read from the future" remark. Suppose unmap_mapping_range is incrementing truncate_count from 0 to 1. I could conceive of do_no_page's read into "sequence" not completing until the spin_lock at step 4. But I believed that the read issued before ->nopage could only err on the safe side, sometimes fetching 0 instead of 1 when 1 would already be safe, but never seeing 1 too soon. That belief was naive, wasn't it? I was thinking in terms of "slow" instructions rather than reordered instructions. > Whoever wrote this code did a careful job. It was Andrea (one reason I've copied him now - as I did when posting the patch to remove it). Unless someone sees this differently, I should send a patch to restore the smp_rmb(), with a longer code comment on what it's for. Thanks a lot for your detailed answer. Hugh -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: aart@kvack.org