From: Hugh Dickins <hugh@veritas.com>
To: Kanoj Sarcar <kanojsarcar@yahoo.com>
Cc: Anton Blanchard <anton@samba.org>, Andi Kleen <ak@suse.de>,
William Lee Irwin III <wli@holomorphy.com>,
Andrea Arcangeli <andrea@suse.de>,
linux-mm@kvack.org, davem@redhat.com
Subject: Re: smp_rmb in mm/memory.c in 2.6.10
Date: Fri, 14 Jan 2005 22:09:17 +0000 (GMT) [thread overview]
Message-ID: <Pine.LNX.4.44.0501142127430.3050-100000@localhost.localdomain> (raw)
In-Reply-To: <20050114211441.59635.qmail@web14305.mail.yahoo.com>
On Fri, 14 Jan 2005, Kanoj Sarcar wrote:
>
> Here are the relevant steps of the two procedures:
>
> do_no_page()
> 1. sequence = atomic_read(&mapping->truncate_count);
> 2. smp_rmb();
> 3. vma->vm_ops->nopage()
> 4. spin_lock(&mm->page_table_lock);
> 5. Retry if sequence !=
> atomic_read(&mapping->truncate_count)
> 5a. See later.
> 6. update_mmu_cache()
> 7. spin_unlock(&mm->page_table_lock);
>
> unmap_mapping_range()
> 8. spin_lock(&mapping->i_mmap_lock); /* irrelevant */
> 9. atomic_inc(&mapping->truncate_count);
> 10.zap_page_range():spin_lock(&mm->page_table_lock);
> zap_page_range():tlbcleaning
> zap_page_range():spin_unlock(&mm->page_table_lock)
> 11. spin_unlock(&mapping->i_mmap_lock);
Yes (except that 8 is somewhat relevant to removing atomicity;
I say somewhat because there's also an exclusive i_sem protecting).
> --- Hugh Dickins <hugh@veritas.com> wrote:
> > On Thu, 13 Jan 2005, Kanoj Sarcar wrote:
> > >
> > > Thanks, I think this explains it. IE, if
> > do_no_page()
> > > reads truncate_count, and then later goes on to
> > > acquire a lock in nopage(), the smp_rmb() is
> > > guaranteeing that the read of truncate_count
> > completes
> > > before nopage() starts executing.
> > >
> > > For x86 at least, it seems to me that since the
> > > spin_lock (in nopage()) uses a "lock" instruction,
> > > that itself guarantees that the truncate_count
> > read is
> > > completed, even without the smp_rmb(). (Refer to
> > IA32
> > > SDM Vol 3 section 7.2.4 last para page 7-11). Thus
> > for
> > > x86, the smp_rmb is superfluous.
> >
> > You're making me nervous. If you look at 2.6.11-rc1
> > you'll find
> > that I too couldn't see the point of that smp_rmb(),
> > on any architecture,
> > and so removed it; while also removing the
> > "atomicity" of truncate_count.
>
> I haven't looked at the 2.6.11 code,
Please do if you have time.
> but you could look at atomicity and smp_rmb()
> as two different changes.
Definitely (oh, the shame that I put them together in one patch!)
> I believe the ordering of the C code in steps
> 8 and 9 could be interchanged without any problems, ie
> truncate_count is not protected by i_mmap_lock. In
> that case, you would need truncate_count to be atomic,
> unless you can guarantee unmap_mapping_range() is
> single threaded wrt "mapping" from callers.
Right, but given the ordering 8 before 9,
there is no point to truncate_count being atomic.
> > Here was my comment to that patch:
> > > Why is mapping->truncate_count atomic? It's
> > incremented inside
> > > i_mmap_lock (and i_sem), and the reads don't need
> > it to be atomic.
> > >
> > > And why smp_rmb() before call to ->nopage? The
> > compiler cannot reorder
> > > the initial assignment of sequence after the call
> > to ->nopage, and no
> > > cpu (yet!) can read from the future, which is all
> > that matters there.
> >
> > Now I'm not so convinced by that "no cpu can read
> > from the future".
> >
> > I don't entirely follow your remarks above, but I do
> > think people
> > on this thread have a better grasp of these matters
> > than I have:
> > does anyone now think that smp_rmb() needs to be
> > restored?
>
> As to the smp_rmb() part, I believe it is required; we
> are not talking about compiler reorderings,
Did need to be considered, but I still agree with
myself that the function call makes it no problem.
> rather cpu
> reorderings. Given just steps 1 and 3 above, there is
> no guarantee from the cpu that the read of
> truncate_count would not be performed before nopage()
> is almost complete, even though the compiler generated
> the proper instruction order (ie the cpu could pull
> down the read of truncate_count).
This is your crucial point. Now I think you're right.
But I have remembered how I was thinking at the time,
what's behind my "no cpu can read from the future" remark.
Suppose unmap_mapping_range is incrementing truncate_count
from 0 to 1. I could conceive of do_no_page's read into
"sequence" not completing until the spin_lock at step 4.
But I believed that the read issued before ->nopage could
only err on the safe side, sometimes fetching 0 instead of 1
when 1 would already be safe, but never seeing 1 too soon.
That belief was naive, wasn't it? I was thinking in terms
of "slow" instructions rather than reordered instructions.
> Whoever wrote this code did a careful job.
It was Andrea (one reason I've copied him now -
as I did when posting the patch to remove it).
Unless someone sees this differently, I should send a patch to
restore the smp_rmb(), with a longer code comment on what it's for.
Thanks a lot for your detailed answer.
Hugh
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>
next prev parent reply other threads:[~2005-01-14 22:09 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-01-13 20:26 Kanoj Sarcar
2005-01-13 20:39 ` William Lee Irwin III
2005-01-13 21:02 ` Kanoj Sarcar
2005-01-13 21:06 ` Andi Kleen
2005-01-13 21:29 ` Kanoj Sarcar
2005-01-13 21:59 ` Anton Blanchard
2005-01-13 23:22 ` Kanoj Sarcar
2005-01-14 20:37 ` Hugh Dickins
2005-01-14 21:14 ` Kanoj Sarcar
2005-01-14 21:38 ` Andrea Arcangeli
2005-01-14 22:09 ` Hugh Dickins [this message]
2005-01-14 22:34 ` Andrea Arcangeli
2005-01-14 21:25 ` Andrea Arcangeli
2005-01-14 21:32 ` Andrea Arcangeli
2005-01-14 22:22 ` Kanoj Sarcar
2005-01-14 22:47 ` Hugh Dickins
2005-01-14 22:51 ` Andrea Arcangeli
2005-01-14 23:14 ` Kanoj Sarcar
2005-01-14 23:26 ` Andrea Arcangeli
2005-01-14 22:36 ` Hugh Dickins
2005-01-14 23:01 ` Andrea Arcangeli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.44.0501142127430.3050-100000@localhost.localdomain \
--to=hugh@veritas.com \
--cc=ak@suse.de \
--cc=andrea@suse.de \
--cc=anton@samba.org \
--cc=davem@redhat.com \
--cc=kanojsarcar@yahoo.com \
--cc=linux-mm@kvack.org \
--cc=wli@holomorphy.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox