From: Nick Piggin <npiggin@suse.de>
To: Hugh Dickins <hugh@veritas.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Linux Memory Management List <linux-mm@kvack.org>,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [patch] mm: fix race in COW logic
Date: Mon, 23 Jun 2008 03:49:40 +0200 [thread overview]
Message-ID: <20080623014940.GA29413@wotan.suse.de> (raw)
In-Reply-To: <Pine.LNX.4.64.0806221854050.5466@blonde.site>
On Sun, Jun 22, 2008 at 07:10:41PM +0100, Hugh Dickins wrote:
> On Sun, 22 Jun 2008, Linus Torvalds wrote:
> > On Sun, 22 Jun 2008, Hugh Dickins wrote:
> >
> > > One thing though, in moving the page_remove_rmap in that way, aren't
> > > you assuming that there's an appropriate wmbarrier between the two
> > > locations? If that is necessarily so (there's plenty happening in
> > > between), it may deserve a comment to say just where that barrier is.
> >
> > In this case, I don't think memory ordering matters.
> >
> > What matters is that the map count never goes down to one - and by
> > re-ordering the inc/dec accesses, that simply won't happen. IOW, memory
> > ordering is immaterial, only the ordering of count updates (from the
> > standpoint of the faulting CPU - so that's not even an SMP issue) matters.
>
> I'm puzzled. The page_remove_rmap has moved to the other side of the
> page_add_new_anon_rmap, but they are operating on different pages.
> It's true that the total of their mapcounts doesn't go down to one
> in the critical area, but that total isn't computed anywhere.
>
> After asking, I thought the answer was going to be that page_remove_rmap
> uses atomic_add_negative, and atomic ops which return a value do
> themselves provide sufficient barrier. I'm wondering if that's so
> obvious that you've generously sought out a different meaning to my query.
I was initially thinking an smp_wmb might have been in order (excuse the pun),
but then I rethought it and added the 2d paragraph to my comment. But I may
still have been wrong. Let's ignore the barriers implicit in the rmap
functions for now, and if we find they are required we can add a nice
/* smp_wmb() for ..., provided by ...! */
Now. The critical memory operations AFAIKS are:
dec page->mapcount
load page->mapcount (== 1)
store pte (RW)
store via pte
load via pte
ptep_clear_flush
store new pte
Note that I don't believe the page_add_new_anon_rmap is part of the critical
ordering. Unless that is for a different issue?
Now if we move the decrement of page->mapcount to below the ptep_clear_flush,
then our TLB shootdown protocol *should* guarantee that nothing may load via
that pte after page->mapcount has been decremented, right?
Now we only have pairwise barrier semantics, so if the leftmost process is
not part of the TLB shootdown (which it is not), then it is possible that
it may see the store to decrement the mapcount before the store to clear the
pte. Maybe. I was hoping causality would not allow a subsequent store through
the pte to be seen by the rightmost guy before the TLB flush. But maybe I
was wrong? (at any rate, page_remove_rmap gives us smp_wmb if required, so
the code is not technically wrong)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2008-06-23 1:49 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-06-22 15:30 Nick Piggin
2008-06-22 17:11 ` Hugh Dickins
2008-06-22 17:35 ` Linus Torvalds
2008-06-22 18:10 ` Hugh Dickins
2008-06-22 18:18 ` Linus Torvalds
2008-06-23 1:49 ` Nick Piggin [this message]
2008-06-23 10:04 ` Hugh Dickins
2008-06-23 12:18 ` Nick Piggin
2008-06-23 12:30 ` Nick Piggin
2008-06-23 15:39 ` Hugh Dickins
2008-06-27 9:19 ` Peter Zijlstra
2008-06-27 9:13 ` Peter Zijlstra
2008-06-23 1:52 ` Nick Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080623014940.GA29413@wotan.suse.de \
--to=npiggin@suse.de \
--cc=akpm@linux-foundation.org \
--cc=hugh@veritas.com \
--cc=linux-mm@kvack.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox