From: Linus Torvalds <torvalds@linux-foundation.org>
To: Hugh Dickins <hugh@veritas.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>,
linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
Nick Piggin <npiggin@suse.de>,
"David S. Miller" <davem@davemloft.net>,
Zach Amsden <zach@vmware.com>,
Jeremy Fitzhardinge <jeremy@goop.org>
Subject: Re: tlb_gather_mmu() and semantics of "fullmm"
Date: Thu, 26 Mar 2009 09:38:18 -0700 (PDT) [thread overview]
Message-ID: <alpine.LFD.2.00.0903260927320.3032@localhost.localdomain> (raw)
In-Reply-To: <Pine.LNX.4.64.0903261232060.27412@blonde.anvils>
On Thu, 26 Mar 2009, Hugh Dickins wrote:
> On Thu, 26 Mar 2009, Benjamin Herrenschmidt wrote:
> >
> > I'd like to clarify something about the semantics of the "full_mm_flush"
> > argument of tlb_gather_mmu().
> >
> > The reason is that it can either mean:
> >
> > - All the mappings for that mm are being flushed
> >
> > or
> >
> > - The above +plus+ the mm is dead and has no remaining user. IE, we
> > can relax some of the rules because we know the mappings cannot be
> > accessed concurrently, and thus the PTEs cannot be reloaded into the
> > TLB.
>
> No remaining user in the sense of no longer connected to any user task,
> but may still be active_mm on some cpus.
Side note: this means that CPU's that do speculative TLB fills may still
touch the user entries. They won't _care_ about what they get, though. So
you should be able to do any optimizations you want, as long as it doesn't
cause machine checks or similar (ie another CPU doing a speculative access
and then being really unhappy about a totally invalid page table entry).
> Although it looks as if there's a TLB flush at the end of every batch,
> isn't that deceptive (on x86 anyway)?
You need to. Again. Even on that CPU the TLB may have gotten re-loaded
speculatively, even if nothing _meant_ to touch user pages.
So you can't just flush the TLB once, and then expect that since you
flushed it, and nothing else accessed those user addresses, you don't need
to flush it again.
And doing things the other way around - only flushing once at the end - is
incorrect because the whole point is that we can only free the page
directory once we've flushed all the translations that used it. So we need
to flush before the real release, and we need to flush after we've
unmapped everything. Thus the repeated flushes.
It shouldn't be that costly, since kernel mappings should be marked
global.
> I'm thinking that the first flush_tlb_mm() will end up calling
> leave_mm(), and the subsequent ones do nothing because the cpu_vm_mask
> is then empty.
The subsequent ones shouldn't need to do anything on _other_ CPU's,
because the other CPU's will have changed their active_vm to NULL, and no
longer use that VM at all. The unmapping process still uses the old VM in
the general case.
(The "do_exit()" case is special, and in that case we should not need to
do any of this at all, but on x86 doing different paths depending on the
"full" bit is unlikely to be worth it - it shouldn't be all that
noticeable. You could _try_, though).
> Hmm, but the cpu which is actually doing the flush_tlb_mm() calls
> leave_mm() without considering cpu_vm_mask: won't we get repeated
> unnecessary load_cr3(swapper_pg_dir)s from that?
Yes, but see above: it's necessary for the non-full case, and I doubt it
matters much for the full case.
But nobody has done timings as far as I know.
Linus
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-03-26 15:44 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-03-26 5:01 Benjamin Herrenschmidt
2009-03-26 14:08 ` Hugh Dickins
2009-03-26 16:38 ` Linus Torvalds [this message]
2009-03-26 23:13 ` Benjamin Herrenschmidt
2009-03-26 17:21 ` Jeremy Fitzhardinge
2009-03-26 20:39 ` David Miller
2009-03-26 22:33 ` Benjamin Herrenschmidt
2009-03-27 5:04 ` David Miller
2009-03-27 5:38 ` Benjamin Herrenschmidt
2009-03-27 5:44 ` David Miller
2009-03-27 5:54 ` Benjamin Herrenschmidt
2009-03-27 5:57 ` David Miller
2009-03-27 6:10 ` Benjamin Herrenschmidt
2009-03-27 8:05 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LFD.2.00.0903260927320.3032@localhost.localdomain \
--to=torvalds@linux-foundation.org \
--cc=akpm@linux-foundation.org \
--cc=benh@kernel.crashing.org \
--cc=davem@davemloft.net \
--cc=hugh@veritas.com \
--cc=jeremy@goop.org \
--cc=linux-mm@kvack.org \
--cc=npiggin@suse.de \
--cc=zach@vmware.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox