linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andy Lutomirski <luto@amacapital.net>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>, X86 ML <x86@kernel.org>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Borislav Petkov <bp@alien8.de>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Brian Gerst <brgerst@gmail.com>
Subject: Re: [RFC 09/13] x86/mm: Disable interrupts when flushing the TLB using CR3
Date: Wed, 13 Jan 2016 15:35:46 -0800	[thread overview]
Message-ID: <CALCETrVT7ePZPAySF45hhnhZ5cBKH0EvDGmxftHvUmZw2YxZjQ@mail.gmail.com> (raw)
In-Reply-To: <CA+55aFy=mNDvedPwSF01F-QHEsFdGu63qiGPvmp_Cnhb0CvG+A@mail.gmail.com>

On Fri, Jan 8, 2016 at 6:20 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Fri, Jan 8, 2016 at 4:18 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>>>
>>>  - on pcid setups, wouldn't invpcid_flush_single_context() be better?
>>
>> I played with that and it was slower.  I don't pretend that makes any sense.
>
> Ugh. I guess reading and writing cr3 has been optimized.
>
>>> And yes, that means that we'd require X86_FEATURE_INVPCID in order to
>>> use X86_FEATURE_PCID, but that seems fine.
>>
>> I have an SNB "Extreme" with PCID but not INVPCID, and there could be
>> a whole generation of servers like that.  I think we should fully
>> support them.
>
> Can you check the timings? IOW, is it a win on SNB?

~80ns gain on SNB.  It's actually quite impressive on SNB: it knocks
the penalty for mm switches down to 20ns or so, which I find to be
fairly amazing.  (This is at 3.8GHz or thereabouts.)

>
> I think originally Intel only had two actual bits of process context
> ID in the TLB, and it was meant to be used for virtualization or
> something. Together with the hashing (to make it always appear as 12
> bits to software - a nice idea but also means that the hardware ends
> up invalidating more than software really expects), it may not work
> all that well.
>
> That _could_ explain why the original patch from intel didn't work.
>
>> We might be able to get away with just disabling preemption instead of
>> IRQs, at least if mm == active_mm.
>
> I'm not convinced it is all that much faster. Of course, it's nicer on
> non-preempt, but nobody seems to run things that way.

My current testing version has three different code paths now.  If
INVPCID and PCID are both available, then it uses INVPCID.  If PCID is
available but INVPCID is not, it does raw_local_irqsave.  If PCID is
not available, it just does the CR3 read/write.

Yeah, it's ugly, and it's a big blob of code to do something trivial,
but it seems to work and it should be the right thing to do in most
cases.

Can anyone here ask a hardware or microcode person what's going on
with CR3 writes possibly being faster than INVPCID?  Is there some
trick to it?

--Andy

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2016-01-13 23:36 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-08 23:15 [RFC 00/13] x86/mm: PCID and INVPCID Andy Lutomirski
2016-01-08 23:15 ` [RFC 01/13] x86/paravirt: Turn KASAN off for parvirt.o Andy Lutomirski
2016-01-10 18:59   ` Borislav Petkov
2016-01-11 12:51     ` Andrey Ryabinin
2016-01-11 12:51       ` [PATCH 1/2] x86/kasan: clear kasan_zero_page after TLB flush Andrey Ryabinin
2016-01-18 22:24         ` Andy Lutomirski
2016-01-11 12:51       ` [PATCH 2/2] x86/kasan: write protect kasan zero shadow Andrey Ryabinin
2016-01-18 22:24         ` Andy Lutomirski
2016-01-29 10:35       ` [RFC 01/13] x86/paravirt: Turn KASAN off for parvirt.o Borislav Petkov
2016-01-08 23:15 ` [RFC 02/13] x86/mm: Add INVPCID helpers Andy Lutomirski
2016-01-08 23:15 ` [RFC 03/13] x86/mm: Add a noinvpcid option to turn off INVPCID Andy Lutomirski
2016-01-08 23:15 ` [RFC 04/13] x86/mm: If INVPCID is available, use it to flush global mappings Andy Lutomirski
2016-01-08 23:15 ` [RFC 05/13] x86/mm: Add barriers and document switch_mm-vs-flush synchronization Andy Lutomirski
2016-06-03 17:42   ` Nadav Amit
2016-06-09 17:24     ` Andy Lutomirski
2016-06-09 19:45       ` Nadav Amit
2016-09-06  1:22   ` Wanpeng Li
2016-01-08 23:15 ` [RFC 06/13] x86/mm: Disable PCID on 32-bit kernels Andy Lutomirski
2016-01-08 23:15 ` [RFC 07/13] x86/mm: Add nopcid to turn off PCID Andy Lutomirski
2016-01-08 23:15 ` [RFC 08/13] x86/mm: Teach CR3 readers about PCID Andy Lutomirski
2016-01-08 23:15 ` [RFC 09/13] x86/mm: Disable interrupts when flushing the TLB using CR3 Andy Lutomirski
2016-01-08 23:41   ` Linus Torvalds
2016-01-09  0:18     ` Andy Lutomirski
2016-01-09  2:20       ` Linus Torvalds
2016-01-11 10:51         ` Ingo Molnar
2016-01-13 23:32           ` Andy Lutomirski
2016-01-13 23:35         ` Andy Lutomirski [this message]
2016-01-13 23:43           ` Dave Hansen
2016-01-13 23:51             ` Andy Lutomirski
2016-01-13 23:56               ` Dave Hansen
2016-01-14  0:34                 ` Andy Lutomirski
2016-01-08 23:15 ` [RFC 10/13] x86/mm: Factor out remote TLB flushing Andy Lutomirski
2016-01-08 23:15 ` [RFC 11/13] x86/mm: Build arch/x86/mm/tlb.c even on !SMP Andy Lutomirski
2016-01-08 23:55   ` Dave Hansen
2016-01-08 23:15 ` [RFC 12/13] x86/mm: Uninline switch_mm Andy Lutomirski
2016-01-08 23:15 ` [RFC 13/13] x86/mm: Try to preserve old TLB entries using PCID Andy Lutomirski
2016-01-09  0:27   ` Dave Hansen
2016-01-09  2:19     ` Andy Lutomirski
2016-01-08 23:31 ` [RFC 00/13] x86/mm: PCID and INVPCID Linus Torvalds
2016-01-08 23:36   ` Andy Lutomirski
2016-01-08 23:42     ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALCETrVT7ePZPAySF45hhnhZ5cBKH0EvDGmxftHvUmZw2YxZjQ@mail.gmail.com \
    --to=luto@amacapital.net \
    --cc=bp@alien8.de \
    --cc=brgerst@gmail.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=oleg@redhat.com \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox