From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf1-f199.google.com (mail-pf1-f199.google.com [209.85.210.199]) by kanga.kvack.org (Postfix) with ESMTP id 8AFF36B3BF6 for ; Sun, 26 Aug 2018 13:25:15 -0400 (EDT) Received: by mail-pf1-f199.google.com with SMTP id a23-v6so9898036pfo.23 for ; Sun, 26 Aug 2018 10:25:15 -0700 (PDT) Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id d29-v6sor3874149pfj.97.2018.08.26.10.25.14 for (Google Transport Security); Sun, 26 Aug 2018 10:25:14 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (1.0) Subject: Re: TLB flushes on fixmap changes From: Andy Lutomirski In-Reply-To: Date: Sun, 26 Aug 2018 10:25:11 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: References: <20180822153012.173508681@infradead.org> <20180822154046.823850812@infradead.org> <20180822155527.GF24124@hirez.programming.kicks-ass.net> <20180823134525.5f12b0d3@roar.ozlabs.ibm.com> <776104d4c8e4fc680004d69e3a4c2594b638b6d1.camel@au1.ibm.com> <20180823133958.GA1496@brain-police> <20180824084717.GK24124@hirez.programming.kicks-ass.net> <20180824180438.GS24124@hirez.programming.kicks-ass.net> <56A9902F-44BE-4520-A17C-26650FCC3A11@gmail.com> <9A38D3F4-2F75-401D-8B4D-83A844C9061B@gmail.com> <8E0D8C66-6F21-4890-8984-B6B3082D4CC5@gmail.com> <20180826112341.f77a528763e297cbc36058fa@kernel.org> <952A64F0-90B3-4E2F-B410-7E20BE90D617@amacapital.net> Sender: owner-linux-mm@kvack.org List-ID: To: Kees Cook Cc: Andy Lutomirski , Masami Hiramatsu , Nadav Amit , Linus Torvalds , Paolo Bonzini , Jiri Kosina , Peter Zijlstra , Will Deacon , Benjamin Herrenschmidt , Nick Piggin , the arch/x86 maintainers , Borislav Petkov , Rik van Riel , Jann Horn , Adin Scannell , Dave Hansen , Linux Kernel Mailing List , linux-mm , David Miller , Martin Schwidefsky , Michael Ellerman > On Aug 26, 2018, at 9:47 AM, Kees Cook wrote: >=20 >> On Sun, Aug 26, 2018 at 7:20 AM, Andy Lutomirski wr= ote: >>=20 >>=20 >>>> On Aug 25, 2018, at 9:43 PM, Kees Cook wrote: >>>>=20 >>>>> On Sat, Aug 25, 2018 at 9:21 PM, Andy Lutomirski wro= te: >>>>> On Sat, Aug 25, 2018 at 7:23 PM, Masami Hiramatsu wrote: >>>>> On Fri, 24 Aug 2018 21:23:26 -0700 >>>>> Andy Lutomirski wrote: >>>>>> Couldn't text_poke() use kmap_atomic()? Or, even better, just change= CR3? >>>>>=20 >>>>> No, since kmap_atomic() is only for x86_32 and highmem support kernel.= >>>>> In x86-64, it seems that returns just a page address. That is not >>>>> good for text_poke, since it needs to make a writable alias for RO >>>>> code page. Hmm, maybe, can we mimic copy_oldmem_page(), it uses iorema= p_cache? >>>>>=20 >>>>=20 >>>> I just re-read text_poke(). It's, um, horrible. Not only is the >>>> implementation overcomplicated and probably buggy, but it's SLOOOOOW. >>>> It's totally the wrong API -- poking one instruction at a time >>>> basically can't be efficient on x86. The API should either poke lots >>>> of instructions at once or should be text_poke_begin(); ...; >>>> text_poke_end();. >>>>=20 >>>> Anyway, the attached patch seems to boot. Linus, Kees, etc: is this >>>> too scary of an approach? With the patch applied, text_poke() is a >>>> fantastic exploit target. On the other hand, even without the patch >>>> applied, text_poke() is every bit as juicy. >>>=20 >>> I tried to convince Ingo to use this method for doing "write rarely" >>> and he soundly rejected it. :) I've always liked this because AFAICT, >>> it's local to the CPU. I had proposed it in >>> https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h= =3Dkspp/write-rarely&id=3D9ab0cb2618ebbc51f830ceaa06b7d2182fe1a52d >>=20 >> Ingo, can you clarify why you hate it? I personally would rather use CR3= , but CR0 seems like a fine first step, at least for text_poke. >=20 > Sorry, it looks like it was tglx, not Ingo: >=20 > https://lkml.kernel.org/r/alpine.DEB.2.20.1704071048360.1716@nanos >=20 > This thread is long, and one thing that I think went unanswered was > "why do we want this to be fast?" the answer is: for doing page table > updates. Page tables are becoming a bigger target for attacks now, and > it's be nice if they could stay read-only unless they're getting > updated (with something like this). >=20 >=20 It kind of sounds like tglx would prefer the CR3 approach. And indeed my pat= ch has a serious problem wrt the NMI code.