From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi0-f69.google.com (mail-oi0-f69.google.com [209.85.218.69]) by kanga.kvack.org (Postfix) with ESMTP id CB2726B32C7 for ; Fri, 24 Aug 2018 22:16:30 -0400 (EDT) Received: by mail-oi0-f69.google.com with SMTP id j17-v6so9138078oii.8 for ; Fri, 24 Aug 2018 19:16:30 -0700 (PDT) Received: from mail-sor-f41.google.com (mail-sor-f41.google.com. [209.85.220.41]) by mx.google.com with SMTPS id w132-v6sor3053188oiw.3.2018.08.24.19.16.29 for (Google Transport Security); Fri, 24 Aug 2018 19:16:29 -0700 (PDT) MIME-Version: 1.0 References: <20180822153012.173508681@infradead.org> <20180822154046.823850812@infradead.org> <20180822155527.GF24124@hirez.programming.kicks-ass.net> <20180823134525.5f12b0d3@roar.ozlabs.ibm.com> <776104d4c8e4fc680004d69e3a4c2594b638b6d1.camel@au1.ibm.com> <20180823133958.GA1496@brain-police> <20180824084717.GK24124@hirez.programming.kicks-ass.net> <20180824180438.GS24124@hirez.programming.kicks-ass.net> <56A9902F-44BE-4520-A17C-26650FCC3A11@gmail.com> <9A38D3F4-2F75-401D-8B4D-83A844C9061B@gmail.com> In-Reply-To: From: Nadav Amit Date: Fri, 24 Aug 2018 19:16:10 -0700 Message-ID: Subject: Re: TLB flushes on fixmap changes Content-Type: multipart/alternative; boundary="00000000000075c9d605743914a2" Sender: owner-linux-mm@kvack.org List-ID: To: Linus Torvalds Cc: Paolo Bonzini , Jiri Kosina , Masami Hiramatsu , Peter Zijlstra , Will Deacon , Benjamin Herrenschmidt , Nick Piggin , Andrew Lutomirski , the arch/x86 maintainers , Borislav Petkov , Rik van Riel , Jann Horn , Adin Scannell , Dave Hansen , Linux Kernel Mailing List , linux-mm , David Miller , Martin Schwidefsky , Michael Ellerman --00000000000075c9d605743914a2 Content-Type: text/plain; charset="UTF-8" On Fri, Aug 24, 2018, 5:58 PM Linus Torvalds wrote: > Adding a few people to the cc. > > On Fri, Aug 24, 2018 at 1:24 PM Nadav Amit wrote: > > > > > > Can you actually find something that changes the fixmaps after boot > > > (again, ignoring kmap)? > > > > At least the alternatives mechanism appears to do so. > > > > IIUC the following path is possible when adding a module: > > > > jump_label_add_module() > > ->__jump_label_update() > > ->arch_jump_label_transform() > > ->__jump_label_transform() > > ->text_poke_bp() > > ->text_poke() > > ->set_fixmap() > > Yeah, that looks a bit iffy. > > But making the tlb flush global wouldn't help. This is running on a > local core, and if there are other CPU's that can do this at the same > time, then they'd just fight about the same mapping. > > Honestly, I think it's ok just because I *hope* this is all serialized > anyway (jump_label_lock? But what about other users of text_poke?). > The users should hold text_mutex. > But I'd be a lot happier about it if it either used an explicit lock > to make sure, or used per-cpu fixmap entries. > My concern is that despite the lock, one core would do a speculative page walk and cache a translation that soon after would become stale. > And the tlb flush is done *after* the address is used, which is bogus > anyway. > It seems to me that it is intended to remove the mapping that might be a security issue. But anyhow, set_fixmap and clear_fixmap perform a local TLB flush, (in __set_pte_vaddr()) so locally things should be fine. > > > And a similar path can happen when static_key_enable/disable() is called. > > Same comments. > > How about replacing that > > local_irq_save(flags); > ... do critical things here ... > local_irq_restore(flags); > > in text_poke() with > > static DEFINE_SPINLOCK(poke_lock); > > spin_lock_irqsave(&poke_lock, flags); > ... do critical things here ... > spin_unlock_irqrestore(&poke_lock, flags); > > and moving the local_flush_tlb() to after the set_fixmaps, but before > the access through the virtual address. > > But changing things to do a global tlb flush would just be wrong. As I noted, I think that locking and local flushes as they are right now are fine (besides the redundant flush). My concern is merely that speculative page walks on other cores would cache stale entries. --00000000000075c9d605743914a2 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


= On Fri, Aug 24, 2018, 5:58 PM Linus Torvalds <torvalds@linux-= foundation.org> wrote:
Addin= g a few people to the cc.

On Fri, Aug 24, 2018 at 1:24 PM Nadav Amit <nadav.amit@gmai= l.com> wrote:
> >
> > Can you actually find something that changes the fixmaps after bo= ot
> > (again, ignoring kmap)?
>
> At least the alternatives mechanism appears to do so.
>
> IIUC the following path is possible when adding a module:
>
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0jump_label_add_module()
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0->__jump_label_update()
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0->arch_jump_label_transform()
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0->__jump_label_transform()
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0->text_poke_bp()
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0->text_poke()
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0->set_fixmap()

Yeah, that looks a bit iffy.

But making the tlb flush global wouldn't help.=C2=A0 This is running on= a
local core, and if there are other CPU's that can do this at the same time, then they'd just fight about the same mapping.

Honestly, I think it's ok just because I *hope* this is all serialized<= br> anyway (jump_label_lock? But what about other users of text_poke?).

The user= s should hold text_mutex.



But I'd be a lot happier about it if it either used an explicit lock to make sure, or used per-cpu fixmap entries.
<= div dir=3D"auto">
My concern is that despite the= lock, one core would do a speculative page walk and cache a translation th= at soon after would become stale.



And the tlb flush is done *after* the address is used, which is bogus anywa= y.

It seems to me that it is intended to remove the mapping that might be a = security issue.=C2=A0

Bu= t anyhow, set_fixmap and clear_fixmap perform a local TLB flush, (in __set_= pte_vaddr()) so locally things should be fine.

<= /div>



> And a similar path can happen when static_key_enable/disable() is call= ed.

Same comments.

How about replacing that

=C2=A0 =C2=A0 =C2=A0 =C2=A0 local_irq_save(flags);
=C2=A0 =C2=A0 =C2=A0 =C2=A0... do critical things here ...
=C2=A0 =C2=A0 =C2=A0 =C2=A0 local_irq_restore(flags);

in text_poke() with

=C2=A0 =C2=A0 =C2=A0 =C2=A0 static DEFINE_SPINLOCK(poke_lock);

=C2=A0 =C2=A0 =C2=A0 =C2=A0 spin_lock_irqsave(&poke_lock, flags);
=C2=A0 =C2=A0 =C2=A0 =C2=A0... do critical things here ...
=C2=A0 =C2=A0 =C2=A0 =C2=A0 spin_unlock_irqrestore(&poke_lock, flags);<= br>
and moving the local_flush_tlb() to after the set_fixmaps, but before
the access through the virtual address.

But changing things to do a global tlb flush would just be wrong.


As I noted, I think that locking and local flushes as they ar= e right now are fine (besides the redundant flush).
= My concern is merely that speculative page walks on other cores would cache= stale entries.
--00000000000075c9d605743914a2--