linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: David Hildenbrand <david@redhat.com>,
	linux-kernel@vger.kernel.org, patches@lists.linux.dev,
	 tglx@linutronix.de, linux-crypto@vger.kernel.org,
	linux-api@vger.kernel.org,  x86@kernel.org,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	 Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>,
	"Carlos O'Donell" <carlos@redhat.com>,
	 Florian Weimer <fweimer@redhat.com>,
	Arnd Bergmann <arnd@arndb.de>, Jann Horn <jannh@google.com>,
	 Christian Brauner <brauner@kernel.org>,
	David Hildenbrand <dhildenb@redhat.com>,
	linux-mm@kvack.org
Subject: Re: [PATCH v22 1/4] mm: add MAP_DROPPABLE for designating always lazily freeable mappings
Date: Thu, 11 Jul 2024 10:57:17 -0700	[thread overview]
Message-ID: <CAHk-=whGE_w46zVk=7S0zOcWv4Dp3EYtuJtzU92ab3pSnnmpHw@mail.gmail.com> (raw)
In-Reply-To: <ZpAR0CgLc28gEkV3@zx2c4.com>

On Thu, 11 Jul 2024 at 10:09, Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>
> When I was working on this patchset this year with the syscall, this is
> similar somewhat to the initial approach I was taking with setting up a
> special mapping. It turned into kind of a mess and I couldn't get it
> working. There's a lot of functionality built around anonymous pages
> that would need to be duplicated (I think?).

Yeah, I was kind of assuming that. You'd need to handle VM_DROPPABLE
in the fault path specially, the way we currently split up based on
vma_is_anonymous(), eg

        if (vma_is_anonymous(vmf->vma))
                return do_anonymous_page(vmf);
        else
                return do_fault(vmf);

in do_pte_missing() etc.

I don't actually think it would be too hard, but it's a more
"conceptual" change, and it's probably not worth it.

> Alright, an hour later of fiddling, and it doesn't actually work (yet?)
> -- the selftest fails. A diff follows below.

May I suggest a slightly different approach: do what we did for "pte_mkwrite()".

It needed the vma too, for not too dissimilar reasons: special dirty
bit handling for the shadow stack. See

  bb3aadf7d446 ("x86/mm: Start actually marking _PAGE_SAVED_DIRTY")
  b497e52ddb2a ("x86/mm: Teach pte_mkwrite() about stack memory")

and now we have "pte_mkwrite_novma()" with the old semantics for the
legacy cases that didn't get converted - whether it's because the
architecture doesn't have the issue, or because it's a kernel pte.

And the conversion was actually quite pain-free, because we have

  #ifndef pte_mkwrite
  static inline pte_t pte_mkwrite(pte_t pte, struct vm_area_struct *vma)
  {
        return pte_mkwrite_novma(pte);
  }
  #endif

so all any architecture that didn't want this needed to do was to
rename their pte_mkwrite() to pte_mkwrite_novma() and they were done.
In fact, that was done first as basically semantically no-op patches:

   2f0584f3f4bd ("mm: Rename arch pte_mkwrite()'s to pte_mkwrite_novma()")
   6ecc21bb432d ("mm: Move pte/pmd_mkwrite() callers with no VMA to _novma()")
   161e393c0f63 ("mm: Make pte_mkwrite() take a VMA")

which made this all very pain-free (and was largely a sed script, I think).

> -                   !pte_dirty(pte) && !PageDirty(page))
> +                   !pte_dirty(pte) && !PageDirty(page) &&
> +                   !(vma->vm_flags & VM_DROPPABLE))

So instead of this kind of thing, we'd have

> -                   !pte_dirty(pte) && !PageDirty(page))
> +                   !pte_dirty(pte, vma) && !PageDirty(page) &&

and the advantage here is that you can't miss anybody by mistake. The
compiler will be very unhappy if you don't pass in the vma, and then
any places that would be converted to "pte_dirty_novma()"

We don't actually have all that many users of pte_dirty(), so it
doesn't look too nasty. And if we make the pte_dirty() semantics
depend on the vma, I really think we should do it the same way we did
pte_mkwrite().

Long-term, maybe we should just aim to always pass in the vma to the
pte_xyz() functions, but...

          Linus


  parent reply	other threads:[~2024-07-11 17:57 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20240709130513.98102-1-Jason@zx2c4.com>
2024-07-09 13:05 ` Jason A. Donenfeld
2024-07-10  3:27   ` David Hildenbrand
2024-07-10  4:05     ` David Hildenbrand
2024-07-11  0:44       ` Jason A. Donenfeld
2024-07-11  4:32         ` Jason A. Donenfeld
2024-07-11  4:46           ` David Hildenbrand
2024-07-11  5:07             ` Linus Torvalds
2024-07-11 17:09               ` Jason A. Donenfeld
2024-07-11 17:17                 ` Jason A. Donenfeld
2024-07-11 17:24                   ` David Hildenbrand
2024-07-11 17:27                     ` David Hildenbrand
2024-07-11 17:54                       ` Jason A. Donenfeld
2024-07-11 17:56                         ` Jason A. Donenfeld
2024-07-11 18:08                           ` Jason A. Donenfeld
2024-07-11 18:24                             ` David Hildenbrand
2024-07-11 18:54                               ` Jason A. Donenfeld
2024-07-11 18:56                                 ` David Hildenbrand
2024-07-11 19:18                                   ` David Hildenbrand
2024-07-11 19:20                                     ` David Hildenbrand
2024-07-11 19:49                                       ` Yu Zhao
2024-07-11 19:52                                         ` Yu Zhao
2024-07-11 19:53                                         ` David Hildenbrand
2024-07-11 19:58                                           ` Yu Zhao
2024-07-11 20:59                                             ` David Hildenbrand
2024-07-11 20:20                                         ` Jason A. Donenfeld
2024-07-11 20:59                                           ` David Hildenbrand
2024-07-11 17:49                     ` Jason A. Donenfeld
2024-07-11 17:57                 ` Linus Torvalds [this message]
2024-07-11 19:07                   ` David Hildenbrand
2024-07-11 19:17                     ` Linus Torvalds
2024-07-11 19:22                       ` David Hildenbrand
2024-07-11 20:07                   ` Jason A. Donenfeld
2024-07-11 20:17                     ` Jason A. Donenfeld
2024-07-11 22:29     ` David Hildenbrand
2024-07-12  1:21       ` Jason A. Donenfeld

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHk-=whGE_w46zVk=7S0zOcWv4Dp3EYtuJtzU92ab3pSnnmpHw@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=Jason@zx2c4.com \
    --cc=adhemerval.zanella@linaro.org \
    --cc=arnd@arndb.de \
    --cc=brauner@kernel.org \
    --cc=carlos@redhat.com \
    --cc=david@redhat.com \
    --cc=dhildenb@redhat.com \
    --cc=fweimer@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jannh@google.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=patches@lists.linux.dev \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox