linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: David Hildenbrand <david@redhat.com>
Cc: Peter Xu <peterx@redhat.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	 linux-fsdevel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	 liubo <liubo254@huawei.com>,
	Matthew Wilcox <willy@infradead.org>,
	 Hugh Dickins <hughd@google.com>, Jason Gunthorpe <jgg@ziepe.ca>,
	John Hubbard <jhubbard@nvidia.com>,  Mel Gorman <mgorman@suse.de>
Subject: Re: [PATCH v1 0/4] smaps / mm/gup: fix gup_can_follow_protnone fallout
Date: Mon, 31 Jul 2023 11:23:59 -0700	[thread overview]
Message-ID: <CAHk-=wiREarX5MQx9AppxPzV6jXCCQRs5KVKgHoGYwATRL6nPg@mail.gmail.com> (raw)
In-Reply-To: <a3349cdb-f76f-eb87-4629-9ccba9f435a1@redhat.com>

On Mon, 31 Jul 2023 at 09:20, David Hildenbrand <david@redhat.com> wrote:
>
> I modified it slightly: FOLL_HONOR_NUMA_FAULT is now set in
> is_valid_gup_args(), such that it will always be set for any GUP users,
> including GUP-fast.

But do we actually want that? It is actively crazy to honor NUMA
faulting at least for get_user_pages_remote().

So right now, GUP-fast requires us to honor NUMA faults, because
GUP-fast doesn't have a vma (which in turn is because GUP-fast doesn't
take any locks).

So GUP-fast can only look at the page table data, and as such *has* to
fail if the page table is inaccessible.

But GUP in general? Why would it want to honor numa faulting?
Particularly by default, and _particularly_ for things like
FOLL_REMOTE.

In fact, I feel like this is what the real rule should be: we simply
define that get_user_pages_fast() is about looking up the page in the
page tables.

So if you want something that acts like a page table lookup, you use
that "fast" thing.  It's literally how it is designed. The whole - and
pretty much only - point of it is that it can be used with no locking
at all, because it basically acts like the hardware lookup does.

So then if KVM wants to look up a page in the page table, that is what
kvm should use, and it automatically gets the "honor numa faults"
behavior, not because it sets a magic flag, but simply because that is
how GUP-fast *works*.

But if you use the "normal" get/pin_user_pages() function, which looks
up the vma, at that point you are following things at a "software
level", and it wouldn't do NUMA faulting, it would just get the page.

(Ok, we have the whole "FAST_ONLY vs fall back" case, so "fast" can
look up the vma too, but read the above argument as "fast *can* be
done without vma, so fast must honor page table bits as per
hardware").

              Linus


  reply	other threads:[~2023-07-31 18:24 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-27 21:28 David Hildenbrand
2023-07-27 21:28 ` [PATCH v1 2/4] mm/gup: Make follow_page() succeed again on PROT_NONE PTEs/PMDs David Hildenbrand
2023-07-28  2:30   ` John Hubbard
2023-07-28  9:08     ` David Hildenbrand
2023-07-28 10:12       ` David Hildenbrand
2023-07-27 21:28 ` [PATCH v1 3/4] smaps: use vm_normal_page_pmd() instead of follow_trans_huge_pmd() David Hildenbrand
2023-07-27 21:28 ` [PATCH v1 4/4] mm/gup: document FOLL_FORCE behavior David Hildenbrand
2023-07-28 16:18 ` [PATCH v1 0/4] smaps / mm/gup: fix gup_can_follow_protnone fallout Linus Torvalds
2023-07-28 17:30   ` David Hildenbrand
2023-07-28 17:54     ` David Hildenbrand
2023-07-28 19:40     ` David Hildenbrand
2023-07-28 19:50       ` Peter Xu
2023-07-28 20:00         ` David Hildenbrand
2023-08-02 10:24     ` Mel Gorman
2023-07-28 19:39   ` Peter Xu
2023-07-28 19:52     ` David Hildenbrand
2023-07-28 20:23     ` Linus Torvalds
2023-07-28 20:33       ` David Hildenbrand
2023-07-28 20:50         ` Linus Torvalds
2023-07-28 21:02           ` David Hildenbrand
2023-07-28 21:20             ` Peter Xu
2023-07-28 21:31               ` David Hildenbrand
2023-07-28 22:14                 ` Jason Gunthorpe
2023-07-31 16:01                   ` Peter Xu
2023-07-28 21:32               ` John Hubbard
2023-07-28 21:49                 ` Peter Xu
2023-07-28 22:00                   ` John Hubbard
2023-07-31 16:05                     ` Peter Xu
     [not found]   ` <412bb30f-0417-802c-3fc4-a4e9d5891c5d@redhat.com>
2023-07-29  9:35     ` David Hildenbrand
2023-07-31 16:10       ` Peter Xu
2023-07-31 16:20         ` David Hildenbrand
2023-07-31 18:23           ` Linus Torvalds [this message]
2023-07-31 18:51             ` Peter Xu
2023-07-31 19:00             ` David Hildenbrand
2023-07-31 19:07               ` Linus Torvalds
2023-07-31 19:22                 ` David Hildenbrand
2023-08-01 13:05               ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHk-=wiREarX5MQx9AppxPzV6jXCCQRs5KVKgHoGYwATRL6nPg@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=hughd@google.com \
    --cc=jgg@ziepe.ca \
    --cc=jhubbard@nvidia.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=liubo254@huawei.com \
    --cc=mgorman@suse.de \
    --cc=peterx@redhat.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox