linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Yang Shi <yang@os.amperecomputing.com>
Cc: yangge1116@126.com, david@redhat.com, akpm@linux-foundation.org,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	stable@vger.kernel.org
Subject: Re: [v2 PATCH] mm: gup: do not call try_grab_folio() in slow path
Date: Thu, 27 Jun 2024 19:19:40 -0400	[thread overview]
Message-ID: <Zn3zjKnKIZjCXGrU@x1n> (raw)
In-Reply-To: <20240627221413.671680-1-yang@os.amperecomputing.com>

Yang,

On Thu, Jun 27, 2024 at 03:14:13PM -0700, Yang Shi wrote:
> The try_grab_folio() is supposed to be used in fast path and it elevates
> folio refcount by using add ref unless zero.  We are guaranteed to have
> at least one stable reference in slow path, so the simple atomic add
> could be used.  The performance difference should be trivial, but the
> misuse may be confusing and misleading.

This first paragraph is IMHO misleading itself..

I think we should mention upfront the important bit, on the user impact.

Here IMO the user impact should be: Linux may fail longterm pin in some
releavnt paths when applied over CMA reserved blocks.  And if to extend a
bit, that include not only slow-gup but also the new memfd pinning, because
both of them used try_grab_folio() which used to be only for fast-gup.

It's great this patch renamed try_grab_folio() to try_grab_folio_fast(), I
think that definitely helps on reducing the abuse in the future.  However
then with that the subject becomes misleading, because it says "do not call
try_grab_folio()" however after this patch we keep using it.

Maybe rename the subject to "mm: Fix longterm pin on slow gup and memfd pin
regress"?

> 
> In another thread [1] a kernel warning was reported when pinning folio
> in CMA memory when launching SEV virtual machine.  The splat looks like:
> 
> [  464.325306] WARNING: CPU: 13 PID: 6734 at mm/gup.c:1313 __get_user_pages+0x423/0x520
> [  464.325464] CPU: 13 PID: 6734 Comm: qemu-kvm Kdump: loaded Not tainted 6.6.33+ #6
> [  464.325477] RIP: 0010:__get_user_pages+0x423/0x520
> [  464.325515] Call Trace:
> [  464.325520]  <TASK>
> [  464.325523]  ? __get_user_pages+0x423/0x520
> [  464.325528]  ? __warn+0x81/0x130
> [  464.325536]  ? __get_user_pages+0x423/0x520
> [  464.325541]  ? report_bug+0x171/0x1a0
> [  464.325549]  ? handle_bug+0x3c/0x70
> [  464.325554]  ? exc_invalid_op+0x17/0x70
> [  464.325558]  ? asm_exc_invalid_op+0x1a/0x20
> [  464.325567]  ? __get_user_pages+0x423/0x520
> [  464.325575]  __gup_longterm_locked+0x212/0x7a0
> [  464.325583]  internal_get_user_pages_fast+0xfb/0x190
> [  464.325590]  pin_user_pages_fast+0x47/0x60
> [  464.325598]  sev_pin_memory+0xca/0x170 [kvm_amd]
> [  464.325616]  sev_mem_enc_register_region+0x81/0x130 [kvm_amd]
> 
> Per the analysis done by yangge, when starting the SEV virtual machine,
> it will call pin_user_pages_fast(..., FOLL_LONGTERM, ...) to pin the
> memory.  But the page is in CMA area, so fast GUP will fail then
> fallback to the slow path due to the longterm pinnalbe check in
> try_grab_folio().
> The slow path will try to pin the pages then migrate them out of CMA
> area.  But the slow path also uses try_grab_folio() to pin the page,
> it will also fail due to the same check then the above warning
> is triggered.
> 
> [1] https://lore.kernel.org/linux-mm/1719478388-31917-1-git-send-email-yangge1116@126.com/
> 
> Fixes: 57edfcfd3419 ("mm/gup: accelerate thp gup even for "pages != NULL"")
> Cc: <stable@vger.kernel.org> [6.6+]
> Reported-by: yangge <yangge1116@126.com>
> Signed-off-by: Yang Shi <yang@os.amperecomputing.com>

The patch itself looks mostly ok to me.

There's still some "cleanup" part mangled together, e.g., the real meat
should be avoiding the folio_is_longterm_pinnable() check in relevant
paths.  The rest (e.g. switch slow-gup / memfd pin to use folio_ref_add()
not try_get_folio(), and renames) could be good cleanups.

So a smaller fix might be doable, but again I don't have a strong opinion
here.

Thanks,

-- 
Peter Xu



  parent reply	other threads:[~2024-06-27 23:19 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-27 22:14 Yang Shi
2024-06-27 22:54 ` Andrew Morton
2024-06-27 23:02   ` Yang Shi
2024-06-27 23:19 ` Peter Xu [this message]
2024-06-27 23:32   ` Andrew Morton
2024-06-27 23:39     ` Yang Shi
2024-06-27 23:43     ` Peter Xu
2024-06-28  1:25       ` Ge Yang
2024-06-28  6:09       ` Ge Yang
2024-06-28  6:35 ` Christoph Hellwig
2024-06-28 18:27   ` Yang Shi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zn3zjKnKIZjCXGrU@x1n \
    --to=peterx@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=stable@vger.kernel.org \
    --cc=yang@os.amperecomputing.com \
    --cc=yangge1116@126.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox