From: Hyesoo Yu <hyesoo.yu@samsung.com>
To: Zhaoyang Huang <huangzhaoyang@gmail.com>
Cc: jaewon31.kim@samsung.com, David Hildenbrand <david@redhat.com>,
John Hubbard <jhubbard@nvidia.com>,
"zhaoyang.huang@unisoc.com" <zhaoyang.huang@unisoc.com>,
"surenb@google.com" <surenb@google.com>,
"Steve.Kang@unisoc.com" <Steve.Kang@unisoc.com>,
Jaewon Kim <jaewon31.kim@gmail.com>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
Jang-Hyuck Kim <janghyuck.kim@samsung.com>
Subject: Re: reply: [RFC] pin_user_pages_fast failure count increased
Date: Wed, 28 May 2025 10:23:29 +0900 [thread overview]
Message-ID: <20250528012329.GA1545287@tiffany> (raw)
In-Reply-To: <CAGWkznHPR3qL1Y+EJbp_4hU9LYTVKL5vsjMHHmpY4SM365=D_A@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 5361 bytes --]
On Mon, May 26, 2025 at 07:49:57PM +0800, Zhaoyang Huang wrote:
> On Mon, May 26, 2025 at 7:17 PM Jaewon Kim <jaewon31.kim@samsung.com> wrote:
> >
> > >On 26.05.25 11:33, Hyesoo Yu wrote:
> > >> On Mon, May 26, 2025 at 04:05:16PM +0800, Zhaoyang Huang wrote:
> > >>> On Mon, May 26, 2025 at 3:50?PM Hyesoo Yu <hyesoo.yu@samsung.com> wrote:
> > >>>>
> > >>>> On Thu, May 22, 2025 at 07:52:41PM -0700, John Hubbard wrote:
> > >>>>> On 5/22/25 7:37 PM, 김재원 wrote:
> > >>>>> ...
> > >>>>>> I think this is what you meant, please let me know if you have an idea to make this nicer.
> > >>>>>> We may be to able to prepare the patch next week.
> > >>>>>>
> > >>>>>> static long
> > >>>>>> check_and_migrate_movable_pages_or_folios(struct pages_or_folios *pofs)
> > >>>>>> {
> > >>>>>> + bool any_unpinnable;
> > >>>>>> LIST_HEAD(movable_folio_list);
> > >>>>>>
> > >>>>>> - collect_longterm_unpinnable_folios(&movable_folio_list, pofs);
> > >>>>>> - if (list_empty(&movable_folio_list))
> > >>>>>> - return 0;
> > >>>>>> + any_unpinnable = collect_longterm_unpinnable_folios(&movable_folio_list, pofs);
> > >>>>>> + if (list_empty(&movable_folio_list)) {
> > >>>>>> + if (any_unpinnable)
> > >>>>>> + pofs_unpin(pofs);
> > >>>>>
> > >>>>> I think this is correct, although as I mentioned in the other thread,
> > >>>>> that implies that commit 1aaf8c122918 (which didn't add nor remove
> > >>>>> any pof unpinning) is probably not the true or only culprit, right?
> > >>>>>
> > >>>>>> + return any_unpinnable ? -EAGAIN : 0;
> > >>>>>
> > >>>>> Ha, the "?" operator almost always does more harm than good.
> > >>>>>
> > >>>>> Here, for example, it has obscured from you the fact that any_unpinnable
> > >>>>> is being checked twice, when you could have merged those into a single "if".
> > >>>>>
> > >>>>
> > >>>> Hello,
> > >>>>
> > >>>> I was wondering if the original problem - an infinite loop when pages allocated by
> > >>>> cma_alloc() in vm_ops->fault are passed to GUP - still remains unresolved.
> > >>>> (To be honest, I'm not quite sure how such pages end up being pinned via GUP.
> > >>>> Is that the expected behavior, or could it possibly indicate a bug ?)
> > >>> The original problem arises from applying CMA as guestOS's memory
> > >>> slots for kvm which use GUP to setup its 2nd stage mapping(HVA->PFN).
> > >>> You can check KVM code if you are interested.
> > >>>
> > >>
> > >> Thanks for the kind explanation. While I'm not deeply familiar with KVM, my understanding
> > >> is that there are cases where GUP is used on CMA.
> > >>
> > >> So does that mean pinning memory from the CMA was actually intended to succeed ?
> > >
> > >Careful: KVM uses ordinary GUP, not GUP-longterm.
> >
> > Hi. David and Zhaoyang
> >
> > If possible, could you kindly explain the situation where the 1aaf8c122918 was addeded?
> > If KVM does not user FOLL_LONGTERM, then why the function,
> > collect_longterm_unpinnable_folios, was changed at that time?
> >
> > First of all, I'm not a KVM expert. After reading Zhaoyang's mail,
> > I thought CMA free page was initially allocated then migrated by FOLL_LONGTERM,
> > during the get_user_page for KVM's guest OS. If KVM does not use FOLL_LONGTERM,
> > I am confused.
> >
> > Actually I did not understand the infinite loop situation. I thought few times of -EAGAIN
> > might happen during the gup. But calling lru_add_drain_all by collect_longterm_unpinnable_folios
> > would put the page to LRU. And other cma_alloc context or migration context, I guess,
> > put the pages back to LRU if there was race.
> Actually, it is pkvm which was introduced by google in AOSP. I am
> afraid I can just brief the callstack here for security reasons. The
> pin_user_pages will setup the 2nd stage mapping for the hva by the
> vm_ops->fault which is registered by kvm memfd driver and all PFNs are
> from CMA area. The driver will keep the pages out of the LRU which hit
> the original bug as it is counted but have the movable_page_list be
> empty and lead to infinite loop within __gup_longterm_locked
>
> pkvm_xxx_xxx(equal to user_mem_abort in kvm)
> {
> unsigned int flags = FOLL_HWPOISON | FOLL_LONGTERM | FOLL_WRITE;
> ...
> ret = pin_user_pages(hva, 1, flags, &page);
> __gup_longterm_locked
> do {
> nr_pinned_pages = __get_user_pages_locked(mm,
> start, nr_pages,
> pages, locked, gup_flags);
> rc =
> check_and_migrate_movable_pages(nr_pinned_pages, pages);
> } while (rc == -EAGAIN);
> }
Hello, Zhaoyang.
I don't believe commit 1aaf8c was just intended to prevent an infinite loop.
The commit was introduced to allow pinning CMA memory in the pKVM on AOSP.
That leads me to question whether the assumption that CMA can be long-term pinned is actually valid.
In my opinion, it might be more appropriate to revert that commit 1aaf8c and instead ensure
that pKVM avoids using CMA for memory that requires long-term pinning through GUP ?
Alternatively, instead of changing the current logic that prevents longterm GUP from pinning CMA,
it would be better to propose a new patch that specifically addresses the pKVM scenario like adding new FOLL_flags ?
Thanks,
Regards.
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
next prev parent reply other threads:[~2025-05-28 1:25 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CAJrd-UtDD50iN=Yxz4=6kNkAcNAtRFkxhKAbEYiRyyDT-bYPHg@mail.gmail.com>
2025-05-22 10:18 ` 黄朝阳 (Zhaoyang Huang)
2025-05-22 12:22 ` David Hildenbrand
[not found] ` <CGME20250522130101epcas1p435244c12cfc9bb7895008b8ea98af064@epcms1p3>
2025-05-22 13:09 ` Jaewon Kim
2025-05-22 14:06 ` David Hildenbrand
[not found] ` <CGME20250522130101epcas1p435244c12cfc9bb7895008b8ea98af064@epcms1p2>
2025-05-22 14:44 ` 김재원
2025-05-22 15:07 ` David Hildenbrand
2025-05-23 2:48 ` John Hubbard
2025-05-23 2:37 ` 김재원
2025-05-23 2:52 ` John Hubbard
2025-05-26 7:48 ` Hyesoo Yu
2025-05-26 8:05 ` Zhaoyang Huang
2025-05-26 9:33 ` Hyesoo Yu
2025-05-26 9:38 ` David Hildenbrand
[not found] ` <CGME20250522130101epcas1p435244c12cfc9bb7895008b8ea98af064@epcms1p8>
2025-05-26 11:17 ` Jaewon Kim
2025-05-26 11:49 ` Zhaoyang Huang
2025-05-28 1:23 ` Hyesoo Yu [this message]
2025-05-28 2:49 ` Zhaoyang Huang
2025-05-28 3:36 ` Hyesoo Yu
2025-05-28 7:55 ` David Hildenbrand
2025-05-28 10:59 ` Zhaoyang Huang
2025-05-28 12:57 ` David Hildenbrand
2025-06-03 13:12 ` David Hildenbrand
2025-06-04 1:04 ` Zhaoyang Huang
2025-06-04 9:12 ` David Hildenbrand
2025-06-04 9:41 ` Zhaoyang Huang
2025-06-04 9:48 ` David Hildenbrand
[not found] ` <CGME20250604095542epcas2p3f3d2d6fc17115547981a7173215a09d1@epcas2p3.samsung.com>
2025-06-04 9:53 ` Hyesoo Yu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250528012329.GA1545287@tiffany \
--to=hyesoo.yu@samsung.com \
--cc=Steve.Kang@unisoc.com \
--cc=david@redhat.com \
--cc=huangzhaoyang@gmail.com \
--cc=jaewon31.kim@gmail.com \
--cc=jaewon31.kim@samsung.com \
--cc=janghyuck.kim@samsung.com \
--cc=jhubbard@nvidia.com \
--cc=linux-mm@kvack.org \
--cc=surenb@google.com \
--cc=zhaoyang.huang@unisoc.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox