From: 김재원 <jaewon31.kim@samsung.com>
To: "David Hildenbrand" <david@redhat.com>,
김재원 <jaewon31.kim@samsung.com>,
"zhaoyang.huang@unisoc.com" <zhaoyang.huang@unisoc.com>,
유혜수 <hyesoo.yu@samsung.com>,
"jhubbard@nvidia.com" <jhubbard@nvidia.com>,
"surenb@google.com" <surenb@google.com>,
"Steve.Kang@unisoc.com" <Steve.Kang@unisoc.com>,
"huangzhaoyang@gmail.com" <huangzhaoyang@gmail.com>
Cc: Jaewon Kim <jaewon31.kim@gmail.com>,
"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: reply: [RFC] pin_user_pages_fast failure count increased
Date: Fri, 23 May 2025 11:37:09 +0900 [thread overview]
Message-ID: <20250523023709epcms1p236d4f55b79adb9366ec1cf6d5792b06b@epcms1p2> (raw)
In-Reply-To: <1dd70770-f097-46bc-99b1-1bd3249032fb@redhat.com>
>>> On 22.05.25 15:09, Jaewon Kim wrote:
>>>>> On 22.05.25 12:18, ?朝? (Zhaoyang Huang) wrote:
>>>>>>> On Mon, Apr 28, 2025 at 02:12:57PM -0700, John Hubbard wrote:
>>>>>>>> On 4/28/25 1:56 PM, David Hildenbrand wrote:
>>>>>>>>> On 28.04.25 22:14, John Hubbard wrote:
>>>>>>>>>> On 4/28/25 8:17 AM, Jaewon Kim wrote:
>>>>>>>>>>> Hi
>>>>>>>>>>>
>>>>>>>>>>> If pin_user_pages_fast does not pin all the requested number of
>>>>>>>>>>> pages, then drivers calling to pin_user_pages_fast should retry
>>>>>>>>>>> until the gup pins all?
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Approaches vary, for handling partial success of pin_user_pages().
>>>>>>>>>>
>>>>>>>>>> * Many drivers unpin everything and either bail out entirely, or
>>>>>>>>>> retry pinning the entire original range.
>>>>>>>>>
>>>>>>>>> Hm, unpinning + trying to repin the entire range can easily result
>>>>>>>>> in an endless loop on persistent errors IIRC?
>>>>>>>>>
>>>>>>>>
>>>>>>>> I vaguely recall a limited number of retries, yes.
>>>>>>>>
>>>>>>>> thanks,
>>>>>>>> --
>>>>>>>> John Hubbard
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I'd like to report a potential issue introduced by a recent change in
>>>>>>> 1aaf8c122918 mm: gup: fix infinite loop within __get_longterm_locked
>>>>>>>
>>>>>>> Previously, the call to migrate_longterm_unpinnable_folio() was guarded by the
>>>>>>> collected variable. This meant that if a CMA page was temporarily held in the
>>>>>>> pagevec and failed LRU isolation, it wouldn't be added to the
>>>>>>> movable_page_list, but the collected counter would still be incremented.
>>>>>
>>>>> Okay, so we'd also express that way "any longterm_pinnable page found".
>>>>>
>>>>>> There is lru_add_drain_all for dealing with this scenario, so this won't be the case, right?
>>>>>
>>>>> Good point. Only concurrent isolation might be problematic (concurrent reclaim?).
>>>>>
>>>>>>>
>>>>>>> As a result, migrate_longterm_unpinnable_folio() would return -EAGAIN, and
>>>>>>> the process would be retried until migration of the CMA page succeeded.
>>>>>>>
>>>>>>> However, in the recent patch merged into mainline, the logic now only checks
>>>>>>> whether movable_page_list is empty, and no longer relies on the collected
>>>>>>> count.
>>>>>>> This can cause CMA pages that fail isolation to bypass retry logic and remain
>>>>>>> pinned.
>>>>>>>
>>>>>>> Effectively,long-term pinning is now possible for CMA pages ? something that
>>>>>>> previously would have been avoided through repeated attempts.
>>>>>
>>>>> Calling migrate_longterm_unpinnable_folios() when there is nothing to migrate is stupid.
>>>>>
>>>>> Maybe something like:
>>>>>
>>>>> diff --git a/mm/gup.c b/mm/gup.c
>>>>> index 329c5f7acc7a0..58b8e40fc19ed 100644
>>>>> --- a/mm/gup.c
>>>>> +++ b/mm/gup.c
>>>>> @@ -2301,14 +2301,15 @@ static void pofs_unpin(struct pages_or_folios *pofs)
>>>>> }
>>>>>
>>>>> /*
>>>>> - * Returns the number of collected folios. Return value is always >= 0.
>>>>> + * Returns whether any longterm unpinnable folio was found (if isolation
>>>>> + * fails, not all can be added to the movable_folio_list).
>>>>> */
>>>>> -static void collect_longterm_unpinnable_folios(
>>>>> +static bool collect_longterm_unpinnable_folios(
>>>>> struct list_head *movable_folio_list,
>>>>> struct pages_or_folios *pofs)
>>>>> {
>>>>> + bool drain_allow = true, any_unpinnable = false;
>>>>> struct folio *prev_folio = NULL;
>>>>> - bool drain_allow = true;
>>>>> unsigned long i;
>>>>>
>>>>> for (i = 0; i < pofs->nr_entries; i++) {
>>>>> @@ -2320,6 +2321,7 @@ static void collect_longterm_unpinnable_folios(
>>>>>
>>>>> if (folio_is_longterm_pinnable(folio))
>>>>> continue;
>>>>> + any_unpinnable = true;
>>>>>
>>>>> if (folio_is_device_coherent(folio))
>>>>> continue;
>>>>> @@ -2342,6 +2344,8 @@ static void collect_longterm_unpinnable_folios(
>>>>> NR_ISOLATED_ANON + folio_is_file_lru(folio),
>>>>> folio_nr_pages(folio));
>>>>> }
>>>>> +
>>>>> + return any_unpinnable;
>>>>> }
>>>>>
>>>>> /*
>>>>> @@ -2417,11 +2421,12 @@ migrate_longterm_unpinnable_folios(struct list_head *movable_folio_list,
>>>>> static long
>>>>> check_and_migrate_movable_pages_or_folios(struct pages_or_folios *pofs)
>>>>> {
>>>>> + bool any_unpinnable;
>>>>> LIST_HEAD(movable_folio_list);
>>>>>
>>>>> - collect_longterm_unpinnable_folios(&movable_folio_list, pofs);
>>>>> + any_unpinnable = collect_longterm_unpinnable_folios(&movable_folio_list, pofs);
>>>>> if (list_empty(&movable_folio_list))
>>>>> - return 0;
>>>>> + return any_unpinnable ? -EAGAIN : 0;
>>>>>
>>>>> return migrate_longterm_unpinnable_folios(&movable_folio_list, pofs);
>>>>> }
>>>>>
>>>>>
>>>>> --
>>>>> Cheers,
>>>>>
>>>>> David / dhildenb
>>>>
>>>> Hi
>>>>
>>>> Thank you for your comment and patch.
>>>> By the way, what if there are any_unpinnable pages and also pages in the movable_folio_list,
>>>> but migrate_longterm_unpinnable_folios failed to migrate and return other erros instead of -EAGAIN?
>>>> In that case, I think the CMA or other unpinnable pages would be pinned.
>>>
>>> Oh, I think what we have to do is call pofs_unpin(pofs)() in case we
>>> return with -EAGAIN early. That's what
>>> migrate_longterm_unpinnable_folios() would do.
>>
>> I did not understand. Do you mean we need to call pofs_unpin in case of any_unpinnable?
>
>At least that's what we did before 1aaf8c122 when calling
>collect_longterm_unpinnable_folios() I think.
>
>> I think your following code seems to be good to me.
>> return any_unpinnable ? -EAGAIN : 0;
>
>I think we have to call pofs_unpin() here.
>
>Assuming we always call migrate_longterm_unpinnable_folios() with a list
>now, we could drop the list_empty() check in there.
>
Thank for your kind explanation.
I think this is what you meant, please let me know if you have an idea to make this nicer.
We may be to able to prepare the patch next week.
static long
check_and_migrate_movable_pages_or_folios(struct pages_or_folios *pofs)
{
+ bool any_unpinnable;
LIST_HEAD(movable_folio_list);
- collect_longterm_unpinnable_folios(&movable_folio_list, pofs);
- if (list_empty(&movable_folio_list))
- return 0;
+ any_unpinnable = collect_longterm_unpinnable_folios(&movable_folio_list, pofs);
+ if (list_empty(&movable_folio_list)) {
+ if (any_unpinnable)
+ pofs_unpin(pofs);
+ return any_unpinnable ? -EAGAIN : 0;
+ }
return migrate_longterm_unpinnable_folios(&movable_folio_list, pofs);
}
>--
>Cheers,
>
>David / dhildenb
next prev parent reply other threads:[~2025-05-23 2:37 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CAJrd-UtDD50iN=Yxz4=6kNkAcNAtRFkxhKAbEYiRyyDT-bYPHg@mail.gmail.com>
2025-05-22 10:18 ` 黄朝阳 (Zhaoyang Huang)
2025-05-22 12:22 ` David Hildenbrand
[not found] ` <CGME20250522130101epcas1p435244c12cfc9bb7895008b8ea98af064@epcms1p3>
2025-05-22 13:09 ` Jaewon Kim
2025-05-22 14:06 ` David Hildenbrand
[not found] ` <CGME20250522130101epcas1p435244c12cfc9bb7895008b8ea98af064@epcms1p2>
2025-05-22 14:44 ` 김재원
2025-05-22 15:07 ` David Hildenbrand
2025-05-23 2:48 ` John Hubbard
2025-05-23 2:37 ` 김재원 [this message]
2025-05-23 2:52 ` John Hubbard
2025-05-26 7:48 ` Hyesoo Yu
2025-05-26 8:05 ` Zhaoyang Huang
2025-05-26 9:33 ` Hyesoo Yu
2025-05-26 9:38 ` David Hildenbrand
[not found] ` <CGME20250522130101epcas1p435244c12cfc9bb7895008b8ea98af064@epcms1p8>
2025-05-26 11:17 ` Jaewon Kim
2025-05-26 11:49 ` Zhaoyang Huang
2025-05-28 1:23 ` Hyesoo Yu
2025-05-28 2:49 ` Zhaoyang Huang
2025-05-28 3:36 ` Hyesoo Yu
2025-05-28 7:55 ` David Hildenbrand
2025-05-28 10:59 ` Zhaoyang Huang
2025-05-28 12:57 ` David Hildenbrand
2025-06-03 13:12 ` David Hildenbrand
2025-06-04 1:04 ` Zhaoyang Huang
2025-06-04 9:12 ` David Hildenbrand
2025-06-04 9:41 ` Zhaoyang Huang
2025-06-04 9:48 ` David Hildenbrand
[not found] ` <CGME20250604095542epcas2p3f3d2d6fc17115547981a7173215a09d1@epcas2p3.samsung.com>
2025-06-04 9:53 ` Hyesoo Yu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250523023709epcms1p236d4f55b79adb9366ec1cf6d5792b06b@epcms1p2 \
--to=jaewon31.kim@samsung.com \
--cc=Steve.Kang@unisoc.com \
--cc=david@redhat.com \
--cc=huangzhaoyang@gmail.com \
--cc=hyesoo.yu@samsung.com \
--cc=jaewon31.kim@gmail.com \
--cc=jhubbard@nvidia.com \
--cc=linux-mm@kvack.org \
--cc=surenb@google.com \
--cc=zhaoyang.huang@unisoc.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox