On Wed, Jun 04, 2025 at 12:07:21PM +0200, David Hildenbrand wrote: > On 04.06.25 11:50, Hyesoo Yu wrote: > > Commit 1aaf8c122918 ("mm: gup: fix infinite loop within __get_longterm_locked") > > caused CMA pages to become pinned in some cases when handling longterm GUP. > > This happened because migration would return success immediately if no pages > > were in the movable_page_list, without retrying. > > > > However, CMA pages can be temporarily off the LRU (e.g., in pagevecs), and > > A better example might be concurrent isolation. Just imagine two of these > longterm pinnings racing. > I will change the example below. CMA pages may be temporarily off the LRU due to concurrent isolation, for example when multiple longterm GUP requests are racing. > > therefore not appear in movable_page_list, even though they can be migrated > > later. Before commit 1aaf8c, the kernel would retry migration in such cases, > > which helped avoid accidental CMA pinning. > > > > The commit 1aaf8c aimed to support an out-of-tree use case (like pKVM), where > > longterm GUP was applied to non-LRU CMA pages. But allowing CMA pinning > > in general for this corner case could lead to more fragmentation and > > reliability issues. So this patch prevents that. > > > > Instead of retrying, this patch explicitly fails the migration attempt > > (-EBUSY) if no movable pages are found and unpinnable pages remain. > > This avoids infinite loops and gives user a clear signal to retry, > > rather then spinning inside kernel. > > Hmmm, that means we will return EBUSY to the caller. Are all users actually > prepared to deal with that? > > So far we only returned EBUSY in this corner-case > migrate_device_coherent_folio() that most callers never actually trigger. > > Maybe we should do EAGAIN for now (old way of doing it?), and look into > doing EBUSY separately. > Thank you for the feedback. I agree. I'll keep this patch focused on resolving the CMA pinning issue using -EAGAIN. Handling -EBUSY can be considered separately. > > > > Fixes: 1aaf8c122918 ("mm: gup: fix infinite loop within __get_longterm_locked") > > Signed-off-by: Hyesoo Yu > > --- > > mm/gup.c | 49 ++++++++++++++++++++++++++----------------------- > > 1 file changed, 26 insertions(+), 23 deletions(-) > > > > diff --git a/mm/gup.c b/mm/gup.c > > index e065a49842a8..446938aedcc9 100644 > > --- a/mm/gup.c > > +++ b/mm/gup.c > > @@ -2303,12 +2303,13 @@ static void pofs_unpin(struct pages_or_folios *pofs) > > /* > > * Returns the number of collected folios. Return value is always >= 0. > > */ > > Comment should be removed. > Got it. I'll remove the comment. > > -static void collect_longterm_unpinnable_folios( > > +static bool collect_longterm_unpinnable_folios( > > struct list_head *movable_folio_list, > > struct pages_or_folios *pofs) > > { > > struct folio *prev_folio = NULL; > > bool drain_allow = true; > > + bool any_unpinnable = false; > > unsigned long i; > > for (i = 0; i < pofs->nr_entries; i++) { > > @@ -2321,6 +2322,8 @@ static void collect_longterm_unpinnable_folios( > > if (folio_is_longterm_pinnable(folio)) > > continue; > > + any_unpinnable = true; > > + > > if (folio_is_device_coherent(folio)) > > continue; > > @@ -2342,6 +2345,8 @@ static void collect_longterm_unpinnable_folios( > > NR_ISOLATED_ANON + folio_is_file_lru(folio), > > folio_nr_pages(folio)); > > } > > + > > + return any_unpinnable; > > } > > /* > > @@ -2353,8 +2358,13 @@ static int > > migrate_longterm_unpinnable_folios(struct list_head *movable_folio_list, > > struct pages_or_folios *pofs) > > { > > - int ret; > > + int ret = -EAGAIN; > > unsigned long i; > > + struct migration_target_control mtc = { > > + .nid = NUMA_NO_NODE, > > + .gfp_mask = GFP_USER | __GFP_NOWARN, > > + .reason = MR_LONGTERM_PIN, > > + }; > > Reverse xmas tree while we're at it. > > But, can we do this cleanup here separately, and not as part of the fix? > I'll prepare a separate patch for the cleanup. > > for (i = 0; i < pofs->nr_entries; i++) { > > struct folio *folio = pofs_get_folio(pofs, i); > > @@ -2370,6 +2380,7 @@ migrate_longterm_unpinnable_folios(struct list_head *movable_folio_list, > > gup_put_folio(folio, 1, FOLL_PIN); > > if (migrate_device_coherent_folio(folio)) { > > + pofs_unpin(pofs); > > ret = -EBUSY; > > goto err; > > } > > @@ -2388,27 +2399,11 @@ migrate_longterm_unpinnable_folios(struct list_head *movable_folio_list, > > pofs_clear_entry(pofs, i); > > } > > - if (!list_empty(movable_folio_list)) { > > - struct migration_target_control mtc = { > > - .nid = NUMA_NO_NODE, > > - .gfp_mask = GFP_USER | __GFP_NOWARN, > > - .reason = MR_LONGTERM_PIN, > > - }; > > - > > - if (migrate_pages(movable_folio_list, alloc_migration_target, > > - NULL, (unsigned long)&mtc, MIGRATE_SYNC, > > - MR_LONGTERM_PIN, NULL)) { > > - ret = -ENOMEM; > > - goto err; > > - } > > - } > > - > > - putback_movable_pages(movable_folio_list); > > - > > - return -EAGAIN; > > + if (migrate_pages(movable_folio_list, alloc_migration_target, NULL, > > + (unsigned long)&mtc, MIGRATE_SYNC, MR_LONGTERM_PIN, NULL)) > > + ret = -ENOMEM; > > err: > > - pofs_unpin(pofs); > > putback_movable_pages(movable_folio_list); > > return ret; > > @@ -2417,11 +2412,19 @@ migrate_longterm_unpinnable_folios(struct list_head *movable_folio_list, > > static long > > check_and_migrate_movable_pages_or_folios(struct pages_or_folios *pofs) > > { > > + bool any_unpinnable; > > + > > LIST_HEAD(movable_folio_list); > > - collect_longterm_unpinnable_folios(&movable_folio_list, pofs); > > - if (list_empty(&movable_folio_list)) > > + any_unpinnable = collect_longterm_unpinnable_folios(&movable_folio_list, pofs); > > + > > + if (list_empty(&movable_folio_list)) { > > + if (any_unpinnable) { > > /* > * If we find any longterm unpinnable page that we failed to > * isolated for migration, it might be because someone else > * concurrently isolated it. Make the caller retry until it > * succeeds. > */ > > I will add the comment. > > + pofs_unpin(pofs); > > + return -EBUSY; > > + } > > return 0; > > + } > > return migrate_longterm_unpinnable_folios(&movable_folio_list, pofs); > > } > > > -- > Cheers, > > David / dhildenb > Thanks, Regards.