* Re: A potential refcount issue during __folio_split [not found] ` <20260222010708.uohpmddmzaa4i4ic@master> @ 2026-02-22 3:00 ` Zi Yan 2026-02-22 10:28 ` Wei Yang 2026-02-23 9:23 ` David Hildenbrand (Arm) 0 siblings, 2 replies; 4+ messages in thread From: Zi Yan @ 2026-02-22 3:00 UTC (permalink / raw) To: Wei Yang; +Cc: David Hildenbrand (Arm), Linux MM +linux-mm list, since the topic is interesting and worth having a record in the list. On 21 Feb 2026, at 20:07, Wei Yang wrote: > On Sun, Feb 22, 2026 at 01:04:25AM +0000, Wei Yang wrote: >> Hi, David & Zi Yan >> >> With some tests, I may find one refcount issue during __folio_split(), when >> >> * folio is isolated and @list is provided >> * @lock_at is not the large folio's head page >> >> The tricky thing is after __folio_freeze_and_split_unmapped() and >> remap_page(), we would release one refcount for each after-split folio except >> @lock_at after-split folio. >> >> But when @list is provided, we would grab extra refcount in >> __folio_freeze_and_split_unmapped() for each tail after-split folio by >> lru_add_split_folio(), except head after-split folio. >> >> If @lock_at is the large folio's head page, it is fine. If not, the >> after-split folio's refcount is not well maintained. >> >> Take anonymous large folio mapped in one process for example, the refcount >> change during uniformly __folio_split() to order-0 would look like this: >> >> after lru_add_split_folio() after remap after unlock if @lockat == head >> f0: 1 1 + 1 = 2 2 >> f1: 1 + 1 = 2 2 + 1 = 3 3 - 1 = 2 >> f2: 1 + 1 = 2 2 + 1 = 3 3 - 1 = 2 >> f3: 1 + 1 = 2 2 + 1 = 3 3 - 1 = 2 >> >> after unlock if @lockat == head + 1 >> 2 - 1 = 1 >> 3 >> 3 - 1 = 2 >> 3 - 1 = 2 >> >> This shows the refcount of f0/f1 is not correct, if @lockat != head page. after lru_add_split_folio(), refcount for head should be 0, since it is frozen, each of the rest subpages should be refcount == 1. Then, head is unfreezed and its refcount goes to 1. remap adds 1 to all subpages refcount. after the unlock loop, every subpage get -1 refcount except lock_at. >> >> The good news is there is no use case in kernel now. Then I am not sure this >> worth a fix. Would like to ask for your opinion first. Hope I don't miss >> something important. >> >> Since there is no real case in kernel, I adjust the current debugfs >> interface(/sys/kernel/debug/split_huge_pages) to trigger it. Below is the diff >> for the change. This change could pass the selftests/split_huge_page_test to >> make sure the code change itself is correct. >> >> Then change the lockat to folio_page(folio, 1) could trigger the issue, when >> trying to split a THP through the debugfs from userspace. >> > > Sorry, the diff is lost: > > From c6d4c3d81e16f5f4b509cff884540bec0f91e6c3 Mon Sep 17 00:00:00 2001 > From: Wei Yang <richard.weiyang@gmail.com> > Date: Thu, 19 Feb 2026 08:44:49 +0800 > Subject: [PATCH] [T]mm/huge_memory: test split with isolation > > --- > mm/huge_memory.c | 31 +++++++++++++++++++++---------- > 1 file changed, 21 insertions(+), 10 deletions(-) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index ed0375ea22d1..65354c5edfef 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -4621,9 +4621,11 @@ static int split_huge_pages_pid(int pid, unsigned long vaddr_start, > for (addr = vaddr_start; addr < vaddr_end; addr += PAGE_SIZE) { > struct vm_area_struct *vma = vma_lookup(mm, addr); > struct folio_walk fw; > - struct folio *folio; > + struct folio *folio, *folio2; > + struct page *lockat; > struct address_space *mapping; > unsigned int target_order = new_order; > + LIST_HEAD(split_folios); > > if (!vma) > break; > @@ -4660,32 +4662,41 @@ static int split_huge_pages_pid(int pid, unsigned long vaddr_start, > folio_expected_ref_count(folio) != folio_ref_count(folio)) > goto next; > > - if (!folio_trylock(folio)) > + if (!folio_isolate_lru(folio)) > goto next; > - folio_get(folio); > - folio_walk_end(&fw, vma); > > if (!folio_test_anon(folio) && folio->mapping != mapping) > - goto unlock; > + goto putback; > + > + folio_lock(folio); > + folio_walk_end(&fw, vma); > + lockat = folio_page(folio, 0); > > if (in_folio_offset < 0 || > in_folio_offset >= folio_nr_pages(folio)) { > - if (!split_folio_to_order(folio, target_order)) > + lockat = folio_page(folio, 0); > + if (!split_huge_page_to_list_to_order(lockat, &split_folios, target_order)) > split++; > } else { > struct page *split_at = folio_page(folio, > in_folio_offset); > - if (!folio_split(folio, target_order, split_at, NULL)) > + if (!folio_split(folio, target_order, split_at, &split_folios)) > split++; > } > > -unlock: > + list_add_tail(&folio->lru, &split_folios); > + folio_unlock(page_folio(lockat)); > > - folio_unlock(folio); > - folio_put(folio); > + list_for_each_entry_safe(folio, folio2, &split_folios, lru) { > + list_del(&folio->lru); > + folio_putback_lru(folio); > + } > > cond_resched(); > continue; > + > +putback: > + folio_putback_lru(folio); ^^^^^^^^^^ cannot always put folio here. > next: > folio_walk_end(&fw, vma); > cond_resched(); > -- Your code change is wrong. Because when you are using split_huge_page_to_list_to_order(), the code pattern should be: get_page(lock_at); lock_page(lock_at); split_huge_page_to_list_to_order(lock_at); unlock_page(lock_at); put_page(lock_at); So the extra refcount in lock_at will be decreased by put_page(lock_at); But your code change does not do put_page(lock_at) but always does folio_putback_lru(folio), where folio is the original head. BTW, in the folio world, I do not think it is possible to perform the aforementioned split_huge_page_to_list_to_order() pattern any more, since you always work on folio, the head. Unless there is a need of getting hold of a tail after-split folio after a folio split, the pattern would be: tail_page = folio_page(folio, N); folio_get(folio); folio_lock(folio); folio_split(folio, ..., /* new parameter: lock_at = */ tail_page, ...); tail_folio = page_folio(tail_page); folio_unlock(tail_folio); folio_put(tail_folio); -- Best Regards, Yan, Zi ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: A potential refcount issue during __folio_split 2026-02-22 3:00 ` A potential refcount issue during __folio_split Zi Yan @ 2026-02-22 10:28 ` Wei Yang 2026-02-23 9:23 ` David Hildenbrand (Arm) 1 sibling, 0 replies; 4+ messages in thread From: Wei Yang @ 2026-02-22 10:28 UTC (permalink / raw) To: Zi Yan; +Cc: Wei Yang, David Hildenbrand (Arm), Linux MM On Sat, Feb 21, 2026 at 10:00:44PM -0500, Zi Yan wrote: >+linux-mm list, since the topic is interesting and worth having a record in the list. > >On 21 Feb 2026, at 20:07, Wei Yang wrote: > >> On Sun, Feb 22, 2026 at 01:04:25AM +0000, Wei Yang wrote: >>> Hi, David & Zi Yan >>> >>> With some tests, I may find one refcount issue during __folio_split(), when >>> >>> * folio is isolated and @list is provided >>> * @lock_at is not the large folio's head page >>> >>> The tricky thing is after __folio_freeze_and_split_unmapped() and >>> remap_page(), we would release one refcount for each after-split folio except >>> @lock_at after-split folio. >>> >>> But when @list is provided, we would grab extra refcount in >>> __folio_freeze_and_split_unmapped() for each tail after-split folio by >>> lru_add_split_folio(), except head after-split folio. >>> >>> If @lock_at is the large folio's head page, it is fine. If not, the >>> after-split folio's refcount is not well maintained. >>> >>> Take anonymous large folio mapped in one process for example, the refcount >>> change during uniformly __folio_split() to order-0 would look like this: >>> >>> after lru_add_split_folio() after remap after unlock if @lockat == head >>> f0: 1 1 + 1 = 2 2 >>> f1: 1 + 1 = 2 2 + 1 = 3 3 - 1 = 2 >>> f2: 1 + 1 = 2 2 + 1 = 3 3 - 1 = 2 >>> f3: 1 + 1 = 2 2 + 1 = 3 3 - 1 = 2 >>> >>> after unlock if @lockat == head + 1 >>> 2 - 1 = 1 >>> 3 >>> 3 - 1 = 2 >>> 3 - 1 = 2 >>> >>> This shows the refcount of f0/f1 is not correct, if @lockat != head page. > >after lru_add_split_folio(), refcount for head should be 0, since it is frozen, >each of the rest subpages should be refcount == 1. Then, head is unfreezed >and its refcount goes to 1. remap adds 1 to all subpages refcount. >after the unlock loop, every subpage get -1 refcount except lock_at. > > >>> >>> The good news is there is no use case in kernel now. Then I am not sure this >>> worth a fix. Would like to ask for your opinion first. Hope I don't miss >>> something important. >>> >>> Since there is no real case in kernel, I adjust the current debugfs >>> interface(/sys/kernel/debug/split_huge_pages) to trigger it. Below is the diff >>> for the change. This change could pass the selftests/split_huge_page_test to >>> make sure the code change itself is correct. >>> >>> Then change the lockat to folio_page(folio, 1) could trigger the issue, when >>> trying to split a THP through the debugfs from userspace. >>> >> >> Sorry, the diff is lost: >> >> From c6d4c3d81e16f5f4b509cff884540bec0f91e6c3 Mon Sep 17 00:00:00 2001 >> From: Wei Yang <richard.weiyang@gmail.com> >> Date: Thu, 19 Feb 2026 08:44:49 +0800 >> Subject: [PATCH] [T]mm/huge_memory: test split with isolation >> >> --- >> mm/huge_memory.c | 31 +++++++++++++++++++++---------- >> 1 file changed, 21 insertions(+), 10 deletions(-) >> >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >> index ed0375ea22d1..65354c5edfef 100644 >> --- a/mm/huge_memory.c >> +++ b/mm/huge_memory.c >> @@ -4621,9 +4621,11 @@ static int split_huge_pages_pid(int pid, unsigned long vaddr_start, >> for (addr = vaddr_start; addr < vaddr_end; addr += PAGE_SIZE) { >> struct vm_area_struct *vma = vma_lookup(mm, addr); >> struct folio_walk fw; >> - struct folio *folio; >> + struct folio *folio, *folio2; >> + struct page *lockat; >> struct address_space *mapping; >> unsigned int target_order = new_order; >> + LIST_HEAD(split_folios); >> >> if (!vma) >> break; >> @@ -4660,32 +4662,41 @@ static int split_huge_pages_pid(int pid, unsigned long vaddr_start, >> folio_expected_ref_count(folio) != folio_ref_count(folio)) >> goto next; >> >> - if (!folio_trylock(folio)) >> + if (!folio_isolate_lru(folio)) >> goto next; >> - folio_get(folio); >> - folio_walk_end(&fw, vma); >> >> if (!folio_test_anon(folio) && folio->mapping != mapping) >> - goto unlock; >> + goto putback; >> + >> + folio_lock(folio); >> + folio_walk_end(&fw, vma); >> + lockat = folio_page(folio, 0); >> >> if (in_folio_offset < 0 || >> in_folio_offset >= folio_nr_pages(folio)) { >> - if (!split_folio_to_order(folio, target_order)) >> + lockat = folio_page(folio, 0); >> + if (!split_huge_page_to_list_to_order(lockat, &split_folios, target_order)) >> split++; >> } else { >> struct page *split_at = folio_page(folio, >> in_folio_offset); >> - if (!folio_split(folio, target_order, split_at, NULL)) >> + if (!folio_split(folio, target_order, split_at, &split_folios)) >> split++; >> } >> >> -unlock: >> + list_add_tail(&folio->lru, &split_folios); >> + folio_unlock(page_folio(lockat)); >> >> - folio_unlock(folio); >> - folio_put(folio); >> + list_for_each_entry_safe(folio, folio2, &split_folios, lru) { >> + list_del(&folio->lru); >> + folio_putback_lru(folio); >> + } >> >> cond_resched(); >> continue; >> + >> +putback: >> + folio_putback_lru(folio); > > ^^^^^^^^^^ cannot always put folio here. > You mean I should put page_folio(lockat)? This is the error patch. After isolation, if folio mapping changes, it release the folio. So the folio is not split yet. For other case, it handles differently. See below. >> next: >> folio_walk_end(&fw, vma); >> cond_resched(); >> -- > >Your code change is wrong. Because when you are using split_huge_page_to_list_to_order(), >the code pattern should be: > >get_page(lock_at); >lock_page(lock_at); >split_huge_page_to_list_to_order(lock_at); >unlock_page(lock_at); >put_page(lock_at); > >So the extra refcount in lock_at will be decreased by put_page(lock_at); > Yes, generally it is. But we seem not forbid provide a list to put after-split folio. And I found there is another requirement, if we want to put after-folio on specified list, we have to isolate folio first. I took sometime to realize it. >But your code change does not do put_page(lock_at) but always does folio_putback_lru(folio), >where folio is the original head. > I put "folio" to @split_folios after split. And then do folio_putback_lru() for each of them. If split successful, @split_folios contains all after-split folios, including @lock_at. If split fails, @split_folios contains the original folio, including @lock_at by some kind. Maybe I misunderstand your point. Would you mind be more specific? >BTW, in the folio world, I do not think it is possible to perform the aforementioned >split_huge_page_to_list_to_order() pattern any more, since you always work on folio, >the head. Unless there is a need of getting hold of a tail after-split folio after >a folio split, the pattern would be: > >tail_page = folio_page(folio, N); > >folio_get(folio); >folio_lock(folio); >folio_split(folio, ..., /* new parameter: lock_at = */ tail_page, ...); >tail_folio = page_folio(tail_page); >folio_unlock(tail_folio); >folio_put(tail_folio); > Yes, in the folio world, we probably won't lockat a middle page. Another thing I found is in commit e9b61f19858a ("thp: reintroduce split_huge_page()"), the comment of split_huge_page_to_list() says "@page can point to any subpage of huge page to split", but the refcount mechanism seems settle down then. So I am afraid actually @page has to be head since then. -- Wei Yang Help you, Help me ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: A potential refcount issue during __folio_split 2026-02-22 3:00 ` A potential refcount issue during __folio_split Zi Yan 2026-02-22 10:28 ` Wei Yang @ 2026-02-23 9:23 ` David Hildenbrand (Arm) 2026-02-23 11:59 ` Wei Yang 1 sibling, 1 reply; 4+ messages in thread From: David Hildenbrand (Arm) @ 2026-02-23 9:23 UTC (permalink / raw) To: Zi Yan, Wei Yang; +Cc: Linux MM > BTW, in the folio world, I do not think it is possible to perform the aforementioned > split_huge_page_to_list_to_order() pattern any more, since you always work on folio, > the head. Unless there is a need of getting hold of a tail after-split folio after > a folio split, the pattern would be: > > tail_page = folio_page(folio, N); > > folio_get(folio); > folio_lock(folio); > folio_split(folio, ..., /* new parameter: lock_at = */ tail_page, ...); > tail_folio = page_folio(tail_page); > folio_unlock(tail_folio); > folio_put(tail_folio); Agreed. Maybe it would be even nicer if the split function could return the new folio directly. folio_get(folio); folio_lock(folio); split_folio = folio_split_XXX(folio, ..., tail_page, ...); if (IS_ERR_VALUE(split_folio)) { ... } folio_unlock(split_folio); folio_put(split__folio); -- Cheers, David ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: A potential refcount issue during __folio_split 2026-02-23 9:23 ` David Hildenbrand (Arm) @ 2026-02-23 11:59 ` Wei Yang 0 siblings, 0 replies; 4+ messages in thread From: Wei Yang @ 2026-02-23 11:59 UTC (permalink / raw) To: David Hildenbrand (Arm); +Cc: Zi Yan, Wei Yang, Linux MM On Mon, Feb 23, 2026 at 10:23:11AM +0100, David Hildenbrand (Arm) wrote: >> BTW, in the folio world, I do not think it is possible to perform the aforementioned >> split_huge_page_to_list_to_order() pattern any more, since you always work on folio, >> the head. Unless there is a need of getting hold of a tail after-split folio after >> a folio split, the pattern would be: >> >> tail_page = folio_page(folio, N); >> >> folio_get(folio); >> folio_lock(folio); >> folio_split(folio, ..., /* new parameter: lock_at = */ tail_page, ...); >> tail_folio = page_folio(tail_page); >> folio_unlock(tail_folio); >> folio_put(tail_folio); > Missed this. Agree. >Agreed. Maybe it would be even nicer if the split function could return the >new folio directly. > >folio_get(folio); >folio_lock(folio); >split_folio = folio_split_XXX(folio, ..., tail_page, ...); >if (IS_ERR_VALUE(split_folio)) { > ... >} >folio_unlock(split_folio); >folio_put(split__folio); > I am afraid it would be complicated? Well, we don't have this usecase now, could decide it when we do need it. >-- >Cheers, > >David -- Wei Yang Help you, Help me ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-02-23 11:59 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <20260222010425.gbsjzhrew3pg4qrw@master>
[not found] ` <20260222010708.uohpmddmzaa4i4ic@master>
2026-02-22 3:00 ` A potential refcount issue during __folio_split Zi Yan
2026-02-22 10:28 ` Wei Yang
2026-02-23 9:23 ` David Hildenbrand (Arm)
2026-02-23 11:59 ` Wei Yang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox