From: Shivank Garg <shivankg@amd.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: kernel test robot <lkp@intel.com>,
oe-kbuild-all@lists.linux.dev,
Linux Memory Management List <linux-mm@kvack.org>,
Baolin Wang <baolin.wang@linux.alibaba.com>
Subject: Re: [akpm-mm:mm-unstable 36/67] mm/khugepaged.c:2337:7: error: implicit declaration of function 'folio_expected_ref_count'; did you mean 'folio_ref_count'?
Date: Thu, 29 May 2025 09:16:04 +0530 [thread overview]
Message-ID: <2a1e37ab-d9f4-44ed-9833-edb06f420388@amd.com> (raw)
In-Reply-To: <20250528114441.abd8980f1d66fcb3e1eaecc2@linux-foundation.org>
On 5/29/2025 12:14 AM, Andrew Morton wrote:
> On Wed, 28 May 2025 18:04:02 +0530 Shivank Garg <shivankg@amd.com> wrote:
>
>> On 5/28/2025 5:46 PM, kernel test robot wrote:
>>> tree: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-unstable
>>> head: 52ce652e7ab0f015b51fee11b2862507b2c0c25d
>>> commit: 3bdddbba5f02f6d97283acb18e2a6e079324fe4b [36/67] mm/khugepaged: fix race with folio split/free using temporary reference
>>> config: arm64-randconfig-002-20250528 (https://download.01.org/0day-ci/archive/20250528/202505282015.F0fVmLmH-lkp@intel.com/config)
>>> compiler: aarch64-linux-gcc (GCC) 7.5.0
>>> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250528/202505282015.F0fVmLmH-lkp@intel.com/reproduce)
>>>
>>> If you fix the issue in a separate patch/commit (i.e. not just a new version of
>>> the same patch/commit), kindly add following tags
>>> | Reported-by: kernel test robot <lkp@intel.com>
>>> | Closes: https://lore.kernel.org/oe-kbuild-all/202505282015.F0fVmLmH-lkp@intel.com/
>>>
>>> Note: the akpm-mm/mm-unstable HEAD 52ce652e7ab0f015b51fee11b2862507b2c0c25d builds fine.
>>> It only hurts bisectability.
>>>
>>> All errors (new ones prefixed by >>):
>>>
>>> mm/khugepaged.c: In function 'hpage_collapse_scan_file':
>>>>> mm/khugepaged.c:2337:7: error: implicit declaration of function 'folio_expected_ref_count'; did you mean 'folio_ref_count'? [-Werror=implicit-function-declaration]
>>> if (folio_expected_ref_count(folio) + 1 != folio_ref_count(folio)) {
>>> ^~~~~~~~~~~~~~~~~~~~~~~~
>>> folio_ref_count
>>> cc1: some warnings being treated as errors
>>>
>>>
>>> vim +2337 mm/khugepaged.c
>>
>> folio_expected_ref_count() is introduced with this commit[1] and merged into mm-* tree.
>>
>> [1] https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/commit/?h=mm-unstable&id=86ebd50224c0734d965843260d0dc057a9431c61
>>
>
> Well darn. We have a patch in mm-hotfixes-unstable which is cc:stable
> (mm-khugepaged-fix-race-with-folio-split-free-using-temporary-reference.patch)
> which is dependent upon a patch
> (mm-add-folio_expected_ref_count-for-reference-count-calculation.patch)
> which is scheduled for 6.16-rc1.
>
> I'll move
> mm-khugepaged-fix-race-with-folio-split-free-using-temporary-reference.patch
> into mm-stable, after
> mm-add-folio_expected_ref_count-for-reference-count-calculation.patch
> to remove the bisection hole. This means that when the -stable
> maintainers try to backport
> mm-khugepaged-fix-race-with-folio-split-free-using-temporary-reference.patch
> into earlier kernels, the build will fail. When this happens, please
> work with them to come up with a version of
> mm-khugepaged-fix-race-with-folio-split-free-using-temporary-reference.patch
> which is suitable for 6.15 and earlier.
>
Hi Andrew,
Below patch is independent of folio_expected_ref_count() and functionally equivalent.
This can be backported to -stable branches.
Please review.
Thanks,
Shivank
From ee23192f3306cdd2553b084ac23430412a9c61d6 Mon Sep 17 00:00:00 2001
From: Shivank Garg <shivankg@amd.com>
Date: Mon, 26 May 2025 18:28:18 +0000
Subject: [PATCH] mm/khugepaged: fix race with folio split/free using temporary
reference
hpage_collapse_scan_file() calls is_refcount_suitable(), which in turn
calls folio_mapcount(). folio_mapcount() checks folio_test_large() before
proceeding to folio_large_mapcount(), but there is a race window where the
folio may get split/freed between these checks, triggering:
VM_WARN_ON_FOLIO(!folio_test_large(folio), folio)
Take a temporary reference to the folio in hpage_collapse_scan_file().
This stabilizes the folio during refcount check and prevents incorrect
large folio detection due to concurrent split/free.
Fixes: 05c5323b2a34 ("mm: track mapcount of large folios in single value")
Reported-by: syzbot+2b99589e33edbe9475ca@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/6828470d.a70a0220.38f255.000c.GAE@google.com
Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Shivank Garg <shivankg@amd.com>
---
mm/khugepaged.c | 26 +++++++++++++++++++++-----
1 file changed, 21 insertions(+), 5 deletions(-)
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index cc945c6ab3bd..25a406410463 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -548,7 +548,7 @@ static void release_pte_pages(pte_t *pte, pte_t *_pte,
}
}
-static bool is_refcount_suitable(struct folio *folio)
+static bool is_refcount_suitable(struct folio *folio, int extra_refs)
{
int expected_refcount = folio_mapcount(folio);
@@ -558,7 +558,7 @@ static bool is_refcount_suitable(struct folio *folio)
if (folio_test_private(folio))
expected_refcount++;
- return folio_ref_count(folio) == expected_refcount;
+ return folio_ref_count(folio) == expected_refcount + extra_refs;
}
static int __collapse_huge_page_isolate(struct vm_area_struct *vma,
@@ -652,7 +652,7 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma,
* but not from this process. The other process cannot write to
* the page, only trigger CoW.
*/
- if (!is_refcount_suitable(folio)) {
+ if (!is_refcount_suitable(folio, 0)) {
folio_unlock(folio);
result = SCAN_PAGE_COUNT;
goto out;
@@ -1402,7 +1402,7 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm,
* has excessive GUP pins (i.e. 512). Anyway the same check
* will be done again later the risk seems low.
*/
- if (!is_refcount_suitable(folio)) {
+ if (!is_refcount_suitable(folio, 0)) {
result = SCAN_PAGE_COUNT;
goto out_unmap;
}
@@ -2295,6 +2295,17 @@ static int hpage_collapse_scan_file(struct mm_struct *mm, unsigned long addr,
continue;
}
+ if (!folio_try_get(folio)) {
+ xas_reset(&xas);
+ continue;
+ }
+
+ if (unlikely(folio != xas_reload(&xas))) {
+ folio_put(folio);
+ xas_reset(&xas);
+ continue;
+ }
+
if (folio_order(folio) == HPAGE_PMD_ORDER &&
folio->index == start) {
/* Maybe PMD-mapped */
@@ -2305,23 +2316,27 @@ static int hpage_collapse_scan_file(struct mm_struct *mm, unsigned long addr,
* it's safe to skip LRU and refcount checks before
* returning.
*/
+ folio_put(folio);
break;
}
node = folio_nid(folio);
if (hpage_collapse_scan_abort(node, cc)) {
result = SCAN_SCAN_ABORT;
+ folio_put(folio);
break;
}
cc->node_load[node]++;
if (!folio_test_lru(folio)) {
result = SCAN_PAGE_LRU;
+ folio_put(folio);
break;
}
- if (!is_refcount_suitable(folio)) {
+ if (!is_refcount_suitable(folio, 1)) {
result = SCAN_PAGE_COUNT;
+ folio_put(folio);
break;
}
@@ -2333,6 +2348,7 @@ static int hpage_collapse_scan_file(struct mm_struct *mm, unsigned long addr,
*/
present += folio_nr_pages(folio);
+ folio_put(folio);
if (need_resched()) {
xas_pause(&xas);
--
2.34.1
prev parent reply other threads:[~2025-05-29 3:46 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-28 12:16 kernel test robot
2025-05-28 12:34 ` Shivank Garg
2025-05-28 18:44 ` Andrew Morton
2025-05-29 3:46 ` Shivank Garg [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2a1e37ab-d9f4-44ed-9833-edb06f420388@amd.com \
--to=shivankg@amd.com \
--cc=akpm@linux-foundation.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=linux-mm@kvack.org \
--cc=lkp@intel.com \
--cc=oe-kbuild-all@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox