linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Shivank Garg <shivankg@amd.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: kernel test robot <lkp@intel.com>,
	oe-kbuild-all@lists.linux.dev,
	Linux Memory Management List <linux-mm@kvack.org>,
	Baolin Wang <baolin.wang@linux.alibaba.com>
Subject: Re: [akpm-mm:mm-unstable 36/67] mm/khugepaged.c:2337:7: error: implicit declaration of function 'folio_expected_ref_count'; did you mean 'folio_ref_count'?
Date: Thu, 29 May 2025 09:16:04 +0530	[thread overview]
Message-ID: <2a1e37ab-d9f4-44ed-9833-edb06f420388@amd.com> (raw)
In-Reply-To: <20250528114441.abd8980f1d66fcb3e1eaecc2@linux-foundation.org>



On 5/29/2025 12:14 AM, Andrew Morton wrote:
> On Wed, 28 May 2025 18:04:02 +0530 Shivank Garg <shivankg@amd.com> wrote:
> 
>> On 5/28/2025 5:46 PM, kernel test robot wrote:
>>> tree:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-unstable
>>> head:   52ce652e7ab0f015b51fee11b2862507b2c0c25d
>>> commit: 3bdddbba5f02f6d97283acb18e2a6e079324fe4b [36/67] mm/khugepaged: fix race with folio split/free using temporary reference
>>> config: arm64-randconfig-002-20250528 (https://download.01.org/0day-ci/archive/20250528/202505282015.F0fVmLmH-lkp@intel.com/config)
>>> compiler: aarch64-linux-gcc (GCC) 7.5.0
>>> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250528/202505282015.F0fVmLmH-lkp@intel.com/reproduce)
>>>
>>> If you fix the issue in a separate patch/commit (i.e. not just a new version of
>>> the same patch/commit), kindly add following tags
>>> | Reported-by: kernel test robot <lkp@intel.com>
>>> | Closes: https://lore.kernel.org/oe-kbuild-all/202505282015.F0fVmLmH-lkp@intel.com/
>>>
>>> Note: the akpm-mm/mm-unstable HEAD 52ce652e7ab0f015b51fee11b2862507b2c0c25d builds fine.
>>>       It only hurts bisectability.
>>>
>>> All errors (new ones prefixed by >>):
>>>
>>>    mm/khugepaged.c: In function 'hpage_collapse_scan_file':
>>>>> mm/khugepaged.c:2337:7: error: implicit declaration of function 'folio_expected_ref_count'; did you mean 'folio_ref_count'? [-Werror=implicit-function-declaration]
>>>       if (folio_expected_ref_count(folio) + 1 != folio_ref_count(folio)) {
>>>           ^~~~~~~~~~~~~~~~~~~~~~~~
>>>           folio_ref_count
>>>    cc1: some warnings being treated as errors
>>>
>>>
>>> vim +2337 mm/khugepaged.c
>>
>> folio_expected_ref_count() is introduced with this commit[1] and merged into mm-* tree.
>>
>> [1] https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/commit/?h=mm-unstable&id=86ebd50224c0734d965843260d0dc057a9431c61
>>
> 
> Well darn.  We have a patch in mm-hotfixes-unstable which is cc:stable
> (mm-khugepaged-fix-race-with-folio-split-free-using-temporary-reference.patch)
> which is dependent upon a patch
> (mm-add-folio_expected_ref_count-for-reference-count-calculation.patch)
> which is scheduled for 6.16-rc1.
> 
> I'll move
> mm-khugepaged-fix-race-with-folio-split-free-using-temporary-reference.patch
> into mm-stable, after
> mm-add-folio_expected_ref_count-for-reference-count-calculation.patch
> to remove the bisection hole.  This means that when the -stable
> maintainers try to backport
> mm-khugepaged-fix-race-with-folio-split-free-using-temporary-reference.patch
> into earlier kernels, the build will fail.  When this happens, please
> work with them to come up with a version of
> mm-khugepaged-fix-race-with-folio-split-free-using-temporary-reference.patch
> which is suitable for 6.15 and earlier.
>

Hi Andrew,

Below patch is independent of folio_expected_ref_count() and functionally equivalent.
This can be backported to -stable branches.
Please review.

Thanks,
Shivank


From ee23192f3306cdd2553b084ac23430412a9c61d6 Mon Sep 17 00:00:00 2001
From: Shivank Garg <shivankg@amd.com>
Date: Mon, 26 May 2025 18:28:18 +0000
Subject: [PATCH] mm/khugepaged: fix race with folio split/free using temporary
 reference

hpage_collapse_scan_file() calls is_refcount_suitable(), which in turn
calls folio_mapcount(). folio_mapcount() checks folio_test_large() before
proceeding to folio_large_mapcount(), but there is a race window where the
folio may get split/freed between these checks, triggering:

  VM_WARN_ON_FOLIO(!folio_test_large(folio), folio)

Take a temporary reference to the folio in hpage_collapse_scan_file().
This stabilizes the folio during refcount check and prevents incorrect
large folio detection due to concurrent split/free.

Fixes: 05c5323b2a34 ("mm: track mapcount of large folios in single value")
Reported-by: syzbot+2b99589e33edbe9475ca@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/6828470d.a70a0220.38f255.000c.GAE@google.com
Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Shivank Garg <shivankg@amd.com>
---
 mm/khugepaged.c | 26 +++++++++++++++++++++-----
 1 file changed, 21 insertions(+), 5 deletions(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index cc945c6ab3bd..25a406410463 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -548,7 +548,7 @@ static void release_pte_pages(pte_t *pte, pte_t *_pte,
 	}
 }
 
-static bool is_refcount_suitable(struct folio *folio)
+static bool is_refcount_suitable(struct folio *folio, int extra_refs)
 {
 	int expected_refcount = folio_mapcount(folio);
 
@@ -558,7 +558,7 @@ static bool is_refcount_suitable(struct folio *folio)
 	if (folio_test_private(folio))
 		expected_refcount++;
 
-	return folio_ref_count(folio) == expected_refcount;
+	return folio_ref_count(folio) == expected_refcount + extra_refs;
 }
 
 static int __collapse_huge_page_isolate(struct vm_area_struct *vma,
@@ -652,7 +652,7 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma,
 		 * but not from this process. The other process cannot write to
 		 * the page, only trigger CoW.
 		 */
-		if (!is_refcount_suitable(folio)) {
+		if (!is_refcount_suitable(folio, 0)) {
 			folio_unlock(folio);
 			result = SCAN_PAGE_COUNT;
 			goto out;
@@ -1402,7 +1402,7 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm,
 		 * has excessive GUP pins (i.e. 512).  Anyway the same check
 		 * will be done again later the risk seems low.
 		 */
-		if (!is_refcount_suitable(folio)) {
+		if (!is_refcount_suitable(folio, 0)) {
 			result = SCAN_PAGE_COUNT;
 			goto out_unmap;
 		}
@@ -2295,6 +2295,17 @@ static int hpage_collapse_scan_file(struct mm_struct *mm, unsigned long addr,
 			continue;
 		}
 
+		if (!folio_try_get(folio)) {
+			xas_reset(&xas);
+			continue;
+		}
+
+		if (unlikely(folio != xas_reload(&xas))) {
+			folio_put(folio);
+			xas_reset(&xas);
+			continue;
+		}
+
 		if (folio_order(folio) == HPAGE_PMD_ORDER &&
 		    folio->index == start) {
 			/* Maybe PMD-mapped */
@@ -2305,23 +2316,27 @@ static int hpage_collapse_scan_file(struct mm_struct *mm, unsigned long addr,
 			 * it's safe to skip LRU and refcount checks before
 			 * returning.
 			 */
+			folio_put(folio);
 			break;
 		}
 
 		node = folio_nid(folio);
 		if (hpage_collapse_scan_abort(node, cc)) {
 			result = SCAN_SCAN_ABORT;
+			folio_put(folio);
 			break;
 		}
 		cc->node_load[node]++;
 
 		if (!folio_test_lru(folio)) {
 			result = SCAN_PAGE_LRU;
+			folio_put(folio);
 			break;
 		}
 
-		if (!is_refcount_suitable(folio)) {
+		if (!is_refcount_suitable(folio, 1)) {
 			result = SCAN_PAGE_COUNT;
+			folio_put(folio);
 			break;
 		}
 
@@ -2333,6 +2348,7 @@ static int hpage_collapse_scan_file(struct mm_struct *mm, unsigned long addr,
 		 */
 
 		present += folio_nr_pages(folio);
+		folio_put(folio);
 
 		if (need_resched()) {
 			xas_pause(&xas);
-- 
2.34.1





      reply	other threads:[~2025-05-29  3:46 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-28 12:16 kernel test robot
2025-05-28 12:34 ` Shivank Garg
2025-05-28 18:44   ` Andrew Morton
2025-05-29  3:46     ` Shivank Garg [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2a1e37ab-d9f4-44ed-9833-edb06f420388@amd.com \
    --to=shivankg@amd.com \
    --cc=akpm@linux-foundation.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=linux-mm@kvack.org \
    --cc=lkp@intel.com \
    --cc=oe-kbuild-all@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox