linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: <hailong.liu@oppo.com>
To: Andrew Morton <akpm@linux-foundation.org>,
	Uladzislau Rezki <urezki@gmail.com>,
	Christoph Hellwig <hch@infradead.org>,
	Lorenzo Stoakes <lstoakes@gmail.com>,
	Vlastimil Babka <vbabka@suse.cz>, Michal Hocko <mhocko@suse.com>
Cc: Hailong.Liu <hailong.liu@oppo.com>,
	Barry Song <21cnbao@gmail.com>, Baoquan He <bhe@redhat.com>,
	Matthew Wilcox <willy@infradead.org>,
	"Tangquan . Zheng" <zhengtangquan@oppo.com>, <linux-mm@kvack.org>,
	<linux-kernel@vger.kernel.org>
Subject: [RFC PATCH v2] mm/vmalloc: fix incorrect __vmap_pages_range_noflush() if vm_area_alloc_pages() from high order fallback to order0
Date: Thu, 25 Jul 2024 11:53:15 +0800	[thread overview]
Message-ID: <20240725035318.471-1-hailong.liu@oppo.com> (raw)

From: "Hailong.Liu" <hailong.liu@oppo.com>

The scenario where the issue occurs is as follows:
CONFIG: vmap_allow_huge = true && 2M is for PMD_SIZE
kvmalloc(2M, __GFP_NOFAIL|GFP_XXX)
    __vmalloc_node_range(vm_flags=VM_ALLOW_HUGE_VMAP)
        vm_area_alloc_pages(order=9) --->allocs order9 failed and fallback to order0
                                        and phys_addr is aligned with PMD_SIZE
            vmap_pages_range
                vmap_pages_range_noflush
                    __vmap_pages_range_noflush(page_shift = 21) ----> incorrect vmap *huge* here

In fact, as long as page_shift is not equal to PAGE_SHIFT, there
might be issues with the __vmap_pages_range_noflush().

The patch also remove VM_ALLOW_HUGE_VMAP in kvmalloc_node(), There
are several reasons for this:
- This increases memory footprint because ALIGNMENT.
- This increases the likelihood of kvmalloc allocation failures.
- Without this it fixes the origin issue of kvmalloc with __GFP_NOFAIL may return NULL.
Besides if drivers want to vmap huge, user vmalloc_huge instead.

Fix it by disabling fallback and remove VM_ALLOW_HUGE_VMAP in
kvmalloc_node().
Fixes: e9c3cda4d86e ("mm, vmalloc: fix high order __GFP_NOFAIL allocations")

CC: Barry Song <21cnbao@gmail.com>
CC: Baoquan He <bhe@redhat.com>
CC: Matthew Wilcox <willy@infradead.org>
Reported-by: Tangquan.Zheng <zhengtangquan@oppo.com>
Signed-off-by: Hailong.Liu <hailong.liu@oppo.com>
---
 mm/util.c    | 2 +-
 mm/vmalloc.c | 9 ---------
 2 files changed, 1 insertion(+), 10 deletions(-)

diff --git a/mm/util.c b/mm/util.c
index 669397235787..b23133b738cf 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -657,7 +657,7 @@ void *kvmalloc_node(size_t size, gfp_t flags, int node)
 	 * protection games.
 	 */
 	return __vmalloc_node_range(size, 1, VMALLOC_START, VMALLOC_END,
-			flags, PAGE_KERNEL, VM_ALLOW_HUGE_VMAP,
+			flags, PAGE_KERNEL, 0,
 			node, __builtin_return_address(0));
 }
 EXPORT_SYMBOL(kvmalloc_node);
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 03c78fae06f3..1914768f473e 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -3577,15 +3577,6 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
 			page = alloc_pages(alloc_gfp, order);
 		else
 			page = alloc_pages_node(nid, alloc_gfp, order);
-		if (unlikely(!page)) {
-			if (!nofail)
-				break;
-
-			/* fall back to the zero order allocations */
-			alloc_gfp |= __GFP_NOFAIL;
-			order = 0;
-			continue;
-		}

 		/*
 		 * Higher order allocations must be able to be treated as
--
After 1) I check the code and I can't find a resonable band-aid to fix
this. so the v2 patch works but ugly. Glad to hear a better solution :)

[1] https://lore.kernel.org/lkml/20240724182827.nlgdckimtg2gwns5@oppo.com/
2.34.1


             reply	other threads:[~2024-07-25  3:53 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-25  3:53 hailong.liu [this message]
2024-07-25  6:21 ` Barry Song
2024-07-25  9:17   ` Hailong Liu
2024-07-25  9:34     ` Barry Song
2024-07-25  9:58       ` Hailong Liu
2024-07-25 10:22         ` Barry Song
2024-07-25 11:39 ` Baoquan He
2024-07-25 16:40   ` Hailong Liu
2024-07-26  1:31     ` Barry Song
2024-07-26  2:31     ` Baoquan He
2024-07-26  3:53       ` Barry Song
2024-07-29  1:48         ` Baoquan He
2024-07-30  3:24           ` Hailong Liu
2024-07-30  4:29             ` Baoquan He
2024-07-26  4:00       ` Hailong Liu
2024-07-26  5:03         ` Hailong Liu
2024-07-26  5:29           ` Barry Song
2024-07-26  8:37             ` Baoquan He
2024-07-26  8:48               ` Hailong Liu
2024-07-26  9:00                 ` Hailong Liu
2024-07-26  9:29                 ` Baoquan He
2024-07-26  9:15               ` Barry Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240725035318.471-1-hailong.liu@oppo.com \
    --to=hailong.liu@oppo.com \
    --cc=21cnbao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=bhe@redhat.com \
    --cc=hch@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lstoakes@gmail.com \
    --cc=mhocko@suse.com \
    --cc=urezki@gmail.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=zhengtangquan@oppo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox