From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8DCE8C3DA5D for ; Thu, 25 Jul 2024 11:39:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 93DB26B0083; Thu, 25 Jul 2024 07:39:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8EBD16B0085; Thu, 25 Jul 2024 07:39:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7DACB6B0088; Thu, 25 Jul 2024 07:39:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 614E06B0083 for ; Thu, 25 Jul 2024 07:39:21 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 045231C2954 for ; Thu, 25 Jul 2024 11:39:20 +0000 (UTC) X-FDA: 82378079322.03.1220A30 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf16.hostedemail.com (Postfix) with ESMTP id 50B53180010 for ; Thu, 25 Jul 2024 11:39:18 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="Ywc/3Ji6"; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf16.hostedemail.com: domain of bhe@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=bhe@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721907503; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=GGbeiqKRT+tHcz5I4U4+03vVdQrB/TWsSocFYpZneWA=; b=egIu4f7Q1/32OlO9OFWSX9crQOaJIImHLfLWW9UqWNJXKd9mTmhwcjAKPd4arXYdS7QW/L Gjku88+/mqijBznVCSiE2bkV8BWQSh/pG6DqcoeQP9gEjEOzVBfSj5+tmPTFTbHA1ZssYc DMTZs+gCGT2PyWns8xsDI0s1R5Bv6YY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721907503; a=rsa-sha256; cv=none; b=zsI80YB861uoLT3WBDyNljA5+wzf1WguVggQzoHT1YIIkT/mNUUtnsLQVBMWxqzsXnRjEA Ho4Kcuk5MkNOgEIP5i9KRz+jarEfaNrNTMb9qUDsFl4+5Palgqlkc4UK16yNed0Mzu1ykh AQXqkIxN7+5KSnxYuIVmeX9ATTj2DwM= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="Ywc/3Ji6"; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf16.hostedemail.com: domain of bhe@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=bhe@redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1721907557; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=GGbeiqKRT+tHcz5I4U4+03vVdQrB/TWsSocFYpZneWA=; b=Ywc/3Ji6MGghZik+/wplbgH3NMkugDBuL637DHX/y9g3XmkMqX1d0r4eXsCcOmekc88JND 3ndZHasnj1f6qn6vLT0pSRyNe/n6nDHIFhilfgX84pQxrH0uh0pNLaESf+W/zz9RoWWLfN 1kuq8FXMXckAotBg5Eka9dvC8VxUoeE= Received: from mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-643-92xIQx3xP2GvFj0tQGUHpA-1; Thu, 25 Jul 2024 07:39:12 -0400 X-MC-Unique: 92xIQx3xP2GvFj0tQGUHpA-1 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 060E21955D4D; Thu, 25 Jul 2024 11:39:10 +0000 (UTC) Received: from localhost (unknown [10.72.112.12]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 194243000194; Thu, 25 Jul 2024 11:39:07 +0000 (UTC) Date: Thu, 25 Jul 2024 19:39:03 +0800 From: Baoquan He To: hailong.liu@oppo.com Cc: Andrew Morton , Uladzislau Rezki , Christoph Hellwig , Lorenzo Stoakes , Vlastimil Babka , Michal Hocko , Barry Song <21cnbao@gmail.com>, Matthew Wilcox , "Tangquan . Zheng" , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [RFC PATCH v2] mm/vmalloc: fix incorrect __vmap_pages_range_noflush() if vm_area_alloc_pages() from high order fallback to order0 Message-ID: References: <20240725035318.471-1-hailong.liu@oppo.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240725035318.471-1-hailong.liu@oppo.com> X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 50B53180010 X-Stat-Signature: ohspcsr1tfey75e5pjb1t4bxu1p5ndpn X-Rspam-User: X-HE-Tag: 1721907558-908063 X-HE-Meta: U2FsdGVkX19Y/ZQZBd2TjJ3Xt5va17njBuejAzBgL5diXd//slB9gC4GlcJDZLdBzyb2G08aqBG4S3/+kcCH3T7lAXGL3kQiwsW0tNHEbNA1uV9y0TxOgaMjcYOZF73apMOkhvQxXu+Wo/WwECDnm2bkbHx+HHnUEWaQgQhvgHQNfvo6vhPwnboA+eExMe3AjENx8KSs643PJZht02H+5nf+Fjo+yRa4OOCf123lrFWYhrWoVpQs3S/Y0gecQH7g3JjtM4nOGsDAG6hTq7IxEuOgdsHk4eqbWy6EpXj5yJrdVkUIWK2jNCMPHVWxIPOPEEmxXrvBq6/Rwf83Lva2/Ph0mG7qLYGn9Dg+yAKHHpHySF20fkEe3RYNIEv5erqPwv5u2SLfUilLAU+ak5HwBLF4PY6v5kqVdFySlFXYLo+BFFNNVXa7R9oIb7dn0+9HrlDbj0PhKP0PLe/xfDvwv8OiydPSbytpl/Oxkxm6P0r4zI/G5bNVTUngB/g654qu7l2jn9dRSZatWcDKDFRYzMTj+c7z+9Uwux3WacUuy8P5JCLqmB0CgPziHEIlaAHD1kn1rx3aW1iKJsz+rMDFFXlbD9Gr3NbyByZVOmJkbwqYgmcnyMnEgk/I6bLpalNixL4K6xVWOiJ8pOerU1y+F6vdUNKwdTf88epB/r87aXFwbfvosFfSvtOZ7RHZM6WRFrE4JSQlBfoWfPtglfE5h/o20VwzGZcPjbv7x9/62JXdO0QRcHL0C1MbdqW9zIZ95pU5157bc8xPz5lUxxg8pbBcSKkCzM+JLw631A4c+ykgsq1ePT3FVMBbkJNtgFIJBFYsO+mCteONKqeoVzw/Fo9hLmltY3mCEZgRJMaeZhnFeJ9MhpZIHXuFJtLjEEBk1eS/E6i4ziP+NXdcaJeX+IHnK2xk2JNAonmVX0NgtOVXkvsdKh5+6k0CBAApNKDcaX+1AmBcDoq/+Q/YKia biSwTrAt vVwzJ3AWqR3dHL4jxlbVQ9dU+oVzXz1qmdGkCStf7Y/wM6RuVVR9RXfvm8mftPxIUvUFEzSiqcP51o58HYyUvExX4iQ7gWMorUsJlsDhTZ4naDYyZ/e+sUv68t0so5RRe/ugcM0gXt9ZcIG0ttiJSFXiZ5XPHqqLZRAp57VdDmQ+RnKtt9wlWvHMDCNoJaBG4gI3v+O7bmZYMGIp4oLd579XMPz4auJoxJK4qs9SP8wMqqMdmjx55Tr88lkYhTUf4UUI4X+zOQAMufOQo/B7r76ljkU7lFABdte4X3Eh5taalbtlUxjYY39iR6aW3fI+WkzH3asXgBKvE7kV9g6y3psIsEDqBVuuulFpOG3Y3xHN2qBryDgHuDw1jUEKFCeZ/u0SE1JsBhKGB0Y/ueJ73UgxwDZeHmkafRvkhSCsGgYqowo4RADT2jn+YJ8/L0DisFB/nxUcyoimcwP8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 07/25/24 at 11:53am, hailong.liu@oppo.com wrote: > From: "Hailong.Liu" > > The scenario where the issue occurs is as follows: > CONFIG: vmap_allow_huge = true && 2M is for PMD_SIZE > kvmalloc(2M, __GFP_NOFAIL|GFP_XXX) > __vmalloc_node_range(vm_flags=VM_ALLOW_HUGE_VMAP) > vm_area_alloc_pages(order=9) --->allocs order9 failed and fallback to order0 > and phys_addr is aligned with PMD_SIZE > vmap_pages_range > vmap_pages_range_noflush > __vmap_pages_range_noflush(page_shift = 21) ----> incorrect vmap *huge* here > > In fact, as long as page_shift is not equal to PAGE_SHIFT, there > might be issues with the __vmap_pages_range_noflush(). > > The patch also remove VM_ALLOW_HUGE_VMAP in kvmalloc_node(), There > are several reasons for this: > - This increases memory footprint because ALIGNMENT. > - This increases the likelihood of kvmalloc allocation failures. > - Without this it fixes the origin issue of kvmalloc with __GFP_NOFAIL may return NULL. > Besides if drivers want to vmap huge, user vmalloc_huge instead. Seem there are two issues you are folding into one patch: one is the wrong informatin passed into __vmap_pages_range_noflush(); the other is you want to take off VM_ALLOW_HUGE_VMAP on kvmalloc(). About the 1st one, do you think below draft is OK to you? Pass out the fall back order and adjust the order and shift for later usage, mainly for vmap_pages_range(). diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 260897b21b11..5ee9ae518f3d 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -3508,9 +3508,9 @@ EXPORT_SYMBOL_GPL(vmap_pfn); static inline unsigned int vm_area_alloc_pages(gfp_t gfp, int nid, - unsigned int order, unsigned int nr_pages, struct page **pages) + unsigned int *page_order, unsigned int nr_pages, struct page **pages) { - unsigned int nr_allocated = 0; + unsigned int nr_allocated = 0, order = *page_order; gfp_t alloc_gfp = gfp; bool nofail = gfp & __GFP_NOFAIL; struct page *page; @@ -3611,6 +3611,7 @@ vm_area_alloc_pages(gfp_t gfp, int nid, cond_resched(); nr_allocated += 1U << order; } + *page_order = order; return nr_allocated; } @@ -3654,7 +3655,7 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, page_order = vm_area_page_order(area); area->nr_pages = vm_area_alloc_pages(gfp_mask | __GFP_NOWARN, - node, page_order, nr_small_pages, area->pages); + node, &page_order, nr_small_pages, area->pages); atomic_long_add(area->nr_pages, &nr_vmalloc_pages); if (gfp_mask & __GFP_ACCOUNT) { @@ -3686,6 +3687,10 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, goto fail; } + + set_vm_area_page_order(area, page_order); + page_shift = page_order + PAGE_SHIFT; + /* * page tables allocations ignore external gfp mask, enforce it * by the scope API