From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7C109C678D4 for ; Mon, 6 Mar 2023 08:03:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 00CD36B0072; Mon, 6 Mar 2023 03:03:48 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EFEE96B0073; Mon, 6 Mar 2023 03:03:47 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DED8B6B0074; Mon, 6 Mar 2023 03:03:47 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id CCDBB6B0072 for ; Mon, 6 Mar 2023 03:03:47 -0500 (EST) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 99193AAA72 for ; Mon, 6 Mar 2023 08:03:47 +0000 (UTC) X-FDA: 80537734494.07.7B3461A Received: from out30-99.freemail.mail.aliyun.com (out30-99.freemail.mail.aliyun.com [115.124.30.99]) by imf30.hostedemail.com (Postfix) with ESMTP id A35A48000C for ; Mon, 6 Mar 2023 08:03:44 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=alibaba.com; spf=pass (imf30.hostedemail.com: domain of hsiangkao@linux.alibaba.com designates 115.124.30.99 as permitted sender) smtp.mailfrom=hsiangkao@linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1678089826; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=wmg/GgPCbp4s8x/jJJj4owGVqeEzA5/CvoMf+h/dkzs=; b=4HiWmCnknS3FfjSDTUSa79Ua7iZqOEzSD3KQSyzNHPE4zg2aNy58vRLHSnvGgFNgR+ByCI mqebqtDxeggNZCoErVQlYeG1NlINDdGHtkUdkKK4GhnWvBRnlOq0roPprjtgSm3L4WeMTK +lbHMgd3Q8q5NKLJEXpdYkVNiahiOQw= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=alibaba.com; spf=pass (imf30.hostedemail.com: domain of hsiangkao@linux.alibaba.com designates 115.124.30.99 as permitted sender) smtp.mailfrom=hsiangkao@linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1678089826; a=rsa-sha256; cv=none; b=cmmf/YVNJFJ2JyRHpnZ/Fc82pCNbeXkyFawaR+jmNXhFUGu3synuXyx5l3kDJ2HD359ovn E6qZbkLBIQEQ61vliHIQyTXK4DEXW1qGV7UBxI48+Bf0FaQmgie8zUkmYhRmMkldWEB0eT KJLJ0PF2yFEjflf59esIL0o/LR4D5/M= X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R991e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018045170;MF=hsiangkao@linux.alibaba.com;NM=1;PH=DS;RN=9;SR=0;TI=SMTPD_---0VdCw8xX_1678089820; Received: from 30.97.49.22(mailfrom:hsiangkao@linux.alibaba.com fp:SMTPD_---0VdCw8xX_1678089820) by smtp.aliyun-inc.com; Mon, 06 Mar 2023 16:03:41 +0800 Message-ID: Date: Mon, 6 Mar 2023 16:03:39 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.6.1 Subject: Re: [PATCH] mm/page_alloc: avoid high-order page allocation warn with __GFP_NOFAIL To: Michal Hocko Cc: Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Mel Gorman , Vlastimil Babka , Baoquan He , Christoph Hellwig , Uladzislau Rezki References: <20230305053035.1911-1-hsiangkao@linux.alibaba.com> From: Gao Xiang In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: A35A48000C X-Stat-Signature: 7q8mgwm3cn615du6wwkktu8azrdxqgu9 X-Rspam-User: X-HE-Tag: 1678089824-548588 X-HE-Meta: U2FsdGVkX1/FCGFWM14XZWSPpbQxnemMPdZOR81Lcf/WhsfuZ5MgBfufWkGPPuh+TWCItHn2PRT+BTNZmLbsDTB7l7IWZOS43uHMDHL5TTHUGjf/vhZ8oVqxE1k2zSwd7wrj+/CYIKkVkULXziBkOCgfrZoY0lVMcTNody11YfdajP9DCUVuSCxkuK8Efc9iMNTdFHn7gS1eK07fxY/5f7K3q4MR3QoauDKjeU2rTOVVBqJlyaMgwdmyU71b2mLdAw3OIwfRmkwZSJiLuj0nyAFH4CW7HW8Y4HNHWRVCWykKU2i4hnnCzGnHi25weiu+K1+5uovUcb5ZPs2w3dqAywULwB6uBn9HVzNlPQDaBJ5SLKAQaiLVQ6jbyLwuIXpoewqFe0zO6HS21assxP2tZbAzArMgMgPoMB2Dm2OQJfhvXV2v/QV247hsV76MWf/EbmFoh8pZwQpdh25Yv+kpCevBgLxZIE0vLMHy1wsnFD4hxIAf7uDmVOumCCCc4n07RHmTacrE9cvbyOoe2SfcYLNqiytw7FVrPA8i9ZmcYeVz7EfHw4G1znKWl/o0bf5V36LgR+fcMADVeqNdfjB1bN/v+HNz7oV8Af61v+YNuzDXdv6WBOleHpjPaToUHGCiwKw3K6oStW2V2Qfr2DNndXsMF3+HRp+urW/CLh/6USbQ14QVp5Jau4EDsVwj2khPHFhW3ugaYgX9gxDvbHi742RtiFmRy5Me+CeWvZKJ824F54TOuss0HqGYNTrbfhCVGmmRIFVzEoSq3DlMFldd/GEr5bm2l0HzP+qler+ZHIV5QhIbTd2Mt4RemKNRmDevVY9+w1/ampwQISct0sz2MSDGEdDQjMG7z65OSV+C3bef2blxNRI9ID+VlwE5jk7LJb8+ETbmAfhYOEGG2nyee7ZosteRDmNxHup1eggd37paQvSCvbuo5ylPo4o6zUlFrzCzf9iVfrJsvYKW5oj ZeY4lH28 ZPppEQlzc5havsdv8axaXZLqPM/TRS+qGXQT1c399AbmROqVA5wU4KzuZNR0BvnFGQQjf2yOZnzzQLoxD9yvbs8RRsTyaKZplA3kRC1Qr3Uzk/qJzM79vu5CpGH0NC1ATLwd5xxlCYGbKdgC8rWqYI3DluzyRS5Sku7XWvHNKyvNDjOzyhzM79ih3Un0ew+uJdY/fuTTeEAZfa8wJpoFnjd84sBHnS0BAfTgD X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2023/3/6 15:51, Michal Hocko wrote: > [Cc couple of more people recently involved with vmalloc code] > > On Sun 05-03-23 13:30:35, Gao Xiang wrote: >> My knowledge of this is somewhat limited, however, since vmalloc already >> supported __GFP_NOFAIL in commit 9376130c390a ("mm/vmalloc: add >> support for __GFP_NOFAIL"). __GFP_NOFAIL could trigger the following >> stack and allocate high-order pages when CONFIG_HAVE_ARCH_HUGE_VMALLOC >> is enabled: >> >> __alloc_pages+0x1cb/0x5b0 mm/page_alloc.c:5549 >> alloc_pages+0x1aa/0x270 mm/mempolicy.c:2286 >> vm_area_alloc_pages mm/vmalloc.c:2989 [inline] >> >> __vmalloc_area_node mm/vmalloc.c:3057 [inline] >> __vmalloc_node_range+0x978/0x13c0 mm/vmalloc.c:3227 >> kvmalloc_node+0x156/0x1a0 mm/util.c:606 >> kvmalloc include/linux/slab.h:737 [inline] >> kvmalloc_array include/linux/slab.h:755 [inline] >> kvcalloc include/linux/slab.h:760 [inline] >> (codebase: Linux 6.2-rc2) >> >> Don't warn such cases since high-order pages with __GFP_NOFAIL is >> somewhat legel. > > OK, this is definitely a bug and it seems my 9376130c390a was > incomplete because it hasn't covered the high order case. Not sure how > that happened but removing the warning is not the right thing to do > here. The higher order allocation is an optimization rather than a must. > So it is perfectly fine to fail that allocation and retry rather than > go into a very expensive and potentially impossible higher order > allocation that must not fail. > > The proper fix should look like this unless I am missing something. I > would appreciate another pair of eyes on this because I am not fully > familiar with the high order optimization part much. I'm fine with the fix. Although I'm not familiar with such vmalloc allocation, I thought about this possibility as well. The original issue was: https://lore.kernel.org/r/0000000000007796bd05f1852ec2@google.com which I used kvcalloc with __GFP_NOFAIL but it warned, and I made a fix (which now seems wrong) to use kcalloc() but it now warns the same: https://lore.kernel.org/r/00000000000072eb6505f376dd4b@google.com And I then realized it's a bug in kvmalloc() with __GFP_NOFAIL... Thanks, Gao Xiang > > Thanks! > --- > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index ef910bf349e1..a8aa2765618a 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -2883,6 +2883,8 @@ vm_area_alloc_pages(gfp_t gfp, int nid, > unsigned int order, unsigned int nr_pages, struct page **pages) > { > unsigned int nr_allocated = 0; > + gfp_t alloc_gfp = gfp; > + bool nofail = false; > struct page *page; > int i; > > @@ -2931,20 +2933,30 @@ vm_area_alloc_pages(gfp_t gfp, int nid, > if (nr != nr_pages_request) > break; > } > + } else { > + alloc_gfp &= ~__GFP_NOFAIL; > + nofail = true; > } > > /* High-order pages or fallback path if "bulk" fails. */ > - > while (nr_allocated < nr_pages) { > if (fatal_signal_pending(current)) > break; > > if (nid == NUMA_NO_NODE) > - page = alloc_pages(gfp, order); > + page = alloc_pages(alloc_gfp, order); > else > - page = alloc_pages_node(nid, gfp, order); > - if (unlikely(!page)) > - break; > + page = alloc_pages_node(nid, alloc_gfp, order); > + if (unlikely(!page)) { > + if (!nofail) > + break; > + > + /* fall back to the zero order allocations */ > + alloc_gfp |= __GFP_NOFAIL; > + order = 0; > + continue; > + } > + > /* > * Higher order allocations must be able to be treated as > * indepdenent small pages by callers (as they can with