From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D4A79C021A9 for ; Tue, 18 Feb 2025 09:21:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6A3F9280108; Tue, 18 Feb 2025 04:21:39 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 65351280102; Tue, 18 Feb 2025 04:21:39 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 51AA4280108; Tue, 18 Feb 2025 04:21:39 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 340E5280102 for ; Tue, 18 Feb 2025 04:21:39 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id A9C371A0B71 for ; Tue, 18 Feb 2025 09:21:38 +0000 (UTC) X-FDA: 83132522676.14.4CF377F Received: from szxga07-in.huawei.com (szxga07-in.huawei.com [45.249.212.35]) by imf30.hostedemail.com (Postfix) with ESMTP id 1E41780008 for ; Tue, 18 Feb 2025 09:21:35 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=none; spf=pass (imf30.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.35 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739870496; a=rsa-sha256; cv=none; b=Zu9JJv8qqxkI/1nEeSNQ178Kzh4T2SRuA51+2OlYl4xOSj4g27aKl07d6g6PInCehBP7Tz j8TIwX4X3WnEfmi0w82+0WDI+NFVG5ueVCHRPcmx+pvrR3YdzVU71VZxIJOeFEppvAg6+g BjVz/soiO6ftrghMxnYAtQt+PBFPS5k= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=none; spf=pass (imf30.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.35 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739870496; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZCtAjWsH8jT9hnaqtoB68T1TFs75Fw+Uj5uC9nRt6n0=; b=L1BG2xjWvtkA2WOi3sa2lGg7STR8xNUsIcum4UiV6Xbp1jzc56PdKsYenUKrhKhHmq2rb1 Mtv7p+C/DJplS0aJa5lyrPSkEzLQZ+afxzL5QeKy8c/YsFCAAyVEmkyZw+q8HfvJLUYDH2 fH9fF7DmE+P4NfZfBfv8phlBNstmUqw= Received: from mail.maildlp.com (unknown [172.19.88.214]) by szxga07-in.huawei.com (SkyGuard) with ESMTP id 4Yxv7P5668z1wn7M; Tue, 18 Feb 2025 17:17:37 +0800 (CST) Received: from dggpemf200006.china.huawei.com (unknown [7.185.36.61]) by mail.maildlp.com (Postfix) with ESMTPS id B723E1A016C; Tue, 18 Feb 2025 17:21:31 +0800 (CST) Received: from [10.67.120.129] (10.67.120.129) by dggpemf200006.china.huawei.com (7.185.36.61) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Tue, 18 Feb 2025 17:21:27 +0800 Message-ID: Date: Tue, 18 Feb 2025 17:21:27 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC] mm: alloc_pages_bulk: remove assumption of populating only NULL elements To: Dave Chinner CC: Yishai Hadas , Jason Gunthorpe , Shameer Kolothum , Kevin Tian , Alex Williamson , Chris Mason , Josef Bacik , David Sterba , Gao Xiang , Chao Yu , Yue Hu , Jeffle Xu , Sandeep Dhavale , Carlos Maiolino , "Darrick J. Wong" , Andrew Morton , Jesper Dangaard Brouer , Ilias Apalodimas , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman , Trond Myklebust , Anna Schumaker , Chuck Lever , Jeff Layton , Neil Brown , Olga Kornievskaia , Dai Ngo , Tom Talpey , Luiz Capitulino , Mel Gorman , , , , , , , , , References: <20250217123127.3674033-1-linyunsheng@huawei.com> Content-Language: en-US From: Yunsheng Lin In-Reply-To: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.67.120.129] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To dggpemf200006.china.huawei.com (7.185.36.61) X-Stat-Signature: 1cgh4ux4xyd6obqur3ot41383h38ymaa X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 1E41780008 X-Rspam-User: X-HE-Tag: 1739870495-438451 X-HE-Meta: U2FsdGVkX18vxdqMxcW7MlmxSqj+aU0/+KMNFebDx1kluA3KAlTAF5Ii5uuU/u8xlEnTbznyaoh+VVmewyWctzlI2IVLDjSVBzfSsvI82G3wg9zf/NJAAzB5dmd/tYEyAmvszEXcXDnhRGxTLcfQkx+o8bVMejWNgrgnCpz7a6821+E/84VaxUXjl3aYR2S3UrvR6qgX7IwXQc0Sn/DeNt5BBi5DTzSo0stuUBlqOC5l+Nbt/VnTsSdT/I8YV1/FlVVuc38MgMqC7wUwpDibOp731T889bWMN2mHoufhOYnK99o0NYXWqUNbyFcdNipEcLIrMyTyK78LPqhFnC6MWegGU/sxz9Sxngy3wxxYrquTYImgTU+7wnCXxy69FybznFjHYgVVA3U1cYz/VneOipR8TZJHQvPOqJ1azVgj8AE/vg4xbUmw/kQwJowv1wQrtRnwwVlmekgza2nFAIGf6PpugEcEJny0hzTevEIrcaLahnZFQb+7peI01e5mNU4ssUH0VENSfV2BzdpahXXZfhphiijs/mxpJo1juQHKNmL9gvy8KPPzd67zXViUUCsbf+DQlLbrZMJ2gRhw0dqxG/dQWzytb7Fwr4rNo7H8PQOCyWY8F4SW0fNKl3gHtXTgPJ/yL7fgrIFViq4jl+yzCm8y5kyh822E7T2BOMlA928M+vhVQJ7bXB8Q3WpXXgLmhZNlp+t5LgFONgHotgXayaR0TrAhkflVtZ/f4Jk6VZbBjTD5Exf96Yj4KFK0h4vnHTx/yfsAwqPeGmYqIyp4GG7pcu2sRQQJpuo6ymKvUlKwxc6LI933IckFtUiZaJYuy3fEHT+yc+/4mnPWE7zM66TzrNahmpRisqb78I67nmj19IKPRVO17s06wS3XCnKQU/2Ok9VqSY7Fnhz2OyPoql16mBTZE8sf+rEoFgsMYYKr0ett0ehvUeSEuRvDksuXJtBmnq4PoqBpkinKH1U AdJXKzHO m2lNQNLaskBtvXqbJWtnnRFA3LNkvDYIhLy1Tvq8QMUfwND6hzy9KX/WPA4CZtMFxV3C7PlosL58TtfxdGrfjGfl2V3L9C8FeNz95F8dXuFJvF2u92hF4/5Gk1RGNjsxVyHuZ3IxhRR3q/XPJTic0ToLIMYJAi1et0Ej9DCzqmB/aXHqM9oRg3rITSdnFZR0BwOndSI0JG3j3hGhuHYqY57Mr+DMvxr8sRTte9GfeYALoYaCNHf5Qmsd0C5T0AHqViy4TnPA/z37iXLRXsKHH5ix+7Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2025/2/18 5:31, Dave Chinner wrote: ... > ..... > >> diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c >> index 15bb790359f8..9e1ce0ab9c35 100644 >> --- a/fs/xfs/xfs_buf.c >> +++ b/fs/xfs/xfs_buf.c >> @@ -377,16 +377,17 @@ xfs_buf_alloc_pages( >> * least one extra page. >> */ >> for (;;) { >> - long last = filled; >> + long alloc; >> >> - filled = alloc_pages_bulk(gfp_mask, bp->b_page_count, >> - bp->b_pages); >> + alloc = alloc_pages_bulk(gfp_mask, bp->b_page_count - refill, >> + bp->b_pages + refill); >> + refill += alloc; >> if (filled == bp->b_page_count) { >> XFS_STATS_INC(bp->b_mount, xb_page_found); >> break; >> } >> >> - if (filled != last) >> + if (alloc) >> continue; > > You didn't even compile this code - refill is not defined > anywhere. > > Even if it did complile, you clearly didn't test it. The logic is > broken (what updates filled?) and will result in the first > allocation attempt succeeding and then falling into an endless retry > loop. Ah, the 'refill' is a typo, it should be 'filled' instead of 'refill'. The below should fix the compile error: --- a/fs/xfs/xfs_buf.c +++ b/fs/xfs/xfs_buf.c @@ -379,9 +379,9 @@ xfs_buf_alloc_pages( for (;;) { long alloc; - alloc = alloc_pages_bulk(gfp_mask, bp->b_page_count - refill, - bp->b_pages + refill); - refill += alloc; + alloc = alloc_pages_bulk(gfp_mask, bp->b_page_count - filled, + bp->b_pages + filled); + filled += alloc; if (filled == bp->b_page_count) { XFS_STATS_INC(bp->b_mount, xb_page_found); break; > > i.e. you stepped on the API landmine of your own creation where > it is impossible to tell the difference between alloc_pages_bulk() > returning "memory allocation failed, you need to retry" and > it returning "array is full, nothing more to allocate". Both these > cases now return 0. As my understanding, alloc_pages_bulk() will not be called when "array is full" as the above 'filled == bp->b_page_count' checking has ensured that if the array is not passed in with holes in the middle for xfs. > > The existing code returns nr_populated in both cases, so it doesn't > matter why alloc_pages_bulk() returns with nr_populated != full, it > is very clear that we still need to allocate more memory to fill it. I am not sure if the array will be passed in with holes in the middle for the xfs fs as mentioned above, if not, it seems to be a typical use case like the one in mempolicy.c as below: https://elixir.bootlin.com/linux/v6.14-rc1/source/mm/mempolicy.c#L2525 > > The whole point of the existing API is to prevent callers from > making stupid, hard to spot logic mistakes like this. Forcing > callers to track both empty slots and how full the array is itself, > whilst also constraining where in the array empty slots can occur > greatly reduces both the safety and functionality that > alloc_pages_bulk() provides. Anyone that has code that wants to > steal a random page from the array and then refill it now has a heap > more complex code to add to their allocator wrapper. Yes, I am agreed that it might be better to provide a common API or wrapper if there is some clear use case that need to pass in an array with holes in the middle by adding a new API like refill_pages_bulk() as below. > > IOWs, you just demonstrated why the existing API is more desirable > than a highly constrained, slightly faster API that requires callers > to get every detail right. i.e. it's hard to get it wrong with the > existing API, yet it's so easy to make mistakes with the proposed > API that the patch proposing the change has serious bugs in it. IMHO, if the API is about refilling pages for the only NULL elements, it seems better to add a API like refill_pages_bulk() for that, as the current API seems to be prone to error of not initializing the array to zero before calling alloc_pages_bulk(). > > -Dave.