From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E0A9AC282D0 for ; Tue, 4 Mar 2025 08:18:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4D9646B0082; Tue, 4 Mar 2025 03:18:41 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 489A56B0083; Tue, 4 Mar 2025 03:18:41 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 32A716B0085; Tue, 4 Mar 2025 03:18:41 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 15D436B0082 for ; Tue, 4 Mar 2025 03:18:41 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 888EE12181B for ; Tue, 4 Mar 2025 08:18:40 +0000 (UTC) X-FDA: 83183167200.11.434D818 Received: from mail-pj1-f50.google.com (mail-pj1-f50.google.com [209.85.216.50]) by imf28.hostedemail.com (Postfix) with ESMTP id 94441C0009 for ; Tue, 4 Mar 2025 08:18:38 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=fromorbit-com.20230601.gappssmtp.com header.s=20230601 header.b=eeJLs4um; dmarc=pass (policy=quarantine) header.from=fromorbit.com; spf=pass (imf28.hostedemail.com: domain of david@fromorbit.com designates 209.85.216.50 as permitted sender) smtp.mailfrom=david@fromorbit.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741076318; a=rsa-sha256; cv=none; b=XJNxbZqsGMU2T9ZBiVu4TzAjuYuj2QPM+go8cYFhtuUUNIlHWQA1byKcrkks49PMGWAfiS y3t8HBEob9cQk9hszntyQmNDPEA3/JPzVnXBy+z2CzYwVTf68bZoj+oqKOCv3ZigdxEXIG bYhCoRKuKSdU8H6n70fZWQRyXK1eHFs= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=fromorbit-com.20230601.gappssmtp.com header.s=20230601 header.b=eeJLs4um; dmarc=pass (policy=quarantine) header.from=fromorbit.com; spf=pass (imf28.hostedemail.com: domain of david@fromorbit.com designates 209.85.216.50 as permitted sender) smtp.mailfrom=david@fromorbit.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741076318; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Gc0xsYnTUQhvWodmw2ejFo3Ab0lc47BeE7G2ofWx65Q=; b=A7WBfbMOBEWe2Yci/FPVTdI8GFXQ9t02UGjBWIxjO39iZESo09Dn2h6MtWd0Oo/fspDdeh hbLGPHcyRCo+0b2IFJdPH2Lq/9VEr8PlR5bnypMxtgohYdy+TNaIkrOcAakH5Ge1nOlkVs w+D413TbJOV2llV7sFafUHeWzZwC6tk= Received: by mail-pj1-f50.google.com with SMTP id 98e67ed59e1d1-2ff087762bbso2952699a91.3 for ; Tue, 04 Mar 2025 00:18:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20230601.gappssmtp.com; s=20230601; t=1741076317; x=1741681117; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=Gc0xsYnTUQhvWodmw2ejFo3Ab0lc47BeE7G2ofWx65Q=; b=eeJLs4umVZZLnoKyUPThXeHJCSjKAFgUvciCiDGdATdRMGyhHb0WcYqbj/Y6UYCgpM s1+fAXa8RHpYHVOB2+ZrcuFAvXpCnBMyONmCMXq/CCWWuxawx5F+/AvOA8NBAOVMFCLl g2TzXypdj0Y9nwWQvV3daFbpx7eZluHvEnQK/A6VDmrV/YNUVhrZ/FOg7/BBjeTL3zX6 7EiJxba/w9gPVLF45Slb5Pw1D2IyTvEn4CmkE4t0stHsV1Tpfrazw4cziKGlErmy3Iwj hIfEJn2L9YreEhc5uWZw0zGy1yyYj+xmMLNmLs8ZamCHLYXgZln6eE2EWyK5XrqSm09B VEzA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741076317; x=1741681117; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=Gc0xsYnTUQhvWodmw2ejFo3Ab0lc47BeE7G2ofWx65Q=; b=ARxI1tG10oCn7j5nbGWWpv7Rog5EFPOHicNhJNjrYDEPx6TQcghacaJ6/GNngyv2yq +cJ0BQ+VOLbGNaiD2KhbvDiO6P+f8WI+RplxZjv9U5iKVLTLl+8wiAvR7cN8kVHgjO2T LUhBTHovvkx6Vqib0yEoL3h5rXyDGG3nRpQjBvaexkewUncgvzXgY6pduticrLRXmbz7 HqVAV0QWJYUBIBqOJoMrw7Crxj8GPUEToCNhfJlnHWf18LoPsFHcbH/uTFW4QehmOGXL uNUK63W0OXCyJxgvG4+LJJ4KntgfQXSu81SbcTLVMqM+kQfXlesEL3vGsd3uAdZjV80p 2XaA== X-Forwarded-Encrypted: i=1; AJvYcCU8819ACUeRjp6Kac7frzBMGOUkaFZvMvkULmpv7b82pfplKxsoPsbpSbD87M//vCvvZ1rU1QcAxw==@kvack.org X-Gm-Message-State: AOJu0Yz9fYGmUv0gSb8yVOVKoBxaCDZox/p7uF+Cx2uaHHCWVjfrxhW2 HEElmh2IVGHivCM1ZnyNSkP/4oUP4/rEZQPbq8zoSazyG0WEDJibAV/HBDL9Q+M= X-Gm-Gg: ASbGncuMZdbBhsaFQ/l6fKMbbWZsAUU9oeH00t9la8ihRYdfvTi9+XzapSa61NqsU2i 7qRSRtptJWrk45OCSyNVN/8m05utIz1seDU5SsnNzRYDmPoHbi9sYSoJRHTY99wfsQFxMuD+ip6 mmdM4Yj8YAaD8752rp621OHx2/tZ+0vr6c3Tyog6pdfmBYDPmJ+6yJ+aNxgQZNjfBwo8pNgnbFZ AodrSvHscOjXQlMoLo0kDhQ+Bi/RfXEvmeaMMZrVsSb2QxVlVUP8IwCHcuzoeLmLKJiozMOL67V QkM/0/HbaFhSRpg0+iQiLu/mL+mSVsNt4+9HPpF4LHWXFjYvycpJFYFkoq1Jm9/58QoRVCB1S9A 6bcNlLGpojcPaIIvgMzx2 X-Google-Smtp-Source: AGHT+IFtDzCgImtuz8L9+GYRCY4LVCLrXbK0Bo8kgf88DL3yFokQ0qmw0lddBixblhsY0xBrKZj7tQ== X-Received: by 2002:a05:6a20:734a:b0:1ee:c390:58ad with SMTP id adf61e73a8af0-1f2f4e014f9mr29817034637.34.1741076317276; Tue, 04 Mar 2025 00:18:37 -0800 (PST) Received: from dread.disaster.area (pa49-186-89-135.pa.vic.optusnet.com.au. [49.186.89.135]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-aee7de19d3fsm9497416a12.18.2025.03.04.00.18.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Mar 2025 00:18:36 -0800 (PST) Received: from dave by dread.disaster.area with local (Exim 4.98) (envelope-from ) id 1tpNU1-00000008fSI-4AHP; Tue, 04 Mar 2025 19:18:34 +1100 Date: Tue, 4 Mar 2025 19:18:33 +1100 From: Dave Chinner To: Yunsheng Lin Cc: Yishai Hadas , Jason Gunthorpe , Shameer Kolothum , Kevin Tian , Alex Williamson , Chris Mason , Josef Bacik , David Sterba , Gao Xiang , Chao Yu , Yue Hu , Jeffle Xu , Sandeep Dhavale , Carlos Maiolino , "Darrick J. Wong" , Andrew Morton , Jesper Dangaard Brouer , Ilias Apalodimas , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman , Trond Myklebust , Anna Schumaker , Chuck Lever , Jeff Layton , Neil Brown , Olga Kornievskaia , Dai Ngo , Tom Talpey , Luiz Capitulino , Mel Gorman , kvm@vger.kernel.org, virtualization@lists.linux.dev, linux-kernel@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-erofs@lists.ozlabs.org, linux-xfs@vger.kernel.org, linux-mm@kvack.org, netdev@vger.kernel.org, linux-nfs@vger.kernel.org Subject: Re: [PATCH v2] mm: alloc_pages_bulk: remove assumption of populating only NULL elements Message-ID: References: <20250228094424.757465-1-linyunsheng@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250228094424.757465-1-linyunsheng@huawei.com> X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 94441C0009 X-Stat-Signature: uf63ufxssrxs8dkbjg4f96bhus4bgfio X-Rspam-User: X-HE-Tag: 1741076318-335001 X-HE-Meta: U2FsdGVkX1+VXbiohu3Xj0y1zWzfXqYbGe/LB6cLOhb5YZ/2W1wSjECU2SXRPSMy+egvrwGkso6Q9DTgY3eKl7+bCCHoRXEnRakrCsfb3YCR3lxA/HXdb38BIFVtpXcYBJpiKleeqw0Bj4WVXk7TwJKYwztXSNJOFvOrdJ0FZiGI+qaswyfyyu2ABLxSvhY/1EJHxz1r0XzysLlMBtdX6u+hxLm1Dr1JU9f/eXWkwu0vlqZ6Nj31anxweXFQbmMmg67ufwCZ4Aph5yC4y4GBDWa7oIbzZZyHZrvE9840t+DJTJgI5MSassC3iTw5bR/PA4A98pa9uphWubiDgcqd6F1YvxjXP0cXHi/JyImepZL0g624f5v7fON9AFNitZ4yaE14Y8urBK6EXA+DACV4n3gJIUBWKYWDzmKkEzuDGsarau7J23P888xwze149uWIxWkwq/OWDJ+Ff+ssLX0fIhynwOzPzTmr+lPAIa7yiGImchOMQYfT3bkCnB2m6x+Im/UuElnkRdvKjHBV8Y55EcGbsVGzvCrhQpaViY7JeEPS7iBm4VSdNzwdq9/d4utMnhUZ1khcJSczbh3ZH4/19RIOjeJZ0E5KSvPBWOm9uMwKvkR4CEgIOZa5VBVg41ZTfrPsK4OSkkkMdPFB+AP+yuEO/BQsy46D61DQG/2n1pFaRP3SLSSaccNO69Uu0wZJ9ECz54DslSkiYSjRiOZMKj6S8EnbnBWoitT0PQtO6urCvIvCO9gZjn9P3TlYF/rLG6JF4tu+SQ+6kaA6CBNX0naGFB6Uxfh7zQ6JlrttfWppitd3iBbBfS1ajbvlvU4tOjNfOhkcTFz3Fx/m68agIFv92GeenTJ3j2ZdAUeVWXu6BmKs7FEczF43Xwab8YsM3M2AUNCYW8AzkAcaf8EKFAJXpQYXp02s5rR5mwpifRPbdM2O5np2lVM7d7NPuH1c1KW5rPFx/zpSxhsiRp1 BkkVLOUx p7hgxbYo6auVFYh22ggpDV7APKtUmlXJ7zzkeuhooFC8akJnGGzI5duSu149HFQUaRXg7F3bAuBJLv32r/7TOnDpXj+FrGycJbK4NJigOVEA6VvqKVgwDBKefSh/CyM5qQlGWlpnKj51OidLjG5sYb+2eRU+POg0848DGXKdiRQLjQch7TXNNdSpxvQwzedR0XLLWNjL8A4r1LBMBgY/ixocbUnt619IPbxY0glqwN0S4wnGh8rbhwaoIGV0Pffd7gulWVyMtI5NotI5mLeTrzULWiPb4Ev920F+8zQJp8OMjNXPdMfPQvwVDHn95JzD6Bb4A9mXL3zC+iE9SQ84aoZupZScHc8L+LI5KhD/j9ZxqfiqAFQ+4heWq6kktxKs5Pza/yWNnTd5TBfP8EtVTzjynNvCgJayIvIxx481N3VC/1vF2iQT6ra7scGyhyhOHh/+jKfeWNAMkN8jmpXQ7h+2eYPAbhrjG2q2coCFw7zkT+cDZtrt6NpeEqZna/h93zzHuIcBRdhNnuUxvnlrODo5ZGmNz1UILy/r0AQRq7oMnwcrIXBHp1Ho9eWAUC01BT2qyVV6sPzyZ60Kjawz2qbFSV/X4WuhdyIRZauy/PHwBljc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Feb 28, 2025 at 05:44:20PM +0800, Yunsheng Lin wrote: > As mentioned in [1], it seems odd to check NULL elements in > the middle of page bulk allocating, and it seems caller can > do a better job of bulk allocating pages into a whole array > sequentially without checking NULL elements first before > doing the page bulk allocation for most of existing users. > > Through analyzing of bulk allocation API used in fs, it > seems that the callers are depending on the assumption of > populating only NULL elements in fs/btrfs/extent_io.c and > net/sunrpc/svc_xprt.c while erofs and btrfs don't, see: > commit 91d6ac1d62c3 ("btrfs: allocate page arrays using bulk page allocator") > commit d6db47e571dc ("erofs: do not use pagepool in z_erofs_gbuf_growsize()") > commit c9fa563072e1 ("xfs: use alloc_pages_bulk_array() for buffers") > commit f6e70aab9dfe ("SUNRPC: refresh rq_pages using a bulk page allocator") > > Change SUNRPC and btrfs to not depend on the assumption. > Other existing callers seems to be passing all NULL elements > via memset, kzalloc, etc. > > Remove assumption of populating only NULL elements and treat > page_array as output parameter like kmem_cache_alloc_bulk(). > Remove the above assumption also enable the caller to not > zero the array before calling the page bulk allocating API, > which has about 1~2 ns performance improvement for the test > case of time_bench_page_pool03_slow() for page_pool in a > x86 vm system, this reduces some performance impact of > fixing the DMA API misuse problem in [2], performance > improves from 87.886 ns to 86.429 ns. How much slower did you make btrfs and sunrpc by adding all the defragmenting code there? > > 1. https://lore.kernel.org/all/bd8c2f5c-464d-44ab-b607-390a87ea4cd5@huawei.com/ > 2. https://lore.kernel.org/all/20250212092552.1779679-1-linyunsheng@huawei.com/ > CC: Jesper Dangaard Brouer > CC: Luiz Capitulino > CC: Mel Gorman > CC: Dave Chinner > CC: Chuck Lever > Signed-off-by: Yunsheng Lin > Acked-by: Jeff Layton > --- > V2: > 1. Drop RFC tag and rebased on latest linux-next. > 2. Fix a compile error for xfs. And you still haven't tested the code changes to XFS, because this patch is also broken. > diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c > index 5d560e9073f4..b4e95b2dd0f0 100644 > --- a/fs/xfs/xfs_buf.c > +++ b/fs/xfs/xfs_buf.c > @@ -319,16 +319,17 @@ xfs_buf_alloc_pages( > * least one extra page. > */ > for (;;) { > - long last = filled; > + long alloc; > > - filled = alloc_pages_bulk(gfp_mask, bp->b_page_count, > - bp->b_pages); > + alloc = alloc_pages_bulk(gfp_mask, bp->b_page_count - filled, > + bp->b_pages + filled); > + filled += alloc; > if (filled == bp->b_page_count) { > XFS_STATS_INC(bp->b_mount, xb_page_found); > break; > } > > - if (filled != last) > + if (alloc) > continue; alloc_pages_bulk() now returns the number of pages allocated in the array. So if we ask for 4 pages, then get 2, filled is now 2. Then we loop, ask for another 2 pages, get those two pages and it returns 4. Now filled is 6, and we continue. Now we ask alloc_pages_bulk() for -2 pages, which returns 4 pages... Worse behaviour: second time around, no page allocation succeeds so it returns 2 pages. Filled is now 4, which is the number of pages we need, so we break out of the loop with only 2 pages allocated. There's about to be kernel crashes occur..... Once is a mistake, twice is compeltely unacceptable. When XFS stops using alloc_pages_bulk (probably 6.15) I won't care anymore. But until then, please stop trying to change this code. NACK. -Dave. -- Dave Chinner david@fromorbit.com