From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7E4A2C021AA for ; Tue, 18 Feb 2025 21:14:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F2FCC2801A0; Tue, 18 Feb 2025 16:14:36 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EDEED28019B; Tue, 18 Feb 2025 16:14:36 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DB1F42801A0; Tue, 18 Feb 2025 16:14:36 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id BAD8328019B for ; Tue, 18 Feb 2025 16:14:36 -0500 (EST) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 4F88A1A062E for ; Tue, 18 Feb 2025 21:14:36 +0000 (UTC) X-FDA: 83134319352.29.93F701C Received: from mail-pl1-f174.google.com (mail-pl1-f174.google.com [209.85.214.174]) by imf24.hostedemail.com (Postfix) with ESMTP id 512BA18000D for ; Tue, 18 Feb 2025 21:14:34 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=fromorbit-com.20230601.gappssmtp.com header.s=20230601 header.b=UKkw7ZaF; spf=pass (imf24.hostedemail.com: domain of david@fromorbit.com designates 209.85.214.174 as permitted sender) smtp.mailfrom=david@fromorbit.com; dmarc=pass (policy=quarantine) header.from=fromorbit.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739913274; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=i/TFjjFVKbDA3nMhW4Nd4g/ebT5hNi/7fITiKNmhUzM=; b=BbJCQQig5RCBkxHul2PKrU3os2oYZLZihxbyh1DTXvj/1hj7tUWdr1XY8WpJ2MumTXaU5q MID+Qyws4HcMbbqxEn+R3OkqyMcnyRqSt8LbszgZrh1MP6sJVb4DPEmYUsRN03dbG4JzYn au53xqS7NakGSe6mYbQoisrKTT0IMXA= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=fromorbit-com.20230601.gappssmtp.com header.s=20230601 header.b=UKkw7ZaF; spf=pass (imf24.hostedemail.com: domain of david@fromorbit.com designates 209.85.214.174 as permitted sender) smtp.mailfrom=david@fromorbit.com; dmarc=pass (policy=quarantine) header.from=fromorbit.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739913274; a=rsa-sha256; cv=none; b=vE/jP2wiVXyYZVdlworrCBQXZaGZssXWf0OqgCHb87P2455nBvjSNzS0g5qz8v0GeGS5v8 hQNg+H0lzJPOEQyw4itYc6m2x3KmG5WcNghQJ3v5Djft8e8bQmXQSSQeErb3nAnEKGdnWg 9Bl2PkY9R443vXjlClRpK6mJ4Q0mTXs= Received: by mail-pl1-f174.google.com with SMTP id d9443c01a7336-220c8eb195aso122494445ad.0 for ; Tue, 18 Feb 2025 13:14:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20230601.gappssmtp.com; s=20230601; t=1739913273; x=1740518073; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=i/TFjjFVKbDA3nMhW4Nd4g/ebT5hNi/7fITiKNmhUzM=; b=UKkw7ZaFmTN171d0flUbQLYTPATZw1DWw34tkY4wfS+rPtMzW/3M66ApgwjW/6DYlf daPZS1Kp/RgCzSQ2FR6KLI3+f3EUXGEJe6JJiaYcW1hYTYYB7w5jjFHHPweDIEkfQ8+R Tn/VhqR8apfh4lOAZ0Ixo8FWUvFB8BAaV8fZEL8zg5zRZy2SytMghphwBPhGQ85fLlRJ CmTDE2NvdF1Bcm33i0k+qeaOrKNn0BYkZHO9OtGUXa22pL0COD8gRCLfYGygpCl0twsA UI9WdxDOiqGjbKjE+8ihAo04N6jnF301v3mzUwaXPAqPUkb9OV+JC/iloHLQLqJ93KjF QvwA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739913273; x=1740518073; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=i/TFjjFVKbDA3nMhW4Nd4g/ebT5hNi/7fITiKNmhUzM=; b=dUJ/jWyj8ZJ39aElirH1bpECJHoTxfl9QzUNC4mU3z3GraDidqY4m4pBpW8eD3B0Wf fcya5/mkc0D5vrBS+6B5OAoSzAbGH8KZAjK0VAlftwdPnHSgXLWoDXuAISbo139HZpKe VXvdlsJlc4PHmP5oGrfy+Gq8kTuZNuImvF/S0OKoMSidvZfNhhJoNd5uHaGcPgaPeg2i bIwq0zzZUYIotS8zBPqrIIVZCq7a3AGmNFig0VQt10HDxVn1e8OFUMJdaU8BBM0Pzoih IksTesVVrYrlxudwVdEGXNf52dWpSK82WQ09hBPu5faPlk8iLnH43PvrEBAVp6Kj5S6v Ca+w== X-Forwarded-Encrypted: i=1; AJvYcCUN9Jt/Wt4Me6fynUFZaPjlCEfOoolz00ABnydOlC0YLVIImjNhY7am87Ymu8t/05XZpmKBkinhPg==@kvack.org X-Gm-Message-State: AOJu0YwNlBW9lolWX70s10iOWpg3fIeQvziJKnsPgvhNxoc6aR8JVZPf JGRUtarpOKmtZKOgT/kUx70k9uINDKNRP5lN3PTLgQenBF6V8MlgEPQkE0xxEAs= X-Gm-Gg: ASbGncseKFwwTa9ePHJqT/k+HZ+eEXJD+epJbN92HWusf4DZq8q9DAstJDgGmNJSnXh NkpyhszGUX6+0PN1+yO6QUj9nk427maFFSxgtYDgA6FRyIoc18tsdbvp+B/juhip4Xj9mYZGYXG arJfj5+/svC+AMynB90ivKJTFhpGkXlYODnqzW6/xUfEyjhizcE3TkAGysMKHypP0FY021J5Jwj gVLC6OVJ9KaD+0RIIHDWbZGkcP/68PQ80A82X4rqpWtYT8rQ8XyFxJoExnJi5DdYUdWV23UdEJB LrULqjie/JXN6FNDU+jVkO0bDgDw3ySQ4coLI83RQWitR17s3WXbWW6PhgLbxyt5Q00= X-Google-Smtp-Source: AGHT+IE4cq05OkjYUSAgnZzqAzKf40dxIlYKr7v+HOFdczvTt57phu0weZ5JY1SWVHQCE16yV/OTUA== X-Received: by 2002:a17:902:d50e:b0:221:337:4862 with SMTP id d9443c01a7336-2217086de01mr12977035ad.15.1739913273049; Tue, 18 Feb 2025 13:14:33 -0800 (PST) Received: from dread.disaster.area (pa49-186-89-135.pa.vic.optusnet.com.au. [49.186.89.135]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-220d5348f34sm93829725ad.10.2025.02.18.13.14.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 18 Feb 2025 13:14:32 -0800 (PST) Received: from dave by dread.disaster.area with local (Exim 4.98) (envelope-from ) id 1tkUvF-00000002xpK-2NTA; Wed, 19 Feb 2025 08:14:29 +1100 Date: Wed, 19 Feb 2025 08:14:29 +1100 From: Dave Chinner To: Yunsheng Lin Cc: Yishai Hadas , Jason Gunthorpe , Shameer Kolothum , Kevin Tian , Alex Williamson , Chris Mason , Josef Bacik , David Sterba , Gao Xiang , Chao Yu , Yue Hu , Jeffle Xu , Sandeep Dhavale , Carlos Maiolino , "Darrick J. Wong" , Andrew Morton , Jesper Dangaard Brouer , Ilias Apalodimas , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman , Trond Myklebust , Anna Schumaker , Chuck Lever , Jeff Layton , Neil Brown , Olga Kornievskaia , Dai Ngo , Tom Talpey , Luiz Capitulino , Mel Gorman , kvm@vger.kernel.org, virtualization@lists.linux.dev, linux-kernel@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-erofs@lists.ozlabs.org, linux-xfs@vger.kernel.org, linux-mm@kvack.org, netdev@vger.kernel.org, linux-nfs@vger.kernel.org Subject: Re: [RFC] mm: alloc_pages_bulk: remove assumption of populating only NULL elements Message-ID: References: <20250217123127.3674033-1-linyunsheng@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspam-User: X-Stat-Signature: hbmpjnujmjrem95ho5p7y655j4drx3s7 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 512BA18000D X-HE-Tag: 1739913274-382719 X-HE-Meta: U2FsdGVkX19UPxWvbqqcEyqtJeRfEmCJqCQLhDQYGosmDroV8IHVheOXP131dPQBd61yPX/MXLos06yUnaD0CvGzpYqTcme0iCohgkv8TeIpCp5TCHN0K71BZ2t9Txbt2eV6pwY/svbSdH/jJ29+xIceCsxcr4kgq3Txm+tiKwVeOu4EusXnBbLFa3ecJiAQqMJjvBO/LRRpvi/3gngtDyn78JTRmSpiFeQF8Os/wAsmTrSRabLjKauyoFIkawSLw41/7EXd9RXRl9Goc8Ex9C0QiENBocdlxO7ZKvOWyjNRAzvZwH94JxDuTx1jlAFgn7r94Zs/yv34WKxtfSbEzGg8o0UcX4ac0XZdRPUxmHzPGxs04T9DJgMkxERTjwO6TvUSkq7FC2qzQKwTwQ7AQAGpaC/P8Dzhp03Vos53DacpPXTNqF5fY/PqSoJfztEQfrsBMnrgUTi3dgia0jtgA1LA/lQNJQCkxX/FJe5PGkjUQ716i+w6IPv35IabeV5G6mDu2tUw6DTvZIHrLbHn5hbmjuFhrKha2l8cb09u3rwl5M6J0e3HY51nV5njcd4Wozhe0zm7STcPGjryE/ntNmNZpZncScVVx2nEPnWouamePcnm7PzdCnlZ0cF/lahWUcJMrx4wnXb0EB9yW5kbcpF1zzdfM0sRh8L8E/ozomvA7ZAERfw2rOnj0O4MqUkPHTKqkPOtwOBIERw/5GkaG2DyoHWVcoiTuHC0SrmLsOJuXRwCtvkGMkyOVeYD+B5DiS8Z8uujrui7hpOZQ0fFod7ibWbxUlfWovvd/MPdkXEKwVTdTkWInP+HJZ+c2oRDPMwuIHuujOTcQjwLpMQ1pnAWy9bke436uiqJqdiKy6QEQ/8TsWsGZEBOZRF2TOYQF90KXLKJj+7/DGFhQMDV1UTwQ4vrqm/JfcDb6o11iG7zGerJKjR1nBr3rmsDDjSVTVbp83PRSFM1m5yVWjk 1I8NMl0J pGf3tiW0zIyl6tDznxRk7LnTH0+Im57gCbKuPls+EEEsbc1JUW/68nvI8COwiJqFjG9D8MbQ+odSgyS+TO9CDz2pSHCO7ntn78Yt6jYdLXSO7jdrEU1cxWDJzVasjLwtCQdm6Gb4qrnd99xcGFKO5E0oCG4QwNBXAMTwKS2X65RI/5nASqJQSzYeY36/2daX5Ie1FMJNhYOj30DcTn5aoNCZkxyLRM5W305JrHajaKJFixMYQbuCUFkenv+QaAYrENE7gXGyRVfooKabf2D0jYkLVZO/GjYMGhCFJhxqJATN75agWiAVP0zhsjTNjf03whcfSz1zeltxRYmZkHqb+GzQmmxBJ6RAt142TvA71PU/QMhIpx0FYPMUKEW2x2DTwElsFMf5jix9HKLnVe1Ymz2mr8E0USrwZpBCoy/kEjLKSCSIxBCPwXvhNzNmC5HRKjerqMqCO7w4fE5CIq3AUTbNe4/i9X8wPcG2pFm3zdfuMq7Pzot4fTgWW3gsHrIr3BpDZHcNatlMjVXE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Feb 18, 2025 at 05:21:27PM +0800, Yunsheng Lin wrote: > On 2025/2/18 5:31, Dave Chinner wrote: > > ... > > > ..... > > > >> diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c > >> index 15bb790359f8..9e1ce0ab9c35 100644 > >> --- a/fs/xfs/xfs_buf.c > >> +++ b/fs/xfs/xfs_buf.c > >> @@ -377,16 +377,17 @@ xfs_buf_alloc_pages( > >> * least one extra page. > >> */ > >> for (;;) { > >> - long last = filled; > >> + long alloc; > >> > >> - filled = alloc_pages_bulk(gfp_mask, bp->b_page_count, > >> - bp->b_pages); > >> + alloc = alloc_pages_bulk(gfp_mask, bp->b_page_count - refill, > >> + bp->b_pages + refill); > >> + refill += alloc; > >> if (filled == bp->b_page_count) { > >> XFS_STATS_INC(bp->b_mount, xb_page_found); > >> break; > >> } > >> > >> - if (filled != last) > >> + if (alloc) > >> continue; > > > > You didn't even compile this code - refill is not defined > > anywhere. > > > > Even if it did complile, you clearly didn't test it. The logic is > > broken (what updates filled?) and will result in the first > > allocation attempt succeeding and then falling into an endless retry > > loop. > > Ah, the 'refill' is a typo, it should be 'filled' instead of 'refill'. > The below should fix the compile error: > --- a/fs/xfs/xfs_buf.c > +++ b/fs/xfs/xfs_buf.c > @@ -379,9 +379,9 @@ xfs_buf_alloc_pages( > for (;;) { > long alloc; > > - alloc = alloc_pages_bulk(gfp_mask, bp->b_page_count - refill, > - bp->b_pages + refill); > - refill += alloc; > + alloc = alloc_pages_bulk(gfp_mask, bp->b_page_count - filled, > + bp->b_pages + filled); > + filled += alloc; > if (filled == bp->b_page_count) { > XFS_STATS_INC(bp->b_mount, xb_page_found); > break; > > > > > i.e. you stepped on the API landmine of your own creation where > > it is impossible to tell the difference between alloc_pages_bulk() > > returning "memory allocation failed, you need to retry" and > > it returning "array is full, nothing more to allocate". Both these > > cases now return 0. > > As my understanding, alloc_pages_bulk() will not be called when > "array is full" as the above 'filled == bp->b_page_count' checking > has ensured that if the array is not passed in with holes in the > middle for xfs. You miss the point entirely. Previously, alloc_pages_bulk() would return a value that would tell us the array is full, even if we call it with a full array to begin with. Now it fails to tell us that the array is full, and we have to track that precisely ourselves - it is impossible to tell the difference between "array is full" and "allocation failed". Not being able to determine from the allocation return value whether the array is ready for use or whether another go-around to fill it is needed is a very poor API choice, regardless of anything else. You've already demonstrated this: tracking array usage in every caller is error-prone and much harder to get right than just having alloc_pages_bulk() do everything for us. > > The existing code returns nr_populated in both cases, so it doesn't > > matter why alloc_pages_bulk() returns with nr_populated != full, it > > is very clear that we still need to allocate more memory to fill it. > > I am not sure if the array will be passed in with holes in the > middle for the xfs fs as mentioned above, if not, it seems to be > a typical use case like the one in mempolicy.c as below: > > https://elixir.bootlin.com/linux/v6.14-rc1/source/mm/mempolicy.c#L2525 That's not "typical" usage. That is implementing "try alloc" fast path that avoids memory reclaim with a slow path fallback to fill the rest of the array when the fast path fails. No other users of alloc_pages_bulk() is trying to do this. Indeed, it looks somewhat pointless to do this here (i.e. premature optimisation!), because the only caller of alloc_pages_bulk_mempolicy_noprof() has it's own fallback slowpath for when alloc_pages_bulk() can't fill the entire request. > > IOWs, you just demonstrated why the existing API is more desirable > > than a highly constrained, slightly faster API that requires callers > > to get every detail right. i.e. it's hard to get it wrong with the > > existing API, yet it's so easy to make mistakes with the proposed > > API that the patch proposing the change has serious bugs in it. > > IMHO, if the API is about refilling pages for the only NULL elements, > it seems better to add a API like refill_pages_bulk() for that, as > the current API seems to be prone to error of not initializing the > array to zero before calling alloc_pages_bulk(). How is requiring a well defined initial state for API parameters "error prone"? What code is failing to do the well known, defined initialisation before calling alloc_pages_bulk()? Allowing uninitialised structures in an API (i.e. unknown initial conditions) means we cannot make assumptions about the structure contents within the API implementation. We cannot assume that all variables are zero on the first use, nor can we assume that anything that is zero has a valid state. Again, this is poor API design - structures passed to interfaces -should- have a well defined initial state, either set by a *_init() function or by defining the initial state to be all zeros (i.e. via memset, kzalloc, etc). Performance and speed is not an excuse for writing fragile, easy to break code and APIs. -Dave. -- Dave Chinner david@fromorbit.com