From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E8267C021AA for ; Fri, 21 Feb 2025 09:34:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 151E36B0095; Fri, 21 Feb 2025 04:34:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1028D6B0096; Fri, 21 Feb 2025 04:34:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F0B276B009B; Fri, 21 Feb 2025 04:34:31 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id D29296B0095 for ; Fri, 21 Feb 2025 04:34:31 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 2C21A4D597 for ; Fri, 21 Feb 2025 09:34:31 +0000 (UTC) X-FDA: 83143441542.10.4586BFF Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by imf27.hostedemail.com (Postfix) with ESMTP id E0A2B4000A for ; Fri, 21 Feb 2025 09:34:27 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=none; spf=pass (imf27.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740130469; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=KV6/ZMu6u/scLjwfN4QKh8XL+w11Fjmxpk6VPBV1oRc=; b=JI1Rw3EeiixDxcQyl8ZJVelECzhEgAew7WzHObsMfAaAsBLTrY2EKLrRQ3DJ51dORKhN0U NzDpEwm4mC1hUl9tGeUkUM+jPEztRC97Z2valoYB02/TJyeFkNxB3z006S+h12vI4t9q1f JcdAZr0lBGx5oxJPLW/pSPDjB8bz+R8= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=none; spf=pass (imf27.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740130469; a=rsa-sha256; cv=none; b=pW82MXVNQEH2q5vlOAWtVrt5lbGJehg0pGQZ807kFgSX/SgUbgjEhMu5iz5lKbdUlxRlQL ab9NLTkDuSGvWou0jJ4Y4TXSDqLpKMqj2PRa9bJnnSmeab3pAlFcVbUW7gDvmO6fHPbkZs MFgfq+yWkB4XQXbcg7JtD8SYDK9hiXY= Received: from mail.maildlp.com (unknown [172.19.88.105]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4YzlG03WHwzdb9B; Fri, 21 Feb 2025 17:29:44 +0800 (CST) Received: from dggpemf200006.china.huawei.com (unknown [7.185.36.61]) by mail.maildlp.com (Postfix) with ESMTPS id 57A03140158; Fri, 21 Feb 2025 17:34:23 +0800 (CST) Received: from [10.67.120.129] (10.67.120.129) by dggpemf200006.china.huawei.com (7.185.36.61) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Fri, 21 Feb 2025 17:34:22 +0800 Message-ID: Date: Fri, 21 Feb 2025 17:34:22 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC] mm: alloc_pages_bulk: remove assumption of populating only NULL elements To: Chuck Lever , Yishai Hadas , Jason Gunthorpe , Shameer Kolothum , Kevin Tian , Alex Williamson , Chris Mason , Josef Bacik , David Sterba , Gao Xiang , Chao Yu , Yue Hu , Jeffle Xu , Sandeep Dhavale , Carlos Maiolino , "Darrick J. Wong" , Andrew Morton , Jesper Dangaard Brouer , Ilias Apalodimas , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman , Trond Myklebust , Anna Schumaker , Jeff Layton , Neil Brown , Olga Kornievskaia , Dai Ngo , Tom Talpey CC: Luiz Capitulino , Mel Gorman , , , , , , , , , References: <20250217123127.3674033-1-linyunsheng@huawei.com> <7b7492c0-a3a7-470b-b7aa-697ac790a94b@oracle.com> Content-Language: en-US From: Yunsheng Lin In-Reply-To: <7b7492c0-a3a7-470b-b7aa-697ac790a94b@oracle.com> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit X-Originating-IP: [10.67.120.129] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To dggpemf200006.china.huawei.com (7.185.36.61) X-Rspamd-Queue-Id: E0A2B4000A X-Stat-Signature: uq9c9ouizydcqkpwkoc7fcwbmksbens3 X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1740130467-776030 X-HE-Meta: U2FsdGVkX18E8b+gINL4QnTUq/qUDbqBMp2olniahaWdRPq+6yJwFu35aXiYq4j0k3Z9xJFaJAikiVzw/A8awXSF0c42ZTDf5oh/PmL8w2looYbz+pZT8ef97fsQspSUm6C2ADaSX8ogaCyda6UHOnusUyB3kGgclJ2rc78A5laBjwfmpVfDl56HlIGSi/7oM5XzF+IjJZykFefGal/t0qN1thJdENMS+IGfxi5/YWfH8WhQPggXE7L7kyZaoycC4wU+Z4EwTip8EWZ/2sXZbReAHKNrvp8IKtnejexkNnzxZAu61MHdzjBj9xADir1y5eAV3f6BHZgn5eKZQtYVxkcLRx9zCg16E+wkWf6Uf31nXhKbS2Xe6QHEkRuSHIqj7wuERb3VN2AiG+ClczT+bPJK8+PpBErbGLPabVNxL2ckHwEbZeswXRUwHFnPcJrjM+/LHjTZLcvdtj949vB/bAXOKSsLtlqeOMBEK4hLutCmlhhRMELzlRNn5WNAz6inCJpK4BfF9QJcyDbz5sbOHzX4ctTQDhmNL7FuTTMY22KauepbtkMiR9tpC2mraP69p4o5Sxt+7j35NMsbD+v+94FqsK8KwDQpq31llBouaJKT0eAi6I5zhUbWT7PFJusGt9imsMsXdOKmJFf4AmrlIzwRnkJfkkGB3HytO0XXB92Rzvr6yE6+Ou6HCLFvGP3mVZgOIZKJu41O9iQ2U445mTIwDpffepGLRvhz+J9oxcKTrGozfycPhWQZ2s3iE6QS3RhxrDHiStgzmdm9giJNv7mEk/4p1gX63ZKvosGuUp7Y9qNM8M4Z9gM8wTbrHy5B+8id654ainOOXqk18BfF4pRlD7X5k1HtGJv9ey5fJ/+tpN4rmkKLi+5+lXNirQFBU3FdzK7rv4GhTtBGNZdh4Mz/YXojhvJpU8bv7HcfrOIkRky2+KkMd7oN0tOLxnWBDfFF22x18efJUHblIyS x9+wB8qB A8C3PVMI/Zs9nxCPLxYUoUzyG8N7+GLD7mQUpsS3/Dv99cMCNYSqZpErDRYgF8ENfPWWkZbMQquAEJ3A+Jkqr3FBQMvm7s8VplodaW2m7xkSCQtjirLP/pYV3JMvUiT4HfZFpSHQEKaUf3ito+fkCS8R2I9EmWrmhpxCQDVek0GTiszwt0LFr4+noHpHIU+KU+BhDVsinBO8jpPRxx4PChzPUtSNK0UuvzzASW/OcfdX52iwoKsd7eBuRrl/CKzv9dHgTrbtR6PKIH20dP9pEyoZOyXA8uq01fj54nvEklGtki+kc85wKMSHlsvQCSYi8CbGP0rEjJ7EPjXqOtUn9Kbcpjs/dxDCMhWpS6Gz8MrFCeqvK67hG8HQ60YQHFJNGOTyYtACt1GukLqyg64hQrTQsGw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2025/2/18 22:17, Chuck Lever wrote: > On 2/18/25 4:16 AM, Yunsheng Lin wrote: >> On 2025/2/17 22:20, Chuck Lever wrote: >>> On 2/17/25 7:31 AM, Yunsheng Lin wrote: >>>> As mentioned in [1], it seems odd to check NULL elements in >>>> the middle of page bulk allocating, >>> >>> I think I requested that check to be added to the bulk page allocator. >>> >>> When sending an RPC reply, NFSD might release pages in the middle of >> >> It seems there is no usage of the page bulk allocation API in fs/nfsd/ >> or fs/nfs/, which specific fs the above 'NFSD' is referring to? > > NFSD is in fs/nfsd/, and it is the major consumer of > net/sunrpc/svc_xprt.c. > > >>> the rq_pages array, marking each of those array entries with a NULL >>> pointer. We want to ensure that the array is refilled completely in this >>> case. >>> >> >> I did some researching, it seems you requested that in [1]? >> It seems the 'holes are always at the start' for the case in that >> discussion too, I am not sure if the case is referring to the caller >> in net/sunrpc/svc_xprt.c? If yes, it seems caller can do a better >> job of bulk allocating pages into a whole array sequentially without >> checking NULL elements first before doing the page bulk allocation >> as something below: >> >> +++ b/net/sunrpc/svc_xprt.c >> @@ -663,9 +663,10 @@ static bool svc_alloc_arg(struct svc_rqst *rqstp) >> pages = RPCSVC_MAXPAGES; >> } >> >> - for (filled = 0; filled < pages; filled = ret) { >> - ret = alloc_pages_bulk(GFP_KERNEL, pages, rqstp->rq_pages); >> - if (ret > filled) >> + for (filled = 0; filled < pages; filled += ret) { >> + ret = alloc_pages_bulk(GFP_KERNEL, pages - filled, >> + rqstp->rq_pages + filled); >> + if (ret) >> /* Made progress, don't sleep yet */ >> continue; >> >> @@ -674,7 +675,7 @@ static bool svc_alloc_arg(struct svc_rqst *rqstp) >> set_current_state(TASK_RUNNING); >> return false; >> } >> - trace_svc_alloc_arg_err(pages, ret); >> + trace_svc_alloc_arg_err(pages, filled); >> memalloc_retry_wait(GFP_KERNEL); >> } >> rqstp->rq_page_end = &rqstp->rq_pages[pages]; >> >> >> 1. https://lkml.iu.edu/hypermail/linux/kernel/2103.2/09060.html > > I still don't see what is broken about the current API. As mentioned in [1], the page bulk alloc API before this patch may have some space for improvement from performance and easy-to-use perspective as the most existing calllers of page bulk alloc API are trying to bulk allocate the page for the whole array sequentially. 1. https://lore.kernel.org/all/c9950a79-7bcb-41c2-a59e-af315dc6d7ff@huawei.com/ > > Anyway, any changes in svc_alloc_arg() will need to be run through the > upstream NFSD CI suite before they are merged. Is there any web link pointing to the above NFSD CI suite, so that I can test it if removing assumption of populating only NULL elements is indeed possible? Look more closely, it seems svc_rqst_release_pages()/svc_rdma_save_io_pages() does set rqstp->rq_respages[i] to NULL based on rqstp->rq_next_page, and the original code before using the page bulk alloc API does seem to only allocate page for NULL elements as can see from the below patch: https://lore.kernel.org/all/20210325114228.27719-8-mgorman@techsingularity.net/T/#u The clearing of rqstp->rq_respages[] to NULL does seems sequentially, is it possible to only pass NULL elements in rqstp->rq_respages[] to alloc_pages_bulk() so that bulk alloc API does not have to do the NULL checking and use the array only as output parameter? > >