From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CF242E7716C for ; Thu, 5 Dec 2024 15:22:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9C5236B00E5; Thu, 5 Dec 2024 10:19:19 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 558676B012A; Thu, 5 Dec 2024 10:19:16 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A0FCE6B00CE; Thu, 5 Dec 2024 10:19:11 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 862916B0085 for ; Sat, 19 Oct 2024 04:33:11 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 5F956C08F4 for ; Sat, 19 Oct 2024 08:32:57 +0000 (UTC) X-FDA: 82689686856.14.FC06AE2 Received: from mail-pf1-f195.google.com (mail-pf1-f195.google.com [209.85.210.195]) by imf21.hostedemail.com (Postfix) with ESMTP id 0EFC61C0002 for ; Sat, 19 Oct 2024 08:32:45 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=WtWxHl8d; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf21.hostedemail.com: domain of yunshenglin0825@gmail.com designates 209.85.210.195 as permitted sender) smtp.mailfrom=yunshenglin0825@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729326669; a=rsa-sha256; cv=none; b=l35jRzHR173cUFKNIUQiySG95KHKwVbUSysCCaDGHiNK1u0qLKejLUdBD3vbCF8DHSd6qN AQu6znqcIKPNeeI0AwdZyIdeue8+CEFJhpcq8Rhs4gjZmDmOYwcpW9yM2wSlMEpBpGrsu6 riqYDKiEUe/dhr/ZpddXrnV6N1+lahc= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=WtWxHl8d; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf21.hostedemail.com: domain of yunshenglin0825@gmail.com designates 209.85.210.195 as permitted sender) smtp.mailfrom=yunshenglin0825@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729326669; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=hjVx+evV8y7AenmUm89rFPBklsBKvjFgkjmcEucF22g=; b=7IFXA8B5xgHy9C0Sc75FMV5zQtzdR+RhQzzfLBBy5D6U4Nx0gTtjVcoVM49G2uTYkEnQQV junrRIumEkU2J6rDCy/fLSESGUkRhhlDLS7f2wW2pyvrrul2aaDET/D8X6fD/kpf4g4zeg Fj3UW7kaem/6f/Nt9wtOuTs/1UCVWG0= Received: by mail-pf1-f195.google.com with SMTP id d2e1a72fcca58-71e681bc315so1932270b3a.0 for ; Sat, 19 Oct 2024 01:33:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1729326788; x=1729931588; darn=kvack.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=hjVx+evV8y7AenmUm89rFPBklsBKvjFgkjmcEucF22g=; b=WtWxHl8dDWiRcrJhadDIE5VuJPCFHOiLtIz4HbRMw9AjjKqLFQcTyfJOPlbibWgqBP eh2w0VaIzoWmK/tBTZ7S45/gb5/fjVzibKwLSv6B3su3dEfuntgCkSnJUqavYlIx7EGF IXAptnKiFbRELP6N/aD9ZsdAoMTVJ92PhV589PCoS6sLXBGlNK2wjswOzmGbqm7+Aq9G p6QaGZgt1GUer2jYqcl8HAaaa14m7HYTdDb/PJBqahn3x9Q+y9BySNnhSwP+Kxy6tfcO 1CLGb9qcjtzUTxPqhM1Qv9bDH5bf/mcFBW4BoKH2iYM2lBzb4TmPUEmkesL82Aw9vaTc hl9Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729326788; x=1729931588; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=hjVx+evV8y7AenmUm89rFPBklsBKvjFgkjmcEucF22g=; b=M+E3vDopTIk7SIREXyhlWoax9boDHJara7fhCch0imaIrMubjdg1vSJN15igt4AxZS lrxIejRUfTufHzx4OZdVP1qlfhTtT8KIMWgDGqO7xFV0weE9twJxfMkBSm9NQNNs8tCj C7g2rJQBn1VcITgyPiDgJgxJE/GtPeN30HzjHlxMfS5VN/e+ftksIl9kX5y8HevxfuGy 6nL0dt1RVnUKHWWTM4TUeeumKWvzvadp01z2gKHr1g2k5wirbr7p0nlv25Pth7+xbFzF 9duuGcn0a/SEOsEG0D6dZv2zDcZGTz0tmhaf6XaUGq3Wr0iVJF2z/pnLoGPRe1UkWSoP tQbw== X-Forwarded-Encrypted: i=1; AJvYcCU9Uqx9ie2u150QVPQzAOgSlxltU94MlGipm6LwBNhHUwcE5rxp0qnug0NL00JCxPfajG4cXtUfXA==@kvack.org X-Gm-Message-State: AOJu0YwGaEGmBYgfh/VVEYiflLa0l5PRdgJKzkQvCl7YiuhCoDeeY3y7 0RQzmicazdreNVipaFOCE+zSzjFmcpyIzkUwuzN8D3wNwfSQ3KLV X-Google-Smtp-Source: AGHT+IGR7/pTVY7VLBIdcpKR+r1g6aaL1O3zf4FQoQqCaxl64SDDx4a+9FugYaswfjVMXgOFeXKLNw== X-Received: by 2002:a05:6a00:2e18:b0:71e:6a57:7288 with SMTP id d2e1a72fcca58-71ea416b4b8mr7893903b3a.5.1729326787962; Sat, 19 Oct 2024 01:33:07 -0700 (PDT) Received: from ?IPV6:2409:8a55:301b:e120:79c0:453d:47b6:bbf5? ([2409:8a55:301b:e120:79c0:453d:47b6:bbf5]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-7eacc23a10csm2529267a12.43.2024.10.19.01.33.05 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 19 Oct 2024 01:33:07 -0700 (PDT) Message-ID: Date: Sat, 19 Oct 2024 16:33:00 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH net-next v22 10/14] mm: page_frag: introduce prepare/probe/commit API To: Alexander Duyck , Yunsheng Lin Cc: davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Andrew Morton , linux-mm@kvack.org References: <20241018105351.1960345-1-linyunsheng@huawei.com> <20241018105351.1960345-11-linyunsheng@huawei.com> Content-Language: en-US From: Yunsheng Lin In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 0EFC61C0002 X-Stat-Signature: zhz6qho6pkt3a3h8rjh8r4mtj957ucxf X-Rspam-User: X-HE-Tag: 1729326765-977027 X-HE-Meta: U2FsdGVkX1/rMaurOHTu40i2wDYe6T8VKrZTShVvAS7A7rCJTDnNggm1mOVDMq5oOidfBkJqmqIYPfQPw7wxqthII6asv23JxTV9L/7BCqnJclJ5xXL5bZkVr4WoFCv4BcadYFpico9hqIqcp1OcG09CAqEIsmOgN94zRaibFscKOld3VGGSPrAOVcD+Wvj7zLRpyG03FjI7GZobi6oCbwhExzZPBuoGNZDrvNG3ooaL/Fa6iThIvGtc9joZw5ANKVdEgpiZaJDAo/O4kWFcwHuX3hghh0jYI1fU5TWGNvmX2Ccofo2LWV4Upd918qeh9OFfdqmNT+gGOW71awTNVBHL+6WpUES0P/jNhDs0a1QmHivdxKpvL16jcQB2lutrBjVytTU1f1+pd2nKJs//pyIDSWKG21cUPyZjTi5mQkHOg4ftaG3tcDYVWM9J5DLf1B2qzGQTiSAVcbQnAyKt0HOwWQ+gAGdKBMp9o8zTY/oHlEBcoixIGCHHQgpMfy+l5nlkr4O9Y02ByyNa59dkqCfTegx9aUDkgH5dBLG5Db5KJn+Lq+LmqW7U8jKTV2zEG3iVlH88arMUSh0dcodTOkxpJRToPxCyThMfbBuZjB1MslIOHbaInn7VRnvuA9HLZKf1BaG6OngSgC4k0J0A6kJA/SOfNrPlW4TWcx5qq7rp6DApbpPlJgsK2IXkQlhoNJCRR5wqV51J+z9UBsE2p/glqQxlukuAoYMdJNc5ZxqI0B+ySR+ksTSlMjUYcDIJMCRYsdcWCLPnef39xmVxXrtNYUdlei4V8uRZiHAGbV4BF1PXg1kJj5kRP7tR1tj5XL9+KgpBVVbwqf9eIb06YlaY4RqaIGRg94UKUAuGw7XvJnj/Nl1YJ53A2/PYmdAJVLnmWYj4/deUhdLW/4R/KuEIHGfNH6bplNQYl0WsZfPwlkToND/qaAy47lOA7UMSnC6PJpGL/DLBC2c9bkX npUiqIRM ROllbkfH8gfG1Ox25K/0OQMEJVzMxuNA8gQkQfWSUCeIW5gObnVf8Zifz2L6gNH961B+KAszwOaVs0whiwECQrL9duQRisguYinIy//CwzaRPq8VNUUQWQuUXwE2rtnw0gTjeeWabPu+gnz91RWsh5qidBiCWE7qn44MOnQzsu8TbihTISSb7JGyyloCeImiTPD5rsPOmF6+sqCOF0NL9hqhd8kD1dZx06uUArkDx8m6s0kwJ2e3c0sPv8vIMK5+z4twM7e1fPwSb622AY8q5sdSnUikcLPQjz+KG+cvdAe0SWH39GD+DCKFDI76MZT/riPEXGEoUre0v3pA1qTtluMPxBNkepy6v4G2GERwldkgZNNLFMI7y4povUFxUioBCLl7bOp6RGxslAMG3Ot5dFWAjUW03RKR7dhS/E6NBcJMYZ51O8/dSIEHSfZeNW0OhQ7G+vllFymwX7esysfaSRLIACYDPifcza97wfPBwimkk9P5Rl3mI4+IMdG1pmWy7DEo0gdgmx1KEIF9W1Z4qYVceVQwUNr10f+3NJ322uM0pXsiGY27hUu5OJa28t4mnStHhy8ESvQGN0jCffDozy92xk23fDbd4lshsDT7K2WRdTp+zg/TKpdaULRNd9QQrWwx1Pjv35m3LdwXImP5MsxSyCWI7lSipurmUBWSRlfYE7dg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 10/19/2024 2:03 AM, Alexander Duyck wrote: > > Not a huge fan of introducing a ton of new API calls and then having > to have them all applied at once in the follow-on patches. Ideally the > functions and the header documentation for them would be introduced in > the same patch as well as examples on how it would be used. > > I really think we should break these up as some are used in one case, > and others in another and it is a pain to have a pile of abstractions > that are all using these functions in different ways. I am guessing this patch may be split into three parts to make it more reviewable and easier to discuss here: 1. Prepare & commit related API, which is still the large one. 2. Probe API related API. 3. Abort API. And it is worthing mentioning that even if this patch is split into more patches, it seems impossible to break patch 12 up as almost everything related to changing "page_frag" to "page_frag_cache" need to be one patch to avoid compile error. > >> +static inline void page_frag_alloc_abort(struct page_frag_cache *nc, >> + unsigned int fragsz) >> +{ >> + VM_BUG_ON(fragsz > nc->offset); >> + >> + nc->pagecnt_bias++; >> + nc->offset -= fragsz; >> +} >> + > > We should probably have the same checks here you had on the earlier > commit. We should not be allowing blind changes. If we are using the > commit or abort interfaces we should be verifying a page frag with > them to verify that the request to modify this is legitimate. As an example in 'Preparation & committing API' section of patch 13, the abort API is used to abort the operation of page_frag_alloc_*() related API, so 'page_frag' is not available for doing those checking like the commit API. For some case without the needing of complicated prepare & commit API like tun_build_skb(), the abort API can be used to abort the operation of page_frag_alloc_*() related API when bpf_prog_run_xdp() returns XDP_DROP knowing that no one else is taking extra reference to the just allocated fragment. +Allocation & freeing API +------------------------ + +.. code-block:: c + + void *va; + + va = page_frag_alloc_align(nc, size, gfp, align); + if (!va) + goto do_error; + + err = do_something(va, size); + if (err) { + page_frag_alloc_abort(nc, size); + goto do_error; + } + + ... + + page_frag_free(va); If there is a need to abort the commit API operation, we probably call it something like page_frag_commit_abort()? > >> void page_frag_free(void *addr); >> >> #endif >> diff --git a/mm/page_frag_cache.c b/mm/page_frag_cache.c >> index f55d34cf7d43..5ea4b663ab8e 100644 >> --- a/mm/page_frag_cache.c >> +++ b/mm/page_frag_cache.c >> @@ -112,6 +112,27 @@ unsigned int __page_frag_cache_commit_noref(struct page_frag_cache *nc, >> } >> EXPORT_SYMBOL(__page_frag_cache_commit_noref); >> >> +void *__page_frag_alloc_refill_probe_align(struct page_frag_cache *nc, >> + unsigned int fragsz, >> + struct page_frag *pfrag, >> + unsigned int align_mask) >> +{ >> + unsigned long encoded_page = nc->encoded_page; >> + unsigned int size, offset; >> + >> + size = PAGE_SIZE << encoded_page_decode_order(encoded_page); >> + offset = __ALIGN_KERNEL_MASK(nc->offset, ~align_mask); >> + if (unlikely(!encoded_page || offset + fragsz > size)) >> + return NULL; >> + >> + pfrag->page = encoded_page_decode_page(encoded_page); >> + pfrag->size = size - offset; >> + pfrag->offset = offset; >> + >> + return encoded_page_decode_virt(encoded_page) + offset; >> +} >> +EXPORT_SYMBOL(__page_frag_alloc_refill_probe_align); >> + > > If I am not mistaken this would be the equivalent of allocating a size > 0 fragment right? The only difference is that you are copying out the > "remaining" size, but we could get that from the offset if we knew the > size couldn't we? Would it maybe make sense to look at limiting this > to PAGE_SIZE instead of passing the size of the actual fragment? I am not sure if I understand what does "limiting this to PAGE_SIZE" mean here. I probably should mention the usecase of probe API here. For the usecase of mptcp_sendmsg(), the minimum size of a fragment can be smaller when the new fragment can be coalesced to previous fragment as there is an extra memory needed for some header if the fragment can not be coalesced to previous fragment. The probe API is mainly used to see if there is any memory left in the 'page_frag_cache' that can be coalesced to previous fragment. > >> void *__page_frag_cache_prepare(struct page_frag_cache *nc, unsigned int fragsz, >> struct page_frag *pfrag, gfp_t gfp_mask, >> unsigned int align_mask) >> -- >> 2.33.0 >> >