From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A11EFD3E1BD for ; Sat, 19 Oct 2024 08:30:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8EBA66B0082; Sat, 19 Oct 2024 04:30:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 89C896B0083; Sat, 19 Oct 2024 04:30:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 73BD86B0085; Sat, 19 Oct 2024 04:30:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 571AE6B0082 for ; Sat, 19 Oct 2024 04:30:06 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id C73F71A091E for ; Sat, 19 Oct 2024 08:29:42 +0000 (UTC) X-FDA: 82689678960.04.05A72E6 Received: from mail-pf1-f194.google.com (mail-pf1-f194.google.com [209.85.210.194]) by imf08.hostedemail.com (Postfix) with ESMTP id CF9F016000A for ; Sat, 19 Oct 2024 08:29:55 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=mPpKj2uB; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf08.hostedemail.com: domain of yunshenglin0825@gmail.com designates 209.85.210.194 as permitted sender) smtp.mailfrom=yunshenglin0825@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729326484; a=rsa-sha256; cv=none; b=TIVODeq4aQgGml2P4tLiHzSJzagRhb3zQ9rA15rnA4VckZUl68B+tz4rV1fQif4XrkIFb1 sp1OM34zwPsSt1zo3tatXnRhqYU85/r6GIoT95sfSf8SklRNpnU5fBDW18qHJhe8+Pvdiu udBUXzKcIvg6gm3LJRQyPUCCzWfsSbE= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=mPpKj2uB; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf08.hostedemail.com: domain of yunshenglin0825@gmail.com designates 209.85.210.194 as permitted sender) smtp.mailfrom=yunshenglin0825@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729326484; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ypk+z+xRcTBWV1V15b0w7hLYpsbWMpsZnUP5WJHs+kk=; b=FstevgfysVV3GvMLMaCo0YBdrGVQ2eQLNAn+7F/gNv2AbH73DbrQPsJD7gicGXvRLbAhYe SrfgWCYt4LD1Eyq9YSQJbAY9/8qeXeHpyPYxJD1/Oy66re2dTfuVAIMOoJByYpwfr0ng/6 xuG6qRCK/uzCy/5LyMHdP24HW/PBgPM= Received: by mail-pf1-f194.google.com with SMTP id d2e1a72fcca58-71e8235f0b6so2335779b3a.3 for ; Sat, 19 Oct 2024 01:30:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1729326602; x=1729931402; darn=kvack.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=ypk+z+xRcTBWV1V15b0w7hLYpsbWMpsZnUP5WJHs+kk=; b=mPpKj2uBjBKQH6ZYaY0R3NrZXYs1l89Cz6Owm5YF0Z4grwbVx2yd4ZXJohJvupE4Y3 dr5YNpDnENGfWsurBpEVvnRaPT4QbQOrppvbOAfmcq11bs9Q8ml6EVEYYwFFVY/Dx+5p nQ9OuJlX6Gc/EpezayTrg5uec/NA55v0NewEQiWDKEkVJlTCFkM1zoMlW5mqzZi4e4EB P4XEa1UJla7kyDnoyaa0QnkK6XaNpwIVDEhHTmsP1dV759lb76GkSSE1kPhkiStMZRB4 FxnsEiQih23kZnMeBVirqmvOaFZ+vIXaGInn44i2Oxa9tGz536QDOOlc97O0FI3iYGxr Hb4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729326602; x=1729931402; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ypk+z+xRcTBWV1V15b0w7hLYpsbWMpsZnUP5WJHs+kk=; b=plibohaGQaCJGj03kh9xHon/SMj+Tv9Kv/zhSe5wVIMKd2B2vdzpsRnDN8cp1zuGlP J2XZ5fRI0sVn8hCnVJKAnZX+isrUTb43s4NGv5RyzNI4j9q8GFeqK6XH7SIfWEJPmUp6 d68240bI1uQjVjZ7pAA8O1/IK87vByOJG/JmpApOuXq1EOuWxhgNolttcWxTNtjRsONJ /kNhhuU6vIIgvKgTySklQbfL32sFS+GB/xmZVpdgfhXunvkEVljTPIdLY5MU7MkJib6Y Pk1jAPpHhHESzCmiJCVEc14s9dwYmeBLsBC0b1HxrbZBqjTV/W/loSlk9LFN0o3VHjn+ b1rA== X-Forwarded-Encrypted: i=1; AJvYcCW+uqGYKbiVkMo+NrNzVs2/cFimf/Fss2pM4NndTR6hfQY19lWhWCnHyk1BNRviwz7a4OR424oQXQ==@kvack.org X-Gm-Message-State: AOJu0YzJWJaGeLAZ+aY6Fxu6vHlMOk6CQpNH9pY/HTilfLg0ognabavS cHXBDel84GOrROh6OS9FnWSlJSjzZRajyl4615MbuR9Qmn/zbtxo X-Google-Smtp-Source: AGHT+IGIVncmZLdglbYL5/bB7UPtwsZkuk61GpwBSnZML5xru3zyieLSJxiHelgOV1+GXlMGsZ7Z0g== X-Received: by 2002:a05:6a00:190c:b0:71e:75c0:2552 with SMTP id d2e1a72fcca58-71ea30284f0mr7454212b3a.0.1729326602344; Sat, 19 Oct 2024 01:30:02 -0700 (PDT) Received: from ?IPV6:2409:8a55:301b:e120:79c0:453d:47b6:bbf5? ([2409:8a55:301b:e120:79c0:453d:47b6:bbf5]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-71ea3314192sm2660952b3a.41.2024.10.19.01.30.00 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 19 Oct 2024 01:30:01 -0700 (PDT) Message-ID: Date: Sat, 19 Oct 2024 16:29:55 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH net-next v22 07/14] mm: page_frag: some minor refactoring before adding new API To: Alexander Duyck , Yunsheng Lin Cc: davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Andrew Morton , linux-mm@kvack.org References: <20241018105351.1960345-1-linyunsheng@huawei.com> <20241018105351.1960345-8-linyunsheng@huawei.com> Content-Language: en-US From: Yunsheng Lin In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: CF9F016000A X-Stat-Signature: kakfbccro6nesickegos1ue49gwmhae4 X-Rspam-User: X-HE-Tag: 1729326595-275279 X-HE-Meta: U2FsdGVkX19zTCn7exsbCdcuqT1PdAspBJwwxQseaYkGwuYKYTr1faI3SiObJSem5WBv1+tO94wl7DxmOTp4vcMM3UDMqGWKqwhw6kKRXk32U+7uPpap2gtCIDbFTFsBVBB5Sopyx7LSnRI5S4JPvvcun6O0SU0paN1txAYQRHpITMWySRuE1IKDCdaTDvlomx1Qegm1uAUYej6X84SKu9z1PvDdaJOxFm6wcOSXmpip1s3PXJIlaAh7rfW6lWjk3iDd5IoPNwtVM1JmkCWn5QsLEd9r9XBoTjPOQ4veI02kJJShbd5a5yGWaS2QxMSXo010BpqRBultrTXh7ITbmFkf3Yf1qvvj/Asb5brbmg32+aOsiHHu7L43tPw5NDEcl2SLnhcI2owrD8E7kRzBmzj9nKLQV717vxOBk0jlFVqd6+kuYBakhuC1Hpd9H/0Xmq9gvRHnlrgMY8zCQAeVAhL89gg4OPYtfQHFFyzLuNxkfuYJd5l8oWG71v7ssXPXhLNxoFjjYRN+VEKBfgk+KdT2m+5Tc1eNfcgCWV6A1gO+xF5wFAgKn6ERPlQdfcB6qiLGeiE3u0FI5Ma4xjAMt941/yrTuJCJo7vTLyvHu2aGeCIgjMh1PQ10M7JQlnJDUWFBxwsGdDY9gk8nkdOgcqTNoy3GcFXeTB4Ri7d5Jc72EEbctgo10VZIhOW3d7gF1aJvTuhA3updtau/LaPFipVBZb+mhPUDeis90gzmyYQ9Fj3pkYkWicXQlDggtCFunIFtDM75ixTK3UIUt4QMoCGLWAwWVKXgfvlmE1NYS1W5ByLLdZXBL10i8PvPiMoC20n9LN0Qwklas3NpKppP3PH6TcXifAf1ApGzPgEcFE6WUvks8rV1d2gnWM6lpyGSvPTEzXlLAPrLZhpB854Omui/zMfGviPq2yyxNvAmAIvKfULSISi11A5aOclTUc/gq74RTalanZnjzCoAC2x npgNOh1S HkYBqdEvkvis9LNe/L1dRVOXV3Sr3LucidX427jj8GuBM2u954JcbiNcWC0qfi7/1o/B0XT4u/OvXTxXZtZikH5/Ng8jay7lyc8njoJNyjcur9uWpfBCmWYyDjNPEgNaElmj9hexsWZUcYBVmNdeJT2vNVhxXUzcj06ODsN5ol7c/pwNBgZYxF1cftotDWPlvh3qBbLyLg9ZvoHVPQdBqDZVEZS9fdNA7vf3mLMwEUIKmeb0bdRfGKrqvd6JvNcdqjtiSUtSQqxm7fWnymBoMGGW8fbJHUNOD0hbeyGSJl0WBiz4iYGveYH+/+pbGdF3hXtKLmTS8eiD6AHzaK0dMkrmR16+MU1d7jQCEbjbUh+L4NGCuuIG6VubYdbpl9vvurw2FyJ7limaoZhxBcArWB2HfSkwI1cUXTBT48/7fZx4TVqrE3KSgrKqaUvlfYPg2dZFm3VIJd8RddgLpFk1Tffit99oWjdk5ECY44ovFtvg32cjIxJH81K10fKTJaXLxGNAnJTSRJiAu3zGbkJeMO2EgtAYbAH7NgwZUMiVj60R/JNDjlcEMVt4a2g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 10/19/2024 1:26 AM, Alexander Duyck wrote: ... >> +static inline void *__page_frag_alloc_align(struct page_frag_cache *nc, >> + unsigned int fragsz, gfp_t gfp_mask, >> + unsigned int align_mask) >> +{ >> + struct page_frag page_frag; >> + void *va; >> + >> + va = __page_frag_cache_prepare(nc, fragsz, &page_frag, gfp_mask, >> + align_mask); >> + if (unlikely(!va)) >> + return NULL; >> + >> + __page_frag_cache_commit(nc, &page_frag, fragsz); > > Minor nit here. Rather than if (!va) return I think it might be better > to just go with if (likely(va)) __page_frag_cache_commit. Ack. > >> + >> + return va; >> +} >> >> static inline void *page_frag_alloc_align(struct page_frag_cache *nc, >> unsigned int fragsz, gfp_t gfp_mask, >> diff --git a/mm/page_frag_cache.c b/mm/page_frag_cache.c >> index a36fd09bf275..a852523bc8ca 100644 >> --- a/mm/page_frag_cache.c >> +++ b/mm/page_frag_cache.c >> @@ -90,9 +90,31 @@ void __page_frag_cache_drain(struct page *page, unsigned int count) >> } >> EXPORT_SYMBOL(__page_frag_cache_drain); >> >> -void *__page_frag_alloc_align(struct page_frag_cache *nc, >> - unsigned int fragsz, gfp_t gfp_mask, >> - unsigned int align_mask) >> +unsigned int __page_frag_cache_commit_noref(struct page_frag_cache *nc, >> + struct page_frag *pfrag, >> + unsigned int used_sz) >> +{ >> + unsigned int orig_offset; >> + >> + VM_BUG_ON(used_sz > pfrag->size); >> + VM_BUG_ON(pfrag->page != encoded_page_decode_page(nc->encoded_page)); >> + VM_BUG_ON(pfrag->offset + pfrag->size > >> + (PAGE_SIZE << encoded_page_decode_order(nc->encoded_page))); >> + >> + /* pfrag->offset might be bigger than the nc->offset due to alignment */ >> + VM_BUG_ON(nc->offset > pfrag->offset); >> + >> + orig_offset = nc->offset; >> + nc->offset = pfrag->offset + used_sz; >> + >> + /* Return true size back to caller considering the offset alignment */ >> + return nc->offset - orig_offset; >> +} >> +EXPORT_SYMBOL(__page_frag_cache_commit_noref); >> + > > I have a question. How often is it that we are committing versus just > dropping the fragment? It seems like this approach is designed around > optimizing for not commiting the page as we are having to take an > extra function call to commit the change every time. Would it make > more sense to have an abort versus a commit? Before this patch, page_frag_alloc() related API seems to be mostly used for skb data or frag for rx part, see napi_alloc_skb() or some drivers like e1000, but with more drivers using the page_pool for skb rx frag, it seems skb data for tx is the main usecase. And the prepare and commit API added in the patchset seems to be mainly used for skb frag for tx part except af_packet. It seems it is not very clear which is mostly used one, mostly likely the prepare and commit API might be the mostly used one if I have to guess as there might be more memory needed for skb frag than skb data. > >> +void *__page_frag_cache_prepare(struct page_frag_cache *nc, unsigned int fragsz, >> + struct page_frag *pfrag, gfp_t gfp_mask, >> + unsigned int align_mask) >> { >> unsigned long encoded_page = nc->encoded_page; >> unsigned int size, offset; >> @@ -114,6 +136,8 @@ void *__page_frag_alloc_align(struct page_frag_cache *nc, >> /* reset page count bias and offset to start of new frag */ >> nc->pagecnt_bias = PAGE_FRAG_CACHE_MAX_SIZE + 1; >> nc->offset = 0; >> + } else { >> + page = encoded_page_decode_page(encoded_page); >> } >> >> size = PAGE_SIZE << encoded_page_decode_order(encoded_page); > > This makes no sense to me. Seems like there are scenarios where you > are grabbing the page even if you aren't going to use it? Why? > > I think you would be better off just waiting to the end and then > fetching it instead of trying to grab it and potentially throw it away > if there is no space left in the page. Otherwise what you might do is > something along the lines of: > pfrag->page = page ? : encoded_page_decode_page(encoded_page); But doesn't that mean an additional checking is needed to decide if we need to grab the page? But the './scripts/bloat-o-meter' does show some binary size shrink using the above. > > >> @@ -132,8 +156,6 @@ void *__page_frag_alloc_align(struct page_frag_cache *nc, >> return NULL; >> } >> >> - page = encoded_page_decode_page(encoded_page); >> - >> if (!page_ref_sub_and_test(page, nc->pagecnt_bias)) >> goto refill; >> >> @@ -148,15 +170,17 @@ void *__page_frag_alloc_align(struct page_frag_cache *nc, >> >> /* reset page count bias and offset to start of new frag */ >> nc->pagecnt_bias = PAGE_FRAG_CACHE_MAX_SIZE + 1; >> + nc->offset = 0; >> offset = 0; >> } >> >> - nc->pagecnt_bias--; >> - nc->offset = offset + fragsz; >> + pfrag->page = page; >> + pfrag->offset = offset; >> + pfrag->size = size - offset; > > I really think we should still be moving the nc->offset forward at > least with each allocation. It seems like you end up doing two flavors > of commit, one with and one without the decrement of the bias. So I > would be okay with that being pulled out into some separate logic to > avoid the extra increment in the case of merging the pages. However in > both cases you need to move the offset, so I would recommend keeping > that bit there as it would allow us to essentially call this multiple > times without having to do a commit in between to keep the offset > correct. With that your commit logic only has to verify nothing > changes out from underneath us and then update the pagecnt_bias if > needed. The problem is that we don't really know how much the nc->offset need to be moved forward to and the caller needs the original offset for skb_fill_page_desc() related calling when prepare API is used as an example in 'Preparation & committing API' section of patch 13: +Preparation & committing API +---------------------------- + +.. code-block:: c + + struct page_frag page_frag, *pfrag; + bool merge = true; + void *va; + + pfrag = &page_frag; + va = page_frag_alloc_refill_prepare(nc, 32U, pfrag, GFP_KERNEL); + if (!va) + goto wait_for_space; + + copy = min_t(unsigned int, copy, pfrag->size); + if (!skb_can_coalesce(skb, i, pfrag->page, pfrag->offset)) { + if (i >= max_skb_frags) + goto new_segment; + + merge = false; + } + + copy = mem_schedule(copy); + if (!copy) + goto wait_for_space; + + err = copy_from_iter_full_nocache(va, copy, iter); + if (err) + goto do_error; + + if (merge) { + skb_frag_size_add(&skb_shinfo(skb)->frags[i - 1], copy); + page_frag_commit_noref(nc, pfrag, copy); + } else { + skb_fill_page_desc(skb, i, pfrag->page, pfrag->offset, copy); + page_frag_commit(nc, pfrag, copy); + }