From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 74423C4332F for ; Wed, 13 Dec 2023 01:49:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9AB536B0387; Tue, 12 Dec 2023 20:49:49 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 90D406B0385; Tue, 12 Dec 2023 20:49:49 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 761336B037B; Tue, 12 Dec 2023 20:49:49 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 59FE26B036D for ; Tue, 12 Dec 2023 20:49:49 -0500 (EST) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 2EBB81C10F5 for ; Wed, 13 Dec 2023 01:49:49 +0000 (UTC) X-FDA: 81560113698.17.CD16F67 Received: from mail-ej1-f53.google.com (mail-ej1-f53.google.com [209.85.218.53]) by imf20.hostedemail.com (Postfix) with ESMTP id 605681C001B for ; Wed, 13 Dec 2023 01:49:46 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=jbP36mcg; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf20.hostedemail.com: domain of almasrymina@google.com designates 209.85.218.53 as permitted sender) smtp.mailfrom=almasrymina@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1702432186; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=4Dp7XSpsxzOVeUljUsll9zHj7vC0qxh0to2u+ut6gGQ=; b=U+emNYymTJWtEeuJt94OaUUC5juLRIb4SnNfDXT+X+kkj6LGoqbcu8SlT/bc297yoiQcBS ycuiAfmLhalw11NxeMM9xDcJJqVSvVadxC0EQPLemHcbKD1l48lALqgRwE7K6SkEw0LLVO frujD3DCwWTLx1bK7Ppo96ABHNmt3ag= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=jbP36mcg; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf20.hostedemail.com: domain of almasrymina@google.com designates 209.85.218.53 as permitted sender) smtp.mailfrom=almasrymina@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1702432186; a=rsa-sha256; cv=none; b=lQPMWPSCOXQvdGil7BxlqGb9SknjGj286/UIytjw0unL5R8/1tl9AGIrms6d1UyhbF2HsL 4wgTFPj4B2VhtvXX9IDnOlKcIyoVXkWqDeBhlT1tY7UIU1PboqBxZ+UsbRbNNmX2HfUEji nIFmpU4GIMQ3zIYQPpd+28rdraNJgmU= Received: by mail-ej1-f53.google.com with SMTP id a640c23a62f3a-a1da1017a09so764830366b.3 for ; Tue, 12 Dec 2023 17:49:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1702432185; x=1703036985; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=4Dp7XSpsxzOVeUljUsll9zHj7vC0qxh0to2u+ut6gGQ=; b=jbP36mcgXykjk5VtJ4pievmRSymAGa6ahdlD2vqiisXbFCKPTWMSKjVRwcK7bUungb dF+d1th13rxkrJJY5PfJ8+S1vQ9e8eEBaoqHq4da8e49TFwJvXG0kNV5UFXfaD9COryf bPiUtd/UuTjhlRBbazqidPIPgkNRJkcwx3MCOjy1uQ4nnuGSzdOn0s2jBuf72h8/xcLP GMivcs52nr5u1KMbUhoB0T8Y2fpNLp+7ihBYMhnMOtDjq8amkgxqPHQNVCQGiBB1oNuF MEdSlrtAJi9BO3jxyX0flVc7sGKwA1/xn4XN/KpJFb/i5+iXn0aPL7Iwo5+TeONvzPrC Vhgw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702432185; x=1703036985; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4Dp7XSpsxzOVeUljUsll9zHj7vC0qxh0to2u+ut6gGQ=; b=H5qFnqIo2P10oC+Pgy09Fymo+rVp9RSxw+D2E6p1Q1b6Rhbvc21cDMBYZ7yuSnSiaH 47Phk1RkcV89f3fE9ow7Qvgui14sADtebznRZ9EVwPZHgKQ3mVn3CbibR8pNZ6rCNfuO WYLaX2XvJDucKLA1NlKPELpSc9lbK6AVH3U7/KhYRipqiYmjcCKga7UDoFmYkem86UVy bmHayjnUSztPzxb84c4EcTEgiEr9NG/pkGFi+oTjPTBu7b9Ls9LZNChm36ljRMNQtuMi V6xnUQiJAyj1bwK764qjsB/6DNn2zhcwDuvU9M3B1/hNPIieJa9T+u3ZJEk8kcQcmo/w AkLQ== X-Gm-Message-State: AOJu0YykuKwr+xXvvetkGrFd+kog+DJGAmlm7tQCylamPFMaOCa3RlgY 5Xo10zPg0FHoHl5uruJJJBkXMTC5I5+lJfVK7oi3Cg== X-Google-Smtp-Source: AGHT+IE/VYpq5pvmEgd37RPrAwycVczGDtBum4RYx++cbQb3nApf1q/8jfeKyKbjAvGvRf5ib8GJ29kZBhsqp+RH20c= X-Received: by 2002:a17:906:cb9b:b0:a19:a19b:78d6 with SMTP id mf27-20020a170906cb9b00b00a19a19b78d6mr3018061ejb.153.1702432184670; Tue, 12 Dec 2023 17:49:44 -0800 (PST) MIME-Version: 1.0 References: <20231212044614.42733-1-liangchen.linux@gmail.com> <20231212044614.42733-5-liangchen.linux@gmail.com> In-Reply-To: <20231212044614.42733-5-liangchen.linux@gmail.com> From: Mina Almasry Date: Tue, 12 Dec 2023 17:49:33 -0800 Message-ID: Subject: Re: [PATCH net-next v9 4/4] skbuff: Optimization of SKB coalescing for page pool To: Liang Chen Cc: davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, hawk@kernel.org, ilias.apalodimas@linaro.org, linyunsheng@huawei.com, netdev@vger.kernel.org, linux-mm@kvack.org, jasowang@redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 605681C001B X-Stat-Signature: jka7w5sp973qd7mhim7kromz3zgrue8b X-Rspam-User: X-HE-Tag: 1702432186-220637 X-HE-Meta: U2FsdGVkX1/7bz9EmLahZRx+yhV4wPp1YtF0wQ3eG/ymIUPCIWZVuhqM+pIYqM9XuHrz/z1Bzt2B9GJBpv88OgWNELVq2cF/C9jZTnW8FChF1I2ll9+f7BsCUGFaL2ATCOsbO3bPutQ1dykWsvOAEdpUkakU6kkUU9DXK4Qp5iNRYAMl1kLeWfooiGOLOn16a91JOvpDPoaME6IqTdyFZiobCsEvWAkghDM+ok+pvTX6Dk6N9XB6ArkT4arXivjHSrdqqp7WOi0EVLPvWqaBbknzSUVPJEPYKCDofjJzxMUwgklbUWuAcZXAsN1R+cwL+8OnINNcz3zcKmPAqhGKDIr1MreCrN5NaBcS/r+JiJaEyS6kCGCrTtBt5TEEK8HqrDHE12XvBF0xlEVYpCt9DXcc/xMuUCJjHzfXb2U99Kk05RBdstSNeqUhMjHrNRxPUZqCzsmUsdPLByBUvg8tpdJgjNjF0BKglwuNKKJH6mUjgjQkePfOd7m8s+qINkdhJZyO0pjUZ7yn5Oj5p51m+JXRkDx1dYeIpqkNgF3txM3cjbg1NitgDPMFmqIhJpb4nmyO5xo20noYfaFteZVHhxgJx8HGAnl9bBegMfsr9tfJoREPNckS35ZOd9qX4r2pm+aiIHPRnOZxrEDFpRxRIL1vmhs59zyFDv1zHdRBI0D9rAUwElJ9k95fHxmmdfhvMEiwgTWlPU3hrq0kcVKuoXqiHaMViGU7c+0ZQb2SdFY2SBk/unqIHby1wnHOK3n5NeY/8myWmll+4Zfv/4mjMaiEEg2wOxL3tIeKvz1sZ9OhFHIXDamhygMWfVr1W7AJ2j6Bb9gaSNVxZw4BDNlz3IWyZdiJFJeIxXz4pnddZ2rdACgXVT7NJq1FGC+AV2N2i5p5ArnU1B8NqjOBmYtoeL2SoA6tef5RffJ060xjthaKm1dp1V9rJzdtmmLzNL5aYKJ8FRlBbXaqBXWBrEE M+lrsRQ5 m50WUITGSh/9vrx7hy/+oIKupwRZ4BP0cQf6ZtPc7aRKE50vmyXGcFcdC3LRsNdJ57OxAyp07qsB0NTNKAx6ZU7Ic6I3RYnxziK9dS3D1/DjJlpWA7q2HOaMg7Ngu7XTzs80ZtLmqyaH6pnVSjnTdekdTjakJPqcW9y0ybCHJyXh7fdTJ8fQbz2/sU5F9Rtae5wINuHleKPbdUNI6vabwo7gnw59+FjQ3MXosXqMN6UE6YkgRlWIbxv52EY3CeEZmfuLzHG5Roc1MBHBXv01eXIqHgtqx3LddJlIYymhiz5cQ4obnMupNDGYdGvzi0UqsWAy8JDdqjbOcTP4suOju79F8TW0nmvSpxOPP+L0Y6L5ozgoGU5YqTwbSwg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Dec 11, 2023 at 8:47=E2=80=AFPM Liang Chen wrote: > > In order to address the issues encountered with commit 1effe8ca4e34 > ("skbuff: fix coalescing for page_pool fragment recycling"), the > combination of the following condition was excluded from skb coalescing: > > from->pp_recycle =3D 1 > from->cloned =3D 1 > to->pp_recycle =3D 1 > > However, with page pool environments, the aforementioned combination can > be quite common(ex. NetworkMananger may lead to the additional > packet_type being registered, thus the cloning). In scenarios with a > higher number of small packets, it can significantly affect the success > rate of coalescing. For example, considering packets of 256 bytes size, > our comparison of coalescing success rate is as follows: > > Without page pool: 70% > With page pool: 13% > > Consequently, this has an impact on performance: > > Without page pool: 2.57 Gbits/sec > With page pool: 2.26 Gbits/sec > > Therefore, it seems worthwhile to optimize this scenario and enable > coalescing of this particular combination. To achieve this, we need to > ensure the correct increment of the "from" SKB page's page pool > reference count (pp_ref_count). > > Following this optimization, the success rate of coalescing measured in > our environment has improved as follows: > > With page pool: 60% > > This success rate is approaching the rate achieved without using page > pool, and the performance has also been improved: > > With page pool: 2.52 Gbits/sec > > Below is the performance comparison for small packets before and after > this optimization. We observe no impact to packets larger than 4K. > > packet size before after improved > (bytes) (Gbits/sec) (Gbits/sec) > 128 1.19 1.27 7.13% > 256 2.26 2.52 11.75% > 512 4.13 4.81 16.50% > 1024 6.17 6.73 9.05% > 2048 14.54 15.47 6.45% > 4096 25.44 27.87 9.52% > > Signed-off-by: Liang Chen > Reviewed-by: Yunsheng Lin > Suggested-by: Jason Wang > --- > include/net/page_pool/helpers.h | 5 ++++ > net/core/skbuff.c | 43 ++++++++++++++++++++++++--------- > 2 files changed, 36 insertions(+), 12 deletions(-) > > diff --git a/include/net/page_pool/helpers.h b/include/net/page_pool/help= ers.h > index d0c5e7e6857a..0dc8fab43bef 100644 > --- a/include/net/page_pool/helpers.h > +++ b/include/net/page_pool/helpers.h > @@ -281,6 +281,11 @@ static inline long page_pool_unref_page(struct page = *page, long nr) > return ret; > } > > +static inline void page_pool_ref_page(struct page *page) > +{ > + atomic_long_inc(&page->pp_ref_count); > +} > + > static inline bool page_pool_is_last_ref(struct page *page) > { > /* If page_pool_unref_page() returns 0, we were the last user */ > diff --git a/net/core/skbuff.c b/net/core/skbuff.c > index 7e26b56cda38..783a04733109 100644 > --- a/net/core/skbuff.c > +++ b/net/core/skbuff.c > @@ -947,6 +947,26 @@ static bool skb_pp_recycle(struct sk_buff *skb, void= *data, bool napi_safe) > return napi_pp_put_page(virt_to_page(data), napi_safe); > } > > +/** > + * skb_pp_frag_ref() - Increase fragment reference count of a page > + * @page: page of the fragment on which to increase a reference > + * > + * Increase the fragment reference count (pp_ref_count) of a page. This = is > + * intended to gain a fragment reference only for page pool aware skbs, > + * i.e. when skb->pp_recycle is true, and not for fragments in a > + * non-pp-recycling skb. It has a fallback to increase a reference on a > + * normal page, as page pool aware skbs may also have normal page fragme= nts. > + */ > +static void skb_pp_frag_ref(struct page *page) > +{ > + struct page *head_page =3D compound_head(page); > + Feel free to not delay this patch series further based on this comment/question, but... I'm a bit confused about the need for compound_head() here, but skb_frag_ref() doesn't first obtain the compound_head(). Is there a page_pool specific reason why skb_frag_ref() can get_page() directly but this helper needs to grab the compound_head() first? > + if (likely(is_pp_page(head_page))) > + page_pool_ref_page(head_page); > + else > + page_ref_inc(head_page); Any reason why not get_page() here? > +} > + > static void skb_kfree_head(void *head, unsigned int end_offset) > { > if (end_offset =3D=3D SKB_SMALL_HEAD_HEADROOM) > @@ -5769,17 +5789,12 @@ bool skb_try_coalesce(struct sk_buff *to, struct = sk_buff *from, > return false; > > /* In general, avoid mixing page_pool and non-page_pool allocated > - * pages within the same SKB. Additionally avoid dealing with clo= nes > - * with page_pool pages, in case the SKB is using page_pool fragm= ent > - * references (page_pool_alloc_frag()). Since we only take full p= age > - * references for cloned SKBs at the moment that would result in > - * inconsistent reference counts. > - * In theory we could take full references if @from is cloned and > - * !@to->pp_recycle but its tricky (due to potential race with > - * the clone disappearing) and rare, so not worth dealing with. > + * pages within the same SKB. In theory we could take full > + * references if @from is cloned and !@to->pp_recycle but its > + * tricky (due to potential race with the clone disappearing) and > + * rare, so not worth dealing with. > */ > - if (to->pp_recycle !=3D from->pp_recycle || > - (from->pp_recycle && skb_cloned(from))) > + if (to->pp_recycle !=3D from->pp_recycle) > return false; > > if (len <=3D skb_tailroom(to)) { > @@ -5836,8 +5851,12 @@ bool skb_try_coalesce(struct sk_buff *to, struct s= k_buff *from, > /* if the skb is not cloned this does nothing > * since we set nr_frags to 0. > */ > - for (i =3D 0; i < from_shinfo->nr_frags; i++) > - __skb_frag_ref(&from_shinfo->frags[i]); > + if (from->pp_recycle) > + for (i =3D 0; i < from_shinfo->nr_frags; i++) > + skb_pp_frag_ref(skb_frag_page(&from_shinfo->frags= [i])); > + else > + for (i =3D 0; i < from_shinfo->nr_frags; i++) > + __skb_frag_ref(&from_shinfo->frags[i]); > > to->truesize +=3D delta; > to->len +=3D len; > -- > 2.31.1 > --=20 Thanks, Mina