From: Mina Almasry <almasrymina@google.com>
To: Liang Chen <liangchen.linux@gmail.com>
Cc: davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
pabeni@redhat.com, hawk@kernel.org, ilias.apalodimas@linaro.org,
linyunsheng@huawei.com, netdev@vger.kernel.org,
linux-mm@kvack.org, jasowang@redhat.com
Subject: Re: [PATCH net-next v9 4/4] skbuff: Optimization of SKB coalescing for page pool
Date: Tue, 12 Dec 2023 17:49:33 -0800 [thread overview]
Message-ID: <CAHS8izPW8dugsbUmXbt8WOFaOLvAaNtW2SwxizVtk4tNm-hFJw@mail.gmail.com> (raw)
In-Reply-To: <20231212044614.42733-5-liangchen.linux@gmail.com>
On Mon, Dec 11, 2023 at 8:47 PM Liang Chen <liangchen.linux@gmail.com> wrote:
>
> In order to address the issues encountered with commit 1effe8ca4e34
> ("skbuff: fix coalescing for page_pool fragment recycling"), the
> combination of the following condition was excluded from skb coalescing:
>
> from->pp_recycle = 1
> from->cloned = 1
> to->pp_recycle = 1
>
> However, with page pool environments, the aforementioned combination can
> be quite common(ex. NetworkMananger may lead to the additional
> packet_type being registered, thus the cloning). In scenarios with a
> higher number of small packets, it can significantly affect the success
> rate of coalescing. For example, considering packets of 256 bytes size,
> our comparison of coalescing success rate is as follows:
>
> Without page pool: 70%
> With page pool: 13%
>
> Consequently, this has an impact on performance:
>
> Without page pool: 2.57 Gbits/sec
> With page pool: 2.26 Gbits/sec
>
> Therefore, it seems worthwhile to optimize this scenario and enable
> coalescing of this particular combination. To achieve this, we need to
> ensure the correct increment of the "from" SKB page's page pool
> reference count (pp_ref_count).
>
> Following this optimization, the success rate of coalescing measured in
> our environment has improved as follows:
>
> With page pool: 60%
>
> This success rate is approaching the rate achieved without using page
> pool, and the performance has also been improved:
>
> With page pool: 2.52 Gbits/sec
>
> Below is the performance comparison for small packets before and after
> this optimization. We observe no impact to packets larger than 4K.
>
> packet size before after improved
> (bytes) (Gbits/sec) (Gbits/sec)
> 128 1.19 1.27 7.13%
> 256 2.26 2.52 11.75%
> 512 4.13 4.81 16.50%
> 1024 6.17 6.73 9.05%
> 2048 14.54 15.47 6.45%
> 4096 25.44 27.87 9.52%
>
> Signed-off-by: Liang Chen <liangchen.linux@gmail.com>
> Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com>
> Suggested-by: Jason Wang <jasowang@redhat.com>
> ---
> include/net/page_pool/helpers.h | 5 ++++
> net/core/skbuff.c | 43 ++++++++++++++++++++++++---------
> 2 files changed, 36 insertions(+), 12 deletions(-)
>
> diff --git a/include/net/page_pool/helpers.h b/include/net/page_pool/helpers.h
> index d0c5e7e6857a..0dc8fab43bef 100644
> --- a/include/net/page_pool/helpers.h
> +++ b/include/net/page_pool/helpers.h
> @@ -281,6 +281,11 @@ static inline long page_pool_unref_page(struct page *page, long nr)
> return ret;
> }
>
> +static inline void page_pool_ref_page(struct page *page)
> +{
> + atomic_long_inc(&page->pp_ref_count);
> +}
> +
> static inline bool page_pool_is_last_ref(struct page *page)
> {
> /* If page_pool_unref_page() returns 0, we were the last user */
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 7e26b56cda38..783a04733109 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -947,6 +947,26 @@ static bool skb_pp_recycle(struct sk_buff *skb, void *data, bool napi_safe)
> return napi_pp_put_page(virt_to_page(data), napi_safe);
> }
>
> +/**
> + * skb_pp_frag_ref() - Increase fragment reference count of a page
> + * @page: page of the fragment on which to increase a reference
> + *
> + * Increase the fragment reference count (pp_ref_count) of a page. This is
> + * intended to gain a fragment reference only for page pool aware skbs,
> + * i.e. when skb->pp_recycle is true, and not for fragments in a
> + * non-pp-recycling skb. It has a fallback to increase a reference on a
> + * normal page, as page pool aware skbs may also have normal page fragments.
> + */
> +static void skb_pp_frag_ref(struct page *page)
> +{
> + struct page *head_page = compound_head(page);
> +
Feel free to not delay this patch series further based on this
comment/question, but...
I'm a bit confused about the need for compound_head() here, but
skb_frag_ref() doesn't first obtain the compound_head(). Is there a
page_pool specific reason why skb_frag_ref() can get_page() directly
but this helper needs to grab the compound_head() first?
> + if (likely(is_pp_page(head_page)))
> + page_pool_ref_page(head_page);
> + else
> + page_ref_inc(head_page);
Any reason why not get_page() here?
> +}
> +
> static void skb_kfree_head(void *head, unsigned int end_offset)
> {
> if (end_offset == SKB_SMALL_HEAD_HEADROOM)
> @@ -5769,17 +5789,12 @@ bool skb_try_coalesce(struct sk_buff *to, struct sk_buff *from,
> return false;
>
> /* In general, avoid mixing page_pool and non-page_pool allocated
> - * pages within the same SKB. Additionally avoid dealing with clones
> - * with page_pool pages, in case the SKB is using page_pool fragment
> - * references (page_pool_alloc_frag()). Since we only take full page
> - * references for cloned SKBs at the moment that would result in
> - * inconsistent reference counts.
> - * In theory we could take full references if @from is cloned and
> - * !@to->pp_recycle but its tricky (due to potential race with
> - * the clone disappearing) and rare, so not worth dealing with.
> + * pages within the same SKB. In theory we could take full
> + * references if @from is cloned and !@to->pp_recycle but its
> + * tricky (due to potential race with the clone disappearing) and
> + * rare, so not worth dealing with.
> */
> - if (to->pp_recycle != from->pp_recycle ||
> - (from->pp_recycle && skb_cloned(from)))
> + if (to->pp_recycle != from->pp_recycle)
> return false;
>
> if (len <= skb_tailroom(to)) {
> @@ -5836,8 +5851,12 @@ bool skb_try_coalesce(struct sk_buff *to, struct sk_buff *from,
> /* if the skb is not cloned this does nothing
> * since we set nr_frags to 0.
> */
> - for (i = 0; i < from_shinfo->nr_frags; i++)
> - __skb_frag_ref(&from_shinfo->frags[i]);
> + if (from->pp_recycle)
> + for (i = 0; i < from_shinfo->nr_frags; i++)
> + skb_pp_frag_ref(skb_frag_page(&from_shinfo->frags[i]));
> + else
> + for (i = 0; i < from_shinfo->nr_frags; i++)
> + __skb_frag_ref(&from_shinfo->frags[i]);
>
> to->truesize += delta;
> to->len += len;
> --
> 2.31.1
>
--
Thanks,
Mina
next prev parent reply other threads:[~2023-12-13 1:49 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-12 4:46 [PATCH net-next v9 0/4] skbuff: Optimize " Liang Chen
2023-12-12 4:46 ` [PATCH net-next v9 1/4] page_pool: transition to reference count management after page draining Liang Chen
2023-12-13 1:53 ` Mina Almasry
2023-12-12 4:46 ` [PATCH net-next v9 2/4] page_pool: halve BIAS_MAX for multiple user references of a fragment Liang Chen
2023-12-13 1:51 ` Mina Almasry
2023-12-13 11:38 ` Ilias Apalodimas
2023-12-14 3:42 ` Liang Chen
2023-12-12 4:46 ` [PATCH net-next v9 3/4] skbuff: Add a function to check if a page belongs to page_pool Liang Chen
2023-12-13 1:50 ` Mina Almasry
2023-12-12 4:46 ` [PATCH net-next v9 4/4] skbuff: Optimization of SKB coalescing for page pool Liang Chen
2023-12-13 1:49 ` Mina Almasry [this message]
2023-12-13 2:37 ` Liang Chen
2023-12-13 2:49 ` Mina Almasry
2023-12-14 3:00 ` [PATCH net-next v9 0/4] skbuff: Optimize " patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAHS8izPW8dugsbUmXbt8WOFaOLvAaNtW2SwxizVtk4tNm-hFJw@mail.gmail.com \
--to=almasrymina@google.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=hawk@kernel.org \
--cc=ilias.apalodimas@linaro.org \
--cc=jasowang@redhat.com \
--cc=kuba@kernel.org \
--cc=liangchen.linux@gmail.com \
--cc=linux-mm@kvack.org \
--cc=linyunsheng@huawei.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox