From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 87C44C4167B
	for <linux-mm@archiver.kernel.org>; Mon, 11 Dec 2023 07:47:35 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id 0C1E66B00AD; Mon, 11 Dec 2023 02:47:35 -0500 (EST)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 071C36B00AE; Mon, 11 Dec 2023 02:47:35 -0500 (EST)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id E7CF76B00AF; Mon, 11 Dec 2023 02:47:34 -0500 (EST)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17])
	by kanga.kvack.org (Postfix) with ESMTP id D7B6E6B00AD
	for <linux-mm@kvack.org>; Mon, 11 Dec 2023 02:47:34 -0500 (EST)
Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay10.hostedemail.com (Postfix) with ESMTP id AE3ECC06B2
	for <linux-mm@kvack.org>; Mon, 11 Dec 2023 07:47:34 +0000 (UTC)
X-FDA: 81553757628.11.3C061E3
Received: from mail-lj1-f182.google.com (mail-lj1-f182.google.com [209.85.208.182])
	by imf03.hostedemail.com (Postfix) with ESMTP id D08D920002
	for <linux-mm@kvack.org>; Mon, 11 Dec 2023 07:47:32 +0000 (UTC)
Authentication-Results: imf03.hostedemail.com;
	dkim=pass header.d=linaro.org header.s=google header.b=BtA8A2cN;
	dmarc=pass (policy=none) header.from=linaro.org;
	spf=pass (imf03.hostedemail.com: domain of ilias.apalodimas@linaro.org designates 209.85.208.182 as permitted sender) smtp.mailfrom=ilias.apalodimas@linaro.org
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1702280853; a=rsa-sha256;
	cv=none;
	b=ZQR+LTIaMTKf6HU8NHlN2OGEeEEkbi9oyYpfY9Aua7XDMGFNuK5FZ6AvYteQGau6P0qPYc
	xz+YPpGB4kQ1vVyBmFC/amsgSsYphCqVju2J894x5PNgLGp8XSZYEo51On+aLiCaI5/QgP
	bsJALhYFkeqoiu9zNJ6MpjSVPmktu6o=
ARC-Authentication-Results: i=1;
	imf03.hostedemail.com;
	dkim=pass header.d=linaro.org header.s=google header.b=BtA8A2cN;
	dmarc=pass (policy=none) header.from=linaro.org;
	spf=pass (imf03.hostedemail.com: domain of ilias.apalodimas@linaro.org designates 209.85.208.182 as permitted sender) smtp.mailfrom=ilias.apalodimas@linaro.org
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com;
	s=arc-20220608; t=1702280853;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-type:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=IfPiMvnAYgF5F1rGe24c0+/9ss5EX07OpQuxPZvZeO8=;
	b=67d1icj8mJ4hSKy6FlNGNSQhjMJ5YbXzKYP2mWrQRejHmNdBJNMP68liZ76Cn6dP02ziuR
	NFg3s5ynOtIHUdjT5rCpNFq2t6v8ZnPBou6T5D+hUfA1JUedhYO61mcSRwjJa7vvw4JSvJ
	V11dDkjLR6GTG/MOr+nvmOvSBQqaCcs=
Received: by mail-lj1-f182.google.com with SMTP id 38308e7fff4ca-2c9f85eff28so60788041fa.3
        for <linux-mm@kvack.org>; Sun, 10 Dec 2023 23:47:32 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=linaro.org; s=google; t=1702280851; x=1702885651; darn=kvack.org;
        h=cc:to:subject:message-id:date:from:in-reply-to:references
         :mime-version:from:to:cc:subject:date:message-id:reply-to;
        bh=IfPiMvnAYgF5F1rGe24c0+/9ss5EX07OpQuxPZvZeO8=;
        b=BtA8A2cNzBcgRfA5Q6aLkEA3E14NhZ8Oh2DFUcDQWnnWkblmGX0Zgvd0V7VSNQ7djX
         VBBuZQl+xAQhLkSJuM95FfrYIkU/Q51QJ752OKpiJ2V95V1Fix9oYSI3p2OraXe5HBnV
         nF6bxBzqi+V1MwzNoC7gzV1tseWKK144SMmizjxmAyaIJFH65UvcUgQ6ziKx8dbjboup
         W4claJ8wYAqjM+qx3fPrz0mop8+08idGBn4lyAbf/ZjUnjS7YO+HMMGDaIjzzUidznZa
         Tmn27QoRkSktpvaY7rgidpxdaHuYk2zJvgQgBnNDBJG+MiDEPaB8dqffU1sTvz8s1GpW
         Awjw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1702280851; x=1702885651;
        h=cc:to:subject:message-id:date:from:in-reply-to:references
         :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id
         :reply-to;
        bh=IfPiMvnAYgF5F1rGe24c0+/9ss5EX07OpQuxPZvZeO8=;
        b=wUhzDBXc09fQtkL67ob56AkkDfN3NKs2ADijlZLoapac/6oIBEL9ZfIjtJW5fiGFCU
         W71k/q0WNgoaKxdkeZqkMoq0R3JAe3q8DopVbPHF9HwmNHuvCSaB3zZxJz7FzAB9NMwO
         8t93iAj3SCx+lIq7BOHmdyMEqiwkpCgYvM9sz2+x84xgooQqAylGYmulsjrqCSK4yXH3
         txbA+ugdwd/vgyvKh6Cji5rNEzPUttrSkFKAp0UpZiZ1Qc5rUXlFCZVW3wszlwIigR5o
         SDoBqAzFpoFaHzESSuyl6H7YGwcHCBfi5kOxlTQOzsFHBzEfN/yasBfEy730V1r7b6Yf
         SoCw==
X-Gm-Message-State: AOJu0Yy6Ql4P1syyPhNDsOdhsTDehm3pLWDBYo3IAQtAtNoPiDA9dSj3
	AoYAmkV0WUe4QwK6l26/mWJMGCbmlUTAvIKDeTtFpA==
X-Google-Smtp-Source: AGHT+IFpzmZKaLlTVxosn15d+6B50NWZg8obUK6SdFhtB5RUiAGxPPRVNxEnJrqqe0a+Q9rxMcRbHovCoCaNWexheKI=
X-Received: by 2002:a2e:954b:0:b0:2ca:3073:1a4b with SMTP id
 t11-20020a2e954b000000b002ca30731a4bmr1771180ljh.73.1702280851156; Sun, 10
 Dec 2023 23:47:31 -0800 (PST)
MIME-Version: 1.0
References: <20231211035243.15774-1-liangchen.linux@gmail.com> <20231211035243.15774-5-liangchen.linux@gmail.com>
In-Reply-To: <20231211035243.15774-5-liangchen.linux@gmail.com>
From: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Date: Mon, 11 Dec 2023 09:46:55 +0200
Message-ID: <CAC_iWjJX3ixPevJAVpszx7nVMb99EtmEeeQcoqxd0GWocK0zkw@mail.gmail.com>
Subject: Re: [PATCH net-next v8 4/4] skbuff: Optimization of SKB coalescing
 for page pool
To: Liang Chen <liangchen.linux@gmail.com>
Cc: davem@davemloft.net, edumazet@google.com, kuba@kernel.org, 
	pabeni@redhat.com, hawk@kernel.org, linyunsheng@huawei.com, 
	netdev@vger.kernel.org, linux-mm@kvack.org, jasowang@redhat.com, 
	almasrymina@google.com
Content-Type: text/plain; charset="UTF-8"
X-Rspam-User: 
X-Rspamd-Server: rspam06
X-Rspamd-Queue-Id: D08D920002
X-Stat-Signature: cmomthtm1t5bdjwnhjhigwefzdnfcnih
X-HE-Tag: 1702280852-30808
X-HE-Meta: U2FsdGVkX18zF8TJ3cC2K9+PaBBecmUwdQMcRR85tWyWtCxeXwRUDIbmNSHYMcrce5GoetfDjDtxwjiegWNpT9FpW8M13KhmYRX5LKBFglSnTN30kppCd81Qy4LfwZ7dyGiP9jCWJOB9IaeEgOmh3zGOHDI/iwhDrKVJrVbBs1DivWeeNR4JwqfVSB4uJa4j9f932HDT/YEQt+CZiVSjjAZA1wLvwu8nUnWVcMrNhjtwErgzIcti9bayD8EeFIVinblWToBmQqdV2N1cwq5jiakNbAgiORtjCdeGMU/NJ1+lbmPNhuJaEBJnvvFpssi3as3DUmy2M7a/WOS5ZdmfTCjrgTRVDzC+f+PXHeVtWXoZd5aK/EVgtWpG1kg6T7paCyus7KGluGh4fLooGSwhwFWnVp2JJMSkvdFZf3RAdQeGDqbYv8UObrb3z4YEH0SLioUKZrIZL0P20DB2ps6cxvQgsAHn6ggxoFgKg5EcXqKr80YGp/rMIiH9HzUnUHLIqtxqnjvFsYQNWJBKYH15ll4xbNF81VyIVGoXqnwnF8YSo8M+f2U0DH0M3UUKSo9HKGsFguWU9oKqCLhdkspvXroSL6Hu01/d1SfcHvROCKrlOWDvQfBVs4zw4dwnGh1kk4Hkt+5Gz9d0FFMSpiaWR2NP43tbSrbQH/DrUTSl6X79fYBQtT5Dq0WSB0rUIrwG7fPPs46kkvVMgJiD87BaQvaUYlSoRjRdqATbV2t/gzEj1ojTmss3UrpFUXMlvdcglWpPEHhJH0NfDO7Jfg1MsRXdrETJgi+XVp5VS8UnhunjM3U39bF+l7MLzjpzop398wGtmMGdRLliFlR7Tsz5bRwK4gM80PwJLhhG7YFGsbLpdINHtHqwBEXg92czhO6NBIXT4j+Ty4EyvPCDaDdwbY1pFU5r6CjT6qtVm7kbYc+fVrIVaDj5McMsQTKT4XTkJJiLjRwMYBXRO5SrTlm
 aXvGPaSO
 o8kjbQF8X7tbG417Lp+8z81NW4bi+k0xbQGc3z6D69we1rfGWzCdamB0lq/E6SwrQXcWy71ziIcldbUanSnhD8xbfd+4kVStbMV9pH9CcoTJoZ+nsH8alY//jTS0Hp2Vk3fbFyJuvERUtrpBE6W+Njjn4HHn4bia5XXfWsecjO9rb50IxLqzBVjHn+u6gwF59QnBGDiellKjkjX02zE6pHmsSitNJlXnXybF0ElnwVqakXoaCRJjg29zu35/4tzPVbiS22WxCVFW8hYY0jn1zBtdsY261EMPwR2H914m1+KWM7fZyFrTBnAFcYkDA56alOAIjclThOYnkaASexpenOOsglUqR99O4ZHPbzyj0vIWzX/k=
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>
List-Subscribe: <mailto:majordomo@kvack.org>
List-Unsubscribe: <mailto:majordomo@kvack.org>

Hi Liang,

On Mon, 11 Dec 2023 at 05:53, Liang Chen <liangchen.linux@gmail.com> wrote:
>
> In order to address the issues encountered with commit 1effe8ca4e34
> ("skbuff: fix coalescing for page_pool fragment recycling"), the
> combination of the following condition was excluded from skb coalescing:
>
> from->pp_recycle = 1
> from->cloned = 1
> to->pp_recycle = 1
>
> However, with page pool environments, the aforementioned combination can
> be quite common(ex. NetworkMananger may lead to the additional
> packet_type being registered, thus the cloning). In scenarios with a
> higher number of small packets, it can significantly affect the success
> rate of coalescing. For example, considering packets of 256 bytes size,
> our comparison of coalescing success rate is as follows:
>
> Without page pool: 70%
> With page pool: 13%
>
> Consequently, this has an impact on performance:
>
> Without page pool: 2.57 Gbits/sec
> With page pool: 2.26 Gbits/sec
>
> Therefore, it seems worthwhile to optimize this scenario and enable
> coalescing of this particular combination. To achieve this, we need to
> ensure the correct increment of the "from" SKB page's page pool
> reference count (pp_ref_count).
>
> Following this optimization, the success rate of coalescing measured in
> our environment has improved as follows:
>
> With page pool: 60%
>
> This success rate is approaching the rate achieved without using page
> pool, and the performance has also been improved:
>
> With page pool: 2.52 Gbits/sec
>
> Below is the performance comparison for small packets before and after
> this optimization. We observe no impact to packets larger than 4K.
>
> packet size     before      after       improved
> (bytes)         (Gbits/sec) (Gbits/sec)
> 128             1.19        1.27        7.13%
> 256             2.26        2.52        11.75%
> 512             4.13        4.81        16.50%
> 1024            6.17        6.73        9.05%
> 2048            14.54       15.47       6.45%
> 4096            25.44       27.87       9.52%
>
> Signed-off-by: Liang Chen <liangchen.linux@gmail.com>
> Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com>
> Suggested-by: Jason Wang <jasowang@redhat.com>

As I said in the past the patch look correct. I don't like the fact
that more pp internals creep into the default network stack, but
perhaps this is fine with the bigger adoption?
Jakub any thoughts/objections?

Thanks
/Ilias
> ---
>  include/net/page_pool/helpers.h |  5 ++++
>  net/core/skbuff.c               | 41 +++++++++++++++++++++++----------
>  2 files changed, 34 insertions(+), 12 deletions(-)
>
> diff --git a/include/net/page_pool/helpers.h b/include/net/page_pool/helpers.h
> index d0c5e7e6857a..0dc8fab43bef 100644
> --- a/include/net/page_pool/helpers.h
> +++ b/include/net/page_pool/helpers.h
> @@ -281,6 +281,11 @@ static inline long page_pool_unref_page(struct page *page, long nr)
>         return ret;
>  }
>
> +static inline void page_pool_ref_page(struct page *page)
> +{
> +       atomic_long_inc(&page->pp_ref_count);
> +}
> +
>  static inline bool page_pool_is_last_ref(struct page *page)
>  {
>         /* If page_pool_unref_page() returns 0, we were the last user */
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 7e26b56cda38..3c2515a29376 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -947,6 +947,24 @@ static bool skb_pp_recycle(struct sk_buff *skb, void *data, bool napi_safe)
>         return napi_pp_put_page(virt_to_page(data), napi_safe);
>  }
>
> +/**
> + * skb_pp_frag_ref() - Increase fragment reference count of a page
> + * @page:      page of the fragment on which to increase a reference
> + *
> + * Increase fragment reference count (pp_ref_count) on a page, but if it is
> + * not a page pool page, fallback to increase a reference(_refcount) on a
> + * normal page.
> + */
> +static void skb_pp_frag_ref(struct page *page)
> +{
> +       struct page *head_page = compound_head(page);
> +
> +       if (likely(is_pp_page(head_page)))
> +               page_pool_ref_page(head_page);
> +       else
> +               page_ref_inc(head_page);
> +}
> +
>  static void skb_kfree_head(void *head, unsigned int end_offset)
>  {
>         if (end_offset == SKB_SMALL_HEAD_HEADROOM)
> @@ -5769,17 +5787,12 @@ bool skb_try_coalesce(struct sk_buff *to, struct sk_buff *from,
>                 return false;
>
>         /* In general, avoid mixing page_pool and non-page_pool allocated
> -        * pages within the same SKB. Additionally avoid dealing with clones
> -        * with page_pool pages, in case the SKB is using page_pool fragment
> -        * references (page_pool_alloc_frag()). Since we only take full page
> -        * references for cloned SKBs at the moment that would result in
> -        * inconsistent reference counts.
> -        * In theory we could take full references if @from is cloned and
> -        * !@to->pp_recycle but its tricky (due to potential race with
> -        * the clone disappearing) and rare, so not worth dealing with.
> +        * pages within the same SKB. In theory we could take full
> +        * references if @from is cloned and !@to->pp_recycle but its
> +        * tricky (due to potential race with the clone disappearing) and
> +        * rare, so not worth dealing with.
>          */
> -       if (to->pp_recycle != from->pp_recycle ||
> -           (from->pp_recycle && skb_cloned(from)))
> +       if (to->pp_recycle != from->pp_recycle)
>                 return false;
>
>         if (len <= skb_tailroom(to)) {
> @@ -5836,8 +5849,12 @@ bool skb_try_coalesce(struct sk_buff *to, struct sk_buff *from,
>         /* if the skb is not cloned this does nothing
>          * since we set nr_frags to 0.
>          */
> -       for (i = 0; i < from_shinfo->nr_frags; i++)
> -               __skb_frag_ref(&from_shinfo->frags[i]);
> +       if (from->pp_recycle)
> +               for (i = 0; i < from_shinfo->nr_frags; i++)
> +                       skb_pp_frag_ref(skb_frag_page(&from_shinfo->frags[i]));
> +       else
> +               for (i = 0; i < from_shinfo->nr_frags; i++)
> +                       __skb_frag_ref(&from_shinfo->frags[i]);
>
>         to->truesize += delta;
>         to->len += len;
> --
> 2.31.1
>