linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Alexander Duyck <alexander.duyck@gmail.com>
To: Yunsheng Lin <linyunsheng@huawei.com>
Cc: davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com,
	 netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	 Shuah Khan <skhan@linuxfoundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	 Linux-MM <linux-mm@kvack.org>
Subject: Re: [PATCH net-next v2 00/10] Replace page_frag with page_frag_cache (Part-2)
Date: Mon, 9 Dec 2024 08:03:51 -0800	[thread overview]
Message-ID: <CAKgT0Uc5A_mtN_qxR6w5zqDbx87SUdCTFOBxVWCarnryRvhqHA@mail.gmail.com> (raw)
In-Reply-To: <3de1b8a3-ae4f-492f-969d-bc6f2c145d09@huawei.com>

On Mon, Dec 9, 2024 at 3:42 AM Yunsheng Lin <linyunsheng@huawei.com> wrote:
>
> On 2024/12/9 5:34, Alexander Duyck wrote:
>
> ...
>
> >>
> >> Performance validation for part2:
> >> 1. Using micro-benchmark ko added in patch 1 to test aligned and
> >>    non-aligned API performance impact for the existing users, there
> >>    seems to be about 20% performance degradation for refactoring
> >>    page_frag to support the new API, which seems to nullify most of
> >>    the performance gain in [3] of part1.
> >
> > So if I am understanding correctly then this is showing a 20%
> > performance degradation with this patchset. I would argue that it is
> > significant enough that it would be a blocking factor for this patch
> > set. I would suggest bisecting the patch set to identify where the
> > performance degradation has been added and see what we can do to
> > resolve it, and if nothing else document it in that patch so we can
> > identify the root cause for the slowdown.
>
> The only patch in this patchset affecting the performance of existing API
> seems to be patch 1, only including patch 1 does show ~20% performance
> degradation as including the whole patchset does:
> mm: page_frag: some minor refactoring before adding new API
>
> And the cause seems to be about the binary increasing as below, as the
> performance degradation didn't seems to change much when I tried inlining
> the __page_frag_cache_commit_noref() by moving it to the header file:
>
> ./scripts/bloat-o-meter vmlinux_orig vmlinux
> add/remove: 3/2 grow/shrink: 5/0 up/down: 920/-500 (420)
> Function                                     old     new   delta
> __page_frag_cache_prepare                      -     500    +500
> __napi_alloc_frag_align                       68     180    +112
> __netdev_alloc_skb                           488     596    +108
> napi_alloc_skb                               556     624     +68
> __netdev_alloc_frag_align                    196     252     +56
> svc_tcp_sendmsg                              340     376     +36
> __page_frag_cache_commit_noref                 -      32     +32
> e843419@09a6_0000bd47_30                       -       8      +8
> e843419@0369_000044ee_684                      8       -      -8
> __page_frag_alloc_align                      492       -    -492
> Total: Before=34719207, After=34719627, chg +0.00%
>
> ./scripts/bloat-o-meter page_frag_test_orig.ko page_frag_test.ko
> add/remove: 0/0 grow/shrink: 2/0 up/down: 78/0 (78)
> Function                                     old     new   delta
> page_frag_push_thread                        508     580     +72
> __UNIQUE_ID_vermagic367                       67      73      +6
> Total: Before=4582, After=4660, chg +1.70%

Other than code size have you tried using perf to profile the
benchmark before and after. I suspect that would be telling about
which code changes are the most likely to be causing the issues.
Overall I don't think the size has increased all that much. I suspect
most of this is the fact that you are inlining more of the
functionality.

> Patch 1 is about refactoring common codes from __page_frag_alloc_va_align()
> to __page_frag_cache_prepare() and __page_frag_cache_commit(), so that the
> new API can make use of them as much as possible.
>
> Any better idea to reuse common codes as much as possible while avoiding
> the performance degradation as much as possible?
>
> >
> >> 2. Use the below netcat test case, there seems to be some minor
> >>    performance gain for replacing 'page_frag' with 'page_frag_cache'
> >>    using the new page_frag API after this patchset.
> >>    server: taskset -c 32 nc -l -k 1234 > /dev/null
> >>    client: perf stat -r 200 -- taskset -c 0 head -c 20G /dev/zero | taskset -c 1 nc 127.0.0.1 1234
> >
> > This test would barely touch the page pool. The fact is most of the
>
> I am guessing you meant page_frag here?
>
> > overhead for this would likely be things like TCP latency and data
> > copy much more than the page allocation. As such fluctuations here are
> > likely not related to your changes.
>
> But it does tell us something that the replacing does not seems to
> cause obvious regression, right?

Not really. The fragment allocator is such a small portion of this
test that we could probably double the cost for it and it would still
be negligible.

> I tried using a smaller MTU to amplify the impact of page allocation,
> it seemed to have a similar result.

Not surprising. However the network is likely only a small part of
this. I suspect if you ran a profile it would likely show the same.


  reply	other threads:[~2024-12-09 16:04 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-06 12:25 Yunsheng Lin
2024-12-06 12:25 ` [PATCH net-next v2 01/10] mm: page_frag: some minor refactoring before adding new API Yunsheng Lin
2024-12-06 12:25 ` [PATCH net-next v2 02/10] net: rename skb_copy_to_page_nocache() helper Yunsheng Lin
2024-12-06 12:25 ` [PATCH net-next v2 03/10] mm: page_frag: update documentation for page_frag Yunsheng Lin
2024-12-06 12:25 ` [PATCH net-next v2 04/10] mm: page_frag: introduce page_frag_alloc_abort() related API Yunsheng Lin
2024-12-06 12:25 ` [PATCH net-next v2 05/10] mm: page_frag: introduce refill prepare & commit API Yunsheng Lin
2024-12-06 12:25 ` [PATCH net-next v2 06/10] mm: page_frag: introduce alloc_refill " Yunsheng Lin
2024-12-06 12:25 ` [PATCH net-next v2 07/10] mm: page_frag: introduce probe related API Yunsheng Lin
2024-12-06 12:25 ` [PATCH net-next v2 08/10] mm: page_frag: add testing for the newly added API Yunsheng Lin
2024-12-06 12:25 ` [PATCH net-next v2 09/10] net: replace page_frag with page_frag_cache Yunsheng Lin
2024-12-06 12:25 ` [PATCH net-next v2 10/10] mm: page_frag: add an entry in MAINTAINERS for page_frag Yunsheng Lin
2024-12-08 21:34 ` [PATCH net-next v2 00/10] Replace page_frag with page_frag_cache (Part-2) Alexander Duyck
2024-12-09 11:42   ` Yunsheng Lin
2024-12-09 16:03     ` Alexander Duyck [this message]
2024-12-10 12:27       ` Yunsheng Lin
2024-12-10 15:58         ` Alexander Duyck
2024-12-11 12:52           ` Yunsheng Lin
2024-12-13 12:09             ` Yunsheng Lin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKgT0Uc5A_mtN_qxR6w5zqDbx87SUdCTFOBxVWCarnryRvhqHA@mail.gmail.com \
    --to=alexander.duyck@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=davem@davemloft.net \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linyunsheng@huawei.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=skhan@linuxfoundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox