linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Yunsheng Lin <linyunsheng@huawei.com>
To: Alexander Duyck <alexander.duyck@gmail.com>
Cc: <davem@davemloft.net>, <kuba@kernel.org>, <pabeni@redhat.com>,
	<netdev@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	Shuah Khan <skhan@linuxfoundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linux-MM <linux-mm@kvack.org>
Subject: Re: [PATCH net-next v2 00/10] Replace page_frag with page_frag_cache (Part-2)
Date: Mon, 9 Dec 2024 19:42:51 +0800	[thread overview]
Message-ID: <3de1b8a3-ae4f-492f-969d-bc6f2c145d09@huawei.com> (raw)
In-Reply-To: <CAKgT0UeXcsB-HOyeA7kYKHmEUM+d_mbTQJRhXfaiFBg_HcWV0w@mail.gmail.com>

On 2024/12/9 5:34, Alexander Duyck wrote:

...

>>
>> Performance validation for part2:
>> 1. Using micro-benchmark ko added in patch 1 to test aligned and
>>    non-aligned API performance impact for the existing users, there
>>    seems to be about 20% performance degradation for refactoring
>>    page_frag to support the new API, which seems to nullify most of
>>    the performance gain in [3] of part1.
> 
> So if I am understanding correctly then this is showing a 20%
> performance degradation with this patchset. I would argue that it is
> significant enough that it would be a blocking factor for this patch
> set. I would suggest bisecting the patch set to identify where the
> performance degradation has been added and see what we can do to
> resolve it, and if nothing else document it in that patch so we can
> identify the root cause for the slowdown.

The only patch in this patchset affecting the performance of existing API
seems to be patch 1, only including patch 1 does show ~20% performance
degradation as including the whole patchset does:
mm: page_frag: some minor refactoring before adding new API

And the cause seems to be about the binary increasing as below, as the
performance degradation didn't seems to change much when I tried inlining
the __page_frag_cache_commit_noref() by moving it to the header file:

./scripts/bloat-o-meter vmlinux_orig vmlinux
add/remove: 3/2 grow/shrink: 5/0 up/down: 920/-500 (420)
Function                                     old     new   delta
__page_frag_cache_prepare                      -     500    +500
__napi_alloc_frag_align                       68     180    +112
__netdev_alloc_skb                           488     596    +108
napi_alloc_skb                               556     624     +68
__netdev_alloc_frag_align                    196     252     +56
svc_tcp_sendmsg                              340     376     +36
__page_frag_cache_commit_noref                 -      32     +32
e843419@09a6_0000bd47_30                       -       8      +8
e843419@0369_000044ee_684                      8       -      -8
__page_frag_alloc_align                      492       -    -492
Total: Before=34719207, After=34719627, chg +0.00%

./scripts/bloat-o-meter page_frag_test_orig.ko page_frag_test.ko
add/remove: 0/0 grow/shrink: 2/0 up/down: 78/0 (78)
Function                                     old     new   delta
page_frag_push_thread                        508     580     +72
__UNIQUE_ID_vermagic367                       67      73      +6
Total: Before=4582, After=4660, chg +1.70%

Patch 1 is about refactoring common codes from __page_frag_alloc_va_align()
to __page_frag_cache_prepare() and __page_frag_cache_commit(), so that the
new API can make use of them as much as possible.

Any better idea to reuse common codes as much as possible while avoiding
the performance degradation as much as possible?

> 
>> 2. Use the below netcat test case, there seems to be some minor
>>    performance gain for replacing 'page_frag' with 'page_frag_cache'
>>    using the new page_frag API after this patchset.
>>    server: taskset -c 32 nc -l -k 1234 > /dev/null
>>    client: perf stat -r 200 -- taskset -c 0 head -c 20G /dev/zero | taskset -c 1 nc 127.0.0.1 1234
> 
> This test would barely touch the page pool. The fact is most of the

I am guessing you meant page_frag here?

> overhead for this would likely be things like TCP latency and data
> copy much more than the page allocation. As such fluctuations here are
> likely not related to your changes.

But it does tell us something that the replacing does not seems to
cause obvious regression, right?

I tried using a smaller MTU to amplify the impact of page allocation,
it seemed to have a similar result.


  reply	other threads:[~2024-12-09 11:42 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-06 12:25 Yunsheng Lin
2024-12-06 12:25 ` [PATCH net-next v2 01/10] mm: page_frag: some minor refactoring before adding new API Yunsheng Lin
2024-12-06 12:25 ` [PATCH net-next v2 02/10] net: rename skb_copy_to_page_nocache() helper Yunsheng Lin
2024-12-06 12:25 ` [PATCH net-next v2 03/10] mm: page_frag: update documentation for page_frag Yunsheng Lin
2024-12-06 12:25 ` [PATCH net-next v2 04/10] mm: page_frag: introduce page_frag_alloc_abort() related API Yunsheng Lin
2024-12-06 12:25 ` [PATCH net-next v2 05/10] mm: page_frag: introduce refill prepare & commit API Yunsheng Lin
2024-12-06 12:25 ` [PATCH net-next v2 06/10] mm: page_frag: introduce alloc_refill " Yunsheng Lin
2024-12-06 12:25 ` [PATCH net-next v2 07/10] mm: page_frag: introduce probe related API Yunsheng Lin
2024-12-06 12:25 ` [PATCH net-next v2 08/10] mm: page_frag: add testing for the newly added API Yunsheng Lin
2024-12-06 12:25 ` [PATCH net-next v2 09/10] net: replace page_frag with page_frag_cache Yunsheng Lin
2024-12-06 12:25 ` [PATCH net-next v2 10/10] mm: page_frag: add an entry in MAINTAINERS for page_frag Yunsheng Lin
2024-12-08 21:34 ` [PATCH net-next v2 00/10] Replace page_frag with page_frag_cache (Part-2) Alexander Duyck
2024-12-09 11:42   ` Yunsheng Lin [this message]
2024-12-09 16:03     ` Alexander Duyck
2024-12-10 12:27       ` Yunsheng Lin
2024-12-10 15:58         ` Alexander Duyck
2024-12-11 12:52           ` Yunsheng Lin
2024-12-13 12:09             ` Yunsheng Lin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3de1b8a3-ae4f-492f-969d-bc6f2c145d09@huawei.com \
    --to=linyunsheng@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.duyck@gmail.com \
    --cc=davem@davemloft.net \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=skhan@linuxfoundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox