From: Li Qiang <liqiang01@kylinos.cn>
To: akpm@linux-foundation.org, david@redhat.com
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com,
vbabka@suse.cz, rppt@kernel.org, surenb@google.com,
mhocko@suse.com
Subject: [PATCH] mm: memory: Force-inline PTE/PMD zapping functions for performance
Date: Tue, 5 Aug 2025 20:04:35 +0800 [thread overview]
Message-ID: <20250805120435.1142283-1-liqiang01@kylinos.cn> (raw)
In-Reply-To: <74580442-2a9a-4055-b92d-23f5e5664878@redhat.com>
Ah, missed it after the performance numbers. As Vlastimil mentioned, I
would have expected a bloat-o-meter output.
>
> My 2 cents is that usually it may be better to understand why it is
> not inlined and address that (e.g., likely() hints or something else)
> instead of blindly putting __always_inline. The __always_inline might
> stay there for no reason after some code changes and therefore become
> a maintenance burden. Concretely, in this case, where there is a single
> caller, one can expect the compiler to really prefer to inline the
> callees.
>
> Agreed, although the compiler is sometimes hard to convince to do the
> right thing when dealing with rather large+complicated code in my
> experience.
Question 1: Will this patch increase the vmlinux size?
Reply:
Actually, the overall vmlinux size becomes smaller on x86_64:
[root@localhost linux_old1]# ./scripts/bloat-o-meter before.vmlinux after.vmlinux
add/remove: 6/0 grow/shrink: 0/1 up/down: 4569/-4747 (-178)
Function old new delta
zap_present_ptes.constprop - 2696 +2696
zap_pte_range - 1236 +1236
zap_pmd_range.isra - 589 +589
__pfx_zap_pte_range - 16 +16
__pfx_zap_present_ptes.constprop - 16 +16
__pfx_zap_pmd_range.isra - 16 +16
unmap_page_range 5765 1018 -4747
Total: Before=35379786, After=35379608, chg -0.00%
Question 2: Why doesn't GCC inline these functions by default? Are there any side effects of forced inlining?
Reply:
1) GCC's default parameter max-inline-insns-single imposes restrictions. However, since these are leaf functions, inlining them not only improves performance but also reduces code size. May we consider relaxing the max-inline-insns-single restriction in this case?
2) The functions being inlined in this patch follow a single call path and are ultimately inlined into unmap_page_range. This only increases the size of the unmap_page_range assembly function, but since unmap_page_range itself won't be further inlined, the impact is well-contained.
Question 3: Does this inlining modification affect code maintainability?
Reply: The modified inline functions are exclusively called by unmap_page_range, forming a single call path. This doesn't introduce additional maintenance complexity.
Question 4: Have you performed performance testing on other platforms? Have you tested other scenarios?
Reply:
1) I tested the same GCC version on arm64 architecture. Even without this patch, these functions get inlined into unmap_page_range automatically. This appears to be due to architecture-specific differences in GCC's max-inline-insns-single default values.
2) I believe UnixBench serves as a reasonably representative server benchmark. Theoretically, this patch should improve performance by reducing multi-layer function call overhead. However, I would sincerely appreciate your guidance on what additional tests might better demonstrate the performance improvements. Could you kindly suggest some specific benchmarks or test scenarios I should consider?
--
Cheers,
Li Qiang
next prev parent reply other threads:[~2025-08-05 12:04 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-04 12:39 Li Qiang
2025-08-04 12:51 ` David Hildenbrand
2025-08-04 13:01 ` Nadav Amit
2025-08-04 13:30 ` David Hildenbrand
2025-08-05 12:04 ` Li Qiang [this message]
2025-08-05 13:15 ` Vlastimil Babka
2025-08-06 5:40 ` [PATCH] mm: memory: Force-inline PTE/PMD zapping functions Li Qiang
2025-08-05 13:35 ` [PATCH] mm: memory: Force-inline PTE/PMD zapping functions for performance Lorenzo Stoakes
2025-08-06 5:51 ` Li Qiang
2025-08-07 10:25 ` Vlastimil Babka
2025-08-04 13:15 ` Vlastimil Babka
2025-08-04 13:29 ` Lorenzo Stoakes
2025-08-04 13:59 ` Lorenzo Stoakes
2025-08-04 14:41 ` Vlastimil Babka
2025-08-04 14:50 ` Nadav Amit
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250805120435.1142283-1-liqiang01@kylinos.cn \
--to=liqiang01@kylinos.cn \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mhocko@suse.com \
--cc=rppt@kernel.org \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox