* [PATCH][next] mm/mincore: improve performance by adding an unlikely hint
@ 2025-02-17 17:09 Colin Ian King
2025-02-17 17:58 ` Matthew Wilcox
0 siblings, 1 reply; 6+ messages in thread
From: Colin Ian King @ 2025-02-17 17:09 UTC (permalink / raw)
To: Andrew Morton, linux-mm; +Cc: kernel-janitors, linux-kernel
Adding an unlikely() hint on the masked start comparison error
return path improves run-time performance of the mincore system call.
Benchmarking on an i9-12900 shows an improvement of 7ns on mincore calls
on a 256KB mmap'd region where 50% of the pages we resident.
Results based on running 20 tests with turbo disabled (to reduce
clock freq turbo changes), with 10 second run per test and comparing
the number of mincores calls per second. The % standard deviation of
the 20 tests was ~0.10%, so results are reliable.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
---
mm/mincore.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/mincore.c b/mm/mincore.c
index d6bd19e520fc..832f29f46767 100644
--- a/mm/mincore.c
+++ b/mm/mincore.c
@@ -239,7 +239,7 @@ SYSCALL_DEFINE3(mincore, unsigned long, start, size_t, len,
start = untagged_addr(start);
/* Check the start address: needs to be page-aligned.. */
- if (start & ~PAGE_MASK)
+ if (unlikely(start & ~PAGE_MASK))
return -EINVAL;
/* ..and we need to be passed a valid user-space range */
--
2.47.2
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH][next] mm/mincore: improve performance by adding an unlikely hint
2025-02-17 17:09 [PATCH][next] mm/mincore: improve performance by adding an unlikely hint Colin Ian King
@ 2025-02-17 17:58 ` Matthew Wilcox
2025-02-17 18:00 ` Colin King (gmail)
0 siblings, 1 reply; 6+ messages in thread
From: Matthew Wilcox @ 2025-02-17 17:58 UTC (permalink / raw)
To: Colin Ian King; +Cc: Andrew Morton, linux-mm, kernel-janitors, linux-kernel
On Mon, Feb 17, 2025 at 05:09:34PM +0000, Colin Ian King wrote:
> Adding an unlikely() hint on the masked start comparison error
> return path improves run-time performance of the mincore system call.
>
> Benchmarking on an i9-12900 shows an improvement of 7ns on mincore calls
> on a 256KB mmap'd region where 50% of the pages we resident.
>
> Results based on running 20 tests with turbo disabled (to reduce
> clock freq turbo changes), with 10 second run per test and comparing
> the number of mincores calls per second. The % standard deviation of
> the 20 tests was ~0.10%, so results are reliable.
I think you've elided _just_ enough information here that nobody can
judge whether your stats skills are any good ;-) You've told us 7ns
(per call, presumably) and you've told us 0.10% standard deviation,
but you haven't told us how long the syscall takes, so nobody can tell
whether 7ns is within 0.10% or not ;-)
> Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
> ---
> mm/mincore.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/mincore.c b/mm/mincore.c
> index d6bd19e520fc..832f29f46767 100644
> --- a/mm/mincore.c
> +++ b/mm/mincore.c
> @@ -239,7 +239,7 @@ SYSCALL_DEFINE3(mincore, unsigned long, start, size_t, len,
> start = untagged_addr(start);
>
> /* Check the start address: needs to be page-aligned.. */
> - if (start & ~PAGE_MASK)
> + if (unlikely(start & ~PAGE_MASK))
> return -EINVAL;
We might get even more advantage by moving the EINVAL test before
untagged_addr() since we know that the tags are all in the high bits and
we don't need to have the test be dependent on the previous arithmetic.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH][next] mm/mincore: improve performance by adding an unlikely hint
2025-02-17 17:58 ` Matthew Wilcox
@ 2025-02-17 18:00 ` Colin King (gmail)
2025-02-18 3:13 ` Andrew Morton
0 siblings, 1 reply; 6+ messages in thread
From: Colin King (gmail) @ 2025-02-17 18:00 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: Andrew Morton, linux-mm, kernel-janitors, linux-kernel
[-- Attachment #1.1.1: Type: text/plain, Size: 1908 bytes --]
fOn 17/02/2025 17:58, Matthew Wilcox wrote:
> On Mon, Feb 17, 2025 at 05:09:34PM +0000, Colin Ian King wrote:
>> Adding an unlikely() hint on the masked start comparison error
>> return path improves run-time performance of the mincore system call.
>>
>> Benchmarking on an i9-12900 shows an improvement of 7ns on mincore calls
>> on a 256KB mmap'd region where 50% of the pages we resident.
>>
>> Results based on running 20 tests with turbo disabled (to reduce
>> clock freq turbo changes), with 10 second run per test and comparing
>> the number of mincores calls per second. The % standard deviation of
>> the 20 tests was ~0.10%, so results are reliable.
>
> I think you've elided _just_ enough information here that nobody can
> judge whether your stats skills are any good ;-) You've told us 7ns
> (per call, presumably) and you've told us 0.10% standard deviation,
> but you haven't told us how long the syscall takes, so nobody can tell
> whether 7ns is within 0.10% or not ;-)
Ugh, my bad.
Improvement was from ~970 down to 963 ns, so small ~0.7% improvement.
Colin
>
>> Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
>> ---
>> mm/mincore.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/mm/mincore.c b/mm/mincore.c
>> index d6bd19e520fc..832f29f46767 100644
>> --- a/mm/mincore.c
>> +++ b/mm/mincore.c
>> @@ -239,7 +239,7 @@ SYSCALL_DEFINE3(mincore, unsigned long, start, size_t, len,
>> start = untagged_addr(start);
>>
>> /* Check the start address: needs to be page-aligned.. */
>> - if (start & ~PAGE_MASK)
>> + if (unlikely(start & ~PAGE_MASK))
>> return -EINVAL;
>
> We might get even more advantage by moving the EINVAL test before
> untagged_addr() since we know that the tags are all in the high bits and
> we don't need to have the test be dependent on the previous arithmetic.
[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 4901 bytes --]
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH][next] mm/mincore: improve performance by adding an unlikely hint
2025-02-17 18:00 ` Colin King (gmail)
@ 2025-02-18 3:13 ` Andrew Morton
2025-02-18 14:16 ` Colin King (gmail)
0 siblings, 1 reply; 6+ messages in thread
From: Andrew Morton @ 2025-02-18 3:13 UTC (permalink / raw)
To: Colin King (gmail)
Cc: Matthew Wilcox, linux-mm, kernel-janitors, linux-kernel
On Mon, 17 Feb 2025 18:00:22 +0000 "Colin King (gmail)" <colin.i.king@gmail.com> wrote:
> fOn 17/02/2025 17:58, Matthew Wilcox wrote:
> > On Mon, Feb 17, 2025 at 05:09:34PM +0000, Colin Ian King wrote:
> >> Adding an unlikely() hint on the masked start comparison error
> >> return path improves run-time performance of the mincore system call.
> >>
> >> Benchmarking on an i9-12900 shows an improvement of 7ns on mincore calls
> >> on a 256KB mmap'd region where 50% of the pages we resident.
> >>
> >> Results based on running 20 tests with turbo disabled (to reduce
> >> clock freq turbo changes), with 10 second run per test and comparing
> >> the number of mincores calls per second. The % standard deviation of
> >> the 20 tests was ~0.10%, so results are reliable.
> >
> > I think you've elided _just_ enough information here that nobody can
> > judge whether your stats skills are any good ;-) You've told us 7ns
> > (per call, presumably) and you've told us 0.10% standard deviation,
> > but you haven't told us how long the syscall takes, so nobody can tell
> > whether 7ns is within 0.10% or not ;-)
>
> Ugh, my bad.
>
> Improvement was from ~970 down to 963 ns, so small ~0.7% improvement.
>
It actually doesn't change the generated code:
hp2:/usr/src/25> diff -u mm/mincore.lst.old mm/mincore.lst
--- mm/mincore.lst.old 2025-02-17 19:11:34.093727411 -0800
+++ mm/mincore.lst 2025-02-17 19:12:59.797009056 -0800
@@ -1563,7 +1563,7 @@
start = untagged_addr(start);
/* Check the start address: needs to be page-aligned.. */
- if (start & ~PAGE_MASK)
+ if (unlikely(start & ~PAGE_MASK))
b27: 31 ff xor %edi,%edi
asm (ALTERNATIVE("",
b29: 90 nop
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH][next] mm/mincore: improve performance by adding an unlikely hint
2025-02-18 3:13 ` Andrew Morton
@ 2025-02-18 14:16 ` Colin King (gmail)
2025-02-19 0:08 ` Andrew Morton
0 siblings, 1 reply; 6+ messages in thread
From: Colin King (gmail) @ 2025-02-18 14:16 UTC (permalink / raw)
To: Andrew Morton; +Cc: Matthew Wilcox, linux-mm, kernel-janitors, linux-kernel
[-- Attachment #1.1.1: Type: text/plain, Size: 2221 bytes --]
On 18/02/2025 03:13, Andrew Morton wrote:
> On Mon, 17 Feb 2025 18:00:22 +0000 "Colin King (gmail)" <colin.i.king@gmail.com> wrote:
>
>> fOn 17/02/2025 17:58, Matthew Wilcox wrote:
>>> On Mon, Feb 17, 2025 at 05:09:34PM +0000, Colin Ian King wrote:
>>>> Adding an unlikely() hint on the masked start comparison error
>>>> return path improves run-time performance of the mincore system call.
>>>>
>>>> Benchmarking on an i9-12900 shows an improvement of 7ns on mincore calls
>>>> on a 256KB mmap'd region where 50% of the pages we resident.
>>>>
>>>> Results based on running 20 tests with turbo disabled (to reduce
>>>> clock freq turbo changes), with 10 second run per test and comparing
>>>> the number of mincores calls per second. The % standard deviation of
>>>> the 20 tests was ~0.10%, so results are reliable.
>>>
>>> I think you've elided _just_ enough information here that nobody can
>>> judge whether your stats skills are any good ;-) You've told us 7ns
>>> (per call, presumably) and you've told us 0.10% standard deviation,
>>> but you haven't told us how long the syscall takes, so nobody can tell
>>> whether 7ns is within 0.10% or not ;-)
>>
>> Ugh, my bad.
>>
>> Improvement was from ~970 down to 963 ns, so small ~0.7% improvement.
>>
>
> It actually doesn't change the generated code:
I've compare the generated x86 object code using gcc 14.2.1 20240912
(Fedora 41) and 14.2.0 (Debian 14.2.0-17), 14.2.1 20250211 (Clear Linux)
and I get differences in the generated object code comparing old and
new, and the improvement on ClearLinux is more significant too because
it uses -O3. So I'm confident the change is generating improved object code.
>
> hp2:/usr/src/25> diff -u mm/mincore.lst.old mm/mincore.lst
> --- mm/mincore.lst.old 2025-02-17 19:11:34.093727411 -0800
> +++ mm/mincore.lst 2025-02-17 19:12:59.797009056 -0800
> @@ -1563,7 +1563,7 @@
> start = untagged_addr(start);
>
> /* Check the start address: needs to be page-aligned.. */
> - if (start & ~PAGE_MASK)
> + if (unlikely(start & ~PAGE_MASK))
> b27: 31 ff xor %edi,%edi
> asm (ALTERNATIVE("",
> b29: 90 nop
[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 4901 bytes --]
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH][next] mm/mincore: improve performance by adding an unlikely hint
2025-02-18 14:16 ` Colin King (gmail)
@ 2025-02-19 0:08 ` Andrew Morton
0 siblings, 0 replies; 6+ messages in thread
From: Andrew Morton @ 2025-02-19 0:08 UTC (permalink / raw)
To: Colin King (gmail)
Cc: Matthew Wilcox, linux-mm, kernel-janitors, linux-kernel
On Tue, 18 Feb 2025 14:16:20 +0000 "Colin King (gmail)" <colin.i.king@gmail.com> wrote:
> >> Improvement was from ~970 down to 963 ns, so small ~0.7% improvement.
> >>
> >
> > It actually doesn't change the generated code:
>
> I've compare the generated x86 object code using gcc 14.2.1 20240912
> (Fedora 41) and 14.2.0 (Debian 14.2.0-17), 14.2.1 20250211 (Clear Linux)
> and I get differences in the generated object code comparing old and
> new, and the improvement on ClearLinux is more significant too because
> it uses -O3. So I'm confident the change is generating improved object code.
I was using gcc-13.2.0.
Please resend, with a Matthew-friendly changelog?
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-02-19 0:08 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-02-17 17:09 [PATCH][next] mm/mincore: improve performance by adding an unlikely hint Colin Ian King
2025-02-17 17:58 ` Matthew Wilcox
2025-02-17 18:00 ` Colin King (gmail)
2025-02-18 3:13 ` Andrew Morton
2025-02-18 14:16 ` Colin King (gmail)
2025-02-19 0:08 ` Andrew Morton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox