* Build performance regressions originating from min()/max() macros
@ 2024-07-23 21:59 Lorenzo Stoakes
2024-07-24 8:14 ` Jürgen Groß
2024-07-24 8:44 ` Lorenzo Stoakes
0 siblings, 2 replies; 10+ messages in thread
From: Lorenzo Stoakes @ 2024-07-23 21:59 UTC (permalink / raw)
To: Andrew Morton, david.laight
Cc: Arnd Bergmann, willy, torvalds, Jason, hch, andriy.shevchenko,
pedro.falcato, Mateusz Guzik, linux-mm, linux-kernel
Arnd reported a significant build slowdown [0], which was bisected to the
series spanning commit 80fcac55385c ("minmax: relax check to allow
comparison between unsigned arguments and signed constants") to commit
867046cc70277 ("minmax: relax check to allow comparison between unsigned
arguments and signed constants"), originating from the series "minmax:
Relax type checks in min() and max()." [1].
I have reproduced this locally, reverting this series and manually fixing
up all call sites that invoke min()/max() for a simple x86-64 defconfig (+
some other debug flags I use for debug kernels, I can provide the .config
if needed).
Arnd noted that the arch/x86/xen/setup.c file was particularly problematic,
taking 15 (!) seconds to pre-process on his machine, so I also enabled
CONFIG_XEN to test this and obtained performance numbers with this set/not
set.
I was able to reproduce this very significant pre-processor time on this
file, noting that with the series reverted compile time for the file is
0.79s, with it in place, it takes 6.90s for a 873.4% slowdown.
I also checked total build times (32-core intel i9-14900KF box):
## With CONFIG_XEN
### Reverted minmax code
make 1588.46s user 92.33s system 2430% cpu 1:09.16 total
make 1598.57s user 93.49s system 2419% cpu 1:09.94 total
make 1598.99s user 92.49s system 2419% cpu 1:09.91 total
### Not reverted
make 1639.25s user 96.34s system 2433% cpu 1:11.32 total
make 1640.34s user 96.01s system 2427% cpu 1:11.54 total
make 1639.98s user 96.76s system 2436% cpu 1:11.27 total
## Without CONFIG_XEN
### Reverted minmax code
make 1524.97s user 89.84s system 2399% cpu 1:07.31 total
make 1521.01s user 88.99s system 2391% cpu 1:07.32 total
make 1530.75s user 89.65s system 2389% cpu 1:07.83 total
### Not reverted
make 1570.64s user 94.09s system 2398% cpu 1:09.41 total
make 1571.25s user 94.36s system 2401% cpu 1:09.36 total
make 1568.25s user 93.83s system 2396% cpu 1:09.35 total
Which suggests a worryingly significant slowdown of ~45s with CONFIG_XEN
enabled and ~35s even without it.
The underlying problems seems to be very large macro expansions, which Arnd
noted in the xen case originated from the line:
extra_pages = min3(EXTRA_MEM_RATIO * min(max_pfn, PFN_DOWN(MAXMEM)),
extra_pages, max_pages - max_pfn);
And resulted in the generation of 47 MB (!) of pre-processor output.
It seems a lot of code now relies on the relaxed conditions of the newly
changed min/max() macros, so the question is - what can we do to address
these regressions?
[0]:https://social.kernel.org/notice/AkDuGHsn0WuA1g1uD2
[1]:https://lore.kernel.org/all/b97faef60ad24922b530241c5d7c933c@AcuMS.aculab.com/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Build performance regressions originating from min()/max() macros
2024-07-23 21:59 Build performance regressions originating from min()/max() macros Lorenzo Stoakes
@ 2024-07-24 8:14 ` Jürgen Groß
2024-07-24 8:31 ` Lorenzo Stoakes
` (2 more replies)
2024-07-24 8:44 ` Lorenzo Stoakes
1 sibling, 3 replies; 10+ messages in thread
From: Jürgen Groß @ 2024-07-24 8:14 UTC (permalink / raw)
To: Lorenzo Stoakes, Andrew Morton, david.laight
Cc: Arnd Bergmann, willy, torvalds, Jason, hch, andriy.shevchenko,
pedro.falcato, Mateusz Guzik, linux-mm, linux-kernel
On 23.07.24 23:59, Lorenzo Stoakes wrote:
> Arnd reported a significant build slowdown [0], which was bisected to the
> series spanning commit 80fcac55385c ("minmax: relax check to allow
> comparison between unsigned arguments and signed constants") to commit
> 867046cc70277 ("minmax: relax check to allow comparison between unsigned
> arguments and signed constants"), originating from the series "minmax:
> Relax type checks in min() and max()." [1].
>
> I have reproduced this locally, reverting this series and manually fixing
> up all call sites that invoke min()/max() for a simple x86-64 defconfig (+
> some other debug flags I use for debug kernels, I can provide the .config
> if needed).
>
> Arnd noted that the arch/x86/xen/setup.c file was particularly problematic,
> taking 15 (!) seconds to pre-process on his machine, so I also enabled
> CONFIG_XEN to test this and obtained performance numbers with this set/not
> set.
>
> I was able to reproduce this very significant pre-processor time on this
> file, noting that with the series reverted compile time for the file is
> 0.79s, with it in place, it takes 6.90s for a 873.4% slowdown.
>
> I also checked total build times (32-core intel i9-14900KF box):
>
> ## With CONFIG_XEN
>
> ### Reverted minmax code
>
> make 1588.46s user 92.33s system 2430% cpu 1:09.16 total
> make 1598.57s user 93.49s system 2419% cpu 1:09.94 total
> make 1598.99s user 92.49s system 2419% cpu 1:09.91 total
>
> ### Not reverted
>
> make 1639.25s user 96.34s system 2433% cpu 1:11.32 total
> make 1640.34s user 96.01s system 2427% cpu 1:11.54 total
> make 1639.98s user 96.76s system 2436% cpu 1:11.27 total
>
> ## Without CONFIG_XEN
>
> ### Reverted minmax code
>
> make 1524.97s user 89.84s system 2399% cpu 1:07.31 total
> make 1521.01s user 88.99s system 2391% cpu 1:07.32 total
> make 1530.75s user 89.65s system 2389% cpu 1:07.83 total
>
> ### Not reverted
>
> make 1570.64s user 94.09s system 2398% cpu 1:09.41 total
> make 1571.25s user 94.36s system 2401% cpu 1:09.36 total
> make 1568.25s user 93.83s system 2396% cpu 1:09.35 total
>
> Which suggests a worryingly significant slowdown of ~45s with CONFIG_XEN
> enabled and ~35s even without it.
>
> The underlying problems seems to be very large macro expansions, which Arnd
> noted in the xen case originated from the line:
>
> extra_pages = min3(EXTRA_MEM_RATIO * min(max_pfn, PFN_DOWN(MAXMEM)),
> extra_pages, max_pages - max_pfn);
>
> And resulted in the generation of 47 MB (!) of pre-processor output.
>
> It seems a lot of code now relies on the relaxed conditions of the newly
> changed min/max() macros, so the question is - what can we do to address
> these regressions?
I can send a patch to simplify the problematic construct, but OTOH this
will avoid only one particularly bad example.
Juergen
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Build performance regressions originating from min()/max() macros
2024-07-24 8:14 ` Jürgen Groß
@ 2024-07-24 8:31 ` Lorenzo Stoakes
2024-07-24 9:40 ` Jürgen Groß
2024-07-24 8:34 ` David Laight
2024-07-24 8:50 ` Arnd Bergmann
2 siblings, 1 reply; 10+ messages in thread
From: Lorenzo Stoakes @ 2024-07-24 8:31 UTC (permalink / raw)
To: Jürgen Groß
Cc: Andrew Morton, david.laight, Arnd Bergmann, willy, torvalds,
Jason, hch, andriy.shevchenko, pedro.falcato, Mateusz Guzik,
linux-mm, linux-kernel
On Wed, Jul 24, 2024 at 10:14:12AM GMT, Jürgen Groß wrote:
> On 23.07.24 23:59, Lorenzo Stoakes wrote:
> > Arnd reported a significant build slowdown [0], which was bisected to the
> > series spanning commit 80fcac55385c ("minmax: relax check to allow
> > comparison between unsigned arguments and signed constants") to commit
> > 867046cc70277 ("minmax: relax check to allow comparison between unsigned
> > arguments and signed constants"), originating from the series "minmax:
> > Relax type checks in min() and max()." [1].
[snip]
> I can send a patch to simplify the problematic construct, but OTOH this
> will avoid only one particularly bad example.
Thanks, appreciated but I am a little concerned that we might get stuck in
whack-a-mole here a bit. I'm pretty sure we've had previous patches that
have addressed invocation points, but obviously the underlying issue are
these macros which will keep cropping up again and again.
>
>
> Juergen
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: Build performance regressions originating from min()/max() macros
2024-07-24 8:14 ` Jürgen Groß
2024-07-24 8:31 ` Lorenzo Stoakes
@ 2024-07-24 8:34 ` David Laight
2024-07-24 8:50 ` Arnd Bergmann
2 siblings, 0 replies; 10+ messages in thread
From: David Laight @ 2024-07-24 8:34 UTC (permalink / raw)
To: 'Jürgen Groß', Lorenzo Stoakes, Andrew Morton
Cc: Arnd Bergmann, willy, torvalds, Jason, hch, andriy.shevchenko,
pedro.falcato, Mateusz Guzik, linux-mm, linux-kernel
From: Jürgen Groß
> Sent: 24 July 2024 09:14
>
> On 23.07.24 23:59, Lorenzo Stoakes wrote:
> > Arnd reported a significant build slowdown [0], which was bisected to the
> > series spanning commit 80fcac55385c ("minmax: relax check to allow
> > comparison between unsigned arguments and signed constants") to commit
> > 867046cc70277 ("minmax: relax check to allow comparison between unsigned
> > arguments and signed constants"), originating from the series "minmax:
> > Relax type checks in min() and max()." [1].
> >
> > I have reproduced this locally, reverting this series and manually fixing
> > up all call sites that invoke min()/max() for a simple x86-64 defconfig (+
> > some other debug flags I use for debug kernels, I can provide the .config
> > if needed).
> >
> > Arnd noted that the arch/x86/xen/setup.c file was particularly problematic,
> > taking 15 (!) seconds to pre-process on his machine, so I also enabled
> > CONFIG_XEN to test this and obtained performance numbers with this set/not
> > set.
> >
> > I was able to reproduce this very significant pre-processor time on this
> > file, noting that with the series reverted compile time for the file is
> > 0.79s, with it in place, it takes 6.90s for a 873.4% slowdown.
...
> > Which suggests a worryingly significant slowdown of ~45s with CONFIG_XEN
> > enabled and ~35s even without it.
> >
> > The underlying problems seems to be very large macro expansions, which Arnd
> > noted in the xen case originated from the line:
> >
> > extra_pages = min3(EXTRA_MEM_RATIO * min(max_pfn, PFN_DOWN(MAXMEM)),
> > extra_pages, max_pages - max_pfn);
Jeepers, that is definitely going to be big - it never was small.
The expansion of min() itself isn't that bad.
But both arguments get expanded a few times (and a few more than the old code).
So a nested min/max causes an O(n^2) expansion - livable.
But min3(a,b,c) is just min(min(a,b),c) - so a nested call.
(It seems to have been added to avoid the cost of the nested call, but
just implemented a nested call anyway.)
Here one of the arguments to min3() is a min() - and I suspect it goes
into the inner min() so O(n^3) - which is bad news.
> >
> > And resulted in the generation of 47 MB (!) of pre-processor output.
> >
> > It seems a lot of code now relies on the relaxed conditions of the newly
> > changed min/max() macros, so the question is - what can we do to address
> > these regressions?
>
> I can send a patch to simplify the problematic construct, but OTOH this
> will avoid only one particularly bad example.
I suspect that reversing the order of the args to min3() will help this case.
There is an updated patch set I sent in January that reduces the
expansion a bit.
I've a reworked version that removes the last few patches (removing
support for constant output and using MIN() for that instead).
I'll repost it and maybe Arnd will pick it up?
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Build performance regressions originating from min()/max() macros
2024-07-23 21:59 Build performance regressions originating from min()/max() macros Lorenzo Stoakes
2024-07-24 8:14 ` Jürgen Groß
@ 2024-07-24 8:44 ` Lorenzo Stoakes
2024-07-24 11:05 ` David Laight
1 sibling, 1 reply; 10+ messages in thread
From: Lorenzo Stoakes @ 2024-07-24 8:44 UTC (permalink / raw)
To: Andrew Morton, david.laight
Cc: Arnd Bergmann, willy, torvalds, Jason, hch, andriy.shevchenko,
pedro.falcato, Mateusz Guzik, linux-mm, linux-kernel
On Tue, Jul 23, 2024 at 10:59:15PM GMT, Lorenzo Stoakes wrote:
> Arnd reported a significant build slowdown [0], which was bisected to the
> series spanning commit 80fcac55385c ("minmax: relax check to allow
> comparison between unsigned arguments and signed constants") to commit
> 867046cc70277 ("minmax: relax check to allow comparison between unsigned
> arguments and signed constants"), originating from the series "minmax:
> Relax type checks in min() and max()." [1].
>
> I have reproduced this locally, reverting this series and manually fixing
> up all call sites that invoke min()/max() for a simple x86-64 defconfig (+
> some other debug flags I use for debug kernels, I can provide the .config
> if needed).
>
> Arnd noted that the arch/x86/xen/setup.c file was particularly problematic,
> taking 15 (!) seconds to pre-process on his machine, so I also enabled
> CONFIG_XEN to test this and obtained performance numbers with this set/not
> set.
>
> I was able to reproduce this very significant pre-processor time on this
> file, noting that with the series reverted compile time for the file is
> 0.79s, with it in place, it takes 6.90s for a 873.4% slowdown.
>
> I also checked total build times (32-core intel i9-14900KF box):
>
> ## With CONFIG_XEN
>
> ### Reverted minmax code
>
> make 1588.46s user 92.33s system 2430% cpu 1:09.16 total
> make 1598.57s user 93.49s system 2419% cpu 1:09.94 total
> make 1598.99s user 92.49s system 2419% cpu 1:09.91 total
>
> ### Not reverted
>
> make 1639.25s user 96.34s system 2433% cpu 1:11.32 total
> make 1640.34s user 96.01s system 2427% cpu 1:11.54 total
> make 1639.98s user 96.76s system 2436% cpu 1:11.27 total
>
> ## Without CONFIG_XEN
>
> ### Reverted minmax code
>
> make 1524.97s user 89.84s system 2399% cpu 1:07.31 total
> make 1521.01s user 88.99s system 2391% cpu 1:07.32 total
> make 1530.75s user 89.65s system 2389% cpu 1:07.83 total
>
> ### Not reverted
>
> make 1570.64s user 94.09s system 2398% cpu 1:09.41 total
> make 1571.25s user 94.36s system 2401% cpu 1:09.36 total
> make 1568.25s user 93.83s system 2396% cpu 1:09.35 total
>
> Which suggests a worryingly significant slowdown of ~45s with CONFIG_XEN
> enabled and ~35s even without it.
>
> The underlying problems seems to be very large macro expansions, which Arnd
> noted in the xen case originated from the line:
>
> extra_pages = min3(EXTRA_MEM_RATIO * min(max_pfn, PFN_DOWN(MAXMEM)),
> extra_pages, max_pages - max_pfn);
>
> And resulted in the generation of 47 MB (!) of pre-processor output.
>
> It seems a lot of code now relies on the relaxed conditions of the newly
> changed min/max() macros, so the question is - what can we do to address
> these regressions?
>
> [0]:https://social.kernel.org/notice/AkDuGHsn0WuA1g1uD2
> [1]:https://lore.kernel.org/all/b97faef60ad24922b530241c5d7c933c@AcuMS.aculab.com/
It seems that (again, all credit to Arnd for his thorough analysis here) a
lot of the underyling issue revolves around the macros need to function
both in scenarios where we absolutely must have a constant value (for
instance, array size) as well as in scenarios where this is not the case.
Arnd also discovered [0] there _relatively_ few call sites that require
this, so maybe a way forward might be to create specific min()/max() macros
for the strictly const case and to fix up the core ones to reduce expansion
at the cost of not being able to use these in these scenarios?
Does this seem viable? Perhaps David you might have thoughts on this?
[0]:https://social.kernel.org/notice/AkFSGxnKlr6Fc5vpuS
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Build performance regressions originating from min()/max() macros
2024-07-24 8:14 ` Jürgen Groß
2024-07-24 8:31 ` Lorenzo Stoakes
2024-07-24 8:34 ` David Laight
@ 2024-07-24 8:50 ` Arnd Bergmann
2 siblings, 0 replies; 10+ messages in thread
From: Arnd Bergmann @ 2024-07-24 8:50 UTC (permalink / raw)
To: Juergen Gross, Lorenzo Stoakes, Andrew Morton, David Laight
Cc: Matthew Wilcox, Linus Torvalds, Jason A . Donenfeld,
Christoph Hellwig, Andy Shevchenko, pedro.falcato, Mateusz Guzik,
linux-mm, linux-kernel
On Wed, Jul 24, 2024, at 10:14, Jürgen Groß wrote:
> On 23.07.24 23:59, Lorenzo Stoakes wrote:
>>
>> And resulted in the generation of 47 MB (!) of pre-processor output.
>>
>> It seems a lot of code now relies on the relaxed conditions of the newly
>> changed min/max() macros, so the question is - what can we do to address
>> these regressions?
>
> I can send a patch to simplify the problematic construct, but OTOH this
> will avoid only one particularly bad example.
It's probably a good idea do change the xen/setup.c file anyway,
as I haven't found any other file that had a regression this bad,
and it only needs a single temporary variable for a 1000x speedup.
For the overall kernel, I see at best a 2.3% speedup (20 second
CPU time) by replacing the current min()/max() macros with a version
that drops both the constant expression output feature and the
assertion, measuring an x86 defconfig build, which has xen
disabled. On a defconfig+xen kernel, that difference increases
to 4.4% or 37 seconds.
Removing only the constexpr side requires a handful of fixups
for x86 allmodconfig to replace min()/max() with something else in
drivers/edac/sb_edac.c
drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
drivers/gpu/drm/drm_color_mgmt.c
drivers/input/touchscreen/cyttsp4_core.c
drivers/md/dm-integrity.c
drivers/net/can/usb/etas_es58x/es58x_devlink.c
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
fs/btrfs/tree-checker.c
lib/vsprintf.c
net/ipv4/proc.c
net/ipv6/proc.c
This gives about half the speed difference, the other
half comes from removing the assertion, but that is not
a good idea unless we can replace it with an equivalent
assertion that works on the unique_x/unique_y variables
instead of expanding the arguments.
Arnd
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Build performance regressions originating from min()/max() macros
2024-07-24 8:31 ` Lorenzo Stoakes
@ 2024-07-24 9:40 ` Jürgen Groß
2024-07-24 9:43 ` Lorenzo Stoakes
2024-07-24 14:47 ` David Laight
0 siblings, 2 replies; 10+ messages in thread
From: Jürgen Groß @ 2024-07-24 9:40 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Andrew Morton, david.laight, Arnd Bergmann, willy, torvalds,
Jason, hch, andriy.shevchenko, pedro.falcato, Mateusz Guzik,
linux-mm, linux-kernel
On 24.07.24 10:31, Lorenzo Stoakes wrote:
> On Wed, Jul 24, 2024 at 10:14:12AM GMT, Jürgen Groß wrote:
>> On 23.07.24 23:59, Lorenzo Stoakes wrote:
>>> Arnd reported a significant build slowdown [0], which was bisected to the
>>> series spanning commit 80fcac55385c ("minmax: relax check to allow
>>> comparison between unsigned arguments and signed constants") to commit
>>> 867046cc70277 ("minmax: relax check to allow comparison between unsigned
>>> arguments and signed constants"), originating from the series "minmax:
>>> Relax type checks in min() and max()." [1].
>
> [snip]
>
>> I can send a patch to simplify the problematic construct, but OTOH this
>> will avoid only one particularly bad example.
>
> Thanks, appreciated but I am a little concerned that we might get stuck in
> whack-a-mole here a bit. I'm pretty sure we've had previous patches that
> have addressed invocation points, but obviously the underlying issue are
> these macros which will keep cropping up again and again.
The xen example seems to be one of the worst due to nesting of min3() and
min(), so being de facto a min4().
I think drivers/firmware/sysfb_simplefb.c has a similar problem, as it is
nesting max() with max3(). Same applies to arch/x86/kernel/cpu/cacheinfo.c
and multiple times to fs/xfs/libxfs/xfs_trans_resv.c.
There are probably more such extreme cases.
Juergen
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Build performance regressions originating from min()/max() macros
2024-07-24 9:40 ` Jürgen Groß
@ 2024-07-24 9:43 ` Lorenzo Stoakes
2024-07-24 14:47 ` David Laight
1 sibling, 0 replies; 10+ messages in thread
From: Lorenzo Stoakes @ 2024-07-24 9:43 UTC (permalink / raw)
To: Jürgen Groß
Cc: Andrew Morton, david.laight, Arnd Bergmann, willy, torvalds,
Jason, hch, andriy.shevchenko, pedro.falcato, Mateusz Guzik,
linux-mm, linux-kernel
On Wed, Jul 24, 2024 at 11:40:07AM GMT, Jürgen Groß wrote:
> On 24.07.24 10:31, Lorenzo Stoakes wrote:
> > On Wed, Jul 24, 2024 at 10:14:12AM GMT, Jürgen Groß wrote:
> > > On 23.07.24 23:59, Lorenzo Stoakes wrote:
> > > > Arnd reported a significant build slowdown [0], which was bisected to the
> > > > series spanning commit 80fcac55385c ("minmax: relax check to allow
> > > > comparison between unsigned arguments and signed constants") to commit
> > > > 867046cc70277 ("minmax: relax check to allow comparison between unsigned
> > > > arguments and signed constants"), originating from the series "minmax:
> > > > Relax type checks in min() and max()." [1].
> >
> > [snip]
> >
> > > I can send a patch to simplify the problematic construct, but OTOH this
> > > will avoid only one particularly bad example.
> >
> > Thanks, appreciated but I am a little concerned that we might get stuck in
> > whack-a-mole here a bit. I'm pretty sure we've had previous patches that
> > have addressed invocation points, but obviously the underlying issue are
> > these macros which will keep cropping up again and again.
>
> The xen example seems to be one of the worst due to nesting of min3() and
> min(), so being de facto a min4().
>
> I think drivers/firmware/sysfb_simplefb.c has a similar problem, as it is
> nesting max() with max3(). Same applies to arch/x86/kernel/cpu/cacheinfo.c
> and multiple times to fs/xfs/libxfs/xfs_trans_resv.c.
>
> There are probably more such extreme cases.
>
Yeah to be clear, I am not opposed to these patches, I just don't want us to
lose sight of the need to fix the underlying problem if possible.
It feels like we are leaving the worst kind of landmine - a construct that you
simply wouldn't expect to cause massive build perf degradation - for others to
step on.
I suspect there are probably a few specific O(n^3) cases (as David pointed out)
that account for most of the problem and a bunch of other less problematic ones
that hit perhaps O(n^2) cases that add up.
>
> Juergen
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: Build performance regressions originating from min()/max() macros
2024-07-24 8:44 ` Lorenzo Stoakes
@ 2024-07-24 11:05 ` David Laight
0 siblings, 0 replies; 10+ messages in thread
From: David Laight @ 2024-07-24 11:05 UTC (permalink / raw)
To: 'Lorenzo Stoakes', Andrew Morton
Cc: Arnd Bergmann, willy, torvalds, Jason, hch, andriy.shevchenko,
pedro.falcato, Mateusz Guzik, linux-mm, linux-kernel
From: Lorenzo Stoakes
> Sent: 24 July 2024 09:45
..
> It seems that (again, all credit to Arnd for his thorough analysis here) a
> lot of the underyling issue revolves around the macros need to function
> both in scenarios where we absolutely must have a constant value (for
> instance, array size) as well as in scenarios where this is not the case.
>
> Arnd also discovered [0] there _relatively_ few call sites that require
> this, so maybe a way forward might be to create specific min()/max() macros
> for the strictly const case and to fix up the core ones to reduce expansion
> at the cost of not being able to use these in these scenarios?
>
> Does this seem viable? Perhaps David you might have thoughts on this?
Indeed.
MIN() and MAX() could be used for compile-time constants (and require constants).
They would then be usable for static initialisers.
Before that can be done all the places that locally define MIN() and MAX()
need changing to either use min()/max() or surround the local definition
with an #ifdef until the global definition is added.
But there are some optimisations that can be done first.
The significant one is implement min3() directly (and probably without
the 'can be constant' logic.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: Build performance regressions originating from min()/max() macros
2024-07-24 9:40 ` Jürgen Groß
2024-07-24 9:43 ` Lorenzo Stoakes
@ 2024-07-24 14:47 ` David Laight
1 sibling, 0 replies; 10+ messages in thread
From: David Laight @ 2024-07-24 14:47 UTC (permalink / raw)
To: 'Jürgen Groß', Lorenzo Stoakes
Cc: Andrew Morton, Arnd Bergmann, willy, torvalds, Jason, hch,
andriy.shevchenko, pedro.falcato, Mateusz Guzik, linux-mm,
linux-kernel
From: Jürgen Groß
> Sent: 24 July 2024 10:40
>
> On 24.07.24 10:31, Lorenzo Stoakes wrote:
> > On Wed, Jul 24, 2024 at 10:14:12AM GMT, Jürgen Groß wrote:
> >> On 23.07.24 23:59, Lorenzo Stoakes wrote:
> >>> Arnd reported a significant build slowdown [0], which was bisected to the
> >>> series spanning commit 80fcac55385c ("minmax: relax check to allow
> >>> comparison between unsigned arguments and signed constants") to commit
> >>> 867046cc70277 ("minmax: relax check to allow comparison between unsigned
> >>> arguments and signed constants"), originating from the series "minmax:
> >>> Relax type checks in min() and max()." [1].
> >
> > [snip]
> >
> >> I can send a patch to simplify the problematic construct, but OTOH this
> >> will avoid only one particularly bad example.
> >
> > Thanks, appreciated but I am a little concerned that we might get stuck in
> > whack-a-mole here a bit. I'm pretty sure we've had previous patches that
> > have addressed invocation points, but obviously the underlying issue are
> > these macros which will keep cropping up again and again.
>
> The xen example seems to be one of the worst due to nesting of min3() and
> min(), so being de facto a min4().
>
> I think drivers/firmware/sysfb_simplefb.c has a similar problem, as it is
> nesting max() with max3(). Same applies to arch/x86/kernel/cpu/cacheinfo.c
> and multiple times to fs/xfs/libxfs/xfs_trans_resv.c.
>
> There are probably more such extreme cases.
I've just sent in a 7-part patch series that slightly reduces the complexity
and directly implements min3() and max3().
The latter should help.
This is based on part of a series I send months ago that will have 'got lost'.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-07-24 14:48 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-07-23 21:59 Build performance regressions originating from min()/max() macros Lorenzo Stoakes
2024-07-24 8:14 ` Jürgen Groß
2024-07-24 8:31 ` Lorenzo Stoakes
2024-07-24 9:40 ` Jürgen Groß
2024-07-24 9:43 ` Lorenzo Stoakes
2024-07-24 14:47 ` David Laight
2024-07-24 8:34 ` David Laight
2024-07-24 8:50 ` Arnd Bergmann
2024-07-24 8:44 ` Lorenzo Stoakes
2024-07-24 11:05 ` David Laight
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox