* Re: [PATCH mm] mm/page_alloc: Avoid second trylock of zone->lock
2025-03-31 0:28 [PATCH mm] mm/page_alloc: Avoid second trylock of zone->lock Alexei Starovoitov
@ 2025-03-31 8:11 ` Sebastian Andrzej Siewior
2025-03-31 10:52 ` Michal Hocko
` (2 subsequent siblings)
3 siblings, 0 replies; 7+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-03-31 8:11 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Linus Torvalds, bpf, daniel, andrii, martin.lau, akpm, peterz,
vbabka, rostedt, shakeel.butt, mhocko, linux-mm, linux-kernel
On 2025-03-30 17:28:09 [-0700], Alexei Starovoitov wrote:
> From: Alexei Starovoitov <ast@kernel.org>
>
> spin_trylock followed by spin_lock will cause extra write cache
> access. If the lock is contended it may cause unnecessary cache
> line bouncing and will execute redundant irq restore/save pair.
> Therefore, check alloc/fpi_flags first and use spin_trylock or
> spin_lock.
>
> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
> Fixes: 97769a53f117 ("mm, bpf: Introduce try_alloc_pages() for opportunistic page allocation")
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Sebastian
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH mm] mm/page_alloc: Avoid second trylock of zone->lock
2025-03-31 0:28 [PATCH mm] mm/page_alloc: Avoid second trylock of zone->lock Alexei Starovoitov
2025-03-31 8:11 ` Sebastian Andrzej Siewior
@ 2025-03-31 10:52 ` Michal Hocko
2025-03-31 12:17 ` Vlastimil Babka
2025-04-01 3:28 ` Harry Yoo
2025-04-02 4:29 ` Shakeel Butt
3 siblings, 1 reply; 7+ messages in thread
From: Michal Hocko @ 2025-03-31 10:52 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Linus Torvalds, bpf, daniel, andrii, martin.lau, akpm, peterz,
vbabka, bigeasy, rostedt, shakeel.butt, linux-mm, linux-kernel
On Sun 30-03-25 17:28:09, Alexei Starovoitov wrote:
> From: Alexei Starovoitov <ast@kernel.org>
>
> spin_trylock followed by spin_lock will cause extra write cache
> access. If the lock is contended it may cause unnecessary cache
> line bouncing and will execute redundant irq restore/save pair.
> Therefore, check alloc/fpi_flags first and use spin_trylock or
> spin_lock.
>
> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
> Fixes: 97769a53f117 ("mm, bpf: Introduce try_alloc_pages() for opportunistic page allocation")
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Makes sense. Fixes tag is probably over reaching but whatever.
Acked-by: Michal Hocko <mhocko@suse.com>
Thanks!
> ---
> mm/page_alloc.c | 15 +++++++++------
> 1 file changed, 9 insertions(+), 6 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index e3ea5bf5c459..ffbb5678bc2f 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1268,11 +1268,12 @@ static void free_one_page(struct zone *zone, struct page *page,
> struct llist_head *llhead;
> unsigned long flags;
>
> - if (!spin_trylock_irqsave(&zone->lock, flags)) {
> - if (unlikely(fpi_flags & FPI_TRYLOCK)) {
> + if (unlikely(fpi_flags & FPI_TRYLOCK)) {
> + if (!spin_trylock_irqsave(&zone->lock, flags)) {
> add_page_to_zone_llist(zone, page, order);
> return;
> }
> + } else {
> spin_lock_irqsave(&zone->lock, flags);
> }
>
> @@ -2341,9 +2342,10 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order,
> unsigned long flags;
> int i;
>
> - if (!spin_trylock_irqsave(&zone->lock, flags)) {
> - if (unlikely(alloc_flags & ALLOC_TRYLOCK))
> + if (unlikely(alloc_flags & ALLOC_TRYLOCK)) {
> + if (!spin_trylock_irqsave(&zone->lock, flags))
> return 0;
> + } else {
> spin_lock_irqsave(&zone->lock, flags);
> }
> for (i = 0; i < count; ++i) {
> @@ -2964,9 +2966,10 @@ struct page *rmqueue_buddy(struct zone *preferred_zone, struct zone *zone,
>
> do {
> page = NULL;
> - if (!spin_trylock_irqsave(&zone->lock, flags)) {
> - if (unlikely(alloc_flags & ALLOC_TRYLOCK))
> + if (unlikely(alloc_flags & ALLOC_TRYLOCK)) {
> + if (!spin_trylock_irqsave(&zone->lock, flags))
> return NULL;
> + } else {
> spin_lock_irqsave(&zone->lock, flags);
> }
> if (alloc_flags & ALLOC_HIGHATOMIC)
> --
> 2.47.1
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH mm] mm/page_alloc: Avoid second trylock of zone->lock
2025-03-31 10:52 ` Michal Hocko
@ 2025-03-31 12:17 ` Vlastimil Babka
2025-03-31 16:59 ` Alexei Starovoitov
0 siblings, 1 reply; 7+ messages in thread
From: Vlastimil Babka @ 2025-03-31 12:17 UTC (permalink / raw)
To: Michal Hocko, Alexei Starovoitov
Cc: Linus Torvalds, bpf, daniel, andrii, martin.lau, akpm, peterz,
bigeasy, rostedt, shakeel.butt, linux-mm, linux-kernel
On 3/31/25 12:52, Michal Hocko wrote:
> On Sun 30-03-25 17:28:09, Alexei Starovoitov wrote:
>> From: Alexei Starovoitov <ast@kernel.org>
>>
>> spin_trylock followed by spin_lock will cause extra write cache
>> access. If the lock is contended it may cause unnecessary cache
>> line bouncing
Right.
> and will execute redundant irq restore/save pair.
Maybe that part matters less if we're likely to have to spin anyway - it
doesn't affect other cpus at least unlike the bouncing.
>> Therefore, check alloc/fpi_flags first and use spin_trylock or
>> spin_lock.
Yeah this should be still ok for the zone lock as the fast paths are using
pcplists, so we still shouldn't be making page allocator slower due to the
try_alloc addition.
>> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
>> Fixes: 97769a53f117 ("mm, bpf: Introduce try_alloc_pages() for opportunistic page allocation")
>> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
>
> Makes sense. Fixes tag is probably over reaching but whatever.
It's fixing 6.15-rc1 code so no possible stable implications anyway.
> Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
> Thanks!
>
>> ---
>> mm/page_alloc.c | 15 +++++++++------
>> 1 file changed, 9 insertions(+), 6 deletions(-)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index e3ea5bf5c459..ffbb5678bc2f 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -1268,11 +1268,12 @@ static void free_one_page(struct zone *zone, struct page *page,
>> struct llist_head *llhead;
>> unsigned long flags;
>>
>> - if (!spin_trylock_irqsave(&zone->lock, flags)) {
>> - if (unlikely(fpi_flags & FPI_TRYLOCK)) {
>> + if (unlikely(fpi_flags & FPI_TRYLOCK)) {
>> + if (!spin_trylock_irqsave(&zone->lock, flags)) {
>> add_page_to_zone_llist(zone, page, order);
>> return;
>> }
>> + } else {
>> spin_lock_irqsave(&zone->lock, flags);
>> }
>>
>> @@ -2341,9 +2342,10 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order,
>> unsigned long flags;
>> int i;
>>
>> - if (!spin_trylock_irqsave(&zone->lock, flags)) {
>> - if (unlikely(alloc_flags & ALLOC_TRYLOCK))
>> + if (unlikely(alloc_flags & ALLOC_TRYLOCK)) {
>> + if (!spin_trylock_irqsave(&zone->lock, flags))
>> return 0;
>> + } else {
>> spin_lock_irqsave(&zone->lock, flags);
>> }
>> for (i = 0; i < count; ++i) {
>> @@ -2964,9 +2966,10 @@ struct page *rmqueue_buddy(struct zone *preferred_zone, struct zone *zone,
>>
>> do {
>> page = NULL;
>> - if (!spin_trylock_irqsave(&zone->lock, flags)) {
>> - if (unlikely(alloc_flags & ALLOC_TRYLOCK))
>> + if (unlikely(alloc_flags & ALLOC_TRYLOCK)) {
>> + if (!spin_trylock_irqsave(&zone->lock, flags))
>> return NULL;
>> + } else {
>> spin_lock_irqsave(&zone->lock, flags);
>> }
>> if (alloc_flags & ALLOC_HIGHATOMIC)
>> --
>> 2.47.1
>
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH mm] mm/page_alloc: Avoid second trylock of zone->lock
2025-03-31 12:17 ` Vlastimil Babka
@ 2025-03-31 16:59 ` Alexei Starovoitov
0 siblings, 0 replies; 7+ messages in thread
From: Alexei Starovoitov @ 2025-03-31 16:59 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Michal Hocko, Linus Torvalds, bpf, Daniel Borkmann,
Andrii Nakryiko, Martin KaFai Lau, Andrew Morton, Peter Zijlstra,
Sebastian Sewior, Steven Rostedt, Shakeel Butt, linux-mm, LKML
On Mon, Mar 31, 2025 at 5:17 AM Vlastimil Babka <vbabka@suse.cz> wrote:
>
> >> Fixes: 97769a53f117 ("mm, bpf: Introduce try_alloc_pages() for opportunistic page allocation")
> >> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> >
> > Makes sense. Fixes tag is probably over reaching but whatever.
>
> It's fixing 6.15-rc1 code so no possible stable implications anyway.
All true. I added the Fixes tag only because if I didn't then
somebody would question why the tag is missing :)
I often look at "Fixes:" as "Strongly-related-to:".
We might backport these patches to older kernels way before 6.15
is released, so having a documented way to strongly connect patches
is a good thing.
Thanks for the reviews everyone.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH mm] mm/page_alloc: Avoid second trylock of zone->lock
2025-03-31 0:28 [PATCH mm] mm/page_alloc: Avoid second trylock of zone->lock Alexei Starovoitov
2025-03-31 8:11 ` Sebastian Andrzej Siewior
2025-03-31 10:52 ` Michal Hocko
@ 2025-04-01 3:28 ` Harry Yoo
2025-04-02 4:29 ` Shakeel Butt
3 siblings, 0 replies; 7+ messages in thread
From: Harry Yoo @ 2025-04-01 3:28 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Linus Torvalds, bpf, daniel, andrii, martin.lau, akpm, peterz,
vbabka, bigeasy, rostedt, shakeel.butt, mhocko, linux-mm,
linux-kernel
On Sun, Mar 30, 2025 at 05:28:09PM -0700, Alexei Starovoitov wrote:
> From: Alexei Starovoitov <ast@kernel.org>
>
> spin_trylock followed by spin_lock will cause extra write cache
> access. If the lock is contended it may cause unnecessary cache
> line bouncing and will execute redundant irq restore/save pair.
> Therefore, check alloc/fpi_flags first and use spin_trylock or
> spin_lock.
>
> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
> Fixes: 97769a53f117 ("mm, bpf: Introduce try_alloc_pages() for opportunistic page allocation")
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> ---
Looks good to me,
Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
> mm/page_alloc.c | 15 +++++++++------
> 1 file changed, 9 insertions(+), 6 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index e3ea5bf5c459..ffbb5678bc2f 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1268,11 +1268,12 @@ static void free_one_page(struct zone *zone, struct page *page,
> struct llist_head *llhead;
> unsigned long flags;
>
> - if (!spin_trylock_irqsave(&zone->lock, flags)) {
> - if (unlikely(fpi_flags & FPI_TRYLOCK)) {
> + if (unlikely(fpi_flags & FPI_TRYLOCK)) {
> + if (!spin_trylock_irqsave(&zone->lock, flags)) {
> add_page_to_zone_llist(zone, page, order);
> return;
> }
> + } else {
> spin_lock_irqsave(&zone->lock, flags);
> }
>
> @@ -2341,9 +2342,10 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order,
> unsigned long flags;
> int i;
>
> - if (!spin_trylock_irqsave(&zone->lock, flags)) {
> - if (unlikely(alloc_flags & ALLOC_TRYLOCK))
> + if (unlikely(alloc_flags & ALLOC_TRYLOCK)) {
> + if (!spin_trylock_irqsave(&zone->lock, flags))
> return 0;
> + } else {
> spin_lock_irqsave(&zone->lock, flags);
> }
> for (i = 0; i < count; ++i) {
> @@ -2964,9 +2966,10 @@ struct page *rmqueue_buddy(struct zone *preferred_zone, struct zone *zone,
>
> do {
> page = NULL;
> - if (!spin_trylock_irqsave(&zone->lock, flags)) {
> - if (unlikely(alloc_flags & ALLOC_TRYLOCK))
> + if (unlikely(alloc_flags & ALLOC_TRYLOCK)) {
> + if (!spin_trylock_irqsave(&zone->lock, flags))
> return NULL;
> + } else {
> spin_lock_irqsave(&zone->lock, flags);
> }
> if (alloc_flags & ALLOC_HIGHATOMIC)
> --
> 2.47.1
>
>
--
Cheers,
Harry (formerly known as Hyeonggon)
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH mm] mm/page_alloc: Avoid second trylock of zone->lock
2025-03-31 0:28 [PATCH mm] mm/page_alloc: Avoid second trylock of zone->lock Alexei Starovoitov
` (2 preceding siblings ...)
2025-04-01 3:28 ` Harry Yoo
@ 2025-04-02 4:29 ` Shakeel Butt
3 siblings, 0 replies; 7+ messages in thread
From: Shakeel Butt @ 2025-04-02 4:29 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Linus Torvalds, bpf, daniel, andrii, martin.lau, akpm, peterz,
vbabka, bigeasy, rostedt, mhocko, linux-mm, linux-kernel
On Sun, Mar 30, 2025 at 05:28:09PM -0700, Alexei Starovoitov wrote:
> From: Alexei Starovoitov <ast@kernel.org>
>
> spin_trylock followed by spin_lock will cause extra write cache
> access. If the lock is contended it may cause unnecessary cache
> line bouncing and will execute redundant irq restore/save pair.
> Therefore, check alloc/fpi_flags first and use spin_trylock or
> spin_lock.
>
> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
> Fixes: 97769a53f117 ("mm, bpf: Introduce try_alloc_pages() for opportunistic page allocation")
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev>
^ permalink raw reply [flat|nested] 7+ messages in thread