From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: bpf@vger.kernel.org, andrii@kernel.org, memxor@gmail.com,
akpm@linux-foundation.org, peterz@infradead.org, vbabka@suse.cz,
rostedt@goodmis.org, houtao1@huawei.com, hannes@cmpxchg.org,
shakeel.butt@linux.dev, mhocko@suse.com, willy@infradead.org,
tglx@linutronix.de, jannh@google.com, tj@kernel.org,
linux-mm@kvack.org, kernel-team@fb.com
Subject: Re: [PATCH bpf-next v5 2/7] mm, bpf: Introduce free_pages_nolock()
Date: Fri, 17 Jan 2025 19:20:55 +0100 [thread overview]
Message-ID: <20250117182055._8lyYECm@linutronix.de> (raw)
In-Reply-To: <20250115021746.34691-3-alexei.starovoitov@gmail.com>
On 2025-01-14 18:17:41 [-0800], Alexei Starovoitov wrote:
> From: Alexei Starovoitov <ast@kernel.org>
>
> Introduce free_pages_nolock() that can free pages without taking locks.
> It relies on trylock and can be called from any context.
> Since spin_trylock() cannot be used in RT from hard IRQ or NMI
> it uses lockless link list to stash the pages which will be freed
> by subsequent free_pages() from good context.
>
> Do not use llist unconditionally. BPF maps continuously
> allocate/free, so we cannot unconditionally delay the freeing to
> llist. When the memory becomes free make it available to the
> kernel and BPF users right away if possible, and fallback to
> llist as the last resort.
>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
…
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 74c2a7af1a77..a9c639e3db91 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1247,13 +1250,44 @@ static void split_large_buddy(struct zone *zone, struct page *page,
…
> static void free_one_page(struct zone *zone, struct page *page,
> unsigned long pfn, unsigned int order,
> fpi_t fpi_flags)
> {
> + struct llist_head *llhead;
> unsigned long flags;
>
> - spin_lock_irqsave(&zone->lock, flags);
> + if (!spin_trylock_irqsave(&zone->lock, flags)) {
> + if (unlikely(fpi_flags & FPI_TRYLOCK)) {
> + add_page_to_zone_llist(zone, page, order);
> + return;
> + }
> + spin_lock_irqsave(&zone->lock, flags);
> + }
> +
> + /* The lock succeeded. Process deferred pages. */
> + llhead = &zone->trylock_free_pages;
> + if (unlikely(!llist_empty(llhead) && !(fpi_flags & FPI_TRYLOCK))) {
Thank you.
> + struct llist_node *llnode;
> + struct page *p, *tmp;
> +
> + llnode = llist_del_all(llhead);
> + llist_for_each_entry_safe(p, tmp, llnode, pcp_llist) {
> + unsigned int p_order = p->order;
> +
> + split_large_buddy(zone, p, page_to_pfn(p), p_order, fpi_flags);
> + __count_vm_events(PGFREE, 1 << p_order);
> + }
> + }
Sebastian
next prev parent reply other threads:[~2025-01-17 18:21 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-15 2:17 [PATCH bpf-next v5 0/7] bpf, mm: Introduce try_alloc_pages() Alexei Starovoitov
2025-01-15 2:17 ` [PATCH bpf-next v5 1/7] mm, bpf: Introduce try_alloc_pages() for opportunistic page allocation Alexei Starovoitov
2025-01-15 11:19 ` Vlastimil Babka
2025-01-15 23:00 ` Alexei Starovoitov
2025-01-15 23:47 ` Shakeel Butt
2025-01-16 2:44 ` Alexei Starovoitov
2025-01-15 23:16 ` Shakeel Butt
2025-01-17 18:19 ` Sebastian Andrzej Siewior
2025-01-15 2:17 ` [PATCH bpf-next v5 2/7] mm, bpf: Introduce free_pages_nolock() Alexei Starovoitov
2025-01-15 11:47 ` Vlastimil Babka
2025-01-15 23:15 ` Alexei Starovoitov
2025-01-16 8:31 ` Vlastimil Babka
2025-01-17 18:20 ` Sebastian Andrzej Siewior [this message]
2025-01-15 2:17 ` [PATCH bpf-next v5 3/7] locking/local_lock: Introduce local_trylock_irqsave() Alexei Starovoitov
2025-01-15 2:23 ` Alexei Starovoitov
2025-01-15 7:22 ` Sebastian Sewior
2025-01-15 14:22 ` Vlastimil Babka
2025-01-16 2:20 ` Alexei Starovoitov
2025-01-17 20:33 ` Sebastian Andrzej Siewior
2025-01-21 15:59 ` Vlastimil Babka
2025-01-21 16:43 ` Sebastian Andrzej Siewior
2025-01-22 1:35 ` Alexei Starovoitov
2025-01-15 2:17 ` [PATCH bpf-next v5 4/7] memcg: Use trylock to access memcg stock_lock Alexei Starovoitov
2025-01-15 16:07 ` Vlastimil Babka
2025-01-16 0:12 ` Shakeel Butt
2025-01-16 2:22 ` Alexei Starovoitov
2025-01-16 20:07 ` Joshua Hahn
2025-01-17 17:36 ` Johannes Weiner
2025-01-15 2:17 ` [PATCH bpf-next v5 5/7] mm, bpf: Use memcg in try_alloc_pages() Alexei Starovoitov
2025-01-15 17:51 ` Vlastimil Babka
2025-01-16 0:24 ` Shakeel Butt
2025-01-15 2:17 ` [PATCH bpf-next v5 6/7] mm: Make failslab, kfence, kmemleak aware of trylock mode Alexei Starovoitov
2025-01-15 17:57 ` Vlastimil Babka
2025-01-16 2:23 ` Alexei Starovoitov
2025-01-15 2:17 ` [PATCH bpf-next v5 7/7] bpf: Use try_alloc_pages() to allocate pages for bpf needs Alexei Starovoitov
2025-01-15 18:02 ` Vlastimil Babka
2025-01-16 2:25 ` Alexei Starovoitov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250117182055._8lyYECm@linutronix.de \
--to=bigeasy@linutronix.de \
--cc=akpm@linux-foundation.org \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=houtao1@huawei.com \
--cc=jannh@google.com \
--cc=kernel-team@fb.com \
--cc=linux-mm@kvack.org \
--cc=memxor@gmail.com \
--cc=mhocko@suse.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=shakeel.butt@linux.dev \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox