From: Uladzislau Rezki <urezki@gmail.com>
To: Baoquan He <bhe@redhat.com>
Cc: "Uladzislau Rezki (Sony)" <urezki@gmail.com>,
linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
Vlastimil Babka <vbabka@suse.cz>,
Michal Hocko <mhocko@kernel.org>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 6/8] mm/vmalloc: Defer freeing partly initialized vm_struct
Date: Mon, 18 Aug 2025 15:02:57 +0200 [thread overview]
Message-ID: <aKMkgbZqOqyGVF1C@pc636> (raw)
In-Reply-To: <aKKqOzepmIkOJi3i@MiWiFi-R3L-srv>
On Mon, Aug 18, 2025 at 12:21:15PM +0800, Baoquan He wrote:
> On 08/07/25 at 09:58am, Uladzislau Rezki (Sony) wrote:
> > __vmalloc_area_node() may call free_vmap_area() or vfree() on
> > error paths, both of which can sleep. This becomes problematic
> > if the function is invoked from an atomic context, such as when
> > GFP_ATOMIC or GFP_NOWAIT is passed via gfp_mask.
> >
> > To fix this, unify error paths and defer the cleanup of partly
> > initialized vm_struct objects to a workqueue. This ensures that
> > freeing happens in a process context and avoids invalid sleeps
> > in atomic regions.
> >
> > Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> > ---
> > include/linux/vmalloc.h | 6 +++++-
> > mm/vmalloc.c | 34 +++++++++++++++++++++++++++++++---
> > 2 files changed, 36 insertions(+), 4 deletions(-)
> >
> > diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
> > index fdc9aeb74a44..b1425fae8cbf 100644
> > --- a/include/linux/vmalloc.h
> > +++ b/include/linux/vmalloc.h
> > @@ -50,7 +50,11 @@ struct iov_iter; /* in uio.h */
> > #endif
> >
> > struct vm_struct {
> > - struct vm_struct *next;
> > + union {
> > + struct vm_struct *next; /* Early registration of vm_areas. */
> > + struct llist_node llnode; /* Asynchronous freeing on error paths. */
> > + };
> > +
> > void *addr;
> > unsigned long size;
> > unsigned long flags;
> > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > index 7f48a54ec108..2424f80d524a 100644
> > --- a/mm/vmalloc.c
> > +++ b/mm/vmalloc.c
> > @@ -3680,6 +3680,35 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
> > return nr_allocated;
> > }
> >
> > +static LLIST_HEAD(pending_vm_area_cleanup);
> > +static void cleanup_vm_area_work(struct work_struct *work)
> > +{
> > + struct vm_struct *area, *tmp;
> > + struct llist_node *head;
> > +
> > + head = llist_del_all(&pending_vm_area_cleanup);
> > + if (!head)
> > + return;
> > +
> > + llist_for_each_entry_safe(area, tmp, head, llnode) {
> > + if (!area->pages)
> > + free_vm_area(area);
> > + else
> > + vfree(area->addr);
> > + }
> > +}
> > +
> > +/*
> > + * Helper for __vmalloc_area_node() to defer cleanup
> > + * of partially initialized vm_struct in error paths.
> > + */
> > +static DECLARE_WORK(cleanup_vm_area, cleanup_vm_area_work);
> > +static void defer_vm_area_cleanup(struct vm_struct *area)
> > +{
> > + if (llist_add(&area->llnode, &pending_vm_area_cleanup))
> > + schedule_work(&cleanup_vm_area);
> > +}
>
> Wondering why here we need call schudule_work() when
> pending_vm_area_cleanup was empty before adding new entry. Shouldn't
> it be as below to schedule the job? Not sure if I miss anything.
>
> if (!llist_add(&area->llnode, &pending_vm_area_cleanup))
> schedule_work(&cleanup_vm_area);
>
> =====
> /**
> * llist_add - add a new entry
> * @new: new entry to be added
> * @head: the head for your lock-less list
> *
> * Returns true if the list was empty prior to adding this entry.
> */
> static inline bool llist_add(struct llist_node *new, struct llist_head *head)
> {
> return llist_add_batch(new, new, head);
> }
> =====
>
But then you will not schedule. If the list is empty, we add one element
llist_add() returns 1, but your condition expects 0.
How it works:
If someone keeps adding to the llist and it is not empty we should not
trigger a new work, because a current work is in flight(it will cover new comers),
i.e. it has been scheduled but it has not yet completed llist_del_all() on
the head.
Once it is done, a new comer will trigger a work again only if it sees NULL,
i.e. when the list is empty.
--
Uladzislau Rezki
next prev parent reply other threads:[~2025-08-18 13:03 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-07 7:58 [PATCH 0/8] __vmalloc() and no-block support Uladzislau Rezki (Sony)
2025-08-07 7:58 ` [PATCH 1/8] lib/test_vmalloc: add no_block_alloc_test case Uladzislau Rezki (Sony)
2025-08-07 7:58 ` [PATCH 2/8] lib/test_vmalloc: Remove xfail condition check Uladzislau Rezki (Sony)
2025-08-07 7:58 ` [PATCH 3/8] mm/vmalloc: Support non-blocking GFP flags in alloc_vmap_area() Uladzislau Rezki (Sony)
2025-08-07 11:20 ` Michal Hocko
2025-08-08 9:59 ` Uladzislau Rezki
2025-08-18 2:11 ` Baoquan He
2025-08-07 7:58 ` [PATCH 4/8] mm/vmalloc: Remove cond_resched() in vm_area_alloc_pages() Uladzislau Rezki (Sony)
2025-08-07 11:22 ` Michal Hocko
2025-08-08 10:08 ` Uladzislau Rezki
2025-08-18 2:14 ` Baoquan He
2025-08-07 7:58 ` [PATCH 5/8] mm/kasan, mm/vmalloc: Respect GFP flags in kasan_populate_vmalloc() Uladzislau Rezki (Sony)
2025-08-07 16:05 ` Andrey Ryabinin
2025-08-08 10:18 ` Uladzislau Rezki
2025-08-07 7:58 ` [PATCH 6/8] mm/vmalloc: Defer freeing partly initialized vm_struct Uladzislau Rezki (Sony)
2025-08-07 11:25 ` Michal Hocko
2025-08-08 10:37 ` Uladzislau Rezki
2025-08-18 4:21 ` Baoquan He
2025-08-18 13:02 ` Uladzislau Rezki [this message]
2025-08-19 8:56 ` Baoquan He
2025-08-19 9:20 ` Uladzislau Rezki
2025-08-07 7:58 ` [PATCH 7/8] mm/vmalloc: Support non-blocking GFP flags in __vmalloc_area_node() Uladzislau Rezki (Sony)
2025-08-07 11:54 ` Michal Hocko
2025-08-08 11:54 ` Uladzislau Rezki
2025-08-18 4:35 ` Baoquan He
2025-08-18 13:08 ` Uladzislau Rezki
2025-08-19 8:46 ` Baoquan He
2025-08-07 7:58 ` [PATCH 8/8] mm: Drop __GFP_DIRECT_RECLAIM flag if PF_MEMALLOC is set Uladzislau Rezki (Sony)
2025-08-07 11:58 ` Michal Hocko
2025-08-08 13:12 ` Uladzislau Rezki
2025-08-08 14:16 ` Michal Hocko
2025-08-08 16:56 ` Uladzislau Rezki
2025-08-07 11:01 ` [PATCH 0/8] __vmalloc() and no-block support Marco Elver
2025-08-08 8:48 ` Uladzislau Rezki
2025-08-23 9:35 ` Uladzislau Rezki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aKMkgbZqOqyGVF1C@pc636 \
--to=urezki@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=bhe@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox