From: Sergey Senozhatsky <senozhatsky@chromium.org>
To: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Michael Fara <mjfara@gmail.com>,
senozhatsky@chromium.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, akpm@linux-foundation.org,
Minchan Kim <minchan@kernel.org>,
Brian Geffon <bgeffon@google.com>
Subject: Re: [PATCH] mm/zsmalloc: fix NULL pointer dereference in get_next_zpdesc
Date: Wed, 18 Feb 2026 14:56:37 +0900 [thread overview]
Message-ID: <kqx5soah7aalionfss6kwht76xbp4ajwwyxawixkolmumj4m3i@cgdoxu25wp63> (raw)
In-Reply-To: <20260209225018.1541260-1-joshua.hahnjy@gmail.com>
* Recovered from SPAM folder *
On (26/02/09 14:50), Joshua Hahn wrote:
> On Mon, 9 Feb 2026 19:32:57 +0000 Michael Fara <mjfara@gmail.com> wrote:
>
> Hello Michael,
>
> I hope you are doing well! Thank you for the patch. I'm not entirely sure if
> the race condition that you note here is correct, and also if this is the
> right fix.
Thanks for taking a look!
> > get_next_zpdesc() calls get_zspage() which unconditionally dereferences
> > zpdesc->zspage without a NULL check. This causes a kernel oops when
> > zpdesc->zspage has been set to NULL by reset_zpdesc() during a race
> > between zspage destruction and page compaction/migration.
> >
> > The race window is documented in a TODO comment in zs_page_migrate():
> >
> > "nothing prevents a zspage from getting destroyed while it is
> > isolated for migration, as the page lock is temporarily dropped
> > after zs_page_isolate() succeeded"
>
> I'm taking a look at zsmalloc these days, and I was confused by this comment
> and remember looking into what was going on here as well : -) My thoughts
> were that it should be safe in every other case, though.
>
> > The sequence is:
> > 1. Compaction calls zs_page_isolate() on a zpdesc, then drops its
> > page lock.
> > 2. Concurrently, async_free_zspage() or free_zspage() destroys the
> > zspage, calling reset_zpdesc() which sets zpdesc->zspage = NULL.
>
> async_free_zspage isolates the zspage first by taking the class lock and
> then splicing the zspage out. free_zspage is always called with the
> class lock held, and it likewise removes itself from the fullness list.
>
> zs_free -> free_zspage -> trylock_zspage holds the class->lock
> In zs_page_migrate, we call replace_sub_page to remove the zpdesc from the
> chain before calling reset_zpdesc, so I don't think there would be a race there
> either.
Right, am having same thoughts. I need to look closer, but sort
of expect that class->lock should have kept us safe here.
> > 3. A subsequent zs_free() path calls trylock_zspage(), which iterates
> > zpdescs via get_next_zpdesc(). get_zspage() dereferences the now-
> > NULL backpointer, causing:
>
> So I don't see how a subsequent zs_free could race with reset_zpdesc on
> the same zpdesc, maybe I'm missing something? : -)
>
> > BUG: kernel NULL pointer dereference, address: 0000000000000000
> > RIP: 0010:free_zspage+0x26/0x100
> > Call Trace:
> > zs_free+0xf4/0x110
> > zswap_entry_free+0x7e/0x160
>
> Is this from a real crash? A more detailed crash dump would be helpful to
> see what else was happening on the other threads!
>
> > The migration side already has a NULL guard (zs_page_migrate line 1675:
> > "if (!zpdesc->zspage) return 0;"), but get_next_zpdesc() lacks the same
> > protection.
> >
> > Fix this by reading zpdesc->zspage directly in get_next_zpdesc()
> > instead of going through get_zspage(), and returning NULL when the
> > backpointer is NULL. This stops iteration safely — the caller treats
> > it as the end of the page chain.
>
> If we return NULL early, what happens to the remaining zpdescs in the
> zspage? In trylock_zspage for instance, we might exit early and think we
> successfully locked all zpdescs, when that isn't the case. It would eventually
> lead to a VM_BUG_ON_PAGE when trying to free the zspage later anyways.
Agreed.
next prev parent reply other threads:[~2026-02-18 5:56 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-09 19:32 Michael Fara
2026-02-09 22:50 ` Joshua Hahn
2026-02-09 23:16 ` Joshua Hahn
2026-02-18 5:56 ` Sergey Senozhatsky [this message]
2026-02-09 19:36 Michael Fara
2026-02-09 19:37 Michael Fara
2026-02-18 5:01 ` Sergey Senozhatsky
2026-02-18 5:46 ` Sergey Senozhatsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=kqx5soah7aalionfss6kwht76xbp4ajwwyxawixkolmumj4m3i@cgdoxu25wp63 \
--to=senozhatsky@chromium.org \
--cc=akpm@linux-foundation.org \
--cc=bgeffon@google.com \
--cc=joshua.hahnjy@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=minchan@kernel.org \
--cc=mjfara@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox