Re: [PATCH] mm/zsmalloc: fix NULL pointer dereference in get_next_zpdesc

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Sergey Senozhatsky <senozhatsky@chromium.org>
To: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Michael Fara <mjfara@gmail.com>,
	senozhatsky@chromium.org,  linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, akpm@linux-foundation.org,
	 Minchan Kim <minchan@kernel.org>,
	Brian Geffon <bgeffon@google.com>
Subject: Re: [PATCH] mm/zsmalloc: fix NULL pointer dereference in get_next_zpdesc
Date: Wed, 18 Feb 2026 14:56:37 +0900	[thread overview]
Message-ID: <kqx5soah7aalionfss6kwht76xbp4ajwwyxawixkolmumj4m3i@cgdoxu25wp63> (raw)
In-Reply-To: <20260209225018.1541260-1-joshua.hahnjy@gmail.com>

* Recovered from SPAM folder *

On (26/02/09 14:50), Joshua Hahn wrote:
> On Mon,  9 Feb 2026 19:32:57 +0000 Michael Fara <mjfara@gmail.com> wrote:
> 
> Hello Michael,
> 
> I hope you are doing well! Thank you for the patch. I'm not entirely sure if
> the race condition that you note here is correct, and also if this is the
> right fix.

Thanks for taking a look!

> > get_next_zpdesc() calls get_zspage() which unconditionally dereferences
> > zpdesc->zspage without a NULL check. This causes a kernel oops when
> > zpdesc->zspage has been set to NULL by reset_zpdesc() during a race
> > between zspage destruction and page compaction/migration.
> > 
> > The race window is documented in a TODO comment in zs_page_migrate():
> > 
> >     "nothing prevents a zspage from getting destroyed while it is
> >     isolated for migration, as the page lock is temporarily dropped
> >     after zs_page_isolate() succeeded"
> 
> I'm taking a look at zsmalloc these days, and I was confused by this comment
> and remember looking into what was going on here as well : -) My thoughts
> were that it should be safe in every other case, though.
> 
> > The sequence is:
> >   1. Compaction calls zs_page_isolate() on a zpdesc, then drops its
> >      page lock.
> >   2. Concurrently, async_free_zspage() or free_zspage() destroys the
> >      zspage, calling reset_zpdesc() which sets zpdesc->zspage = NULL.
> 
> async_free_zspage isolates the zspage first by taking the class lock and
> then splicing the zspage out. free_zspage is always called with the
> class lock held, and it likewise removes itself from the fullness list.
> 
> zs_free -> free_zspage -> trylock_zspage holds the class->lock
> In zs_page_migrate, we call replace_sub_page to remove the zpdesc from the
> chain before calling reset_zpdesc, so I don't think there would be a race there
> either.

Right, am having same thoughts.  I need to look closer, but sort
of expect that class->lock should have kept us safe here.

> >   3. A subsequent zs_free() path calls trylock_zspage(), which iterates
> >      zpdescs via get_next_zpdesc(). get_zspage() dereferences the now-
> >      NULL backpointer, causing:
> 
> So I don't see how a subsequent zs_free could race with reset_zpdesc on
> the same zpdesc, maybe I'm missing something? : -)
> 
> >        BUG: kernel NULL pointer dereference, address: 0000000000000000
> >        RIP: 0010:free_zspage+0x26/0x100
> >        Call Trace:
> >         zs_free+0xf4/0x110
> >         zswap_entry_free+0x7e/0x160
> 
> Is this from a real crash? A more detailed crash dump would be helpful to
> see what else was happening on the other threads!
> 
> > The migration side already has a NULL guard (zs_page_migrate line 1675:
> > "if (!zpdesc->zspage) return 0;"), but get_next_zpdesc() lacks the same
> > protection.
> > 
> > Fix this by reading zpdesc->zspage directly in get_next_zpdesc()
> > instead of going through get_zspage(), and returning NULL when the
> > backpointer is NULL. This stops iteration safely — the caller treats
> > it as the end of the page chain.
> 
> If we return NULL early, what happens to the remaining zpdescs in the
> zspage? In trylock_zspage for instance, we might exit early and think we
> successfully locked all zpdescs, when that isn't the case. It would eventually
> lead to a VM_BUG_ON_PAGE when trying to free the zspage later anyways.

Agreed.

next prev parent reply	other threads:[~2026-02-18  5:56 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-09 19:32 Michael Fara
2026-02-09 22:50 ` Joshua Hahn
2026-02-09 23:16   ` Joshua Hahn
2026-02-18  5:56   ` Sergey Senozhatsky [this message]
2026-02-09 19:36 Michael Fara
2026-02-09 19:37 Michael Fara
2026-02-18  5:01 ` Sergey Senozhatsky
2026-02-18  5:46 ` Sergey Senozhatsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=kqx5soah7aalionfss6kwht76xbp4ajwwyxawixkolmumj4m3i@cgdoxu25wp63 \
    --to=senozhatsky@chromium.org \
    --cc=akpm@linux-foundation.org \
    --cc=bgeffon@google.com \
    --cc=joshua.hahnjy@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan@kernel.org \
    --cc=mjfara@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox