From: Nhat Pham <nphamcs@gmail.com>
To: Pedro Falcato <pedro.falcato@gmail.com>
Cc: mm <linux-mm@kvack.org>,
Linux Regressions <regressions@lists.linux.dev>,
Johannes Weiner <hannes@cmpxchg.org>,
Yosry Ahmed <yosryahmed@google.com>,
Chengming Zhou <chengming.zhou@linux.dev>,
Christian Heusel <christian@heusel.eu>
Subject: Re: zswap_writeback_entry crashes in 6.9.5
Date: Sun, 30 Jun 2024 18:07:47 -0700 [thread overview]
Message-ID: <CAKEwX=NkWzyhb6dZeV1=sGKwurwhk7wMcH_RtwJ9ztgABjMyfQ@mail.gmail.com> (raw)
In-Reply-To: <CAKbZUD1-kqfuV0U+KDKPkQbm=RwzD_A1H3qk_c+bw92CqtMbuw@mail.gmail.com>
On Sun, Jun 30, 2024 at 10:58 AM Pedro Falcato <pedro.falcato@gmail.com> wrote:
>
> Hi everyone,
Hi Pedro,
Thanks for the bug report! Taking a look now - some preliminary
questions to narrow down the suspects and aid the debugging process:
a) Do you observe this bug in 6.8? 6.10?
b) Have you run the faddr2line script to verify that the line that
triggers the crash is count_objcg_event(entry->objcg, ZSWPWB);?
c) Do you have a full dmesg log? Or maybe some other reproduction instructions?
If entry->objcg is garbage, then this smells like a lifetime/reference
counting issue. Either:
a) The zswap entry itself is garbage. Not impossible, but seems
unlikely. In 6.9, we effectively isolate the entry first through the
swap cache, then check and remove it from the zswap tree (under the
tree's lock). The former locks out concurrent accessors, and the
latter should have taken care of invalidated entries (and prevents
future invalidation attempts). Furthermore, after this, if the entry
is somehow garbage (i.e freed and recycled), it should also be
possible to blow up in the decompression step first, by feeding a
garbage handle to zsmalloc and crashing the kernel at that point. IOW,
we should also see zsmalloc crashes in addition to this particular
crash, no? I cannot think of any protection mechanism that applies to
the decompression step and not to count_objcg_event().
b) entry->objcg has been freed/recycled under us. This is much
trickier, as the culprit could be any holder of the objcg reference
who accidentally double-released the reference it held. That said, if
it only happened on zswap shrinker path, then maybe there is something
to this...
Let me muse on this a bit more. Please let us know if you have other
clues, traces, hints, or observation - it will help the investigation
a lot!
next prev parent reply other threads:[~2024-07-01 1:08 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-30 17:58 Pedro Falcato
2024-07-01 1:07 ` Nhat Pham [this message]
2024-07-19 18:57 ` Pedro Falcato
2024-07-19 20:47 ` Nhat Pham
2024-07-01 3:44 ` Chengming Zhou
2024-07-02 0:33 ` Builder
2024-07-02 15:28 ` Nhat Pham
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAKEwX=NkWzyhb6dZeV1=sGKwurwhk7wMcH_RtwJ9ztgABjMyfQ@mail.gmail.com' \
--to=nphamcs@gmail.com \
--cc=chengming.zhou@linux.dev \
--cc=christian@heusel.eu \
--cc=hannes@cmpxchg.org \
--cc=linux-mm@kvack.org \
--cc=pedro.falcato@gmail.com \
--cc=regressions@lists.linux.dev \
--cc=yosryahmed@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox