linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Nhat Pham <nphamcs@gmail.com>
Cc: Linux regressions mailing list <regressions@lists.linux.dev>,
	Piotr Oniszczuk <piotr.oniszczuk@gmail.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Yosry Ahmed <yosryahmed@google.com>,
	Linux-MM <linux-mm@kvack.org>
Subject: Re: [regression] oops on heavy compilations ("kernel BUG at mm/zswap.c:1005!" and "Oops: invalid opcode: 0000")
Date: Fri, 23 Aug 2024 15:47:24 +0100	[thread overview]
Message-ID: <Zsig_AZDT5zOO1Wg@casper.infradead.org> (raw)
In-Reply-To: <CAKEwX=PEye=VcXF=r-A9B47VsNtpLLxz5cJiswzuQXBio8rizA@mail.gmail.com>

On Fri, Aug 23, 2024 at 10:35:19AM -0400, Nhat Pham wrote:
> On Fri, Aug 23, 2024 at 9:13 AM Matthew Wilcox <willy@infradead.org> wrote:
> >
> >
> > That said, zswap could handle this better.  There's no need to panic the
> > entire machine over being unable to read a page from swap.  Killing just
> > the process that needed this page is sufficient.
> 
> Agree 100%. It is silly to kill the entire host for a swap read error,
> and extra silly to kill the process because we fail to writeback - for
> all we know that page might never be needed by the process again!!!
> 
> >
> > Suggested patch at end after the oops.
> >
> > @@ -1601,6 +1613,7 @@ bool zswap_load(struct folio *folio)
> >         bool swapcache = folio_test_swapcache(folio);
> >         struct xarray *tree = swap_zswap_tree(swp);
> >         struct zswap_entry *entry;
> > +       int err;
> >
> >         VM_WARN_ON_ONCE(!folio_test_locked(folio));
> >
> > @@ -1638,10 +1651,13 @@ bool zswap_load(struct folio *folio)
> >         if (!entry)
> >                 return false;
> >
> > -       if (entry->length)
> > -               zswap_decompress(entry, folio);
> > -       else
> > +       if (entry->length) {
> > +               err = zswap_decompress(entry, folio);
> > +               if (err)
> > +                       return false;
> 
> Here, if zswap decompression fails and zswap load returns false, the
> page_io logic will proceed as if zswap does not have the page and
> reads garbage from the backing device instead. This could potentially
> lead to silent data/memory corruption right? Or am I missing something
> :) Maybe we could be extra careful here and treat it as if there is a
> bio read error in the case zswap owns the page, but cannot decompress
> it?

Ah; you know more about how zswap works than I do.  So it's not a
write-through cache?  I guess we need to go a bit further then and
return an errno from zswap_load -- EIO/ENOENT/0 and handle that
appropriately.

> The rest seems solid to me :)


  reply	other threads:[~2024-08-23 14:47 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <BD22A15A-9216-4FA0-82DF-C7BBF8EE642E@gmail.com>
2024-08-23 11:51 ` Linux regression tracking (Thorsten Leemhuis)
2024-08-23 12:12   ` Piotr Oniszczuk
2024-08-23 13:13   ` Matthew Wilcox
2024-08-23 14:35     ` Nhat Pham
2024-08-23 14:47       ` Matthew Wilcox [this message]
2024-08-23 16:07         ` Yosry Ahmed
2024-08-23 15:06     ` Piotr Oniszczuk
2024-08-23 16:16       ` Nhat Pham
2024-08-23 17:24         ` Piotr Oniszczuk
2024-08-23 18:06           ` Nhat Pham
2024-08-24 10:50             ` Piotr Oniszczuk
2024-08-25  5:55         ` Piotr Oniszczuk
2024-08-25 15:05           ` Pedro Falcato
2024-08-25 16:24             ` Piotr Oniszczuk
2024-08-27 18:48               ` Yosry Ahmed
2024-08-29 15:50                 ` Piotr Oniszczuk
2024-08-29 21:54                   ` Yosry Ahmed
2024-08-29 22:29                     ` Matthew Wilcox
2024-08-29 22:53                       ` Yosry Ahmed
2024-08-31  9:41                     ` Piotr Oniszczuk
2024-08-31 17:23                       ` Yosry Ahmed
2024-09-02  8:57                         ` Piotr Oniszczuk
2024-09-03 17:49                           ` Yosry Ahmed
2024-09-03 22:43                             ` Nhat Pham
2024-09-04 23:36                               ` Yosry Ahmed
2024-09-13  9:03                         ` Tomáš Trnka
2024-09-13 17:39                           ` Yosry Ahmed
     [not found]           ` <27594ee6-41dd-4951-b4cc-31577c9466db@amd.com>
2024-09-03 17:52             ` Yosry Ahmed
2024-08-23 18:42       ` Takero Funaki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zsig_AZDT5zOO1Wg@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nphamcs@gmail.com \
    --cc=piotr.oniszczuk@gmail.com \
    --cc=regressions@lists.linux.dev \
    --cc=yosryahmed@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox