From: Kairui Song <ryncsn@gmail.com>
To: robin.kuo@mediatek.com
Cc: Andrew Morton <akpm@linux-foundation.org>,
Chris Li <chrisl@kernel.org>,
Kemeng Shi <shikemeng@huaweicloud.com>,
Nhat Pham <nphamcs@gmail.com>, Baoquan He <bhe@redhat.com>,
Barry Song <baohua@kernel.org>,
Matthias Brugger <matthias.bgg@gmail.com>,
AngeloGioacchino Del Regno
<angelogioacchino.delregno@collabora.com>,
wsd_upstream@mediatek.com, casper.li@mediatek.com,
chinwen.chang@mediatek.com, Andrew.Yang@mediatek.com,
Qun-wei.Lin@mediatek.com, oliver.sang@intel.com,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org,
linux-mediatek@lists.infradead.org
Subject: Re: [PATCH 1/1] Restore swap_space attr aviod krn panic
Date: Thu, 15 Jan 2026 16:04:58 +0800 [thread overview]
Message-ID: <CAMgjq7CrKv7QJNFO_vg1_nXA+RSW83mu8X4cMSVd2tfMA==Wew@mail.gmail.com> (raw)
In-Reply-To: <20260115001405.3513440-1-robin.kuo@mediatek.com>
On Thu, Jan 15, 2026 at 8:14 AM <robin.kuo@mediatek.com> wrote:
>
> From: "robin.kuo" <robin.kuo@mediatek.com>
>
> Restore swap_space attr avoid krn panic
>
> Commit 8b47299a411a ('mm, swap: mark swap address space ro and add
> context debug check') made the swap address space read-only.
> It may lead to kernel panic if arch_prepare_to_swap returns a failure
> under heavy memory pressure as follows,
>
> el1_abort+0x40/0x64
> el1h_64_sync_handler+0x48/0xcc
> el1h_64_sync+0x84/0x88
> errseq_set+0x4c/0xb8 (P)
> __filemap_set_wb_err+0x20/0xd0
> shrink_folio_list+0xc20/0x11cc
> evict_folios+0x1520/0x1be4
> try_to_shrink_lruvec+0x27c/0x3dc
> shrink_one+0x9c/0x228
> shrink_node+0xb3c/0xeac
> do_try_to_free_pages+0x170/0x4f0
> try_to_free_pages+0x334/0x534
> __alloc_pages_direct_reclaim+0x90/0x158
> __alloc_pages_slowpath+0x334/0x588
> __alloc_frozen_pages_noprof+0x224/0x2fc
> __folio_alloc_noprof+0x14/0x64
> vma_alloc_zeroed_movable_folio+0x34/0x44
> do_pte_missing+0xad4/0x1040
> handle_mm_fault+0x4a4/0x790
> do_page_fault+0x288/0x5f8
> do_translation_fault+0x38/0x54
> do_mem_abort+0x54/0xa8
>
> Restore swap address space as not ro to avoid the panic.
Hi Robin
Thanks very much for the patch! I'm fine with it with some suggestions:
Marking it as RO was intentional to avoid silent unexpected usages of
the swap space, the swap space is hollow now, as all content is now in
the swap table.
For example the one you posted, calling filemap_set_wb_err for the
swap space is meaningless, the error info will be ignored without any
notice.
In the comment for mapping_set_error (wrapper for filemap_set_wb_err):
* When a writeback error occurs, most filesystems will want to call
* mapping_set_error to record the error in the mapping so that it can be
* reported when the application calls fsync(2).
So far there is no fsync for swap or other ways to report mapping
errors to userspace. For example, __end_swap_bio_write will just print
an error log if writetback failed. I went through the code, currently
only arch_prepare_to_swap will cause a call to filemap_set_wb_err.
Swap is not a persistent storage like normal FS, so simply bouncing
the writeback folio back as active is good enough for almost all
cases, like the one in end_swap_bio_write.
Catching potential issues with RO/panic is overkill indeed. So I'm
fine with dropping the RO by default. But perhaps we can keep the RO
attribute at least for VM_DEBUG build? Like this:
diff --git a/mm/swap.h b/mm/swap.h
index eba4c04ad039..307cc71d3635 100644
--- a/mm/swap.h
+++ b/mm/swap.h
@@ -220,7 +220,12 @@ int swap_writeout(struct folio *folio, struct
swap_iocb **swap_plug);
void __swap_writepage(struct folio *folio, struct swap_iocb **swap_plug);
/* linux/mm/swap_state.c */
-extern struct address_space swap_space __ro_after_init;
+#ifdef CONFIG_DEBUG_VM
+#define __swap_space_attr __ro_after_init
+#else
+#define __swap_space_attr __read_mostly
+#endif
+extern struct address_space swap_space __swap_space_attr;
static inline struct address_space *swap_address_space(swp_entry_t entry)
{
return &swap_space;
diff --git a/mm/swap_state.c b/mm/swap_state.c
index 3916d144eddc..11273e0ec811 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -38,7 +38,7 @@ static const struct address_space_operations swap_aops = {
};
/* Set swap_space as read only as swap cache is handled by swap table */
-struct address_space swap_space __ro_after_init = {
+struct address_space swap_space __swap_space_attr = {
.a_ops = &swap_aops,
};
---
And as for this particular error marking issue you posted, maybe we
should just drop the handle_write_error I think. That helper was
useful when the filesystems are still using the .writepage interface,
but now all filesystem have switched to .writepages and only swap may
call that helper, which is no longer helpful, as the error info is
completely ignored with no report back to userspace. Maybe we need
something like this:
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 39f0336d9c29..62fe433aac0b 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -477,27 +477,6 @@ static int reclaimer_offset(struct scan_control *sc)
return PGSTEAL_DIRECT - PGSTEAL_KSWAPD;
}
-/*
- * We detected a synchronous write error writing a folio out. Probably
- * -ENOSPC. We need to propagate that into the address_space for a subsequent
- * fsync(), msync() or close().
- *
- * The tricky part is that after writepage we cannot touch the mapping: nothing
- * prevents it from being freed up. But we have a ref on the folio and once
- * that folio is locked, the mapping is pinned.
- *
- * We're allowed to run sleeping folio_lock() here because we know
the caller has
- * __GFP_FS.
- */
-static void handle_write_error(struct address_space *mapping,
- struct folio *folio, int error)
-{
- folio_lock(folio);
- if (folio_mapping(folio) == mapping)
- mapping_set_error(mapping, error);
- folio_unlock(folio);
-}
-
static bool skip_throttle_noprogress(pg_data_t *pgdat)
{
int reclaimable = 0, write_pending = 0;
@@ -650,9 +629,10 @@ static pageout_t writeout(struct folio *folio,
struct address_space *mapping,
else
res = swap_writeout(folio, plug);
- if (res < 0)
- handle_write_error(mapping, folio, res);
- if (res == AOP_WRITEPAGE_ACTIVATE) {
+ if (res < 0) {
+ pr_alert_ratelimited("Swap failure of folio (order:
%d, error: %d)\n",
+ folio_order(folio), res);
+ } else if (res == AOP_WRITEPAGE_ACTIVATE) {
folio_clear_reclaim(folio);
return PAGE_ACTIVATE;
}
next prev parent reply other threads:[~2026-01-15 8:05 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-15 0:13 robin.kuo
2026-01-15 1:27 ` Andrew Morton
2026-01-15 8:04 ` Kairui Song [this message]
2026-01-15 9:24 ` Kairui Song
[not found] ` <20260116062535.306453-1-robin.kuo@mediatek.com>
2026-01-16 6:25 ` [PATCH v2 1/1] mm, swap: " robin.kuo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAMgjq7CrKv7QJNFO_vg1_nXA+RSW83mu8X4cMSVd2tfMA==Wew@mail.gmail.com' \
--to=ryncsn@gmail.com \
--cc=Andrew.Yang@mediatek.com \
--cc=Qun-wei.Lin@mediatek.com \
--cc=akpm@linux-foundation.org \
--cc=angelogioacchino.delregno@collabora.com \
--cc=baohua@kernel.org \
--cc=bhe@redhat.com \
--cc=casper.li@mediatek.com \
--cc=chinwen.chang@mediatek.com \
--cc=chrisl@kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mediatek@lists.infradead.org \
--cc=linux-mm@kvack.org \
--cc=matthias.bgg@gmail.com \
--cc=nphamcs@gmail.com \
--cc=oliver.sang@intel.com \
--cc=robin.kuo@mediatek.com \
--cc=shikemeng@huaweicloud.com \
--cc=wsd_upstream@mediatek.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox