linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
To: Linux Memory Management List <linux-mm@kvack.org>,
	 Linux List Kernel Mailing <linux-kernel@vger.kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	chrisl@kernel.org,  kasong@tencent.com,
	Hugh Dickins <hughd@google.com>
Subject: [RFC PATCH] mm/page_alloc: fix use-after-free in swap due to stale page data after split_page()
Date: Fri, 30 Jan 2026 18:49:00 +0500	[thread overview]
Message-ID: <CABXGCsNqk6pOkocJ0ctcHssCvke2kqhzoR2BGf_Hh1hWPZATuA@mail.gmail.com> (raw)

Hi,

I've been debugging a use-after-free bug in the swap subsystem that manifests
as a crash in free_swap_count_continuations() during swapoff on zram devices.

== Problem ==

KASAN reports wild-memory-access at address 0xdead000000000100 (LIST_POISON1):

  Oops: general protection fault, probably for non-canonical address
0xfbd59c0000000020
  KASAN: maybe wild-memory-access in range
[0xdead000000000100-0xdead000000000107]
  RIP: 0010:__do_sys_swapoff+0x1151/0x1860

  RBP: dead0000000000f8
  R13: dead000000000100

The crash occurs when free_swap_count_continuations() iterates over a
list_head containing LIST_POISON values from a previous list_del().

== Root Cause ==

The swap subsystem uses vmalloc_to_page() to get struct page pointers for
the swap_map array, then uses page->private and page->lru for swap count
continuation lists.

When vmalloc allocates high-order pages without __GFP_COMP and splits them
via split_page(), the resulting pages may contain stale data:

1. post_alloc_hook() only clears page->private for the head page (page[0])
2. split_page() only calls set_page_refcounted() for tail pages
3. Tail pages retain whatever was in page->private and page->lru from
   previous use - including LIST_POISON values from prior list_del() calls

In add_swap_count_continuation() (mm/swapfile.c):

    if (!page_private(head)) {
        INIT_LIST_HEAD(&head->lru);
        set_page_private(head, SWP_CONTINUED);
    }

If head is a vmalloc tail page with stale non-zero page->private, the
INIT_LIST_HEAD is skipped, leaving page->lru with poison values. When
free_swap_count_continuations() later iterates this list, it crashes.

The comment at line 3862 says "Page allocation does not initialize the
page's lru field, but it does always reset its private field" - this
assumption is incorrect for vmalloc pages obtained via split_page().

== Proposed Fix ==

Initialize page->private and page->lru for all pages in split_page().
This matches the documented expectation in mm/vmalloc.c:

  "High-order allocations must be able to be treated as independent
   small pages by callers... Some drivers do their own refcounting
   on vmalloc_to_page() pages, some use page->mapping, page->lru, etc."

--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3122,6 +3122,16 @@ void split_page(struct page *page, unsigned int order)
        VM_BUG_ON_PAGE(PageCompound(page), page);
        VM_BUG_ON_PAGE(!page_count(page), page);

+       /*
+        * Split pages may contain stale data from previous use. Initialize
+        * page->private and page->lru which may have LIST_POISON values.
+        */
+       INIT_LIST_HEAD(&page->lru);
+       for (i = 1; i < (1 << order); i++) {
+               set_page_private(page + i, 0);
+               INIT_LIST_HEAD(&page[i].lru);
+       }
+
        for (i = 1; i < (1 << order); i++)
                set_page_refcounted(page + i);
        split_page_owner(page, order, 0);

== Testing ==

Reproduced with a stress test cycling swapon/swapoff on 8GB zram under
memory pressure:
  - Without patch: crash within ~50 iterations
  - With patch: 1154+ iterations, no crash

The bug was originally discovered on Fedora 44 with kernel 6.19.0-rc7
during normal system shutdown after extended use.

== Questions ==

1. Is split_page() the right place for this fix, or should the swap code
   be more defensive about uninitialized vmalloc pages?

2. Should prep_new_page()/post_alloc_hook() initialize all pages in
   high-order allocations, not just the head?

3. Are there other fields besides page->private and page->lru that
   callers of split_page() might expect to be initialized?

Thoughts?

-- 
Best Regards,
Mike Gavrilov.


             reply	other threads:[~2026-01-30 13:49 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-30 13:49 Mikhail Gavrilov [this message]
2026-01-30 13:59 ` Matthew Wilcox
2026-01-30 14:16   ` Mikhail Gavrilov
2026-01-30 15:30 ` Kairui Song
2026-01-30 15:47   ` Mikhail Gavrilov
2026-02-02  3:17     ` Kairui Song
2026-02-02  5:27       ` Mikhail Gavrilov
2026-02-02 17:54         ` Kairui Song
2026-02-02 20:21           ` Mikhail Gavrilov
2026-02-03  7:14             ` Mikhail Gavrilov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABXGCsNqk6pOkocJ0ctcHssCvke2kqhzoR2BGf_Hh1hWPZATuA@mail.gmail.com \
    --to=mikhail.v.gavrilov@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=chrisl@kernel.org \
    --cc=hughd@google.com \
    --cc=kasong@tencent.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox