linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Roman Gushchin <roman.gushchin@linux.dev>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	stable@vger.kernel.org, Hugh Dickins <hughd@google.com>,
	Matthew Wilcox <willy@infradead.org>
Subject: Re: [PATCH] mm: page_alloc: move mlocked flag clearance into free_pages_prepare()
Date: Mon, 21 Oct 2024 17:17:53 +0000	[thread overview]
Message-ID: <ZxaMwfShUXDzQMwQ@google.com> (raw)
In-Reply-To: <c5cd0ad5-9d9d-4df3-ab20-c5de2a380894@suse.cz>

On Mon, Oct 21, 2024 at 07:01:59PM +0200, Vlastimil Babka wrote:
> On 10/21/24 18:48, Roman Gushchin wrote:
> > Syzbot reported [1] a bad page state problem caused by a page
> > being freed using free_page() still having a mlocked flag at
> > free_pages_prepare() stage:
> > 
> >   BUG: Bad page state in process syz.0.15  pfn:1137bb
> >   page: refcount:0 mapcount:0 mapping:0000000000000000 index:0xffff8881137bb870 pfn:0x1137bb
> >   flags: 0x400000000080000(mlocked|node=0|zone=1)
> >   raw: 0400000000080000 0000000000000000 dead000000000122 0000000000000000
> >   raw: ffff8881137bb870 0000000000000000 00000000ffffffff 0000000000000000
> >   page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
> >   page_owner tracks the page as allocated
> >   page last allocated via order 0, migratetype Unmovable, gfp_mask
> >   0x400dc0(GFP_KERNEL_ACCOUNT|__GFP_ZERO), pid 3005, tgid
> >   3004 (syz.0.15), ts 61546  608067, free_ts 61390082085
> >    set_page_owner include/linux/page_owner.h:32 [inline]
> >    post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1537
> >    prep_new_page mm/page_alloc.c:1545 [inline]
> >    get_page_from_freelist+0x3008/0x31f0 mm/page_alloc.c:3457
> >    __alloc_pages_noprof+0x292/0x7b0 mm/page_alloc.c:4733
> >    alloc_pages_mpol_noprof+0x3e8/0x630 mm/mempolicy.c:2265
> >    kvm_coalesced_mmio_init+0x1f/0xf0 virt/kvm/coalesced_mmio.c:99
> >    kvm_create_vm virt/kvm/kvm_main.c:1235 [inline]
> >    kvm_dev_ioctl_create_vm virt/kvm/kvm_main.c:5500 [inline]
> >    kvm_dev_ioctl+0x13bb/0x2320 virt/kvm/kvm_main.c:5542
> >    vfs_ioctl fs/ioctl.c:51 [inline]
> >    __do_sys_ioctl fs/ioctl.c:907 [inline]
> >    __se_sys_ioctl+0xf9/0x170 fs/ioctl.c:893
> >    do_syscall_x64 arch/x86/entry/common.c:52 [inline]
> >    do_syscall_64+0x69/0x110 arch/x86/entry/common.c:83
> >    entry_SYSCALL_64_after_hwframe+0x76/0x7e
> >   page last free pid 951 tgid 951 stack trace:
> >    reset_page_owner include/linux/page_owner.h:25 [inline]
> >    free_pages_prepare mm/page_alloc.c:1108 [inline]
> >    free_unref_page+0xcb1/0xf00 mm/page_alloc.c:2638
> >    vfree+0x181/0x2e0 mm/vmalloc.c:3361
> >    delayed_vfree_work+0x56/0x80 mm/vmalloc.c:3282
> >    process_one_work kernel/workqueue.c:3229 [inline]
> >    process_scheduled_works+0xa5c/0x17a0 kernel/workqueue.c:3310
> >    worker_thread+0xa2b/0xf70 kernel/workqueue.c:3391
> >    kthread+0x2df/0x370 kernel/kthread.c:389
> >    ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
> >    ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
> > 
> > The problem was originally introduced by
> > commit b109b87050df ("mm/munlock: replace clear_page_mlock() by final
> > clearance"): it was handling focused on handling pagecache
> > and anonymous memory and wasn't suitable for lower level
> > get_page()/free_page() API's used for example by KVM, as with
> > this reproducer.
> 
> Does that mean KVM is mlocking pages that are not pagecache nor anonymous,
> thus not LRU? How and why (and since when) is that done?

KVM allows to mmap and mlock several pages allocated directly.
Please, take a look at the reproducer:
https://syzkaller.appspot.com/x/repro.c?x=1437939f980000

> 
> > Fix it by moving the mlocked flag clearance down to
> > free_page_prepare().
> > 
> > The bug itself if fairly old and harmless (aside from generating these
> > warnings), so the stable backport is likely not justified.
> 
> But since there's a Cc: stable below, it will be backported :)

My bad, I changed my mind in the last minute and added Cc: stable but
forgot to drop this sentence.

> 
> > Closes: https://syzkaller.appspot.com/x/report.txt?x=169a47d0580000
> > Fixes: b109b87050df ("mm/munlock: replace clear_page_mlock() by final clearance")
> > Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
> > Cc: <stable@vger.kernel.org>
> > Cc: Hugh Dickins <hughd@google.com>
> > Cc: Matthew Wilcox <willy@infradead.org>
> > ---
> >  mm/page_alloc.c |  9 +++++++++
> >  mm/swap.c       | 14 --------------
> >  2 files changed, 9 insertions(+), 14 deletions(-)
> > 
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index bc55d39eb372..24200651ad92 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -1044,6 +1044,7 @@ __always_inline bool free_pages_prepare(struct page *page,
> >  	bool skip_kasan_poison = should_skip_kasan_poison(page);
> >  	bool init = want_init_on_free();
> >  	bool compound = PageCompound(page);
> > +	struct folio *folio = page_folio(page);
> >  
> >  	VM_BUG_ON_PAGE(PageTail(page), page);
> >  
> > @@ -1053,6 +1054,14 @@ __always_inline bool free_pages_prepare(struct page *page,
> >  	if (memcg_kmem_online() && PageMemcgKmem(page))
> >  		__memcg_kmem_uncharge_page(page, order);
> >  
> > +	if (unlikely(folio_test_mlocked(folio))) {
> > +		long nr_pages = folio_nr_pages(folio);
> > +
> > +		__folio_clear_mlocked(folio);
> > +		zone_stat_mod_folio(folio, NR_MLOCK, -nr_pages);
> > +		count_vm_events(UNEVICTABLE_PGCLEARED, nr_pages);
> > +	}
> 
> Why drop the useful comment?

Agree. Sounds like I need to restore the comment, drop no stable backport
recommendation and send v2.

Thank you for taking a look!


      reply	other threads:[~2024-10-21 17:18 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-21 16:48 Roman Gushchin
2024-10-21 17:01 ` Vlastimil Babka
2024-10-21 17:17   ` Roman Gushchin [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZxaMwfShUXDzQMwQ@google.com \
    --to=roman.gushchin@linux.dev \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=stable@vger.kernel.org \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox