From: Peter Xu <peterx@redhat.com>
To: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
linux-mm@kvack.org, Alex Williamson <alex.williamson@redhat.com>
Subject: Re: [PATCH 1/5] mm: Free non-hugetlb large folios in a batch
Date: Wed, 24 Apr 2024 11:20:28 -0400 [thread overview]
Message-ID: <ZikjPB0Dt5HA8-uL@x1n> (raw)
In-Reply-To: <20240405153228.2563754-2-willy@infradead.org>
On Fri, Apr 05, 2024 at 04:32:23PM +0100, Matthew Wilcox (Oracle) wrote:
> free_unref_folios() can now handle non-hugetlb large folios, so keep
> normal large folios in the batch. hugetlb folios still need to be
> handled specially. I believe that folios freed using put_pages_list()
> cannot be accounted to a memcg (or the small folios would trip the "page
> still charged to cgroup" warning), but put an assertion in to check that.
There's such user, iommu uses put_pages_list() to free IOMMU pgtables, and
they can be memcg accounted; since 2023 iommu_map switched to use
GFP_KERNEL_ACCOUNT.
I hit below panic when testing my local branch over mm-everthing when
running some VFIO workloads.
For this specific vfio use case, see 160912fc3d4a ("vfio/type1: account
iommu allocations").
I think we should remove the VM_BUG_ON_FOLIO() line, as the memcg will then
be properly taken care of later in free_pages_prepare(). Fixup attached at
the end that will fix this crash for me.
Thanks,
[ 10.092411] kernel BUG at mm/swap.c:152!
[ 10.092686] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[ 10.093034] CPU: 3 PID: 634 Comm: vfio-pci-mmap-t Tainted: G W 6.9.0-rc4-peterx+ #2
[ 10.093628] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[ 10.094361] RIP: 0010:put_pages_list+0x12b/0x150
[ 10.094675] Code: 6d 08 48 81 c4 00 01 00 00 5b 5d c3 cc cc cc cc 48 c7 c6 f0 fd 9f 82 e8 63 e8 03 00 0f 0b 48 c7 c6 48 00 a0 82 e8 55 e8 03 00 <0f> 0b 48 c7 c6 28 fe 9f 82 e8 47f
[ 10.095896] RSP: 0018:ffffc9000221bc50 EFLAGS: 00010282
[ 10.096242] RAX: 0000000000000038 RBX: ffffea00042695c0 RCX: 0000000000000000
[ 10.096707] RDX: 0000000000000001 RSI: 0000000000000027 RDI: 00000000ffffffff
[ 10.097177] RBP: ffffc9000221bd68 R08: 0000000000000000 R09: 0000000000000003
[ 10.097642] R10: ffffc9000221bb08 R11: ffffffff8335db48 R12: ffff8881070172c0
[ 10.098113] R13: ffff888102fd0000 R14: ffff888107017210 R15: ffff888110a6c7c0
[ 10.098586] FS: 0000000000000000(0000) GS:ffff888276a00000(0000) knlGS:0000000000000000
[ 10.099117] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 10.099494] CR2: 00007f1910000000 CR3: 000000000323c006 CR4: 0000000000770ef0
[ 10.099972] PKRU: 55555554
[ 10.100154] Call Trace:
[ 10.100321] <TASK>
[ 10.100466] ? die+0x32/0x80
[ 10.100666] ? do_trap+0xd9/0x100
[ 10.100897] ? put_pages_list+0x12b/0x150
[ 10.101168] ? put_pages_list+0x12b/0x150
[ 10.101434] ? do_error_trap+0x81/0x110
[ 10.101688] ? put_pages_list+0x12b/0x150
[ 10.101957] ? exc_invalid_op+0x4c/0x60
[ 10.102216] ? put_pages_list+0x12b/0x150
[ 10.102484] ? asm_exc_invalid_op+0x16/0x20
[ 10.102771] ? put_pages_list+0x12b/0x150
[ 10.103026] ? 0xffffffff81000000
[ 10.103246] ? dma_pte_list_pagetables.isra.0+0x38/0xa0
[ 10.103592] ? dma_pte_list_pagetables.isra.0+0x9b/0xa0
[ 10.103933] ? dma_pte_clear_level+0x18c/0x1a0
[ 10.104228] ? domain_unmap+0x65/0x130
[ 10.104481] ? domain_unmap+0xe6/0x130
[ 10.104735] domain_exit+0x47/0x80
[ 10.104968] vfio_iommu_type1_detach_group+0x3f1/0x5f0
[ 10.105308] ? vfio_group_detach_container+0x3c/0x1a0
[ 10.105644] vfio_group_detach_container+0x60/0x1a0
[ 10.105977] vfio_group_fops_release+0x46/0x80
[ 10.106274] __fput+0x9a/0x2d0
[ 10.106479] task_work_run+0x55/0x90
[ 10.106717] do_exit+0x32f/0xb70
[ 10.106945] ? _raw_spin_unlock_irq+0x24/0x50
[ 10.107237] do_group_exit+0x32/0xa0
[ 10.107481] __x64_sys_exit_group+0x14/0x20
[ 10.107760] do_syscall_64+0x75/0x190
[ 10.108007] entry_SYSCALL_64_after_hwframe+0x76/0x7e
==================================
diff --git a/mm/swap.c b/mm/swap.c
index f0d478eee292..8ae5cd4ed180 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -149,7 +149,6 @@ void put_pages_list(struct list_head *pages)
free_huge_folio(folio);
continue;
}
- VM_BUG_ON_FOLIO(folio_memcg(folio), folio);
/* LRU flag must be clear because it's passed using the lru */
if (folio_batch_add(&fbatch, folio) > 0)
continue;
--
Peter Xu
next prev parent reply other threads:[~2024-04-24 15:20 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-05 15:32 [PATCH 0/5] Clean up __folio_put() Matthew Wilcox (Oracle)
2024-04-05 15:32 ` [PATCH 1/5] mm: Free non-hugetlb large folios in a batch Matthew Wilcox (Oracle)
2024-04-24 15:20 ` Peter Xu [this message]
2024-04-25 3:39 ` Matthew Wilcox
2024-04-25 15:00 ` Peter Xu
2024-04-25 20:54 ` Andrew Morton
2024-04-05 15:32 ` [PATCH 2/5] mm: Combine free_the_page() and free_unref_page() Matthew Wilcox (Oracle)
2024-04-05 15:32 ` [PATCH 3/5] mm: Inline destroy_large_folio() into __folio_put_large() Matthew Wilcox (Oracle)
2024-04-05 15:32 ` [PATCH 4/5] mm: Combine __folio_put_small, __folio_put_large and __folio_put Matthew Wilcox (Oracle)
2024-04-05 15:32 ` [PATCH 5/5] mm: Convert free_zone_device_page to free_zone_device_folio Matthew Wilcox (Oracle)
2024-04-05 16:34 ` [PATCH 0/5] Clean up __folio_put() Zi Yan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZikjPB0Dt5HA8-uL@x1n \
--to=peterx@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=alex.williamson@redhat.com \
--cc=linux-mm@kvack.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox