From: Ryan Roberts <ryan.roberts@arm.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: linux-mm@kvack.org
Subject: Re: [RFC PATCH 00/14] Rearrange batched folio freeing
Date: Wed, 6 Sep 2023 11:23:03 +0100 [thread overview]
Message-ID: <85c76471-81d8-47e7-a03f-968a248dfe9e@arm.com> (raw)
In-Reply-To: <ZPf2fKkPbVSjCaYK@casper.infradead.org>
On 06/09/2023 04:48, Matthew Wilcox wrote:
> On Tue, Sep 05, 2023 at 03:00:51PM +0100, Matthew Wilcox wrote:
>> On Tue, Sep 05, 2023 at 02:26:54PM +0100, Ryan Roberts wrote:
>>> On 05/09/2023 14:15, Matthew Wilcox wrote:
>>>> On Mon, Sep 04, 2023 at 02:25:41PM +0100, Ryan Roberts wrote:
>>>>> I've been doing some benchmarking of this series, as promised, but have hit an oops. It doesn't appear to be easily reproducible, and I'm struggling to figure out the root cause, so thought I would share in case you have any ideas?
>>>>
>>>> I didn't hit that with my testing. Admittedly I was using xfs rather
>>>> than ext4, but ...
>>>
>>> I've only seen it once.
>>>
>>> I have a bit of a hybrid setup - my rootfs is xfs (and using large folios), but
>>> the linux tree (which is being built during the benchmark) is on an ext4
>>> partition. Large anon folios is enabled in this config, so there will be plenty
>>> of large folios in the system.
>>>
>>> I'm not sure if the fact that this fired from the ext4 path is too relevant -
>>> the page with the dodgy index is already on the PCP list so may or may not be large.
>>
>> Indeed. I have a suspicion that this may be more common, but if pages
>> are commonly freed to and allocated from the PCP list without ever being
>> transferred to the free list, we'll never see it. Perhaps adding a
>> check when pages are added to the PCP list that page->index is less
>> than 8 would catch the miscreant relatively quickly?
>
> Somehow my qemu setup started working again. This stuff is black magic.
>
> Anyway, I did this:
>
> +++ b/mm/page_alloc.c
> @@ -2405,6 +2405,7 @@ static void free_unref_page_commit(struct zone *zone, struct per_cpu_pages *pcp,
>
> __count_vm_events(PGFREE, 1 << order);
> pindex = order_to_pindex(migratetype, order);
> + VM_BUG_ON_PAGE(page->index > 7, page);
> list_add(&page->pcp_list, &pcp->lists[pindex]);
> pcp->count += 1 << order;
>
>
> but I haven't caught a wascally wabbit yet after an hour of running
> xfstests. I think that's the only place we add a page to the
> pcp->lists.
I added a smattering of VM_BUG_ON_PAGE(page->index > 5, page) to the places where the page is added and removed from the pcp lists. And one triggered on removing the page from the list (the same place I saw the UBSAN oops previously). But there is no page info dumped! I've enabled CONFIG_DEBUG_VM (and friends). I can't see how its possible to get the BUG report but not the dump_page() bit - what am I doing wrong?
Anyway, the fact that it did not trigger on insertion into the list suggests this is a corruption issue? I'll keep trying...
[ 334.307831] kernel BUG at mm/page_alloc.c:1217!
[ 334.312351] Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
[ 334.318433] Modules linked in: nfs lockd grace sunrpc fscache netfs nls_iso8859_1 scsi_dh_rdac scsi_dh_emc scsi_dh_alua drm xfs btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor xor_neon raid6_pq libcrc32c raid1 raid0 multipath linear mlx5_ib ib_uverbs ib_core mlx5_core mlxfw pci_hyperv_intf crct10dif_ce ghash_ce sha2_ce sha256_arm64 sha1_ce tls nvme psample nvme_core aes_neon_bs aes_neon_blk aes_ce_blk aes_ce_cipher
[ 334.359704] CPU: 26 PID: 260858 Comm: git Not tainted 6.5.0-rc4-ryarob01-all-debug #1
[ 334.367521] Hardware name: WIWYNN Mt.Jade Server System B81.03001.0005/Mt.Jade Motherboard, BIOS 1.08.20220218 (SCP: 1.08.20220218) 2022/02/18
[ 334.380285] pstate: 604000c9 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 334.387233] pc : free_pcppages_bulk+0x1b0/0x2d0
[ 334.391753] lr : free_pcppages_bulk+0x1b0/0x2d0
[ 334.396270] sp : ffff80008fe9b810
[ 334.399571] x29: ffff80008fe9b810 x28: 0000000000000001 x27: 00000000000000e7
[ 334.406694] x26: fffffc2007e07780 x25: ffff08181ed8f840 x24: ffff08181ed8f868
[ 334.413817] x23: 0000000000000000 x22: ffff0818836caf80 x21: ffff800081bbf008
[ 334.420939] x20: 0000000000000001 x19: ffff08181ed8f850 x18: 0000000000000000
[ 334.428061] x17: 6666373066666666 x16: 2066666666666666 x15: 6632303030303030
[ 334.435184] x14: 0000000000000000 x13: 2935203e20746d28 x12: 454741505f4e4f5f
[ 334.442306] x11: 4755425f4d56203a x10: 6573756163656220 x9 : ffff80008014ef40
[ 334.449429] x8 : 5f4d56203a657375 x7 : 6163656220646570 x6 : ffff08181dee0000
[ 334.456551] x5 : ffff08181ed78d88 x4 : 0000000000000000 x3 : ffff80008fe9b538
[ 334.463674] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 000000000000002b
[ 334.470796] Call trace:
[ 334.473230] free_pcppages_bulk+0x1b0/0x2d0
[ 334.477401] free_unref_page_commit+0x124/0x2a8
[ 334.481918] free_unref_folios+0x3b4/0x4e8
[ 334.486003] release_unref_folios+0xac/0xf8
[ 334.490175] folios_put+0x100/0x228
[ 334.493651] __folio_batch_release+0x34/0x88
[ 334.497908] truncate_inode_pages_range+0x168/0x690
[ 334.502773] truncate_inode_pages_final+0x58/0x90
[ 334.507464] ext4_evict_inode+0x164/0x900
[ 334.511463] evict+0xac/0x160
[ 334.514419] iput+0x170/0x228
[ 334.517375] do_unlinkat+0x1d0/0x290
[ 334.520938] __arm64_sys_unlinkat+0x48/0x98
[ 334.525108] invoke_syscall+0x74/0xf8
[ 334.528758] el0_svc_common.constprop.0+0x58/0x130
[ 334.533536] do_el0_svc+0x40/0xa8
[ 334.536837] el0_svc+0x2c/0xb8
[ 334.539881] el0t_64_sync_handler+0xc0/0xc8
[ 334.544052] el0t_64_sync+0x1a8/0x1b0
[ 334.547703] Code: aa1a03e0 90009dc1 91072021 97ff1097 (d4210000)
[ 334.553783] ---[ end trace 0000000000000000 ]---
prev parent reply other threads:[~2023-09-06 10:23 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-25 13:59 Matthew Wilcox (Oracle)
2023-08-25 13:59 ` [RFC PATCH 01/14] mm: Make folios_put() the basis of release_pages() Matthew Wilcox (Oracle)
2023-08-31 14:21 ` Ryan Roberts
2023-09-01 3:58 ` Matthew Wilcox
2023-09-01 8:14 ` Ryan Roberts
2023-08-25 13:59 ` [RFC PATCH 02/14] mm: Convert free_unref_page_list() to use folios Matthew Wilcox (Oracle)
2023-08-31 14:29 ` Ryan Roberts
2023-09-01 4:03 ` Matthew Wilcox
2023-09-01 8:15 ` Ryan Roberts
2023-08-25 13:59 ` [RFC PATCH 03/14] mm: Add free_unref_folios() Matthew Wilcox (Oracle)
2023-08-31 14:39 ` Ryan Roberts
2023-08-25 13:59 ` [RFC PATCH 04/14] mm: Use folios_put() in __folio_batch_release() Matthew Wilcox (Oracle)
2023-08-31 14:41 ` Ryan Roberts
2023-08-25 13:59 ` [RFC PATCH 05/14] memcg: Add mem_cgroup_uncharge_folios() Matthew Wilcox (Oracle)
2023-08-31 14:49 ` Ryan Roberts
2023-08-25 13:59 ` [RFC PATCH 06/14] mm: Remove use of folio list from folios_put() Matthew Wilcox (Oracle)
2023-08-31 14:53 ` Ryan Roberts
2023-08-25 13:59 ` [RFC PATCH 07/14] mm: Use free_unref_folios() in put_pages_list() Matthew Wilcox (Oracle)
2023-08-25 13:59 ` [RFC PATCH 08/14] mm: use __page_cache_release() in folios_put() Matthew Wilcox (Oracle)
2023-08-25 13:59 ` [RFC PATCH 09/14] mm: Handle large folios in free_unref_folios() Matthew Wilcox (Oracle)
2023-08-31 15:21 ` Ryan Roberts
2023-09-01 4:09 ` Matthew Wilcox
2023-08-25 13:59 ` [RFC PATCH 10/14] mm: Allow non-hugetlb large folios to be batch processed Matthew Wilcox (Oracle)
2023-08-31 15:28 ` Ryan Roberts
2023-09-01 4:10 ` Matthew Wilcox
2023-08-25 13:59 ` [RFC PATCH 11/14] mm: Free folios in a batch in shrink_folio_list() Matthew Wilcox (Oracle)
2023-09-04 3:43 ` Matthew Wilcox
2024-01-05 17:00 ` Matthew Wilcox
2023-08-25 13:59 ` [RFC PATCH 12/14] mm: Free folios directly in move_folios_to_lru() Matthew Wilcox (Oracle)
2023-08-31 15:46 ` Ryan Roberts
2023-09-01 4:16 ` Matthew Wilcox
2023-08-25 13:59 ` [RFC PATCH 13/14] memcg: Remove mem_cgroup_uncharge_list() Matthew Wilcox (Oracle)
2023-08-31 18:26 ` Ryan Roberts
2023-08-25 13:59 ` [RFC PATCH 14/14] mm: Remove free_unref_page_list() Matthew Wilcox (Oracle)
2023-08-31 18:27 ` Ryan Roberts
2023-08-30 18:50 ` [RFC PATCH 15/18] mm: Convert free_pages_and_swap_cache() to use folios_put() Matthew Wilcox (Oracle)
2023-08-30 18:50 ` [RFC PATCH 16/18] mm: Use a folio in __collapse_huge_page_copy_succeeded() Matthew Wilcox (Oracle)
2023-08-30 18:50 ` [RFC PATCH 17/18] mm: Convert free_swap_cache() to take a folio Matthew Wilcox (Oracle)
2023-08-31 18:49 ` Ryan Roberts
2023-08-30 18:50 ` [RFC PATCH 18/18] mm: Add pfn_range_put() Matthew Wilcox (Oracle)
2023-08-31 19:03 ` Ryan Roberts
2023-09-01 4:27 ` Matthew Wilcox
2023-09-01 7:59 ` Ryan Roberts
2023-09-04 13:25 ` [RFC PATCH 00/14] Rearrange batched folio freeing Ryan Roberts
2023-09-05 13:15 ` Matthew Wilcox
2023-09-05 13:26 ` Ryan Roberts
2023-09-05 14:00 ` Matthew Wilcox
2023-09-06 3:48 ` Matthew Wilcox
2023-09-06 10:23 ` Ryan Roberts [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=85c76471-81d8-47e7-a03f-968a248dfe9e@arm.com \
--to=ryan.roberts@arm.com \
--cc=linux-mm@kvack.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox