From: Antal Nemes <antal.nemes@hycu.com>
To: Antal Nemes <antal.nemes@hycu.com>
Cc: Dave Chinner <david@fromorbit.com>,
Matthew Wilcox <willy@infradead.org>,
linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
Daniel Dao <dqminh@cloudflare.com>
Subject: Re: [BUG] soft lockup in filemap_get_read_batch
Date: Wed, 11 Oct 2023 15:20:52 +0200 [thread overview]
Message-ID: <5b9ab143618b2d5bfeeed4619a34c7a8@hycu.com> (raw)
In-Reply-To: <53bb6e7a159cef2942e0e4cd9509847a@hycu.com>
On Wed, Oct 04, 2023 at 10:36:33AM +0200, Antal Nemes wrote:
> On Wed, Oct 04, 2023 at 09:58:04AM +1100, Dave Chinner wrote:
> > On Tue, Oct 03, 2023 at 03:48:14PM +0200, antal.nemes@hycu.com wrote:
> > > Hi Matthew,
> > >
> > > We have observed intermittent soft lockups on at least seven different hosts:
> > > - six hosts ran 6.2.8.fc37-200
> > > - one host ran 6.0.13.fc37-200
> > >
> > > The list of affected hosts is growing.
> > >
> > > Stack traces are all similar:
> > >
> > > emerg kern kernel - - watchdog: BUG: soft lockup - CPU#7 stuck for 17117s! [postmaster:2238460]
> > > warning kern kernel - - Modules linked in: target_core_user uio target_core_pscsi target_core_file target_core_iblock nbd loop nls_utf8 cifs cifs_arc4 cifs_md4 dns_resolver fscache netfs veth iscsi_tcp libiscsi_tcp libiscsi iscsi_target_mod target_core_mod scsi_transport_iscsi nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink sunrpc dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua bochs drm_vram_helper drm_ttm_helper ttm crct10dif_pclmul i2c_piix4 crc32_pclmul polyval_clmulni polyval_generic ghash_clmulni_intel sha512_ssse3 virtio_balloon joydev pcspkr xfs crc32c_intel virtio_net serio_raw ata_generic net_failover failover virtio_scsi pata_acpi qemu_fw_cfg fuse [last unloaded: nbd]
> > > warning kern kernel - - CPU: 7 PID: 2238460 Comm: postmaster Kdump: loaded Tainted: G L 6.2.8-200.fc37.x86_64 #1
> > > warning kern kernel - - Hardware name: Nutanix AHV, BIOS 1.11.0-2.el7 04/01/2014
> > > warning kern kernel - - RIP: 0010:xas_descend+0x28/0x70
> > > warning kern kernel - - Code: 90 90 0f b6 0e 48 8b 57 08 48 d3 ea 83 e2 3f 89 d0 48 83 c0 04 48 8b 44 c6 08 48 89 77 18 48 89 c1 83 e1 03 48 83 f9 02 75 08 <48> 3d fd 00 00 00 76 08 88 57 12 c3 cc cc cc cc 48 c1 e8 02 89 c2
> > > warning kern kernel - - RSP: 0018:ffffab66c9f4bb98 EFLAGS: 00000246
> > > warning kern kernel - - RAX: 00000000000000c2 RBX: ffffab66c9f4bbb8 RCX: 0000000000000002
> > > warning kern kernel - - RDX: 0000000000000032 RSI: ffff89cd6c8cd6d0 RDI: ffffab66c9f4bbb8
> > > warning kern kernel - - RBP: ffff89cd6c8cd6d0 R08: ffffab66c9f4be20 R09: 0000000000000000
> > > warning kern kernel - - R10: 0000000000000001 R11: 0000000000000100 R12: 00000000000000b3
> > > warning kern kernel - - R13: 00000000000000b2 R14: 00000000000000b2 R15: ffffab66c9f4be48
> > > warning kern kernel - - FS: 00007ff1e8bfb540(0000) GS:ffff89d35fbc0000(0000) knlGS:0000000000000000
> > > warning kern kernel - - CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > warning kern kernel - - CR2: 00007ff1e8af0768 CR3: 000000016fdde001 CR4: 00000000003706e0
> > > warning kern kernel - - Call Trace:
> > > warning kern kernel - - <TASK>
> > > warning kern kernel - - xas_load+0x3d/0x50
> > > warning kern kernel - - filemap_get_read_batch+0x179/0x270
> > > warning kern kernel - - filemap_get_pages+0xa9/0x690
> > > warning kern kernel - - ? asm_sysvec_apic_timer_interrupt+0x16/0x20
> > > warning kern kernel - - filemap_read+0xd2/0x340
> > > warning kern kernel - - ? filemap_read+0x32f/0x340
> > > warning kern kernel - - xfs_file_buffered_read+0x4f/0xd0 [xfs]
> > > warning kern kernel - - xfs_file_read_iter+0x70/0xe0 [xfs]
> > > warning kern kernel - - vfs_read+0x23c/0x310
> > > warning kern kernel - - ksys_read+0x6b/0xf0
> > > warning kern kernel - - do_syscall_64+0x5b/0x80
> > > warning kern kernel - - ? syscall_exit_to_user_mode+0x17/0x40
> > > warning kern kernel - - ? do_syscall_64+0x67/0x80
> > > warning kern kernel - - ? do_syscall_64+0x67/0x80
> > > warning kern kernel - - ? __irq_exit_rcu+0x3d/0x140
> > > warning kern kernel - - entry_SYSCALL_64_after_hwframe+0x72/0xdc
> >
> > Fixed by commit cbc02854331e ("XArray: Do not return sibling entries
> > from xa_load()").
> >
> > Should already be backported to the lastest stable kernels.
>
> The commit seems to be the same as the patch referenced in
> https://bugzilla.kernel.org/show_bug.cgi?id=216646#c31
>
> We have been running 6.2.8 with this patch, but the soft lockup still ocurred.
>
> >From https://lore.kernel.org/linux-fsdevel/CA+wXwBRGab3UqbLqsr8xG=ZL2u9bgyDNNea4RGfTDjqB=J3geQ@mail.gmail.com/
> it looks like there could be a different issue at play (locked folio with null
> mapping)?
>
Daniel successfully worked around this issue by reverting
6795801366da0cd3d99e27c37f020a8f16714886 (xfs: Support large folios).
We will follow suit for the time being.
next prev parent reply other threads:[~2023-10-11 13:21 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-03 13:48 antal.nemes
2023-10-03 22:58 ` Dave Chinner
2023-10-04 8:36 ` Antal Nemes
2023-10-11 13:20 ` Antal Nemes [this message]
2024-04-16 9:31 ` [PATCH 1/1] mm: protect xa split stuff under lruvec->lru_lock during migration zhaoyang.huang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5b9ab143618b2d5bfeeed4619a34c7a8@hycu.com \
--to=antal.nemes@hycu.com \
--cc=david@fromorbit.com \
--cc=dqminh@cloudflare.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox