From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9BA34C02181 for ; Fri, 24 Jan 2025 17:59:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 20845280085; Fri, 24 Jan 2025 12:59:54 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1B7CA280079; Fri, 24 Jan 2025 12:59:54 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0A78A280085; Fri, 24 Jan 2025 12:59:54 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id E1686280079 for ; Fri, 24 Jan 2025 12:59:53 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 918DA1C6F48 for ; Fri, 24 Jan 2025 17:59:53 +0000 (UTC) X-FDA: 83043108666.11.1BFC2BD Received: from mail-4322.protonmail.ch (mail-4322.protonmail.ch [185.70.43.22]) by imf29.hostedemail.com (Postfix) with ESMTP id 5D7CD12000B for ; Fri, 24 Jan 2025 17:59:51 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=pm.me header.s=protonmail3 header.b=JIbIB9RK; spf=pass (imf29.hostedemail.com: domain of ihor.solodrai@pm.me designates 185.70.43.22 as permitted sender) smtp.mailfrom=ihor.solodrai@pm.me; dmarc=pass (policy=quarantine) header.from=pm.me ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1737741592; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=4rrjsE+MvicrrQ6Y6gD9cIT/HyQUToaWe+K01zQPoq8=; b=KQHmg9gM+JDa4PqqajjC0BHerOAaqHjAjzCFAe55yOTQc5JBfhNhUw0YmNxF8otPCxXtWE F4DPGsFmXZZUGfpk7sRhuoaqlAE0RrLol3Xv/0fxx4ehDFyBy/0dN9cEEBosN1fzmlVEQy jwnIiDQjgl+WC1QqEFHYL4yONEUOQ+c= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=pm.me header.s=protonmail3 header.b=JIbIB9RK; spf=pass (imf29.hostedemail.com: domain of ihor.solodrai@pm.me designates 185.70.43.22 as permitted sender) smtp.mailfrom=ihor.solodrai@pm.me; dmarc=pass (policy=quarantine) header.from=pm.me ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1737741592; a=rsa-sha256; cv=none; b=4Me97pd9JndCsnceLcd075JJ6b+pQ9v58TwwbjYT5FtpPR6TIW7YfmoJkrjlh41S4iNO6l lGcA5q3b7xKpM2BA2O5WM0MyUT1LNWMTz/8+vVl2H4p4+KkQCWE9QLyBioCk5Nitz/pZLR OxD17CV60gPzK0Mc4r3We86rwZaYYYQ= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pm.me; s=protonmail3; t=1737741587; x=1738000787; bh=4rrjsE+MvicrrQ6Y6gD9cIT/HyQUToaWe+K01zQPoq8=; h=Date:To:From:Cc:Subject:Message-ID:In-Reply-To:References: Feedback-ID:From:To:Cc:Date:Subject:Reply-To:Feedback-ID: Message-ID:BIMI-Selector:List-Unsubscribe:List-Unsubscribe-Post; b=JIbIB9RK1AoMS0k9EJR6pJxIvCsuc8wWj0+M0dglHO9q1JpdzdAmkHasYCrL6Xtg3 yZLr3M62KQpW6QElSB9M+pcd4aMm0HT2wqw0UCCvPmLAHp0lzj3Ee1K42YrTJnD1ZJ iMniNpaAOj+ZqKs0nfS4IhVd194HyBZ9cNCQm9XctxK4WpskYdAABH7CX2J5lBkY13 e0IfEJyknM5ANHMu9Kt8dKBp9jMsb84bURnBlrw52rQdtq+69a5McLcAQtpO65sGdt 41iS4elHgn1exDbKEeUD8x2Poaxtujd+Vr2jIZyVsNGCPiSgDJOwnEXYL+xEoaLsuF mLAK+CeNGGy/w== Date: Fri, 24 Jan 2025 17:59:39 +0000 To: David Howells From: Ihor Solodrai Cc: Christian Brauner , Steve French , Matthew Wilcox , Jeff Layton , Gao Xiang , Dominique Martinet , Marc Dionne , Paulo Alcantara , Shyam Prasad N , Tom Talpey , Eric Van Hensbergen , Ilya Dryomov , netfs@lists.linux.dev, linux-afs@lists.infradead.org, linux-cifs@vger.kernel.org, linux-nfs@vger.kernel.org, ceph-devel@vger.kernel.org, v9fs@lists.linux.dev, linux-erofs@lists.ozlabs.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, bpf Subject: Re: [PATCH v5 27/32] netfs: Change the read result collector to only use one work item Message-ID: In-Reply-To: <20241216204124.3752367-28-dhowells@redhat.com> References: <20241216204124.3752367-1-dhowells@redhat.com> <20241216204124.3752367-28-dhowells@redhat.com> Feedback-ID: 27520582:user:proton X-Pm-Message-ID: 2d63f4f4ff77fa7cfd9219388b78b772c9c2eebe MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 5D7CD12000B X-Stat-Signature: xbxb5y3pwr7wa3a1jxxcrr6fmd1oxqhn X-Rspam-User: X-HE-Tag: 1737741591-172087 X-HE-Meta: U2FsdGVkX1+xGOCiYeiFhVrMuIa6zihYywu+YURoPzRKssbokwl5BiO489UF5OdaW+WHFmArlGh4IaMxThrJcSCV9xPyIwBJ7BLtc7yjSoY71rMcB7nbYKmT3F4FQtaW9Nsd6Z6rsUwhBJILU+IZGNh0Ml5FDQgoeTVCoV//RIumu9SnJCQXQzPwwcn54VoXi7f5QodKgwA9u8ZmbumLhFPXMV/ZGeNeVaz1ZTYtKB9Lttgsd+vRLgy5vB78kxwlnWub/ziIwMJSIu7FKd0XHSLb77IPkMjM9wqVcauigGf7BVSqrr/sb6uEkq8v4gcASztNqkQOhDFw5SfqWA9deGHtDfenLFpmZ0WqH5e9/XFEkQGqmT0HhMBOd/zQQ/1pztFGpGnxp0YucWsGguzPbMP8AVWI20M5jVv2/OjYRuW6feXI1sjyoI6aVtUl6XrCeWRYJvXk133Wd7IH/UvU5aVcy5RTCSl5/oFvk1PYr3F3FmaqlSl123JjmzwINKCIhE04JKA3O/4k7S6NN1Qtw32zaADiMa6V9OMp6KZxHIjRCkkNUug2Fx333An6XvpCiAjW/vwMWHphuv9haK4meZJM33CI17jJsGbAWtGDnj+g9A7STqsWnkHJFtqRzkEzlXmE8CdBDEWxOn7ytuRDzQsNvhvS//zMhZ286oixxefqI1JXCSHAaQp2fBiV94VRUqyUfz4DY7KVfAz85yBv0C/qLDVr8O7/vJ+E9U5EsQyndHKRriX0ZCkcxHk7KQbSnSwNMMCGwpJYKQjjnTbGOtTT+2IT9gTjA/CGtWop5jEN1vLOqJuz79jLNMPsqOU5jmPqAd5xtvGxI9nUY0DouE4ms2Akniwthx9kvmswuMRHPgA5Jnow6ZgwHkkUpD1fAXF7Ntqr9p3La3mb4d5H5JP37f9LxgiK4+AhxwXAbLIwhQ0S80kj3JJTI5ouPZQD91RWEb0xwPDBPT5dXbv +aN2WY9r /9GoNJvkTst6HSxDYHyV1/6zKAVFa2ckAOlT+a5v2n50zXSjFIE3HGsGsVuO9/0mTvWAdq4n7NlnGctF6Z0lNxkElsE4xoPLUSfIwTfbJOX3LSNMo9js6wqvXdWR8wHjAUxLCXP4EUd04wok1P8SDnuwfJFhXCCID70As4M7eKGbYE7DRavyjsXpuPpKidLx4uyNoiq+pIp2wranRVYxBoZAjTHUP7rQCYj726vJMPduoIEKCRU9DXLJBxqweBPVTAx30cwAt8ZwkkTJQV2OHSD5tFw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Monday, December 16th, 2024 at 12:41 PM, David Howells wrote: > Change the way netfslib collects read results to do all the collection fo= r > a particular read request using a single work item that walks along the > subrequest queue as subrequests make progress or complete, unlocking foli= os > progressively rather than doing the unlock in parallel as parallel reques= ts > come in. >=20 > The code is remodelled to be more like the write-side code, though only > using a single stream. This makes it more directly comparable and thus > easier to duplicate fixes between the two sides. >=20 > This has a number of advantages: >=20 > (1) It's simpler. There doesn't need to be a complex donation mechanism > to handle mismatches between the size and alignment of subrequests and > folios. The collector unlocks folios as the subrequests covering each > complete. >=20 > (2) It should cause less scheduler overhead as there's a single work item > in play unlocking pages in parallel when a read gets split up into a > lot of subrequests instead of one per subrequest. >=20 > Whilst the parallellism is nice in theory, in practice, the vast > majority of loads are sequential reads of the whole file, so > committing a bunch of threads to unlocking folios out of order doesn't > help in those cases. >=20 > (3) It should make it easier to implement content decryption. A folio > cannot be decrypted until all the requests that contribute to it have > completed - and, again, most loads are sequential and so, most of the > time, we want to begin decryption sequentially (though it's great if > the decryption can happen in parallel). >=20 > There is a disadvantage in that we're losing the ability to decrypt and > unlock things on an as-things-arrive basis which may affect some > applications. >=20 > Signed-off-by: David Howells dhowells@redhat.com >=20 > cc: Jeff Layton jlayton@kernel.org >=20 > cc: netfs@lists.linux.dev > cc: linux-fsdevel@vger.kernel.org > --- > fs/9p/vfs_addr.c | 3 +- > fs/afs/dir.c | 8 +- > fs/ceph/addr.c | 9 +- > fs/netfs/buffered_read.c | 160 ++++---- > fs/netfs/direct_read.c | 60 +-- > fs/netfs/internal.h | 21 +- > fs/netfs/main.c | 2 +- > fs/netfs/objects.c | 34 +- > fs/netfs/read_collect.c | 716 ++++++++++++++++++++--------------- > fs/netfs/read_pgpriv2.c | 203 ++++------ > fs/netfs/read_retry.c | 207 +++++----- > fs/netfs/read_single.c | 37 +- > fs/netfs/write_collect.c | 4 +- > fs/netfs/write_issue.c | 2 +- > fs/netfs/write_retry.c | 14 +- > fs/smb/client/cifssmb.c | 2 + > fs/smb/client/smb2pdu.c | 5 +- > include/linux/netfs.h | 16 +- > include/trace/events/netfs.h | 79 +--- > 19 files changed, 819 insertions(+), 763 deletions(-) Hello David. After recent merge from upstream BPF CI started consistently failing with a task hanging in v9fs_evict_inode. I bisected the failure to commit e2d46f2ec332, pointing to this patch. Reverting the patch seems to have helped: https://github.com/kernel-patches/vmtest/actions/runs/12952856569 Could you please investigate? Examples of failed jobs: * https://github.com/kernel-patches/bpf/actions/runs/12941732247 * https://github.com/kernel-patches/bpf/actions/runs/12933849075 A log snippet: 2025-01-24T02:15:03.9009694Z [ 246.932163] INFO: task ip:1055 blocked = for more than 122 seconds. 2025-01-24T02:15:03.9013633Z [ 246.932709] Tainted: G = OE 6.13.0-g2bcb9cf535b8-dirty #149 2025-01-24T02:15:03.9018791Z [ 246.933249] "echo 0 > /proc/sys/kernel/= hung_task_timeout_secs" disables this message. 2025-01-24T02:15:03.9025896Z [ 246.933802] task:ip state:= D stack:0 pid:1055 tgid:1055 ppid:1054 flags:0x00004002 2025-01-24T02:15:03.9028228Z [ 246.934564] Call Trace: 2025-01-24T02:15:03.9029758Z [ 246.934764] 2025-01-24T02:15:03.9032572Z [ 246.934937] __schedule+0xa91/0xe80 2025-01-24T02:15:03.9035126Z [ 246.935224] schedule+0x41/0xb0 2025-01-24T02:15:03.9037992Z [ 246.935459] v9fs_evict_inode+0xfe/0x17= 0 2025-01-24T02:15:03.9041469Z [ 246.935748] ? __pfx_var_wake_function+= 0x10/0x10 2025-01-24T02:15:03.9043837Z [ 246.936101] evict+0x1ef/0x360 2025-01-24T02:15:03.9046624Z [ 246.936340] __dentry_kill+0xb0/0x220 2025-01-24T02:15:03.9048855Z [ 246.936610] ? dput+0x3a/0x1d0 2025-01-24T02:15:03.9051128Z [ 246.936838] dput+0x114/0x1d0 2025-01-24T02:15:03.9053548Z [ 246.937069] __fput+0x136/0x2b0 2025-01-24T02:15:03.9056154Z [ 246.937305] task_work_run+0x89/0xc0 2025-01-24T02:15:03.9058593Z [ 246.937571] do_exit+0x2c6/0x9c0 2025-01-24T02:15:03.9061349Z [ 246.937816] do_group_exit+0xa4/0xb0 2025-01-24T02:15:03.9064401Z [ 246.938090] __x64_sys_exit_group+0x17/= 0x20 2025-01-24T02:15:03.9067235Z [ 246.938390] x64_sys_call+0x21a0/0x21a0 2025-01-24T02:15:03.9069924Z [ 246.938672] do_syscall_64+0x79/0x120 2025-01-24T02:15:03.9072746Z [ 246.938941] ? clear_bhb_loop+0x25/0x80 2025-01-24T02:15:03.9075581Z [ 246.939230] ? clear_bhb_loop+0x25/0x80 2025-01-24T02:15:03.9079275Z [ 246.939510] entry_SYSCALL_64_after_hwf= rame+0x76/0x7e 2025-01-24T02:15:03.9081976Z [ 246.939875] RIP: 0033:0x7fb86f66f21d 2025-01-24T02:15:03.9087533Z [ 246.940153] RSP: 002b:00007ffdb3cf93f8 = EFLAGS: 00000202 ORIG_RAX: 00000000000000e7 2025-01-24T02:15:03.9092590Z [ 246.940689] RAX: ffffffffffffffda RBX: = 00007fb86f785fa8 RCX: 00007fb86f66f21d 2025-01-24T02:15:03.9097722Z [ 246.941201] RDX: 00000000000000e7 RSI: = ffffffffffffff80 RDI: 0000000000000000 2025-01-24T02:15:03.9102762Z [ 246.941705] RBP: 00007ffdb3cf9450 R08: = 00007ffdb3cf93a0 R09: 0000000000000000 2025-01-24T02:15:03.9107940Z [ 246.942215] R10: 00007ffdb3cf92ff R11: = 0000000000000202 R12: 0000000000000001 2025-01-24T02:15:03.9113002Z [ 246.942723] R13: 0000000000000000 R14: = 0000000000000000 R15: 00007fb86f785fc0 2025-01-24T02:15:03.9114614Z [ 246.943244] 2025-01-24T02:15:03.9115895Z [ 246.943415] 2025-01-24T02:15:03.9119326Z [ 246.943415] Showing all locks held in t= he system: 2025-01-24T02:15:03.9122278Z [ 246.943865] 1 lock held by khungtaskd/3= 2: 2025-01-24T02:15:03.9128640Z [ 246.944162] #0: ffffffffa9195d90 (rcu_= read_lock){....}-{1:3}, at: debug_show_all_locks+0x2e/0x180 2025-01-24T02:15:03.9131426Z [ 246.944792] 2 locks held by kworker/0:2= /86: 2025-01-24T02:15:03.9132752Z [ 246.945102] 2025-01-24T02:15:03.9136561Z [ 246.945222] =3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D It's worth noting that that the hanging does not happen on *every* test run, but often enough to fail the CI pipeline. You may try reproducing with a container I used for bisection: docker pull ghcr.io/theihor/bpf:v9fs_evict_inode-repro docker run -d --privileged --device=3D/dev/kvm --cap-add ALL -v /path/t= o/your/kernel/source:/ci/workspace ghcr.io/theihor/bpf:v9fs_evict_inode-rep= ro docker exec -it /bin/bash /ci/run.sh # in the container shell Note that inside the container it's an "ubuntu" user, and you might have to run `chown -R ubuntu:ubuntu /ci/workspace` first, or switch to root. > [...]