From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2F4E9C7EE2C for ; Fri, 25 Aug 2023 22:16:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 74EF0680017; Fri, 25 Aug 2023 18:16:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6FDAF68000E; Fri, 25 Aug 2023 18:16:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 57773680017; Fri, 25 Aug 2023 18:16:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 4512D68000E for ; Fri, 25 Aug 2023 18:16:22 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 0F4241C9B50 for ; Fri, 25 Aug 2023 22:16:22 +0000 (UTC) X-FDA: 81164036604.07.301F90E Received: from mail-pl1-f180.google.com (mail-pl1-f180.google.com [209.85.214.180]) by imf17.hostedemail.com (Postfix) with ESMTP id 216EF40012 for ; Fri, 25 Aug 2023 22:16:19 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=fromorbit-com.20221208.gappssmtp.com header.s=20221208 header.b=oGxnlMh0; dmarc=pass (policy=quarantine) header.from=fromorbit.com; spf=pass (imf17.hostedemail.com: domain of david@fromorbit.com designates 209.85.214.180 as permitted sender) smtp.mailfrom=david@fromorbit.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1693001780; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ppeQOWgbWz5gqWZs/rbytV4GhpZ82FSsmBh/giRYIfc=; b=EVQ1OU/2hUH68ouzh+rHyAF0lhWqBcMAxJ4Kv8leidUrG+dmJrpoY+cKhWTtIUUwMylWCM xHJfwVrNCsPqqTTFneHPgyn2VGEVOVT9pZWUpqVL/rM7mc2CE569rUN+fhUkvkS0SjpXop 4csVkD0NmKHKhVKinq1Fc0kpbZ38SLk= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=fromorbit-com.20221208.gappssmtp.com header.s=20221208 header.b=oGxnlMh0; dmarc=pass (policy=quarantine) header.from=fromorbit.com; spf=pass (imf17.hostedemail.com: domain of david@fromorbit.com designates 209.85.214.180 as permitted sender) smtp.mailfrom=david@fromorbit.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1693001780; a=rsa-sha256; cv=none; b=68O44rUo3qmE4CjN1/ko9cE88TSENy3/LTkknKuzvV2U5VDwPf4B0DC9iWz8XuI6J9SQKS Zj98Mj8u3IMRCq+uNfRacbW5RoSmbhLYTjVJ0o63vE9L6AJY4f6P4zX57/YmifZXJ7yVb8 v1lj4oHl0ecL+R72mdA8h7ixkiKXwNg= Received: by mail-pl1-f180.google.com with SMTP id d9443c01a7336-1c09673b006so10133195ad.1 for ; Fri, 25 Aug 2023 15:16:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20221208.gappssmtp.com; s=20221208; t=1693001779; x=1693606579; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=ppeQOWgbWz5gqWZs/rbytV4GhpZ82FSsmBh/giRYIfc=; b=oGxnlMh0S/WiVEBqRgXg0XkOHzHqDpTBvnGnagMxXiK0WZOe1Z4r+2p1ptsCZGDnWB pXEktmY/ad0PuwAxCgT0YcHOMRjDsuwyrAXCFtdwEmAj+muyxYQXIE+r39jHMvXhT5B0 o+SPYmvdCE0AaqEX44rBk89wjuy74zXAp3Vwzk0u1JPoPtSJh1DrCjw1RqWqiFdJpqby tMNly65LV0QvaLstk47azhvzcHLQyEtc8q2u/G93EQr7wDi/F72WQh2TPKv7ILrD/tve HoezaGEwtih61yu0zT7giO0Mqh1qaOyneAEL9vdEIb12mBB0wZbgWFOwNbvY6RzLwUwH OW2Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1693001779; x=1693606579; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=ppeQOWgbWz5gqWZs/rbytV4GhpZ82FSsmBh/giRYIfc=; b=cASbKqxjfrQNi9jWBvpUpuTezuD9hCpEf+5BtIccGIcERoyPC+fJbpJycBJvTOQ1vZ pLEuqhpIybllkr8eLUA0fMBPGgD3+BT5ux8oxZop7qm7+mRwu3pIywoVhaD1jqUgzMVw jbUedhH3Peptk7rzzSXrZiRRJHdlVTrneV0P9KGdBtWZ7lPM2h12xIgosHVRveDvpeQS cqGv9vkwj6paktUZ0gsNwKq9x92s5w7Ax8gepZt/fAuqcYV6nMwx4P7ROg31cv/17493 NU1fD+NAtD7Atf2LK+ZGGFEvsyEEdnKUQWGPtq3owqAzfwCoqnoNHpskvphe0qZU9FRF D/Aw== X-Gm-Message-State: AOJu0YysBvvF9KNUhDLqLLdPAYIMwOr3tZrrBAD9tlp5QBGQaSBAKgJp 3AeruMih1rzs6cNNWw/FriMISw== X-Google-Smtp-Source: AGHT+IEjt82Q7/ukix1Bwi0HcqOCAUCGzooSpz1T14B8KRhD3u7v0lApveU1bnKvxuVgpN4iNhxk8w== X-Received: by 2002:a17:903:264e:b0:1bd:c338:ae14 with SMTP id je14-20020a170903264e00b001bdc338ae14mr16243879plb.12.1693001778913; Fri, 25 Aug 2023 15:16:18 -0700 (PDT) Received: from dread.disaster.area (pa49-195-66-88.pa.nsw.optusnet.com.au. [49.195.66.88]) by smtp.gmail.com with ESMTPSA id f8-20020a170902684800b001c0bf60ba5csm2276046pln.272.2023.08.25.15.16.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 25 Aug 2023 15:16:18 -0700 (PDT) Received: from dave by dread.disaster.area with local (Exim 4.96) (envelope-from ) id 1qZf6F-006Va3-0j; Sat, 26 Aug 2023 08:16:15 +1000 Date: Sat, 26 Aug 2023 08:16:15 +1000 From: Dave Chinner To: Hao Xu Cc: io-uring@vger.kernel.org, Jens Axboe , Dominique Martinet , Pavel Begunkov , Christian Brauner , Alexander Viro , Stefan Roesch , Clay Harris , "Darrick J . Wong" , linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-cachefs@redhat.com, ecryptfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-unionfs@vger.kernel.org, bpf@vger.kernel.org, netdev@vger.kernel.org, linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-btrfs@vger.kernel.org, codalist@coda.cs.cmu.edu, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, linux-mm@kvack.org, linux-nilfs@vger.kernel.org, devel@lists.orangefs.org, linux-cifs@vger.kernel.org, samba-technical@lists.samba.org, linux-mtd@lists.infradead.org, Wanpeng Li Subject: Re: [PATCH 25/29] xfs: support nowait for xfs_buf_item_init() Message-ID: References: <20230825135431.1317785-1-hao.xu@linux.dev> <20230825135431.1317785-26-hao.xu@linux.dev> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230825135431.1317785-26-hao.xu@linux.dev> X-Rspamd-Queue-Id: 216EF40012 X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: o9wn365o49ohmgunwkqyodb8ug9miz8z X-HE-Tag: 1693001779-763799 X-HE-Meta: U2FsdGVkX1+pFlixuaPQ955ylOfxenZcZrM1RhuWZ7yfxhCKVtD90rJqBcM718BJ4AmanrtW5JV1jx/7gN5U8kEWJl0m4K7oc1IofIroPonk2pm4uztsY5dC7W/+OUzmOFEyInZjji80a0VGkUxKWaYhfcwhb8kvZ5oajpmnABQ9WUUnM14QaV5hk0tnLBIkwTx5wnS6x8nGhgl5DiExTN2OsKTFcAzgaIkXm+DzhCMFbz2Fj+Jazmwbbgq82RB5bt5Z9chb0GpAgy+f/nMpNbOz0NDJUa9AI4kPgtG2aXxLJLLsKYo7g0r0vG9qD29xP+2Blj/OH8G6hHu86cgPCVuX6uqvtQjcRgy69QE4souBPOOMX4QKlEx0E62n9viQ36dclhcjzdDunQKST26U9J2rGEYQgpaW7NhgMxBJplQ8vu0nIAegqOoR2z92DjoV+Fq7Opt3DqolmVo+ZKx4unM3yuh/02FJfbztIPJY3Pe7dW7hKM9sPwvmv2F1JijLGQp/9RaVlpiQnp25Locw6QxG5rwgMz9fWf/pRKfe1R9/cqFwC4x93NBiozTe8nPhngrN+Ov/iB+NTimuoxEKvSDkRNgEYk5i6hTcMqTRLgN5SaRJc9iLvlXPw7oIoZbrxGVBHd/jOCjxOK0IzTtczTGnZHusbLllQdThZqdf/DlTRijvT5cvpdGgd+raT4zk1KsF8w+dUJcOruO/M8BzBvJbTCB32LRFQeESNsuBxdIdWdw5Mo399JVpqkVF/q9c7SlVmHJHulQWcCmbhggIOGAWXN/MnIjCSwOpaTRZD7DUcLLRHAo3N818WWf18gcsHUvVHJqzy6gTdzrKAF0rHSpJUy5oF0uMP+JxHDX2No909aoYvpgHjppkQtz8WFPkf0aoVYQgtV2OncAv6+8ld01LOu1Y/A2tqRaSqszNp4Uh6C8Ik2Iwwui8mFAOy9jFvjyo9RuTXyniMe8SxWM pYUEu4cD z4xIxiWWysP2DXubXOTIPJ0xtl6ReIG139JCB+J5rZ+VVGeJt+6LMda0uJ13j7tX4gFyGV764Wbqi8xApOuIe9lJdhNh2FiKmHFqBh1H4w2jYqDAmXJN3Ipf5tGYb7p8EZtlPxjxxJ46ka/PncM66EemtBdUKrprXzsWwS3fZwDIvbRm387otMtrEzb6mki3tdV2qvA45GN1h0fdFwFo3zRP4WSOjowjLfsNiCRsZl0MUJXD38yEDP9wQhqKwfsHQxQOpvkeVOOZmocjOUuuhtPEsZ3Gpsjx9E5f25lG9aYWn7NX5USSQrLaumbbjeQqQPVpx3ty23saaxM9w9iSVltk48L9ZiAEIGBMQwDfpI100hr7X7PPp3YNt4IjdpCcsSQNWhaSFkvGsttjYo9g1inVUGlYHhA59P01b371LZGJqHLKFIGLFsY6hVKDDPdCL3LGVH4xTE6xvF8vh4PMwihyFjH12psyvVZ99KmjT5wf2w9z1YytwlEGk3CBlwFANYJnYcHUAg1bJSFxqiqbquINS3j2C7U3faDdrvDzL+wx5KgFSyM78CvwCh0C4hZzm5Cjf9QdPD8MjYcdBcXB/h6/E37NN5YGUaaF5Z98oCDvNGoU5jcmeKJBEQg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Aug 25, 2023 at 09:54:27PM +0800, Hao Xu wrote: > From: Hao Xu > > support nowait for xfs_buf_item_init() and error out -EAGAIN to > _xfs_trans_bjoin() when it would block. > > Signed-off-by: Hao Xu > --- > fs/xfs/xfs_buf_item.c | 9 +++++++-- > fs/xfs/xfs_buf_item.h | 2 +- > fs/xfs/xfs_buf_item_recover.c | 2 +- > fs/xfs/xfs_trans_buf.c | 16 +++++++++++++--- > 4 files changed, 22 insertions(+), 7 deletions(-) > > diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c > index 023d4e0385dd..b1e63137d65b 100644 > --- a/fs/xfs/xfs_buf_item.c > +++ b/fs/xfs/xfs_buf_item.c > @@ -827,7 +827,8 @@ xfs_buf_item_free_format( > int > xfs_buf_item_init( > struct xfs_buf *bp, > - struct xfs_mount *mp) > + struct xfs_mount *mp, > + bool nowait) > { > struct xfs_buf_log_item *bip = bp->b_log_item; > int chunks; > @@ -847,7 +848,11 @@ xfs_buf_item_init( > return 0; > } > > - bip = kmem_cache_zalloc(xfs_buf_item_cache, GFP_KERNEL | __GFP_NOFAIL); > + bip = kmem_cache_zalloc(xfs_buf_item_cache, > + GFP_KERNEL | (nowait ? 0 : __GFP_NOFAIL)); > + if (!bip) > + return -EAGAIN; > + > xfs_log_item_init(mp, &bip->bli_item, XFS_LI_BUF, &xfs_buf_item_ops); > bip->bli_buf = bp; I see filesystem shutdowns.... > diff --git a/fs/xfs/xfs_trans_buf.c b/fs/xfs/xfs_trans_buf.c > index 016371f58f26..a1e4f2e8629a 100644 > --- a/fs/xfs/xfs_trans_buf.c > +++ b/fs/xfs/xfs_trans_buf.c > @@ -57,13 +57,14 @@ xfs_trans_buf_item_match( > * If the buffer does not yet have a buf log item associated with it, > * then allocate one for it. Then add the buf item to the transaction. > */ > -STATIC void > +STATIC int > _xfs_trans_bjoin( > struct xfs_trans *tp, > struct xfs_buf *bp, > int reset_recur) > { > struct xfs_buf_log_item *bip; > + int ret; > > ASSERT(bp->b_transp == NULL); > > @@ -72,7 +73,11 @@ _xfs_trans_bjoin( > * it doesn't have one yet, then allocate one and initialize it. > * The checks to see if one is there are in xfs_buf_item_init(). > */ > - xfs_buf_item_init(bp, tp->t_mountp); > + ret = xfs_buf_item_init(bp, tp->t_mountp, > + tp->t_flags & XFS_TRANS_NOWAIT); > + if (ret < 0) > + return ret; > + > bip = bp->b_log_item; > ASSERT(!(bip->bli_flags & XFS_BLI_STALE)); > ASSERT(!(bip->__bli_format.blf_flags & XFS_BLF_CANCEL)); > @@ -92,6 +97,7 @@ _xfs_trans_bjoin( > xfs_trans_add_item(tp, &bip->bli_item); > bp->b_transp = tp; > > + return 0; > } > > void > @@ -309,7 +315,11 @@ xfs_trans_read_buf_map( > } > > if (tp) { > - _xfs_trans_bjoin(tp, bp, 1); > + error = _xfs_trans_bjoin(tp, bp, 1); > + if (error) { > + xfs_buf_relse(bp); > + return error; > + } > trace_xfs_trans_read_buf(bp->b_log_item); So what happens at the callers when we have a dirty transaction and joining a buffer fails with -EAGAIN? Apart from the fact this may well propagate -EAGAIN up to userspace, cancelling a dirty transaction at this point will result in a filesystem shutdown.... Indeed, this can happen in the "simple" timestamp update case that this "nowait" semantic is being aimed at. We log the inode in the timestamp update, which dirties the log item and registers a precommit operation to be run. We commit the transaction, which then runs xfs_inode_item_precommit() and that may need to attach the inode to the inode cluster buffer. This results in: xfs_inode_item_precommit xfs_imap_to_bp xfs_trans_read_buf_map _xfs_trans_bjoin xfs_buf_item_init(XFS_TRANS_NOWAIT) kmem_cache_zalloc(GFP_NOFS) gets -EAGAIN error propagates -EAGAIN fails due to -EAGAIN And now xfs_trans_commit() fails with a dirty transaction and the filesystem shuts down. IOWs, XFS_TRANS_NOWAIT as it stands is fundamentally broken. Once we dirty an item in a transaction, we *cannot* back out of the transaction. We *must block* in every place that could fail - locking, memory allocation and/or IO - until the transaction completes because we cannot undo the changes we've already made to the dirty items in the transaction.... It's even worse than that - once we have committed intents, the whole chain of intent processing must be run to completionr. Hence we can't tolerate backing out of that defered processing chain half way through because we might have to block. Until we can roll back partial dirty transactions and partially completed defered intent chains at any random point of completion, XFS_TRANS_NOWAIT will not work. -Dave. -- Dave Chinner david@fromorbit.com