linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.de>
To: Trond Myklebust <trond.myklebust@hammerspace.com>,
	Anna Schumaker <anna.schumaker@netapp.com>,
	Chuck Lever <chuck.lever@oracle.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Mel Gorman <mgorman@suse.de>,
	Christoph Hellwig <hch@infradead.org>,
	David Howells <dhowells@redhat.com>
Cc: linux-nfs@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: [PATCH 10/18] NFS: swap IO handling is slightly different for O_DIRECT IO
Date: Fri, 17 Dec 2021 10:48:23 +1100	[thread overview]
Message-ID: <163969850314.20885.13214679186436457787.stgit@noble.brown> (raw)
In-Reply-To: <163969801519.20885.3977673503103544412.stgit@noble.brown>

1/ Taking the i_rwsem for swap IO triggers lockdep warnings regarding
   possible deadlocks with "fs_reclaim".  These deadlocks could, I believe,
   eventuate if a buffered read on the swapfile was attempted.

   We don't need coherence with the page cache for a swap file, and
   buffered writes are forbidden anyway.  There is no other need for
   i_rwsem during direct IO.  So never take it for swap_rw()

2/ generic_write_checks() explicitly forbids writes to swap, and
   performs checks that are not needed for swap.  So bypass it
   for swap_rw().

Signed-off-by: NeilBrown <neilb@suse.de>
---
 fs/nfs/direct.c        |   30 +++++++++++++++++++++---------
 fs/nfs/file.c          |    4 ++--
 include/linux/nfs_fs.h |    4 ++--
 3 files changed, 25 insertions(+), 13 deletions(-)

diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c
index f1e169f3050a..eeff1b4e1a7c 100644
--- a/fs/nfs/direct.c
+++ b/fs/nfs/direct.c
@@ -165,9 +165,9 @@ int nfs_swap_rw(struct kiocb *iocb, struct iov_iter *iter)
 	VM_BUG_ON(iov_iter_count(iter) != PAGE_SIZE);
 
 	if (iov_iter_rw(iter) == READ)
-		ret = nfs_file_direct_read(iocb, iter);
+		ret = nfs_file_direct_read(iocb, iter, true);
 	else
-		ret = nfs_file_direct_write(iocb, iter);
+		ret = nfs_file_direct_write(iocb, iter, true);
 	if (ret < 0)
 		return ret;
 	return 0;
@@ -421,6 +421,7 @@ static ssize_t nfs_direct_read_schedule_iovec(struct nfs_direct_req *dreq,
  * nfs_file_direct_read - file direct read operation for NFS files
  * @iocb: target I/O control block
  * @iter: vector of user buffers into which to read data
+ * @swap: flag indicating this is swap IO, not O_DIRECT IO
  *
  * We use this function for direct reads instead of calling
  * generic_file_aio_read() in order to avoid gfar's check to see if
@@ -436,7 +437,8 @@ static ssize_t nfs_direct_read_schedule_iovec(struct nfs_direct_req *dreq,
  * client must read the updated atime from the server back into its
  * cache.
  */
-ssize_t nfs_file_direct_read(struct kiocb *iocb, struct iov_iter *iter)
+ssize_t nfs_file_direct_read(struct kiocb *iocb, struct iov_iter *iter,
+			     bool swap)
 {
 	struct file *file = iocb->ki_filp;
 	struct address_space *mapping = file->f_mapping;
@@ -478,12 +480,14 @@ ssize_t nfs_file_direct_read(struct kiocb *iocb, struct iov_iter *iter)
 	if (iter_is_iovec(iter))
 		dreq->flags = NFS_ODIRECT_SHOULD_DIRTY;
 
-	nfs_start_io_direct(inode);
+	if (!swap)
+		nfs_start_io_direct(inode);
 
 	NFS_I(inode)->read_io += count;
 	requested = nfs_direct_read_schedule_iovec(dreq, iter, iocb->ki_pos);
 
-	nfs_end_io_direct(inode);
+	if (!swap)
+		nfs_end_io_direct(inode);
 
 	if (requested > 0) {
 		result = nfs_direct_wait(dreq);
@@ -872,6 +876,7 @@ static ssize_t nfs_direct_write_schedule_iovec(struct nfs_direct_req *dreq,
  * nfs_file_direct_write - file direct write operation for NFS files
  * @iocb: target I/O control block
  * @iter: vector of user buffers from which to write data
+ * @swap: flag indicating this is swap IO, not O_DIRECT IO
  *
  * We use this function for direct writes instead of calling
  * generic_file_aio_write() in order to avoid taking the inode
@@ -888,7 +893,8 @@ static ssize_t nfs_direct_write_schedule_iovec(struct nfs_direct_req *dreq,
  * Note that O_APPEND is not supported for NFS direct writes, as there
  * is no atomic O_APPEND write facility in the NFS protocol.
  */
-ssize_t nfs_file_direct_write(struct kiocb *iocb, struct iov_iter *iter)
+ssize_t nfs_file_direct_write(struct kiocb *iocb, struct iov_iter *iter,
+			      bool swap)
 {
 	ssize_t result, requested;
 	size_t count;
@@ -902,7 +908,11 @@ ssize_t nfs_file_direct_write(struct kiocb *iocb, struct iov_iter *iter)
 	dfprintk(FILE, "NFS: direct write(%pD2, %zd@%Ld)\n",
 		file, iov_iter_count(iter), (long long) iocb->ki_pos);
 
-	result = generic_write_checks(iocb, iter);
+	if (!swap)
+		result = generic_write_checks(iocb, iter);
+	else
+		/* bypass generic checks */
+		result =  iov_iter_count(iter);
 	if (result <= 0)
 		return result;
 	count = result;
@@ -933,7 +943,8 @@ ssize_t nfs_file_direct_write(struct kiocb *iocb, struct iov_iter *iter)
 		dreq->iocb = iocb;
 	pnfs_init_ds_commit_info_ops(&dreq->ds_cinfo, inode);
 
-	nfs_start_io_direct(inode);
+	if (!swap)
+		nfs_start_io_direct(inode);
 
 	requested = nfs_direct_write_schedule_iovec(dreq, iter, pos);
 
@@ -942,7 +953,8 @@ ssize_t nfs_file_direct_write(struct kiocb *iocb, struct iov_iter *iter)
 					      pos >> PAGE_SHIFT, end);
 	}
 
-	nfs_end_io_direct(inode);
+	if (!swap)
+		nfs_end_io_direct(inode);
 
 	if (requested > 0) {
 		result = nfs_direct_wait(dreq);
diff --git a/fs/nfs/file.c b/fs/nfs/file.c
index b620fe697158..996dfb3c74b2 100644
--- a/fs/nfs/file.c
+++ b/fs/nfs/file.c
@@ -161,7 +161,7 @@ nfs_file_read(struct kiocb *iocb, struct iov_iter *to)
 	ssize_t result;
 
 	if (iocb->ki_flags & IOCB_DIRECT)
-		return nfs_file_direct_read(iocb, to);
+		return nfs_file_direct_read(iocb, to, false);
 
 	dprintk("NFS: read(%pD2, %zu@%lu)\n",
 		iocb->ki_filp,
@@ -625,7 +625,7 @@ ssize_t nfs_file_write(struct kiocb *iocb, struct iov_iter *from)
 		return result;
 
 	if (iocb->ki_flags & IOCB_DIRECT)
-		return nfs_file_direct_write(iocb, from);
+		return nfs_file_direct_write(iocb, from, false);
 
 	dprintk("NFS: write(%pD2, %zu@%Ld)\n",
 		file, iov_iter_count(from), (long long) iocb->ki_pos);
diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
index 6329e6958718..3a210478f665 100644
--- a/include/linux/nfs_fs.h
+++ b/include/linux/nfs_fs.h
@@ -512,9 +512,9 @@ static inline const struct cred *nfs_file_cred(struct file *file)
  */
 extern int nfs_swap_rw(struct kiocb *, struct iov_iter *);
 extern ssize_t nfs_file_direct_read(struct kiocb *iocb,
-			struct iov_iter *iter);
+				    struct iov_iter *iter, bool swap);
 extern ssize_t nfs_file_direct_write(struct kiocb *iocb,
-			struct iov_iter *iter);
+				     struct iov_iter *iter, bool swap);
 
 /*
  * linux/fs/nfs/dir.c




  parent reply	other threads:[~2021-12-16 23:57 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-16 23:48 [PATCH 00/18 V2] Repair SWAP-over-NFS NeilBrown
2021-12-16 23:48 ` [PATCH 01/18] Structural cleanup for filesystem-based swap NeilBrown
2021-12-17 10:33   ` kernel test robot
2021-12-21  8:34   ` Christoph Hellwig
2021-12-16 23:48 ` [PATCH 03/18] MM: use ->swap_rw for reads from SWP_FS_OPS swap-space NeilBrown
2021-12-20 12:16   ` Mark Hemment
2021-12-21  8:40   ` Christoph Hellwig
2021-12-16 23:48 ` [PATCH 05/18] MM: reclaim mustn't enter FS for " NeilBrown
2021-12-17  8:51   ` kernel test robot
2021-12-21  8:43   ` Christoph Hellwig
2021-12-16 23:48 ` [PATCH 06/18] MM: submit multipage reads " NeilBrown
2021-12-17  7:09   ` kernel test robot
2021-12-21  8:44   ` Christoph Hellwig
2021-12-16 23:48 ` [PATCH 02/18] MM: create new mm/swap.h header file NeilBrown
2021-12-17 10:03   ` kernel test robot
2021-12-21  8:36   ` Christoph Hellwig
2021-12-16 23:48 ` [PATCH 04/18] MM: perform async writes to SWP_FS_OPS swap-space NeilBrown
2021-12-21  8:41   ` Christoph Hellwig
2021-12-16 23:48 ` [PATCH 16/18] SUNRPC: improve 'swap' handling: scheduling and PF_MEMALLOC NeilBrown
2021-12-16 23:48 ` [PATCH 07/18] MM: submit multipage write for SWP_FS_OPS swap-space NeilBrown
2021-12-20 12:21   ` Mark Hemment
2021-12-16 23:48 ` [PATCH 14/18] SUNRPC: remove scheduling boost for "SWAPPER" tasks NeilBrown
2021-12-16 23:48 ` [PATCH 17/18] NFSv4: keep state manager thread active if swap is enabled NeilBrown
2021-12-16 23:48 ` [PATCH 15/18] NFS: discard NFS_RPC_SWAPFLAGS and RPC_TASK_ROOTCREDS NeilBrown
2021-12-16 23:48 ` [PATCH 13/18] SUNRPC/xprt: async tasks mustn't block waiting for memory NeilBrown
2021-12-16 23:48 ` [PATCH 12/18] SUNRPC/auth: " NeilBrown
2021-12-16 23:48 ` [PATCH 18/18] NFS: swap-out must always use STABLE writes NeilBrown
2021-12-16 23:48 ` [PATCH 11/18] SUNRPC/call_alloc: async tasks mustn't block waiting for memory NeilBrown
2021-12-16 23:48 ` [PATCH 08/18] MM: Add AS_CAN_DIO mapping flag NeilBrown
2021-12-19 13:38   ` Mark Hemment
2021-12-19 20:59     ` NeilBrown
2021-12-21  8:46   ` Christoph Hellwig
2022-01-19  3:54     ` NeilBrown
2021-12-16 23:48 ` NeilBrown [this message]
2021-12-20 15:02   ` [PATCH 10/18] NFS: swap IO handling is slightly different for O_DIRECT IO Mark Hemment
2021-12-16 23:48 ` [PATCH 09/18] NFS: rename nfs_direct_IO and use as ->swap_rw NeilBrown
2021-12-17 21:29 ` [PATCH 00/18 V2] Repair SWAP-over-NFS Anna Schumaker
2021-12-19 21:07   ` NeilBrown
2021-12-21  8:48     ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=163969850314.20885.13214679186436457787.stgit@noble.brown \
    --to=neilb@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=anna.schumaker@netapp.com \
    --cc=chuck.lever@oracle.com \
    --cc=dhowells@redhat.com \
    --cc=hch@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=trond.myklebust@hammerspace.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox