linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "NeilBrown" <neilb@suse.de>
To: "Chuck Lever" <cel@kernel.org>
Cc: , "Chuck Lever" <chuck.lever@oracle.com>,
	"David Howells" <dhowells@redhat.com>,
	"Jeff Layton" <jlayton@kernel.org>,
	"Hugh Dickins" <hughd@google.com>, "Jens Axboe" <axboe@kernel.dk>,
	"Matthew Wilcox" <willy@infradead.org>,
	linux-nfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org
Subject: Re: [PATCH] nfsd: Fix reading via splice
Date: Sat, 29 Jul 2023 09:54:58 +1000	[thread overview]
Message-ID: <169058849828.32308.14965537137761913794@noble.neil.brown.name> (raw)
In-Reply-To: <169054754615.3783.11682801287165281930.stgit@klimt.1015granger.net>

On Fri, 28 Jul 2023, Chuck Lever wrote:
> From: David Howells <dhowells@redhat.com>
> 
> nfsd_splice_actor() has a clause in its loop that chops up a compound page
> into individual pages such that if the same page is seen twice in a row, it
> is discarded the second time.  This is a problem with the advent of
> shmem_splice_read() as that inserts zero_pages into the pipe in lieu of
> pages that aren't present in the pagecache.
> 
> Fix this by assuming that the last page is being extended only if the
> currently stored length + starting offset is not currently on a page
> boundary.
> 
> This can be tested by NFS-exporting a tmpfs filesystem on the test machine
> and truncating it to more than a page in size (eg. truncate -s 8192) and
> then reading it by NFS.  The first page will be all zeros, but thereafter
> garbage will be read.
> 
> Note: I wonder if we can ever get a situation now where we get a splice
> that gives us contiguous parts of a page in separate actor calls.  As NFSD
> can only be splicing from a file (I think), there are only three sources of
> the page: copy_splice_read(), shmem_splice_read() and file_splice_read().
> The first allocates pages for the data it reads, so the problem cannot
> occur; the second should never see a partial page; and the third waits for
> each page to become available before we're allowed to read from it.
> 
> Fixes: bd194b187115 ("shmem: Implement splice-read")
> Reported-by: Chuck Lever <chuck.lever@oracle.com>
> Signed-off-by: David Howells <dhowells@redhat.com>
> Reviewed-by: Jeff Layton <jlayton@kernel.org>
> cc: Hugh Dickins <hughd@google.com>
> cc: Jens Axboe <axboe@kernel.dk>
> cc: Matthew Wilcox <willy@infradead.org>
> cc: linux-nfs@vger.kernel.org
> cc: linux-fsdevel@vger.kernel.org
> cc: linux-mm@kvack.org
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>  fs/nfsd/vfs.c |    9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> index 59b7d60ae33e..ee3bbaa79478 100644
> --- a/fs/nfsd/vfs.c
> +++ b/fs/nfsd/vfs.c
> @@ -956,10 +956,13 @@ nfsd_splice_actor(struct pipe_inode_info *pipe, struct pipe_buffer *buf,
>  	last_page = page + (offset + sd->len - 1) / PAGE_SIZE;
>  	for (page += offset / PAGE_SIZE; page <= last_page; page++) {
>  		/*
> -		 * Skip page replacement when extending the contents
> -		 * of the current page.
> +		 * Skip page replacement when extending the contents of the
> +		 * current page.  But note that we may get two zero_pages in a
> +		 * row from shmem.
>  		 */
> -		if (page == *(rqstp->rq_next_page - 1))
> +		if (page == *(rqstp->rq_next_page - 1) &&
> +		    offset_in_page(rqstp->rq_res.page_base +
> +				   rqstp->rq_res.page_len))

This seems fragile in that it makes assumptions about the pages being
sent and their alignment.
Given that it was broken by the splice-read change, that confirms it is
fragile.  Maybe we could make the code a bit more explicit about what is
expected.

Also, I don't think this test can ever be relevant after the first time
through the loop.  So I think it would be clearest to have the
interesting case outside the loop.

 page += offset / PAGE_SIZE;
 if (rqstp->rq_res.pages_len > 0) {
      /* appending to page list - check alignment */
      if (offset % PAGE_SIZE != (rqstp->rq_res.page_base +
                                 rqstp-.rq_res.page_len) % PAGE_SIZE)
	  return -EIO;
      if (offset % PAGE_SIZE != 0) {
           /* continuing previous page */
           if (page != rqstp->rq_next_page[-1])
               return -EIO;
	   page += 1;
      }
 } else
      /* Starting new page list */
      rqstp->rq_res.page_base = offset % PAGE_SIZE;

 for ( ; page <= last_page ; page++)
       if (unlikely(!svc_rqst_replace_page(rqstp, page)))
           return -EIO;

 rqstp->rq_res.page_len += sd->len;
 return sd->len;


Also, the name "svc_rqst_replace_page" doesn't give any hint that the
next_page pointer is advanced.  Maybe svc_rqst_add_page() ???  Not great
I admit.

NeilBrown

   

>  			continue;
>  		if (unlikely(!svc_rqst_replace_page(rqstp, page)))
>  			return -EIO;
> 
> 
> 



  reply	other threads:[~2023-07-28 23:55 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-28 12:32 Chuck Lever
2023-07-28 23:54 ` NeilBrown [this message]
2023-07-30 15:29   ` Chuck Lever
2023-07-30 16:50     ` Hugh Dickins
2023-07-30 16:55       ` Chuck Lever
2023-07-30 22:02     ` NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=169058849828.32308.14965537137761913794@noble.neil.brown.name \
    --to=neilb@suse.de \
    --cc=axboe@kernel.dk \
    --cc=cel@kernel.org \
    --cc=chuck.lever@oracle.com \
    --cc=dhowells@redhat.com \
    --cc=hughd@google.com \
    --cc=jlayton@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox