From: Steve French <smfrench@gmail.com>
To: David Howells <dhowells@redhat.com>
Cc: kernel test robot <oliver.sang@intel.com>,
oe-lkp@lists.linux.dev, lkp@intel.com,
Linux Memory Management List <linux-mm@kvack.org>,
Christian Brauner <brauner@kernel.org>,
Jeff Layton <jlayton@kernel.org>,
netfs@lists.linux.dev, linux-fsdevel@vger.kernel.org
Subject: Re: [linux-next:master] [netfs] a05b682d49: BUG:KASAN:slab-use-after-free_in_copy_from_iter
Date: Tue, 24 Sep 2024 18:19:29 -0500 [thread overview]
Message-ID: <CAH2r5msFnqU-wCn4QRv1Tjh4dtomA6QbtAGmONhx5C0mduxLVg@mail.gmail.com> (raw)
In-Reply-To: <1191933.1727214450@warthog.procyon.org.uk>
Yes - I can confirm that this fixes the recent netfs regression in generic/075
http://smb311-linux-testing.southcentralus.cloudapp.azure.com/#/builders/3/builds/239
You can add:
Tested-by: Steve French <stfrench@microsoft.com>
On Tue, Sep 24, 2024 at 4:47 PM David Howells <dhowells@redhat.com> wrote:
>
> Does the attached fix the problem?
>
> David
> ---
> netfs: Fix write oops in generic/346 (9p) and maybe generic/074 (cifs)
>
> In netfslib, a buffered writeback operation has a 'write queue' of folios
> that are being written, held in a linear sequence of folio_queue structs.
> The 'issuer' adds new folio_queues on the leading edge of the queue and
> populates each one progressively; the 'collector' pops them off the
> trailing edge and discards them and the folios they point to as they are
> consumed.
>
> The queue is required to always retain at least one folio_queue structure.
> This allows the queue to be accessed without locking and with just a bit of
> barriering.
>
> When a new subrequest is prepared, its ->io_iter iterator is pointed at the
> current end of the write queue and then the iterator is extended as more
> data is added to the queue until the subrequest is committed.
>
> Now, the problem is that the folio_queue at the leading edge of the write
> queue when a subrequest is prepared might have been entirely consumed - but
> not yet removed from the queue as it is the only remaining one and is
> preventing the queue from collapsing.
>
> So, what happens is that subreq->io_iter is pointed at the spent
> folio_queue, then a new folio_queue is added, and, at that point, the
> collector is at entirely at liberty to immediately delete the spent
> folio_queue.
>
> This leaves the subreq->io_iter pointing at a freed object. If the system
> is lucky, iterate_folioq() sees ->io_iter, sees the as-yet uncorrupted
> freed object and advances to the next folio_queue in the queue.
>
> In the case seen, however, the freed object gets recycled and put back onto
> the queue at the tail and filled to the end. This confuses
> iterate_folioq() and it tries to step ->next, which may be NULL - resulting
> in an oops.
>
> Fix this by the following means:
>
> (1) When preparing a write subrequest, make sure there's a folio_queue
> struct with space in it at the leading edge of the queue. A function
> to make space is split out of the function to append a folio so that
> it can be called for this purpose.
>
> (2) If the request struct iterator is pointing to a completely spent
> folio_queue when we make space, then advance the iterator to the newly
> allocated folio_queue. The subrequest's iterator will then be set
> from this.
>
> Whilst we're at it, also split out the function to allocate a folio_queue,
> initialise it and do the accounting.
>
> The oops could be triggered using the generic/346 xfstest with a filesystem
> on9P over TCP with cache=loose. The oops looked something like:
>
> BUG: kernel NULL pointer dereference, address: 0000000000000008
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> ...
> RIP: 0010:_copy_from_iter+0x2db/0x530
> ...
> Call Trace:
> <TASK>
> ...
> p9pdu_vwritef+0x3d8/0x5d0
> p9_client_prepare_req+0xa8/0x140
> p9_client_rpc+0x81/0x280
> p9_client_write+0xcf/0x1c0
> v9fs_issue_write+0x87/0xc0
> netfs_advance_write+0xa0/0xb0
> netfs_write_folio.isra.0+0x42d/0x500
> netfs_writepages+0x15a/0x1f0
> do_writepages+0xd1/0x220
> filemap_fdatawrite_wbc+0x5c/0x80
> v9fs_mmap_vm_close+0x7d/0xb0
> remove_vma+0x35/0x70
> vms_complete_munmap_vmas+0x11a/0x170
> do_vmi_align_munmap+0x17d/0x1c0
> do_vmi_munmap+0x13e/0x150
> __vm_munmap+0x92/0xd0
> __x64_sys_munmap+0x17/0x20
> do_syscall_64+0x80/0xe0
> entry_SYSCALL_64_after_hwframe+0x71/0x79
>
> This may also fix a similar-looking issue with cifs and generic/074.
>
> | Reported-by: kernel test robot <oliver.sang@intel.com>
> | Closes: https://lore.kernel.org/oe-lkp/202409180928.f20b5a08-oliver.sang@intel.com
>
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Eric Van Hensbergen <ericvh@kernel.org>
> cc: Latchesar Ionkov <lucho@ionkov.net>
> cc: Dominique Martinet <asmadeus@codewreck.org>
> cc: Christian Schoenebeck <linux_oss@crudebyte.com>
> cc: Steve French <sfrench@samba.org>
> cc: Paulo Alcantara <pc@manguebit.com>
> cc: Jeff Layton <jlayton@kernel.org>
> cc: v9fs@lists.linux.dev
> cc: linux-cifs@vger.kernel.org
> cc: netfs@lists.linux.dev
> cc: linux-fsdevel@vger.kernel.org
> ---
> fs/netfs/internal.h | 2 +
> fs/netfs/misc.c | 72 ++++++++++++++++++++++++++++++++++---------------
> fs/netfs/objects.c | 12 ++++++++
> fs/netfs/write_issue.c | 12 +++++++-
> 4 files changed, 76 insertions(+), 22 deletions(-)
>
> diff --git a/fs/netfs/internal.h b/fs/netfs/internal.h
> index c7f23dd3556a..79c0ad89affb 100644
> --- a/fs/netfs/internal.h
> +++ b/fs/netfs/internal.h
> @@ -58,6 +58,7 @@ static inline void netfs_proc_del_rreq(struct netfs_io_request *rreq) {}
> /*
> * misc.c
> */
> +struct folio_queue *netfs_buffer_make_space(struct netfs_io_request *rreq);
> int netfs_buffer_append_folio(struct netfs_io_request *rreq, struct folio *folio,
> bool needs_put);
> struct folio_queue *netfs_delete_buffer_head(struct netfs_io_request *wreq);
> @@ -76,6 +77,7 @@ void netfs_clear_subrequests(struct netfs_io_request *rreq, bool was_async);
> void netfs_put_request(struct netfs_io_request *rreq, bool was_async,
> enum netfs_rreq_ref_trace what);
> struct netfs_io_subrequest *netfs_alloc_subrequest(struct netfs_io_request *rreq);
> +struct folio_queue *netfs_folioq_alloc(struct netfs_io_request *rreq, gfp_t gfp);
>
> static inline void netfs_see_request(struct netfs_io_request *rreq,
> enum netfs_rreq_ref_trace what)
> diff --git a/fs/netfs/misc.c b/fs/netfs/misc.c
> index 0ad0982ce0e2..a743e8963247 100644
> --- a/fs/netfs/misc.c
> +++ b/fs/netfs/misc.c
> @@ -9,34 +9,64 @@
> #include "internal.h"
>
> /*
> - * Append a folio to the rolling queue.
> + * Make sure there's space in the rolling queue.
> */
> -int netfs_buffer_append_folio(struct netfs_io_request *rreq, struct folio *folio,
> - bool needs_put)
> +struct folio_queue *netfs_buffer_make_space(struct netfs_io_request *rreq)
> {
> - struct folio_queue *tail = rreq->buffer_tail;
> - unsigned int slot, order = folio_order(folio);
> + struct folio_queue *tail = rreq->buffer_tail, *prev;
> + unsigned int prev_nr_slots = 0;
>
> if (WARN_ON_ONCE(!rreq->buffer && tail) ||
> WARN_ON_ONCE(rreq->buffer && !tail))
> - return -EIO;
> -
> - if (!tail || folioq_full(tail)) {
> - tail = kmalloc(sizeof(*tail), GFP_NOFS);
> - if (!tail)
> - return -ENOMEM;
> - netfs_stat(&netfs_n_folioq);
> - folioq_init(tail);
> - tail->prev = rreq->buffer_tail;
> - if (tail->prev)
> - tail->prev->next = tail;
> - rreq->buffer_tail = tail;
> - if (!rreq->buffer) {
> - rreq->buffer = tail;
> - iov_iter_folio_queue(&rreq->io_iter, ITER_SOURCE, tail, 0, 0, 0);
> + return ERR_PTR(-EIO);
> +
> + prev = tail;
> + if (prev) {
> + if (!folioq_full(tail))
> + return tail;
> + prev_nr_slots = folioq_nr_slots(tail);
> + }
> +
> + tail = netfs_folioq_alloc(rreq, GFP_NOFS);
> + if (!tail)
> + return ERR_PTR(-ENOMEM);
> + tail->prev = prev;
> + if (prev)
> + /* [!] NOTE: After we set prev->next, the consumer is entirely
> + * at liberty to delete prev.
> + */
> + WRITE_ONCE(prev->next, tail);
> +
> + rreq->buffer_tail = tail;
> + if (!rreq->buffer) {
> + rreq->buffer = tail;
> + iov_iter_folio_queue(&rreq->io_iter, ITER_SOURCE, tail, 0, 0, 0);
> + } else {
> + /* Make sure we don't leave the master iterator pointing to a
> + * block that might get immediately consumed.
> + */
> + if (rreq->io_iter.folioq == prev &&
> + rreq->io_iter.folioq_slot == prev_nr_slots) {
> + rreq->io_iter.folioq = tail;
> + rreq->io_iter.folioq_slot = 0;
> }
> - rreq->buffer_tail_slot = 0;
> }
> + rreq->buffer_tail_slot = 0;
> + return tail;
> +}
> +
> +/*
> + * Append a folio to the rolling queue.
> + */
> +int netfs_buffer_append_folio(struct netfs_io_request *rreq, struct folio *folio,
> + bool needs_put)
> +{
> + struct folio_queue *tail;
> + unsigned int slot, order = folio_order(folio);
> +
> + tail = netfs_buffer_make_space(rreq);
> + if (IS_ERR(tail))
> + return PTR_ERR(tail);
>
> rreq->io_iter.count += PAGE_SIZE << order;
>
> diff --git a/fs/netfs/objects.c b/fs/netfs/objects.c
> index d32964e8ca5d..dd8241bc996b 100644
> --- a/fs/netfs/objects.c
> +++ b/fs/netfs/objects.c
> @@ -250,3 +250,15 @@ void netfs_put_subrequest(struct netfs_io_subrequest *subreq, bool was_async,
> if (dead)
> netfs_free_subrequest(subreq, was_async);
> }
> +
> +struct folio_queue *netfs_folioq_alloc(struct netfs_io_request *rreq, gfp_t gfp)
> +{
> + struct folio_queue *fq;
> +
> + fq = kmalloc(sizeof(*fq), gfp);
> + if (fq) {
> + netfs_stat(&netfs_n_folioq);
> + folioq_init(fq);
> + }
> + return fq;
> +}
> diff --git a/fs/netfs/write_issue.c b/fs/netfs/write_issue.c
> index 04e66d587f77..0929d9fd4ce7 100644
> --- a/fs/netfs/write_issue.c
> +++ b/fs/netfs/write_issue.c
> @@ -153,12 +153,22 @@ static void netfs_prepare_write(struct netfs_io_request *wreq,
> loff_t start)
> {
> struct netfs_io_subrequest *subreq;
> + struct iov_iter *wreq_iter = &wreq->io_iter;
> +
> + /* Make sure we don't point the iterator at a used-up folio_queue
> + * struct being used as a placeholder to prevent the queue from
> + * collapsing. In such a case, extend the queue.
> + */
> + if (iov_iter_is_folioq(wreq_iter) &&
> + wreq_iter->folioq_slot >= folioq_nr_slots(wreq_iter->folioq)) {
> + netfs_buffer_make_space(wreq);
> + }
>
> subreq = netfs_alloc_subrequest(wreq);
> subreq->source = stream->source;
> subreq->start = start;
> subreq->stream_nr = stream->stream_nr;
> - subreq->io_iter = wreq->io_iter;
> + subreq->io_iter = *wreq_iter;
>
> _enter("R=%x[%x]", wreq->debug_id, subreq->debug_index);
>
>
--
Thanks,
Steve
next prev parent reply other threads:[~2024-09-24 23:19 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-13 7:24 kernel test robot
2024-09-13 7:59 ` David Howells
2024-09-13 8:11 ` Christian Brauner
2024-09-18 2:24 ` Oliver Sang
2024-09-18 10:34 ` David Howells
2024-09-18 11:27 ` David Howells
2024-09-19 2:23 ` Oliver Sang
2024-09-19 7:14 ` David Howells
2024-09-20 6:36 ` Oliver Sang
2024-09-20 7:55 ` David Howells
2024-09-18 14:03 ` David Howells
2024-09-19 2:50 ` Oliver Sang
2024-09-24 21:47 ` David Howells
2024-09-24 23:19 ` Steve French [this message]
2024-09-26 2:20 ` Oliver Sang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAH2r5msFnqU-wCn4QRv1Tjh4dtomA6QbtAGmONhx5C0mduxLVg@mail.gmail.com \
--to=smfrench@gmail.com \
--cc=brauner@kernel.org \
--cc=dhowells@redhat.com \
--cc=jlayton@kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lkp@intel.com \
--cc=netfs@lists.linux.dev \
--cc=oe-lkp@lists.linux.dev \
--cc=oliver.sang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox