From: Bernard Metzler <bernard.metzler@linux.dev>
To: Pedro Falcato <pfalcato@suse.de>
Cc: Jason Gunthorpe <jgg@ziepe.ca>, Leon Romanovsky <leon@kernel.org>,
Vlastimil Babka <vbabka@suse.cz>,
Jakub Kicinski <kuba@kernel.org>,
David Howells <dhowells@redhat.com>, Tom Talpey <tom@talpey.com>,
linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-mm@kvack.org, torvalds@linux-foundation.org,
stable@vger.kernel.org, kernel test robot <oliver.sang@intel.com>
Subject: Re: [PATCH v2] RDMA/siw: Fix the sendmsg byte count in siw_tcp_sendpages
Date: Thu, 31 Jul 2025 22:19:57 +0200 [thread overview]
Message-ID: <631a1251-5bbc-484d-9bd9-167c5e7cb69f@linux.dev> (raw)
In-Reply-To: <x43xlqzuher54k3j4iwkos36jz5qkhtgxw4zh52j5cz6l2spzw@yips5h4liqbi>
On 30.07.2025 11:26, Pedro Falcato wrote:
> On Tue, Jul 29, 2025 at 08:53:02PM +0200, Bernard Metzler wrote:
>> On 29.07.2025 14:03, Pedro Falcato wrote:
>>> Ever since commit c2ff29e99a76 ("siw: Inline do_tcp_sendpages()"),
>>> we have been doing this:
>>>
>>> static int siw_tcp_sendpages(struct socket *s, struct page **page, int offset,
>>> size_t size)
>>> [...]
>>> /* Calculate the number of bytes we need to push, for this page
>>> * specifically */
>>> size_t bytes = min_t(size_t, PAGE_SIZE - offset, size);
>>> /* If we can't splice it, then copy it in, as normal */
>>> if (!sendpage_ok(page[i]))
>>> msg.msg_flags &= ~MSG_SPLICE_PAGES;
>>> /* Set the bvec pointing to the page, with len $bytes */
>>> bvec_set_page(&bvec, page[i], bytes, offset);
>>> /* Set the iter to $size, aka the size of the whole sendpages (!!!) */
>>> iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
>>> try_page_again:
>>> lock_sock(sk);
>>> /* Sendmsg with $size size (!!!) */
>>> rv = tcp_sendmsg_locked(sk, &msg, size);
>>>
>>> This means we've been sending oversized iov_iters and tcp_sendmsg calls
>>> for a while. This has a been a benign bug because sendpage_ok() always
>>> returned true. With the recent slab allocator changes being slowly
>>> introduced into next (that disallow sendpage on large kmalloc
>>> allocations), we have recently hit out-of-bounds crashes, due to slight
>>> differences in iov_iter behavior between the MSG_SPLICE_PAGES and
>>> "regular" copy paths:
>>>
>>> (MSG_SPLICE_PAGES)
>>> skb_splice_from_iter
>>> iov_iter_extract_pages
>>> iov_iter_extract_bvec_pages
>>> uses i->nr_segs to correctly stop in its tracks before OoB'ing everywhere
>>> skb_splice_from_iter gets a "short" read
>>>
>>> (!MSG_SPLICE_PAGES)
>>> skb_copy_to_page_nocache copy=iov_iter_count
>>> [...]
>>> copy_from_iter
>>> /* this doesn't help */
>>> if (unlikely(iter->count < len))
>>> len = iter->count;
>>> iterate_bvec
>>> ... and we run off the bvecs
>>>
>>> Fix this by properly setting the iov_iter's byte count, plus sending the
>>> correct byte count to tcp_sendmsg_locked.
>>>
>>> Cc: stable@vger.kernel.org
>>> Fixes: c2ff29e99a76 ("siw: Inline do_tcp_sendpages()")
>>> Reported-by: kernel test robot <oliver.sang@intel.com>
>>> Closes: https://lore.kernel.org/oe-lkp/202507220801.50a7210-lkp@intel.com
>>> Reviewed-by: David Howells <dhowells@redhat.com>
>>> Signed-off-by: Pedro Falcato <pfalcato@suse.de>
>>> ---
>>>
>>> v2:
>>> - Add David Howells's Rb on the original patch
>>> - Remove the offset increment, since it's dead code
>>>
>>> drivers/infiniband/sw/siw/siw_qp_tx.c | 5 ++---
>>> 1 file changed, 2 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c b/drivers/infiniband/sw/siw/siw_qp_tx.c
>>> index 3a08f57d2211..f7dd32c6e5ba 100644
>>> --- a/drivers/infiniband/sw/siw/siw_qp_tx.c
>>> +++ b/drivers/infiniband/sw/siw/siw_qp_tx.c
>>> @@ -340,18 +340,17 @@ static int siw_tcp_sendpages(struct socket *s, struct page **page, int offset,
>>> if (!sendpage_ok(page[i]))
>>> msg.msg_flags &= ~MSG_SPLICE_PAGES;
>>> bvec_set_page(&bvec, page[i], bytes, offset);
>>> - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
>>> + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, bytes);
>>> try_page_again:
>>> lock_sock(sk);
>>> - rv = tcp_sendmsg_locked(sk, &msg, size);
>>> + rv = tcp_sendmsg_locked(sk, &msg, bytes)
>>> release_sock(sk);
>>> if (rv > 0) {
>>> size -= rv;
>>> sent += rv;
>>> if (rv != bytes) {
>>> - offset += rv;
>>> bytes -= rv;
>>> goto try_page_again;
>>> }
>> Acked-by: Bernard Metzler <bernard.metzler@linux.dev>
>
> Thanks!
>
> Do you want to take the fix through your tree? Otherwise I suspect Vlastimil
> could simply take it (and possibly resubmit the SLAB PR, which hasn't been
> merged yet).
>
Thanks Pedro. Having Vlastimil taking care sounds good to me.
I am currently without development infrastructure (small village
in the mountains thing). And fixing the SLAB PR in the
first place would be even better.
Best,
Bernard.
next prev parent reply other threads:[~2025-07-31 20:36 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-29 12:03 Pedro Falcato
2025-07-29 18:53 ` Bernard Metzler
2025-07-30 9:26 ` Pedro Falcato
2025-07-31 20:19 ` Bernard Metzler [this message]
2025-08-01 8:12 ` Vlastimil Babka
2025-08-05 14:35 ` Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=631a1251-5bbc-484d-9bd9-167c5e7cb69f@linux.dev \
--to=bernard.metzler@linux.dev \
--cc=dhowells@redhat.com \
--cc=jgg@ziepe.ca \
--cc=kuba@kernel.org \
--cc=leon@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-rdma@vger.kernel.org \
--cc=oliver.sang@intel.com \
--cc=pfalcato@suse.de \
--cc=stable@vger.kernel.org \
--cc=tom@talpey.com \
--cc=torvalds@linux-foundation.org \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox