From: David Howells <dhowells@redhat.com>
To: Matthew Wilcox <willy@infradead.org>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>
Cc: David Howells <dhowells@redhat.com>,
Al Viro <viro@zeniv.linux.org.uk>,
Christoph Hellwig <hch@infradead.org>,
Jens Axboe <axboe@kernel.dk>, Jeff Layton <jlayton@kernel.org>,
Christian Brauner <brauner@kernel.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: [RFC PATCH 12/28] af_unix: Support MSG_SPLICE_PAGES
Date: Thu, 16 Mar 2023 15:26:02 +0000 [thread overview]
Message-ID: <20230316152618.711970-13-dhowells@redhat.com> (raw)
In-Reply-To: <20230316152618.711970-1-dhowells@redhat.com>
Make AF_UNIX sendmsg() support MSG_SPLICE_PAGES, splicing in pages from the
source iterator if given and if ITER_BVEC and copying the data in
otherwise.
This allows ->sendpage() to be replaced by something that can handle
multiple multipage folios in a single transaction.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: netdev@vger.kernel.org
---
net/unix/af_unix.c | 84 +++++++++++++++++++++++++++++++++++++---------
1 file changed, 68 insertions(+), 16 deletions(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 347122c3575e..6f3454db9c53 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -2151,6 +2151,44 @@ static int queue_oob(struct socket *sock, struct msghdr *msg, struct sock *other
}
#endif
+/*
+ * Extract pages from a BVEC-type iterator and add them to the socket buffer.
+ */
+static ssize_t unix_extract_bvec_to_skb(struct sk_buff *skb,
+ struct iov_iter *iter, ssize_t maxsize)
+{
+ const struct bio_vec *bv = iter->bvec;
+ unsigned long start = iter->iov_offset;
+ unsigned int i;
+ ssize_t ret = 0;
+
+ for (i = 0; i < iter->nr_segs; i++) {
+ size_t off, len;
+
+ len = bv[i].bv_len;
+ if (start >= len) {
+ start -= len;
+ continue;
+ }
+
+ len = min_t(size_t, maxsize, len - start);
+ off = bv[i].bv_offset + start;
+
+ if (skb_append_pagefrags(skb, bv->bv_page, off, len) < 0)
+ break;
+
+ ret += len;
+ maxsize -= len;
+ if (maxsize <= 0)
+ break;
+ start = 0;
+ }
+
+ if (ret > 0)
+ iov_iter_advance(iter, ret);
+ return ret;
+}
+
static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg,
size_t len)
{
@@ -2194,19 +2232,25 @@ static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg,
while (sent < len) {
size = len - sent;
- /* Keep two messages in the pipe so it schedules better */
- size = min_t(int, size, (sk->sk_sndbuf >> 1) - 64);
+ if (unlikely(msg->msg_flags & MSG_SPLICE_PAGES)) {
+ skb = sock_alloc_send_pskb(sk, 0, 0,
+ msg->msg_flags & MSG_DONTWAIT,
+ &err, 0);
+ } else {
+ /* Keep two messages in the pipe so it schedules better */
+ size = min_t(int, size, (sk->sk_sndbuf >> 1) - 64);
- /* allow fallback to order-0 allocations */
- size = min_t(int, size, SKB_MAX_HEAD(0) + UNIX_SKB_FRAGS_SZ);
+ /* allow fallback to order-0 allocations */
+ size = min_t(int, size, SKB_MAX_HEAD(0) + UNIX_SKB_FRAGS_SZ);
- data_len = max_t(int, 0, size - SKB_MAX_HEAD(0));
+ data_len = max_t(int, 0, size - SKB_MAX_HEAD(0));
- data_len = min_t(size_t, size, PAGE_ALIGN(data_len));
+ data_len = min_t(size_t, size, PAGE_ALIGN(data_len));
- skb = sock_alloc_send_pskb(sk, size - data_len, data_len,
- msg->msg_flags & MSG_DONTWAIT, &err,
- get_order(UNIX_SKB_FRAGS_SZ));
+ skb = sock_alloc_send_pskb(sk, size - data_len, data_len,
+ msg->msg_flags & MSG_DONTWAIT, &err,
+ get_order(UNIX_SKB_FRAGS_SZ));
+ }
if (!skb)
goto out_err;
@@ -2218,13 +2262,21 @@ static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg,
}
fds_sent = true;
- skb_put(skb, size - data_len);
- skb->data_len = data_len;
- skb->len = size;
- err = skb_copy_datagram_from_iter(skb, 0, &msg->msg_iter, size);
- if (err) {
- kfree_skb(skb);
- goto out_err;
+ if (unlikely(msg->msg_flags & MSG_SPLICE_PAGES)) {
+ size = unix_extract_bvec_to_skb(skb, &msg->msg_iter, size);
+ skb->data_len += size;
+ skb->len += size;
+ skb->truesize += size;
+ refcount_add(size, &sk->sk_wmem_alloc);
+ } else {
+ skb_put(skb, size - data_len);
+ skb->data_len = data_len;
+ skb->len = size;
+ err = skb_copy_datagram_from_iter(skb, 0, &msg->msg_iter, size);
+ if (err) {
+ kfree_skb(skb);
+ goto out_err;
+ }
}
unix_state_lock(other);
next prev parent reply other threads:[~2023-03-16 15:27 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-16 15:25 [RFC PATCH 00/28] splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES) David Howells
2023-03-16 15:25 ` [RFC PATCH 01/28] net: Declare MSG_SPLICE_PAGES internal sendmsg() flag David Howells
2023-03-16 15:25 ` [RFC PATCH 02/28] Add a special allocator for staging netfs protocol to MSG_SPLICE_PAGES David Howells
2023-03-16 17:28 ` Matthew Wilcox
2023-03-16 18:00 ` David Howells
2023-03-16 15:25 ` [RFC PATCH 03/28] tcp: Support MSG_SPLICE_PAGES David Howells
2023-03-16 18:37 ` Willem de Bruijn
2023-03-16 18:44 ` David Howells
2023-03-16 19:00 ` Willem de Bruijn
2023-03-21 0:38 ` David Howells
2023-03-21 14:22 ` Willem de Bruijn
2023-03-16 15:25 ` [RFC PATCH 04/28] tcp: Convert do_tcp_sendpages() to use MSG_SPLICE_PAGES David Howells
2023-03-16 15:25 ` [RFC PATCH 05/28] tcp_bpf: Inline do_tcp_sendpages as it's now a wrapper around tcp_sendmsg David Howells
2023-03-16 15:25 ` [RFC PATCH 06/28] espintcp: Inline do_tcp_sendpages() David Howells
2023-03-16 15:25 ` [RFC PATCH 07/28] tls: " David Howells
2023-03-16 15:25 ` [RFC PATCH 08/28] siw: " David Howells
2023-03-16 15:25 ` [RFC PATCH 09/28] tcp: Fold do_tcp_sendpages() into tcp_sendpage_locked() David Howells
2023-03-16 15:26 ` [RFC PATCH 10/28] ip, udp: Support MSG_SPLICE_PAGES David Howells
2023-03-16 15:26 ` [RFC PATCH 11/28] udp: Convert udp_sendpage() to use MSG_SPLICE_PAGES David Howells
2023-03-16 15:26 ` David Howells [this message]
2023-03-16 15:26 ` [RFC PATCH 13/28] crypto: af_alg: Indent the loop in af_alg_sendmsg() David Howells
2023-03-16 15:26 ` [RFC PATCH 14/28] crypto: af_alg: Support MSG_SPLICE_PAGES David Howells
2023-03-16 15:26 ` [RFC PATCH 15/28] crypto: af_alg: Convert af_alg_sendpage() to use MSG_SPLICE_PAGES David Howells
2023-03-16 15:26 ` [RFC PATCH 16/28] splice, net: Use sendmsg(MSG_SPLICE_PAGES) rather than ->sendpage() David Howells
2023-03-16 15:26 ` [RFC PATCH 17/28] Remove file->f_op->sendpage David Howells
2023-03-16 15:26 ` [RFC PATCH 18/28] siw: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage to transmit David Howells
2023-03-16 15:26 ` [RFC PATCH 19/28] ceph: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage David Howells
2023-03-16 15:26 ` [RFC PATCH 20/28] iscsi: " David Howells
2023-03-16 15:26 ` [RFC PATCH 21/28] tcp_bpf: Make tcp_bpf_sendpage() go through tcp_bpf_sendmsg(MSG_SPLICE_PAGES) David Howells
2023-03-16 15:26 ` [RFC PATCH 22/28] net: Use sendmsg(MSG_SPLICE_PAGES) not sendpage in skb_send_sock() David Howells
2023-03-16 15:26 ` [RFC PATCH 23/28] algif: Remove hash_sendpage*() David Howells
2023-03-16 15:26 ` [RFC PATCH 24/28] ceph: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage() David Howells
2023-03-16 15:26 ` [RFC PATCH 25/28] rds: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage David Howells
2023-03-16 15:26 ` [RFC PATCH 26/28] dlm: " David Howells
2023-03-16 15:26 ` [RFC PATCH 27/28] sunrpc: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage David Howells
2023-03-16 16:17 ` Trond Myklebust
2023-03-16 17:10 ` Chuck Lever III
2023-03-16 17:28 ` David Howells
2023-03-16 17:41 ` Chuck Lever III
2023-03-16 21:21 ` David Howells
2023-03-17 15:29 ` Chuck Lever III
2023-03-16 16:24 ` David Howells
2023-03-16 17:23 ` Trond Myklebust
2023-03-16 18:06 ` David Howells
2023-03-16 19:01 ` Trond Myklebust
2023-03-22 13:10 ` David Howells
2023-03-22 18:15 ` [RFC PATCH] iov_iter: Add an iterator-of-iterators David Howells
2023-03-22 18:47 ` Trond Myklebust
2023-03-22 18:49 ` Matthew Wilcox
[not found] ` <20230316152618.711970-29-dhowells@redhat.com>
2023-03-16 15:57 ` [RFC PATCH 28/28] sock: Remove ->sendpage*() in favour of sendmsg(MSG_SPLICE_PAGES) Marc Kleine-Budde
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230316152618.711970-13-dhowells@redhat.com \
--to=dhowells@redhat.com \
--cc=axboe@kernel.dk \
--cc=brauner@kernel.org \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=hch@infradead.org \
--cc=jlayton@kernel.org \
--cc=kuba@kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox