From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A8A8EC6FD1D for ; Thu, 30 Mar 2023 14:28:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 395076B0072; Thu, 30 Mar 2023 10:28:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 345996B0074; Thu, 30 Mar 2023 10:28:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1E6DA900002; Thu, 30 Mar 2023 10:28:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 10BB56B0072 for ; Thu, 30 Mar 2023 10:28:15 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id D7B611C6627 for ; Thu, 30 Mar 2023 14:28:14 +0000 (UTC) X-FDA: 80625794508.03.A21530F Received: from mail-qt1-f170.google.com (mail-qt1-f170.google.com [209.85.160.170]) by imf30.hostedemail.com (Postfix) with ESMTP id 0A1B98000B for ; Thu, 30 Mar 2023 14:28:12 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=Q9QTrHcq; spf=pass (imf30.hostedemail.com: domain of willemdebruijn.kernel@gmail.com designates 209.85.160.170 as permitted sender) smtp.mailfrom=willemdebruijn.kernel@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680186493; a=rsa-sha256; cv=none; b=4VMeF9H4C5Gw8M1IqspdIBurzSWl0l9dNl0KFmgBBPB/g9iROzDo8bFRc99Vb6kgbr+VSs YxMMpmgRlWLtUKm+Wx0YkJPD+jt8op4v9WPpC+mvrzl3E/PNams+3X2rZ46uKDxkZYfTBV yMsDPs3VH5NAWjQuHnrxxQ7QzJeYkJQ= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=Q9QTrHcq; spf=pass (imf30.hostedemail.com: domain of willemdebruijn.kernel@gmail.com designates 209.85.160.170 as permitted sender) smtp.mailfrom=willemdebruijn.kernel@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680186493; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1KLboLDFLFzlinaqIOWtCd3jSre3IkygbzwNJevdg7c=; b=OmFDFUyXQ9Kb3d2zpT0tnO9IKtkRlCjQ4q+x2VcyJAMdvoVNnj4jt0RAifH1X/mJXiFe7+ lXCAolkoOIcFef2IVVGcwnL6hHb1oz5XD6m0wmZW/5mBM4vErPHYoDM7jWfLHhWvaT5B09 vi3PF6UNkGp8J3TcpNV40XI+mp50N7g= Received: by mail-qt1-f170.google.com with SMTP id cn12so15235733qtb.8 for ; Thu, 30 Mar 2023 07:28:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1680186492; x=1682778492; h=content-transfer-encoding:mime-version:subject:references :in-reply-to:message-id:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=1KLboLDFLFzlinaqIOWtCd3jSre3IkygbzwNJevdg7c=; b=Q9QTrHcqxShw4kQSVsdfF/J/Go0nb9iN1YwIbz6HNb7Sxc0w95lSr2ncpeaD61mPDI 6/iUOOs200KePJ0wKlb4M0HsgYFO6bk1ba9BMFooPfKHlFvV5FxXHFHn4jbtJhN2ZZV9 Ml0RZxBNONkwjOjmRz6Vt75KrAuUjB6UsgkDSwhDtNMc0en16xY3h3VVwF1W+tgefbqF KKtv6OMeQbP6KFA56uqnyY7UGml9LSp0O5bG/av1iyJDhBjhHdlj/K2Trily5Uzi6tIm Sx0lPExcInJ53U/wyRVWjTwHp7gbYjF1ay8/WW3NgcYWvsnCspt5hdljPXZG8wxNJcZL w2DQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680186492; x=1682778492; h=content-transfer-encoding:mime-version:subject:references :in-reply-to:message-id:cc:to:from:date:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=1KLboLDFLFzlinaqIOWtCd3jSre3IkygbzwNJevdg7c=; b=sKtxEGq/ZkF1Dl2YXv+a6uQbTOxqK4W3JnfxQDMKvrnS+CaysDSOpRCJ4h0eARMk/o MP+e+iJhIzC49FkNTrHiJokpn/mKnhhVa9Blz4S/LUkQ+nJTGq2ZbTz5ew17Or31NUov Y8KaVUTBEhGPVQxeHI+3Tqc3VbQXVzlzoC2rsNOi6Ob5QT2CnIZ6pM/tPCKNXmpSPBYM evqmWvmIKHA6rOX/qoHYgXHAn/nBrrlgJis/dsrkkz0OEKgp5Iv6HJxLDRJDA6EKHeUy iEeruJ2pEkhdb0eqlruJ7Nts9TlxOKtojJjqMnXKyrjrGb+qyNV4gXxG3uzHY6BFoQkV kZnw== X-Gm-Message-State: AO0yUKUPnJ9eUnX6CHARSPvaqi7QI1aPLd6v6LRTa008lXMsw7w7+xtg FP7jUycKZv1LCtdLlEqdUJfJKL3ziDY= X-Google-Smtp-Source: AK7set8e8WQk4TNn0PFFyPcyzGMxTO7qRHPU2IQ/nKsQX3YoS0JabNwGhzLrLgYoxvq+OUII8mhZeA== X-Received: by 2002:ac8:57c2:0:b0:3dc:38e:8b7 with SMTP id w2-20020ac857c2000000b003dc038e08b7mr38360724qta.2.1680186492069; Thu, 30 Mar 2023 07:28:12 -0700 (PDT) Received: from localhost (240.157.150.34.bc.googleusercontent.com. [34.150.157.240]) by smtp.gmail.com with ESMTPSA id w2-20020ac87182000000b003b9a6d54b6csm16932858qto.59.2023.03.30.07.28.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Mar 2023 07:28:11 -0700 (PDT) Date: Thu, 30 Mar 2023 10:28:11 -0400 From: Willem de Bruijn To: David Howells , Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Willem de Bruijn Message-ID: <64259c7b2b327_21883920818@willemb.c.googlers.com.notmuch> In-Reply-To: <20230329141354.516864-5-dhowells@redhat.com> References: <20230329141354.516864-1-dhowells@redhat.com> <20230329141354.516864-5-dhowells@redhat.com> Subject: RE: [RFC PATCH v2 04/48] net: Declare MSG_SPLICE_PAGES internal sendmsg() flag Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Queue-Id: 0A1B98000B X-Rspamd-Server: rspam01 X-Stat-Signature: br3n4ppmyxkpzjkxdktp46spryo9fib7 X-HE-Tag: 1680186492-821514 X-HE-Meta: U2FsdGVkX1+7fPlYYsD8RMeKig1rwG9GMb4/VKt0XUMXKr6G7MJ/JpEDUutEgxHvwP/Z2AcW87MK0PskME7sPwFZw9U2QrLAtt/WGB+c5z50KNZ66GZQu3PQA4hy1vrWsnlkIVh4SWfhDK1PWBGs9In6GPMhZ6RQGq+svQqGvpBSUZDiiWujRQEulFfE7PNX7v2O39tHEa78DbldXaTum8BFULN0TfnaSrUiynkN15K5oVmfdFeG+l967PQBAkOzcABIifbLuixSEAN5eUIv+ToQwWZv683KyS/LJ5VQ/Gesty+aWc/bEGoEg3I5D568evFu7NdcNhr2AC0ILmlRkAd9hR0szrR/4jdq9XmdbV+eT2TR8mGEVVbDYv9WHzv9PAQ2XkETt41yOXfAUEzN9A0ek0EwZgCp6FMJwvNpPb8wryoYrm/kMvlItAYmNZIhSDKo/23Gb/LEN5WQqDGRziCyxBIZVedAem18abwsZcN86HYIyRi+JsqRqheK9M+xPfrFuYesN1FLlnYPJbip/QHlnaZxWV8kVDyjQLoAX0wDnxOZFHeCX2h8BraRoUPWpn1wnRhaan0/OT8F0/Wz6n/4azhnpskhBswvHYG5tmexjN7ZU3zFPxMuVrmXjkMmw2tLGYH4PZqHzbuNhO/GOqiXaFmgCvB9zmkIwoabIS7eshQYFDnSy093Ojl+Vy+LkqerDZllc8mRki+xr7hPZFFBemczdl3Kn0nKjXu/a9izJb6Ah/kxG5xxxfQpSdAqwh/k/IJRSEvGeQY7PI/HlXPHDSDSCmx9+LsK5iXPKGaQobctAlhLPpXjKQmsbeAgnc1+PxGfXgVNzG4Up0Dfp80tFKjCWrPzzcMHy2+gqsArH9+/Nfv/mq5xjGq22dQzXd/vY734KTghdElJ1aZfLBvSKWcTKloVvpdh+HmP+utNkse0/p1nNmtPBJVyzD81x9j7QbR/Dg3QeoXIc7A vwUqQpqd c4hGR04M2XJ1zQbdRec8KahcfsKMxmPN4AmTxPqeeSggg4gmVLyV+Ma02m3QQwavcb1cRL8hIrM732xrnbQY+3BiMaJTmpoPVuXPyt6kVmoWDCM5/dRgwsk0u+vM+Ly4VKou1ViHgJxHt/XuOgMsJG7CBT25BlcjnFlz6Hld1pCgHzyZIIz5jQIIoMDKv3I3y0smUdhu/VohgMzishlIDwYT3G9eNBhKO0b83lKD/TDPq7QBu5woEmNx7jJjlfA9CpA4ja+qDH6vhvLE4mc9ls+yOZE1ocNeIFgVDWC2dICVhuMXXibBXg8P2n4u6VxzrJzQ2/k5W1JMJBT9FbatAFaKmdUvh8l5L0STezmyswGpG1TMzwnMQNCmND9OMdwqD9apvfT6wNJbTFB6HZmHEIcSanvERJtNbonoj7bc1l0yoJSdJx3cNnsa1RZHF4J49vm5UOYaqZ822TE3euvqkzejoWKk1HAgy1YCP/atgSAFkgRf26AxnUFFyc10eSiI/lXl0D4PQCIgLVtWC+CP1xm7JSq3Ij6BCQj/L1yMH4LSgKqTSRcqrtZxxstEcuYVZPMt2sNP5lEqo41IgSEy6YV5k3Hn9ADyYraMNKrxExRD9jtBc2iE/jo08NiqG2a7I8Lks09eWlxsgMwIblGzNx4HAjMHSdrLi86vbSk8d32E1x05bBsQMjEpAMqmjXbnSM5NzA/XiqC6rlISXBkhOuAxaiVvyBm+znLQpsW1HUGWwS2eYuvC6JsNNdUH6vx2D7u++ud3uBmgw9G+8G1XriEnsYQRmlHTnuCL9sBKawt5sTRAZowJQCfRSvVKCM/mtAV5H419PMfnrbgw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: David Howells wrote: > Declare MSG_SPLICE_PAGES, an internal sendmsg() flag, that hints to a > network protocol that it should splice pages from the source iterator > rather than copying the data if it can. This flag is added to a list that > is cleared by sendmsg and recvmsg syscalls on entry. > > This is intended as a replacement for the ->sendpage() op, allowing a way > to splice in several multipage folios in one go. > > Signed-off-by: David Howells > cc: Willem de Bruijn > cc: "David S. Miller" > cc: Eric Dumazet > cc: Jakub Kicinski > cc: Paolo Abeni > cc: Jens Axboe > cc: Matthew Wilcox > cc: netdev@vger.kernel.org > --- > include/linux/socket.h | 3 +++ > net/socket.c | 7 +++++++ > 2 files changed, 10 insertions(+) > > diff --git a/include/linux/socket.h b/include/linux/socket.h > index 13c3a237b9c9..c2fa0f800999 100644 > --- a/include/linux/socket.h > +++ b/include/linux/socket.h > @@ -327,6 +327,7 @@ struct ucred { > */ > > #define MSG_ZEROCOPY 0x4000000 /* Use user data in kernel path */ > +#define MSG_SPLICE_PAGES 0x8000000 /* Splice the pages from the iterator in sendmsg() */ > #define MSG_FASTOPEN 0x20000000 /* Send data in TCP SYN */ > #define MSG_CMSG_CLOEXEC 0x40000000 /* Set close_on_exec for file > descriptor received through > @@ -337,6 +338,8 @@ struct ucred { > #define MSG_CMSG_COMPAT 0 /* We never have 32 bit fixups */ > #endif > > +/* Flags to be cleared on entry by sendmsg, recvmsg, sendmmsg and recvmmsg syscalls */ > +#define MSG_INTERNAL_FLAGS (MSG_SPLICE_PAGES) This is fine, but there is no real need to cover both send and receive. The sendpage internal flags only ensure that those flags cannot enter sendpage code from any unintentional path. Indeed those "internal" flags can end up in sendmsg, at least for UDP. Similarly, this flag set only has to protect sendto and sendmsg. That can simplify the patch a bit. > /* Setsockoptions(2) level. Thanks to BSD these must match IPPROTO_xxx */ > #define SOL_IP 0 > diff --git a/net/socket.c b/net/socket.c > index 6bae8ce7059e..dfb912bbed62 100644 > --- a/net/socket.c > +++ b/net/socket.c > @@ -2139,6 +2139,7 @@ int __sys_sendto(int fd, void __user *buff, size_t len, unsigned int flags, > msg.msg_name = (struct sockaddr *)&address; > msg.msg_namelen = addr_len; > } > + flags &= ~MSG_INTERNAL_FLAGS; > if (sock->file->f_flags & O_NONBLOCK) > flags |= MSG_DONTWAIT; > msg.msg_flags = flags; > @@ -2192,6 +2193,7 @@ int __sys_recvfrom(int fd, void __user *ubuf, size_t size, unsigned int flags, > if (!sock) > goto out; > > + flags &= ~MSG_INTERNAL_FLAGS; > if (sock->file->f_flags & O_NONBLOCK) > flags |= MSG_DONTWAIT; > err = sock_recvmsg(sock, &msg, flags); > @@ -2579,6 +2581,7 @@ long __sys_sendmsg(int fd, struct user_msghdr __user *msg, unsigned int flags, > > if (forbid_cmsg_compat && (flags & MSG_CMSG_COMPAT)) > return -EINVAL; > + flags &= ~MSG_INTERNAL_FLAGS; > > sock = sockfd_lookup_light(fd, &err, &fput_needed); > if (!sock) > @@ -2627,6 +2630,7 @@ int __sys_sendmmsg(int fd, struct mmsghdr __user *mmsg, unsigned int vlen, > entry = mmsg; > compat_entry = (struct compat_mmsghdr __user *)mmsg; > err = 0; > + flags &= ~MSG_INTERNAL_FLAGS; > flags |= MSG_BATCH; > No need to modify __sys_sendmmsg explicitly, as it ends up calling __sys_sendmsg? Also, sendpage does this flags masking in the internal sock_FUNC helpers rather than __sys_FUNC. Might be preferable. > while (datagrams < vlen) { > @@ -2775,6 +2779,7 @@ long __sys_recvmsg_sock(struct socket *sock, struct msghdr *msg, > struct user_msghdr __user *umsg, > struct sockaddr __user *uaddr, unsigned int flags) > { > + flags &= ~MSG_INTERNAL_FLAGS; > return ____sys_recvmsg(sock, msg, umsg, uaddr, flags, 0); > } > > @@ -2787,6 +2792,7 @@ long __sys_recvmsg(int fd, struct user_msghdr __user *msg, unsigned int flags, > > if (forbid_cmsg_compat && (flags & MSG_CMSG_COMPAT)) > return -EINVAL; > + flags &= ~MSG_INTERNAL_FLAGS; > > sock = sockfd_lookup_light(fd, &err, &fput_needed); > if (!sock) > @@ -2839,6 +2845,7 @@ static int do_recvmmsg(int fd, struct mmsghdr __user *mmsg, > goto out_put; > } > } > + flags &= ~MSG_INTERNAL_FLAGS; > > entry = mmsg; > compat_entry = (struct compat_mmsghdr __user *)mmsg; >