From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A2C4DC6FD1D for ; Tue, 4 Apr 2023 16:58:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 280766B0071; Tue, 4 Apr 2023 12:58:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 230486B0074; Tue, 4 Apr 2023 12:58:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0D1456B0075; Tue, 4 Apr 2023 12:58:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id ECCE06B0071 for ; Tue, 4 Apr 2023 12:58:34 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id C1682C0D5D for ; Tue, 4 Apr 2023 16:58:34 +0000 (UTC) X-FDA: 80644317348.21.3008757 Received: from mail-qv1-f51.google.com (mail-qv1-f51.google.com [209.85.219.51]) by imf15.hostedemail.com (Postfix) with ESMTP id F095BA001F for ; Tue, 4 Apr 2023 16:58:31 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=TxXEJvll; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf15.hostedemail.com: domain of willemdebruijn.kernel@gmail.com designates 209.85.219.51 as permitted sender) smtp.mailfrom=willemdebruijn.kernel@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680627512; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=NsG84S7OAbnWjOyHKzhM0ljkFZxh+zI+v+7k2r+dTjw=; b=mvrwM/WReMufOvpWYQVzEuE5kcWj2NDVWSvS9etqnchwkA6B4gDlCsF3kIRlhXZaxA8ZV1 +45RbZTTllKGHQ8F99nLLFtb0tMUmiPCM4TULX5YiQ+2ttyyuZkvHyOqMOVmo2FsBanIOd HTaDwsQgQcVef+oO4Od67Zpt3Zao4uk= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=TxXEJvll; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf15.hostedemail.com: domain of willemdebruijn.kernel@gmail.com designates 209.85.219.51 as permitted sender) smtp.mailfrom=willemdebruijn.kernel@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680627512; a=rsa-sha256; cv=none; b=gtAXIPpfCKTMX2ZI+TG2zqSMMmXRzZqLi1D2p7vaMXcFVwBP09oWkRwsVOKyXYvI0URLNy iv0ZAbZYL0E08z7qi2t6ZD8DJAOud1CGMS+KCmiRnEZ72hsw6V8UIjGtAm+nDg/Ev6ODV1 8w6pEMHfZgE0gNHK3xuTSuNNkPLCLwg= Received: by mail-qv1-f51.google.com with SMTP id ny16so736450qvb.4 for ; Tue, 04 Apr 2023 09:58:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1680627511; x=1683219511; h=content-transfer-encoding:mime-version:subject:references :in-reply-to:message-id:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=NsG84S7OAbnWjOyHKzhM0ljkFZxh+zI+v+7k2r+dTjw=; b=TxXEJvllS1G2uEJdcucUV/OdoTfbbuy6q+sunSzArW/cGw7ffjtYQhh/JEQXfBE9Iw jzLJrHSSKfONnL2wYv+Z7hgJ6SpaV3D7wvQycWNxiYwJTk1j5N2jT4BO4Vyvr5SUHDsi axTLPG5zQb6VET9TJsWPOcc0Adif+MF7PCFbnQwYXFSAKR5QpoaG55hDjkiBBa78MdhI IzBY3xNonCUnkbSF9oMxekuSP1Si12vhSpY9esK5uGyaSdbnZa/v/qog2HE08RDPJPs/ ofqGgDOz3Rtng8dzyDkvldqST2IO6SPNaYRcu2gkvc+pYHzxpwn1Qz+T2AJ9hkiE6aK7 BfSg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680627511; x=1683219511; h=content-transfer-encoding:mime-version:subject:references :in-reply-to:message-id:cc:to:from:date:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=NsG84S7OAbnWjOyHKzhM0ljkFZxh+zI+v+7k2r+dTjw=; b=oNyDxYE81Ydrb6jCnpJoBXvXNo5zDDiNuTNHnS5EhnbYGy5PWWxX6ReWJT1pUUdENp OBckQGrJS3Jc04nCJAwvJ0y6/7ud8sD/BNNPMGN1kMOKFUb7EokD4s4Y3jcrDdEtQi5B 77AmGpCV7M3/sx2EXYhcRIJ+hvSqDm3UopI1O/R3k48fAZw9EIZivvvMPdh2kqo0iigB GC0j2sghaiLpBYQ3OltoK3/hovB39TGpUgiG+y/wVHRcnwZ/ewvAm5cRZddl0DAV9lYx XjssXI5xyc39PddYXIiuVOPPQLdcnN7vDvRG/esDZ2W/PO2t2WjO+qO0RtzNKiGs6BtF GeyQ== X-Gm-Message-State: AAQBX9du9UivjklGvE0axxpHhx3a9J0kgDMv4tP6sIv9wlsreGoV3MlE mzbXCtDDF2uW/KvdokX/i48= X-Google-Smtp-Source: AKy350Ys/ftAGsooASf/PYV9c6Rr7BEpf81yJlJmJfLltppTsxAFgDL6GHHPG2KWeVbU/J+Z+D+KSw== X-Received: by 2002:a05:6214:20c7:b0:5d8:ed66:309e with SMTP id 7-20020a05621420c700b005d8ed66309emr5366142qve.11.1680627511125; Tue, 04 Apr 2023 09:58:31 -0700 (PDT) Received: from localhost (240.157.150.34.bc.googleusercontent.com. [34.150.157.240]) by smtp.gmail.com with ESMTPSA id mk23-20020a056214581700b005dd8b9345c6sm3526920qvb.94.2023.04.04.09.58.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Apr 2023 09:58:26 -0700 (PDT) Date: Tue, 04 Apr 2023 12:58:25 -0400 From: Willem de Bruijn To: David Howells , Willem de Bruijn Cc: dhowells@redhat.com, Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Message-ID: <642c5731a7cc5_337e2c208b0@willemb.c.googlers.com.notmuch> In-Reply-To: <2258798.1680559496@warthog.procyon.org.uk> References: <642ad8b66acfe_302ae1208e7@willemb.c.googlers.com.notmuch> <64299af9e8861_2d2a20208e6@willemb.c.googlers.com.notmuch> <20230331160914.1608208-1-dhowells@redhat.com> <20230331160914.1608208-16-dhowells@redhat.com> <1818504.1680515446@warthog.procyon.org.uk> <2258798.1680559496@warthog.procyon.org.uk> Subject: Re: [PATCH v3 15/55] ip, udp: Support MSG_SPLICE_PAGES Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: F095BA001F X-Stat-Signature: 8xg1t3q1rg8mpb8px4b1aouj3zw3516m X-HE-Tag: 1680627511-468428 X-HE-Meta: U2FsdGVkX18Jf5wnapKMFBzbrM732F2QahrsbeNaxg8aQnTbqKqwWOboYom4sdJkf/GqEvY/zAHWAFBg6te9jOxbEpLDZtMuceCGvELLc0pInbd33nqhQdbEIjauNAVDlqrG0wyFk2SCPtZsAjremWnoCaHrflWCqrG46UPusCm3mwqSsyuuUQJkpCt+W+rlQSOXhQqQ+8HfoJAmsRNri6+Xy3L1+VEyevn4vpxAhjuB5CCubS525jNURdk4ohRDPBN7EjvMfDfKCUuK4yYybQ4q9Yana5HRbiXyYvGi/5UGqVJLLLPhJNJFvzX8VmWyHzyX7cNnaWF/abiNCMXK463eLp2NCl7Twgn0RXjG/0NR1Pp0HKfz6UALMfzT3p9Zc70IGe3jPryDlJdD/B33RMJjqUDATFck0GDu6nDXKrZG9ZfTw/EElEf1HR8gIU0BbqdCSGeajN4qO9fBmGAHoShRW74Ximz1jhz348tmICSHbFPVdFmuUaCiju/0XhEuMaNJGFkcmccFjlUo8+vk6zfLNxlUDhrB1mOacIYmvl7AZUVDjbsDNj5Z20uen6sbZKzyGP4vuSE+x3rh2tL6kChU9GbW2fmVPNTu/mJ+Im7XpmEQjYBSTPpxgiIx1B/lNXY18vZDe8eR4AdJFyGhWl8dfkRl1suPaO5RvC3zWpO2coqFiKH6J5Wv+4wdEgBVd7XjG23J0isRlEjca0hQLJFlIsU9ZGHzMxIlQE6rc1zLfi98USMnldeJur8g6VtVJdjTg9r36fsKbW5vhcGVz8dzQT49muHnlSFvGFDhgLQxksU0NpwthXPUMtZJTfQYjYMcZ3iSAZ5aQP9nrWVTHHrZefX///d9WTYX7d0iLiJPLRkM/F/feEj5as21rMs6VKLHo8raItulgcM8AuPyVn+wxq698Zzslm14QT64rvfwHw6En8DRzLPg7pwp0PDyRJneg4a3cV1v4VVkzYo mYBAnryV zInmKM9eSE5xvRjQCWvabpobC3xcbEyjJqAm7aKSEeHe05AmdNGqvTYpnjZlawiEUERK7Q1ssLPZnYJChBglFCz5AY9RsZA32JoLaSu5RrvmKafPwjrFNB2fpcWVEu2ENVpud0Lx+QiVXn4W9FRZBW6lhVt5XGjleLF/x8JedNMQ+BXpkD1RO71hhA/QZaiFDFMjvLrULiqN+g4HDLk/rnsLqng8U+EVXeO4aFCQxt7hXAf/tOlDBM1Pm62ZuLarz7vUJChsziOyM/FuZ/Eryn0W57+oIuiLOO89QOEZN+nn8dFeC4aQNopMKWy2BatZwtOK2rtMBNZ2r8gHeAq23GwO+6HswYzA8BNZ19DNRyrizxSiVEjdYnr6ofLptyuAIvL4fd55lBECvbpXldCts9f0TZZKwij88UmDXZqHU3kiUydXDtII3U3ONxRfpI8Fg6SJ+0fOfHK3HBQhNp7MQuZCNgA/XZSzNlxMae4ftkkT/MMxxecUovCL62qoBJEsA4UsBOoZttifA3J2IfTzGonN9KveCBJp7CMs89oBZR/OX2HSNpQJqZSgZOwbGvFQinj6RMeIgqJ/+qihy8artfH9wHlUJyBRMUbx7yGwWi6ajgGYIZM5ZT5RlBkI7AO5qga7LibbCiIu6Opa+9aVyXgUJYGlwSWJPw9AHVObKrkYYSC+0QidxePYq8ZP2cpQOKDH81VtRqdTewcBcy9YL7yjPcJfQmEUlj/Kq+TUBa1qLjtQ0Zllo7gfvYyt80Tg5by+74Dkh48gZXIWWmSQ4UKpZHA7ZW09D1z4amak7utAVnkhO+bxNSThMkw2jAogH3VaC9q4LrG+LTOE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: David Howells wrote: > Willem de Bruijn wrote: > > > The code already has to avoid allocation in the MSG_ZEROCOPY case. I > > added alloc_len and paged_len for that purpose. > > > > Only the transhdrlen will be copied with getfrag due to > > > > copy = datalen - transhdrlen - fraggap - pagedlen > > > > On next iteration in the loop, when remaining data fits in the skb, > > there are three cases. The first is skipped due to !NETIF_F_SG. The > > other two are either copy to page frags or zerocopy page frags. > > > > I think your code should be able to fit in. Maybe easier if it could > > reuse the existing alloc_new_skb code to copy the transport header, as > > MSG_ZEROCOPY does, rather than adding a new __ip_splice_alloc branch > > that short-circuits that. Then __ip_splice_pages also does not need > > code to copy the initial header. But this is trickier. It's fine to > > leave as is. > > > > Since your code currently does call continue before executing the rest > > of that branch, no need to modify any code there? Notably replacing > > length with initial_length, which itself is initialized to length in > > all cases expect for MSG_SPLICE_PAGES. > > Okay. How about the attached? This seems to work. Just setting "paged" to > true seems to do the right thing in __ip_append_data() when allocating / > setting up the skbuff, and then __ip_splice_pages() is called to add the > pages. If this works, much preferred. Looks great to me. As said, then __ip_splice_pages() probably no longer needs the preamble to copy initial header bytes. > David > --- > commit 9ac72c83407c8aef4be0c84513ec27bac9cfbcaa > Author: David Howells > Date: Thu Mar 9 14:27:29 2023 +0000 > > ip, udp: Support MSG_SPLICE_PAGES > > Make IP/UDP sendmsg() support MSG_SPLICE_PAGES. This causes pages to be > spliced from the source iterator. > > This allows ->sendpage() to be replaced by something that can handle > multiple multipage folios in a single transaction. > > Signed-off-by: David Howells > cc: Willem de Bruijn > cc: "David S. Miller" > cc: Eric Dumazet > cc: Jakub Kicinski > cc: Paolo Abeni > cc: Jens Axboe > cc: Matthew Wilcox > cc: netdev@vger.kernel.org > > diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c > index 6109a86a8a4b..fe2e48874191 100644 > --- a/net/ipv4/ip_output.c > +++ b/net/ipv4/ip_output.c > @@ -956,6 +956,41 @@ csum_page(struct page *page, int offset, int copy) > return csum; > } > > +/* > + * Add (or copy) data pages for MSG_SPLICE_PAGES. > + */ > +static int __ip_splice_pages(struct sock *sk, struct sk_buff *skb, > + void *from, int *pcopy) > +{ > + struct msghdr *msg = from; > + struct page *page = NULL, **pages = &page; > + ssize_t copy = *pcopy; > + size_t off; > + int err; > + > + copy = iov_iter_extract_pages(&msg->msg_iter, &pages, copy, 1, 0, &off); > + if (copy <= 0) > + return copy ?: -EIO; > + > + err = skb_append_pagefrags(skb, page, off, copy); > + if (err < 0) { > + iov_iter_revert(&msg->msg_iter, copy); > + return err; > + } > + > + if (skb->ip_summed == CHECKSUM_NONE) { > + __wsum csum; > + > + csum = csum_page(page, off, copy); > + skb->csum = csum_block_add(skb->csum, csum, skb->len); > + } > + > + skb_len_add(skb, copy); > + refcount_add(copy, &sk->sk_wmem_alloc); > + *pcopy = copy; > + return 0; > +} > + > static int __ip_append_data(struct sock *sk, > struct flowi4 *fl4, > struct sk_buff_head *queue, > @@ -1047,6 +1082,15 @@ static int __ip_append_data(struct sock *sk, > skb_zcopy_set(skb, uarg, &extra_uref); > } > } > + } else if ((flags & MSG_SPLICE_PAGES) && length) { > + if (inet->hdrincl) > + return -EPERM; > + if (rt->dst.dev->features & NETIF_F_SG) { > + /* We need an empty buffer to attach stuff to */ > + paged = true; > + } else { > + flags &= ~MSG_SPLICE_PAGES; > + } > } > > cork->length += length; > @@ -1206,6 +1250,10 @@ static int __ip_append_data(struct sock *sk, > err = -EFAULT; > goto error; > } > + } else if (flags & MSG_SPLICE_PAGES) { > + err = __ip_splice_pages(sk, skb, from, ©); > + if (err < 0) > + goto error; > } else if (!zc) { > int i = skb_shinfo(skb)->nr_frags; > >