From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E7C6BC761AF for ; Mon, 3 Apr 2023 13:46:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5EFA66B0071; Mon, 3 Apr 2023 09:46:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 59ED46B0072; Mon, 3 Apr 2023 09:46:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 440206B0074; Mon, 3 Apr 2023 09:46:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 339BB6B0071 for ; Mon, 3 Apr 2023 09:46:35 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id F09CBC0A2E for ; Mon, 3 Apr 2023 13:46:34 +0000 (UTC) X-FDA: 80640204708.21.3EE0ECF Received: from mail-qt1-f172.google.com (mail-qt1-f172.google.com [209.85.160.172]) by imf19.hostedemail.com (Postfix) with ESMTP id 0D5E21A0017 for ; Mon, 3 Apr 2023 13:46:31 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=aZAY6OIA; spf=pass (imf19.hostedemail.com: domain of willemdebruijn.kernel@gmail.com designates 209.85.160.172 as permitted sender) smtp.mailfrom=willemdebruijn.kernel@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680529592; a=rsa-sha256; cv=none; b=Khzr2dE8vq+kufF9YpvIRZwSGBDsA0OKo3ZGVwI+YQSHvadLfmPE4sV5eaHQANasm9rfaf uAbR4WbBlVriKfDEOqwTe4QQzR8CrwSqZ5Xyyyb5jbZg1KAf/G59TuooA9J/bIS+2cuEEw zzLYyglHFDb4KYofBzIiZmwX/9BMB58= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=aZAY6OIA; spf=pass (imf19.hostedemail.com: domain of willemdebruijn.kernel@gmail.com designates 209.85.160.172 as permitted sender) smtp.mailfrom=willemdebruijn.kernel@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680529592; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=lmBfHIegQyI1FRFG7cpgsbaaSUI++p7T5PaE/RkUk90=; b=VsoLmio94vmV3yXSF36ax58J55LDniO6u8Odc389LOw4bB2eJOZWKGq0vlv5o0LUXkLkDX aAd6QqPff/yLOsB1IbI36CQHANtLQMrbk+rnTTxI5M3CWDzNnQY+dHDukoUcZJZMj2ad9H tRBiU2S5Q8Rr87jU7Ns6ZB5nBP3m234= Received: by mail-qt1-f172.google.com with SMTP id cj15so20829390qtb.5 for ; Mon, 03 Apr 2023 06:46:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1680529591; x=1683121591; h=content-transfer-encoding:mime-version:subject:references :in-reply-to:message-id:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=lmBfHIegQyI1FRFG7cpgsbaaSUI++p7T5PaE/RkUk90=; b=aZAY6OIApLntdoao50EqoWIIO78Bd6VomixfHjKViIWPqX9L5HsySoUqj9WkCqTOo6 +YAEn+5eyrJX85NpSFKwhQTkgFkgZ6HUb0PoGHuhOSYoyfnnMjHH/8dJ5sKtzRC3z7Ll EwVVngSnBy1NB4hrEWHD54MDD6sSG0OYyRORtk+GZqbWqP3Mke7SLfSt6Tf6QYste1th GuSr5YLq24iuIUL5lOQShiylIg4whh+j0V4slhAPA423iFofpjctOb+VSvoTv49PmNYl 6IktRcEXJU1oxwL6IcbgJi2RW0Bycc0zvP7r0WX2Lxp/0qhEzH0gLD2ex455Zu6kMl2b rEpg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680529591; x=1683121591; h=content-transfer-encoding:mime-version:subject:references :in-reply-to:message-id:cc:to:from:date:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=lmBfHIegQyI1FRFG7cpgsbaaSUI++p7T5PaE/RkUk90=; b=bA+ojeFtKOCewzUSQF1fHPoYK1xp0nqJA97tk1Uqcsvu7EKivbCNzQVm1Amb8bU5JM 8KYKuFCLPqcMcoNesdWMMq96kgFYI/JPqd+iWZUdEbNFkODKCdezT85Qy7pHwqTypDib /N2hcQIf4vRq95C90v+NFQc6NVsbF7N6zKcd5VA/INkveNRuNvOHJb8cUKYtPJ4KtCuL REHRRDZMMgZjbSiuRkRG9299Hq/4fQFrGrI0byM+5/WXQjBCQ6f9IYIZqMnLGvX2VsnI 4Spt7u9mRsE709dg1sxX5N4IwNIk6sG5+ePP2RbWamOYVBVdBGt4BVcsf8KVkRQWUdHK Rkyw== X-Gm-Message-State: AAQBX9fkEMPpuLLg2DJ7mIncJknbmOLu7VSmvzxEilOL5um7Unxg9eX5 n+Sd8Z0JMMBfp9XWIk2SJhA= X-Google-Smtp-Source: AKy350bRUIG8K9LjvA1b9EuTwsk3AQLD39qLhO6S6LUfgGDpNedlRhLzUDv7B2/uoLhMI5WTUiFchA== X-Received: by 2002:a05:622a:1104:b0:3e4:e9e4:5d0e with SMTP id e4-20020a05622a110400b003e4e9e45d0emr45040970qty.50.1680529591144; Mon, 03 Apr 2023 06:46:31 -0700 (PDT) Received: from localhost (240.157.150.34.bc.googleusercontent.com. [34.150.157.240]) by smtp.gmail.com with ESMTPSA id c70-20020a379a49000000b00746ae84ea6csm2814958qke.3.2023.04.03.06.46.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 03 Apr 2023 06:46:30 -0700 (PDT) Date: Mon, 03 Apr 2023 09:46:30 -0400 From: Willem de Bruijn To: David Howells , Willem de Bruijn Cc: dhowells@redhat.com, Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Message-ID: <642ad8b66acfe_302ae1208e7@willemb.c.googlers.com.notmuch> In-Reply-To: <1818504.1680515446@warthog.procyon.org.uk> References: <64299af9e8861_2d2a20208e6@willemb.c.googlers.com.notmuch> <20230331160914.1608208-1-dhowells@redhat.com> <20230331160914.1608208-16-dhowells@redhat.com> <1818504.1680515446@warthog.procyon.org.uk> Subject: Re: [PATCH v3 15/55] ip, udp: Support MSG_SPLICE_PAGES Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Queue-Id: 0D5E21A0017 X-Rspamd-Server: rspam01 X-Stat-Signature: cpgpmz8fbaow6daoprkunha6hk76b4p3 X-HE-Tag: 1680529591-150395 X-HE-Meta: U2FsdGVkX1/xHHLrxne3P9b9RNbPDm++gOD2ivJp6hclIevjKeTcLw8J/qpEQY3wbnmMlx2F8Ni1A/yoPTNtKr95agtG+ePJ3+VBmlyxdojKYd3pZqWksL2SaQd/fQsKueLXdQdiVW8eWwBltKOySU7IXiiQ8g2t77dDgUoSQKh65w/2YRyS14yU/QCd2kURwZi8axkP/ySU5hrITA5fvy2ge++zAKWvjVQM9vfjkaeKGPFmpga8RNDelxw1qu556+wuu6hj81MB+D+8Hz7PU7o4TBC2qNAt85s/j/JAn+1StY87X4mtLK8mgMLor+gDQcdlHcgOUBCQeB+gWdDrUa3sFCH5uFmISfbscEEXb/pWD8bhAAXwLa009ShgkZp6I04pyKgPK3yr2QjF0Kg7hy5MiUcRMgJarIT7xQeZLJbnud/foX1wYBPgjqDPRX/cSdUHkRA5K8WD+j1UqbBK/NJP3DoVDJ4kR9VaaYASzgjaPkXBqOihV2hgJlQfV58JJT80b1+FPV4VPzfZPNppwz+reeDUP5E+A9DMAs81a1WN0bNmPWaIjMkW8kHdmFBzeGVwOrgFpZu2h82TZsgSXx/Nt6ejJobQBGC2Lhq/kft7uYgZCHabM4jJxsjy4Bm+RLuXbEk9mo8S98u64LnmmU1PzB7UtRskHFC+vvO+I3SxOISXJM/NcyVY5MJj+7jNDbaWr9zlyzuf8tXiRmSaLVEFvvxDiTjNaTCqTLHy3ouCBxGdO5e094K0ILwGeduoEAuHNc0MrnKNVcQteubpWDBOHUcjpCKOtjRhEeS7gbB3LI/lzbjb8r1/35IKXYXnxM7EpCNIkIuRa9dqr5rvnrQ5dSnjExT+HzWORBnSCU86LCqhFj4n1pOWUMdx7bPlcr+oMyreeZlMNRYUeUvlQ5Y+B+sTpLxzbzZOZDX42V/7HwYUO3ZvZV261Vo/fnnWKL2t1bkL4gHUfJ/l+pW HTUkSola nOmzTBZJ+TwvcV19cyp1PAzdd3N/52PusRaqMRs4OBnafU2msAG5IEa2xXLD5ruB1SxQypJTamzytCxa6c6Cw1etUzRErdRFFyfRVNfWlJglPuGKXvHmjkNr8y8xlOcwO59FVvyvqHmyFwh8uUteu7l7k96c03gZV9GToEVUzVaJUzxtfbI9OATsGnvAidY2OM7/8cnCO39NJ9bmPcej+0sXrxEDCnKv+r331kU61CRvtpDh9klYm3eyWEZrohqB1uIiAAePs5vjV+1pLR7WbUMY93CJc5XixwAnyFRzQvMf5jZzyhEaLzsno8paDCuOj43BCpcFLYdItJoH1D/P1ApYuWtYccBzZ8BWVRwobrqKDJtwg9tXU9jgVPZnK3eRcSs5A858LhKSw47P11j+/tMrhL7HmXijYN+FSYx6uFH8YehQg7jf3F0rbSyO7QSkBVVVSZ19wptkngmcNFaQ5//1osncnXGQGP+q0HGASAVKPNXIXD9W6jGnHobsFCxLWsnJNWnIqW2lb92z+3JOnZlUlfRW/fyULF3e9s7CDrVYDkFU6Ho5fOADf3UILmdl3cpV5PAIy5V3ghJwtJBiMBxM1+WSR1QVTAifupCVVoKKY0Rc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: David Howells wrote: > Willem de Bruijn wrote: > > > > + } else if ((flags & MSG_SPLICE_PAGES) && length) { > > > + if (inet->hdrincl) > > > + return -EPERM; > > > + if (rt->dst.dev->features & NETIF_F_SG) > > > + /* We need an empty buffer to attach stuff to */ > > > + initial_length = transhdrlen; > > > > I still don't entirely understand what initial_length means. > > > > More importantly, transhdrlen can be zero. If not called for UDP > > but for RAW. Or if this is a subsequent call to a packet that is > > being held with MSG_MORE. > > > > This works fine for existing use-cases, which go to alloc_new_skb. > > Not sure how this case would be different. But the comment alludes > > that it does. > > The problem is that in the non-MSG_ZEROCOPY case, __ip_append_data() assumes > that it's going to copy the data it is given and will allocate sufficient > space in the skb in advance to hold it - but I don't want to do that because I > want to splice in the pages holding the data instead. However, I do need to > allocate space to hold the transport header. > > Maybe I should change 'initial_length' to 'initial_alloc'? It represents the > amount I think we should allocate. Or maybe I should have a separate > allocation clause for MSG_SPLICE_PAGES? The code already has to avoid allocation in the MSG_ZEROCOPY case. I added alloc_len and paged_len for that purpose. Only the transhdrlen will be copied with getfrag due to copy = datalen - transhdrlen - fraggap - pagedlen On next iteration in the loop, when remaining data fits in the skb, there are three cases. The first is skipped due to !NETIF_F_SG. The other two are either copy to page frags or zerocopy page frags. I think your code should be able to fit in. Maybe easier if it could reuse the existing alloc_new_skb code to copy the transport header, as MSG_ZEROCOPY does, rather than adding a new __ip_splice_alloc branch that short-circuits that. Then __ip_splice_pages also does not need code to copy the initial header. But this is trickier. It's fine to leave as is. Since your code currently does call continue before executing the rest of that branch, no need to modify any code there? Notably replacing length with initial_length, which itself is initialized to length in all cases expect for MSG_SPLICE_PAGES. Just hardcode transhdrlen as the copy argument to __ip_splice_pages. > I also wonder if __ip_append_data() really needs two places that call > getfrag(). > > David >