From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DB2E2C74A5B for ; Wed, 29 Mar 2023 15:33:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 556CE6B007B; Wed, 29 Mar 2023 11:33:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 505E96B007D; Wed, 29 Mar 2023 11:33:29 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3CDE36B007E; Wed, 29 Mar 2023 11:33:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 2D4486B007B for ; Wed, 29 Mar 2023 11:33:29 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id B6FCC160D97 for ; Wed, 29 Mar 2023 15:33:28 +0000 (UTC) X-FDA: 80622330096.25.B28F482 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf27.hostedemail.com (Postfix) with ESMTP id 8FC5B40007 for ; Wed, 29 Mar 2023 15:33:25 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=ET+cXMSN; spf=pass (imf27.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680104006; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=NeCkkPoa7tXS33oTjWiTv3O34+xg2iZRZ8XP3QkDkJA=; b=G0RyFhlh6dhkXkBC6JiW2vmfuBihIk5PDTp+tOnrFW80tieaZheedA1rGVFk1X6FH1ZJH7 NIhP0m0miMNIKFXXQatI2joY+p6CbfSM61+ew0EYhNqBzTxdkLlI0y5lBZ7XeCI6hcmlgh LwiJnorgTRI8e7WPcZ2uVagPZPdQRWc= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=ET+cXMSN; spf=pass (imf27.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680104006; a=rsa-sha256; cv=none; b=gvM77xgHM74bkP3aB+/3QYNlEYPMXnI3w33OceRBWhfMFVIrguCZTnBuKzLEmpQ6UDwNLS leNhLv+ALeE31x/gSjZ05WOOONnEumwNVTw+ORy/D8xSx5aLUQWphs7NLA0FKS54RFQTox Iz80u4DQnDEvU730gpK3MuJV2vWjpOM= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680104004; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NeCkkPoa7tXS33oTjWiTv3O34+xg2iZRZ8XP3QkDkJA=; b=ET+cXMSNH/K2zXN2A4uWH80/6VyFzGM/mG9X9+tfe+8VIjKxMBL8yY/nbxP2OvWI3fcmd0 IbWks1bIzJFOjr2hUE69JMFhTxdOImkgzU26re5AhgSyQk9ICvgj19EFNYwBbkLkD6LuCR nFB7+IQtAWJOjT6lcjOzPvaI/e9YsSE= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-341-EMw0FyvuP6ivsPNFD8FRkA-1; Wed, 29 Mar 2023 11:33:20 -0400 X-MC-Unique: EMw0FyvuP6ivsPNFD8FRkA-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id C594438149D4; Wed, 29 Mar 2023 15:32:55 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id AB1F41121330; Wed, 29 Mar 2023 15:32:53 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 From: David Howells In-Reply-To: References: <20230329141354.516864-1-dhowells@redhat.com> <20230329141354.516864-31-dhowells@redhat.com> To: Bernard Metzler Cc: dhowells@redhat.com, Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , "netdev@vger.kernel.org" , "linux-fsdevel@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , Tom Talpey , "linux-rdma@vger.kernel.org" Subject: Re: [RFC PATCH v2 30/48] siw: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage to transmit MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <522641.1680103968.1@warthog.procyon.org.uk> Content-Transfer-Encoding: quoted-printable Date: Wed, 29 Mar 2023 16:32:48 +0100 Message-ID: <522642.1680103968@warthog.procyon.org.uk> X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 8FC5B40007 X-Stat-Signature: mirk4n7m4855irj9tewbqgqmuiwudxzy X-Rspam-User: X-HE-Tag: 1680104005-602613 X-HE-Meta: U2FsdGVkX18Swf5Y4wSp3TZNgmmxajml+fOLspxtSup38wL5mHtrbiFeN15rC6VH4LAi+a3IQmUVeAt38PWJK/yPbcQ4rcEkhmsu4Mqa9mtuHqbDsFyfd79Yl3UhoxJvJERB3e8rMZrOFaG4KO+RoSK3JTZHAvLj9H0yiWx7INGob+E7MSWmE1CvGqyxN41bmi+s1q8uUYKkV4dTu2LZQ0aKDLQmnv1MyIDoGwLOSH0dKIPJoE4CVKbR2k0h1vPORNxJnSbdh5O5k96rWU90HbxY31+hyNj2rq1ztE6gu7SLJaPqqVtD0NaLz9SBQNm6q05rZS03JWpG45C2k3/4VfdSYcS1PodXZCeEOdWPpnCilc/g6FnLZ/uTu7he6rskybifGLQ82l5YATaUuRpvsqHnNmjYaM/2zKBAaku9qgjrv4foRAdK0HTDYQdy8XNP1a71uqYUI17oXih6ulF4mZ/Zav4DkglT6LBrfVMxg7Yfk3g3WwKMOIqxgb+Ba0h34BpbpcukcsbcUyOBimV4Taa+VZbKeiPabgpl1niW/RFraoFCeJxC8Jg6lTH6hTLFdTyKYbjUIOdlgU9Z2n+6yxYso9SUsgThrV1vB5/rqbRJUCMQAT7isaVG4FjNBXszQEKPRS4OL0ZQ/Rqx23RcD5J7Sk1ThKtgVR3lSR9NXWbHUhFAdgP00NMsM3YI483LJae67VHM0BZKpfj2ZGoMl8Rwtn++s+dlXqS/U2axTMpA2634/2AqUlFIg5h7IJZvf045aCD9qPCb7TBmZQWlEwxlqun4Nk7Y6kxvTd2iu9TqBmX4K1bt10+U6GXdCAlRo2X4jrFzbivn9BTgIKaKx/PQooUIZzc7A3J32U2S92sVp0mRjkVb5jT2Ja6YQrWtYPzYgKtjwH+DWqi+8YgSjgeg4NfFjxSlBxXDiAg7vW3bTN2dEIIIQCjw0XJgoFQZTr9U59g6KX3wtrJtDUY Kyurg0Is KU/JI3/tvnDeF+bnaBJxsJS0gbDOvucd7LVwHeJc73ktjdKBeTv4Ozhg6ACpK+M47c11znH/SN9Aej0fV+JwYTcFy+fFBg/QItOxrqwY0K/yzT4B9Xlp9OJo2cMSjXEDGk3bL3941nLOxhkRI6SVck7w76UvVkDr6tWhtDQDzS+6nB4Y2jSfl3x5roW1NC4rzUPeRhzt1uNwzCa4Hs7uWZKhhI/t3LMPkZl50DFfvsgEcmThqMTGfXULingoobmZ4CPxVLP7XCd7Ek2YO/8rBzjCs5U4qOY88/LA9qDfNU1VHIucU/ZHCPR33+lkpkKaZDpiG2mlf0rKdjBhetmKaBwoO7A9DewIyppdJNoxM2ogEBWjCl79vODzlM4EoFukTbOI8JrYCh04Cu1HGxPIYPPN3Ng5yITzw3KvR X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Bernard Metzler wrote: > > When transmitting data, call down into TCP using a single sendmsg with > > MSG_SPLICE_PAGES to indicate that content should be spliced rather tha= n > > performing several sendmsg and sendpage calls to transmit header, data > > pages and trailer. > > = > > To make this work, the data is assembled in a bio_vec array and attach= ed to > > a BVEC-type iterator. The header and trailer (if present) are copied = into > > page fragments that can be freed with put_page(). > = > I like it a lot if it still keeps zero copy sendpage() semantics for > the cases the driver can make use of data transfers w/o copy. = > Is 'msg.msg_flags |=3D MSG_SPLICE_PAGES' doing that magic? Yes. MSG_SPLICE_PAGES indicates that you want the socket to retain your buffer and pass it directly to the device. Note that it's just a hint, however, pages that are unspliceable (eg. they belong to the slab) will ge= t copied into a page fragment instead. Further, if the device cannot suppor= t a vector, then the hint can be ignored and all the data can be copied as nor= mal. > 'splicing' suggest just merging pages to me. 'splicing' as in what the splice system call does. Unfortunately, MSG_ZEROCOPY is already a (different) thing. > It would simplify the transmit code path substantially, also getting > rid of kmap_local_page()/kunmap_local() sequences for multi-fragment > sendmsg()'s. If the ITER_ITERLIST iterator is accepted, then siw would be able to do mi= x KVEC and BVEC iterators, e.g. what I did for sunrpc here: https://lore.kernel.org/linux-fsdevel/20230329141354.516864-42-dhowells@r= edhat.com/T/#u This means that in siw_tx_hdt() where I made it copy data into page fragme= nts using page_frag_memdup() and attach that to a bvec: hdr_len =3D c_tx->ctrl_len - c_tx->ctrl_sent; h =3D page_frag_memdup(NULL, hdr, hdr_len, GFP_NOFS, ULONG_MAX); if (!h) goto done; bvec_set_virt(&bvec[0], h, hdr_len); seg =3D 1; it can just set up a kvec instead. Unfortunately, it's not so easy to get rid of all of the kmap'ing as we ne= ed to do some of it to do the hashing. David