From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E0801C27C7B for ; Wed, 12 Jun 2024 13:53:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 701DA6B0099; Wed, 12 Jun 2024 09:53:49 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6B1BD6B009A; Wed, 12 Jun 2024 09:53:49 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5044E6B009B; Wed, 12 Jun 2024 09:53:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 329636B0099 for ; Wed, 12 Jun 2024 09:53:49 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id DCE1BC17D0 for ; Wed, 12 Jun 2024 13:53:48 +0000 (UTC) X-FDA: 82222379736.30.E73573B Received: from fhigh6-smtp.messagingengine.com (fhigh6-smtp.messagingengine.com [103.168.172.157]) by imf06.hostedemail.com (Postfix) with ESMTP id CA54518000C for ; Wed, 12 Jun 2024 13:53:46 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=fastmail.fm header.s=fm1 header.b=cF0wr9Vl; dkim=pass header.d=messagingengine.com header.s=fm1 header.b="Q 19rxdo"; spf=pass (imf06.hostedemail.com: domain of bernd.schubert@fastmail.fm designates 103.168.172.157 as permitted sender) smtp.mailfrom=bernd.schubert@fastmail.fm; dmarc=pass (policy=none) header.from=fastmail.fm ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1718200426; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=04DJWBew/hHwmU0Ytn7mCPWz2PLbAvV1Cam6ofFkdHQ=; b=4hLKaQFyswavNe2cp5FFSntLl0yzluLZ26gR25ZII8vqmNFAX+w0q6Qh6GAE9FUu6h6/D1 UIhm9ABKPXkSoOJdSFVhHRVIqFiVyqaMTElj/+mwT4M5YL38iQTDoXcVbZRpcnCj4Lez4M 7EGbrbAoxMs0NhITXOMf787SN1f2ju4= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=fastmail.fm header.s=fm1 header.b=cF0wr9Vl; dkim=pass header.d=messagingengine.com header.s=fm1 header.b="Q 19rxdo"; spf=pass (imf06.hostedemail.com: domain of bernd.schubert@fastmail.fm designates 103.168.172.157 as permitted sender) smtp.mailfrom=bernd.schubert@fastmail.fm; dmarc=pass (policy=none) header.from=fastmail.fm ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1718200426; a=rsa-sha256; cv=none; b=Q2FKikWN73c3qjnd9fnllGLvc2PG5rNohZ2pFEpSYvOxR0t7L3P2/6JDJ6FAvq/fIeIlaB t4vKeh9sO3FCOvKRr6GXiQodEyxqf4qpn+F5YZiBeiJVALossMskqwdpqnLIDw8Dyu1cSh ZNzZvSj+yTvowwXVTLgxn/4etgOXQOI= Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailfhigh.nyi.internal (Postfix) with ESMTP id 1F3611140201; Wed, 12 Jun 2024 09:53:46 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Wed, 12 Jun 2024 09:53:46 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fastmail.fm; h= cc:cc:content-transfer-encoding:content-type:content-type:date :date:from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to; s=fm1; t=1718200426; x=1718286826; bh=04DJWBew/hHwmU0Ytn7mCPWz2PLbAvV1Cam6ofFkdHQ=; b= cF0wr9Vl8GqaJwBu+Q+PM1mDuxRuMDC4jaOA8pLHbOI+NLnALW607n3lwhioU30F fQe6bEkzvI7wtQ6HQbiMjxF37VVEZqSWWUShs8IjxPHIw4zS3POyxWKJGaRJ1fIO 3AcHQl1w1e5Vpy5budr5o0NSW0+oJTkJiMjsmNKEsvk1LaUvZH56m6Rbs8HH75C5 ROwFC6b2PmH3eRFLBDhhLNoSd3kF65kPdpBWLCOmTlSdl+TIQ4A6CDiiVpRYxwx6 uRb4VHeYTX9q/4uYyS1jliaATE5YxKSytcp2DTo8TJrTQRrYskcRLd1zv5A7IJHI lCx3pKzgXR5h1TYg3LwBPQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm1; t=1718200426; x= 1718286826; bh=04DJWBew/hHwmU0Ytn7mCPWz2PLbAvV1Cam6ofFkdHQ=; b=Q 19rxdogsji9GS6RBtKhgkC4CGcaBxZtQTEcPxW9G+bhc1L/LKWEMTDKIRy3Ypqiu ssw0TGgiKAMfAoKtQLgMNxpm72G/9V/ZZkAMY3+D785ROpP7cnhlNcTg3WvEXrfk QqoQmy7XJFPWmPA61GOWwp0dY2p1rv4TJeFlM1FUgct2XaJ0WPFzWCvQnavmPBrE 9TcJU2hmekRVjKLE0XAp0yYAAH3q/+kJiTtx9xrRrDnM5tuQ8frnnP+HtHNAMVaC 557UcrbFPdP+8CxLu0C+CPWVzXnAN4M6DwHf7oKwsdIv14qEGX7DHvHi3Mf+7Jp3 XLrsmc5AWg7sYYjZMcwkg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrfedugedgieelucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepkfffgggfuffvvehfhfgjtgfgsehtjeertddtvdejnecuhfhrohhmpeeuvghr nhguucfutghhuhgsvghrthcuoegsvghrnhgurdhstghhuhgsvghrthesfhgrshhtmhgrih hlrdhfmheqnecuggftrfgrthhtvghrnhepvedvfffhhefgjeetuddvhfffgeehhfelffek ffehvdfhfedvhfffjeekgfekkeefnecuffhomhgrihhnpehgihhthhhusgdrtghomhdpgh hnuhifvggvsgdrohhrghenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgr ihhlfhhrohhmpegsvghrnhgurdhstghhuhgsvghrthesfhgrshhtmhgrihhlrdhfmh X-ME-Proxy: Feedback-ID: id8a24192:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 12 Jun 2024 09:53:43 -0400 (EDT) Message-ID: Date: Wed, 12 Jun 2024 15:53:42 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH RFC v2 00/19] fuse: fuse-over-io-uring To: Kent Overstreet Cc: Miklos Szeredi , Bernd Schubert , Amir Goldstein , linux-fsdevel@vger.kernel.org, Andrew Morton , linux-mm@kvack.org, Ingo Molnar , Peter Zijlstra , Andrei Vagin , io-uring@vger.kernel.org References: <20240529-fuse-uring-for-6-9-rfc2-out-v1-0-d149476b1d65@ddn.com> <99d13ae4-8250-4308-b86d-14abd1de2867@fastmail.fm> <62ecc4cf-97c8-43e6-84a1-72feddf07d29@fastmail.fm> From: Bernd Schubert Content-Language: en-US In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Queue-Id: CA54518000C X-Rspamd-Server: rspam01 X-Stat-Signature: hns4hrbnkmzponunze71fyk8919sf7ac X-HE-Tag: 1718200426-988584 X-HE-Meta: U2FsdGVkX1/n+3MyoHIRSIFRRQeZZZBjb3fsZS6pDROLArGXuLfzi9kV44L1jDdib2GIf8XR3P4nvgDwcTcH2bgWljd/bcBEvX0qhSNB41J7TTt6sW8l4tFxap3c42bZv0ZqVjxJIyhUCJ9RCgjL7PKjYTFqxarSkJf+urE1f0qQJfRR00UyGxSqlPxBcC53RyO2z0Gqg8S/T940Ukn2oQ1dS2jHxCLc565xChwWXRehZHE3BSbXLPDEoyAm9DNI9TB603+I9LV0TKEi06SWQBScMG4ipBKeuwafv2Uee06oUTjzM80jfjBkwIzneHEZbRqRPV+clNzvuqSUBWugAmMaG3vENAl4KRal5XUsCpNg6tfKl/ZkfslUVCxy8qohj3IbfyeuAXSG9wV+JhG4I/poOHGEzwr4ooIQmeKJZ0EY3Dt+9luClGnKw+W55pS5E/SeVsRLqltg5gP2rFilvt2sZpSslgrnMLLbZ9aH0VvTPOXD+5SMIvZuwPJjjAP9kKHNo0b+k1x+WdHOaWjk/bRmgg+/I2nADnkEPApE/6AJJqZ9+woc2Hwn8uhB8IuhhdswT2bBIM5e7d1Zz5UhC7U6TUGMvsl/6i41m4vMW6lMQzwTF8ZBUes8/Q025mhR3t/l9+guUCUipXxHxhkNAJUIxvzLIqdtTLcAqMqfMQ9TBi79jQ7mx9FTeItOGN5HrjUVOD1+8e0ycSDWVf0LkOTHoM7699Cr9u074EtvyKrHoCYaflCQdGjenh63B40/kBHMhb2KjF2Z0WODseee1ldkJk5huHf1D6JxljPFl4D2XPlKPPstqwMkY97Qt4bhLcE5CknHf+KL4NyPRslzJkcZZApN49d7K8Uc6wpQ2baJ0sne4g8tzlxOf1KGFovG5AbEqAP9T6XZS9vv5l4ZpA5Uj9AqmKHdGoFP9xivsgqWqokFn0XMpq7oRBk5yK7krbW7cGX4TXHGSXGv9AB k/fL8Bib ro4clqLfnZanPgSmGs9ZraN/XdGUdepDPGQ9E4Tb5S0j1pI5zcUFenr0GuuQb43KZY0MRYqXlkLxCk2wndlVT6GeUq00xPY0Kuo1j2JKaD2BgNRxR2IXbvYbgrJTtRRpMM30BiAWHoWm4vc1gdlNbyxddU74Ybn7SLuLsJcSFAXmwQec4Owqpe/joGm+RzMHWC5cVbkxoSTTOLr3RXeVN8P2gduLiRPJPJq7TP9YxaAnT4ZrJ5P+MuK04fJoroF//q4yF1jJP/T/V0oqSvOQvDlxMTwsEtcV0P5FlwsXVShcsK9RcEdTyvsHWmr5hxRIw7YiZre1KrKFiYrMxMKfcWrreoKfOrmvAHi6/oZaTzzR1W6D2exXDjPyJy8b1VB/W1QllDiRFgPnMgzROubAoRSKTVUKhb9hIo4a1 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 6/12/24 01:35, Kent Overstreet wrote: > On Tue, Jun 11, 2024 at 07:37:30PM GMT, Bernd Schubert wrote: >> >> >> On 6/11/24 17:35, Miklos Szeredi wrote: >>> On Tue, 11 Jun 2024 at 12:26, Bernd Schubert wrote: >>> >>>> Secondly, with IORING_OP_URING_CMD we already have only a single command >>>> to submit requests and fetch the next one - half of the system calls. >>>> >>>> Wouldn't IORING_OP_READV/IORING_OP_WRITEV be just this approach? >>>> https://github.com/uroni/fuseuring? >>>> I.e. it hook into the existing fuse and just changes from read()/write() >>>> of /dev/fuse to io-uring of /dev/fuse. With the disadvantage of zero >>>> control which ring/queue and which ring-entry handles the request. >>> >>> Unlike system calls, io_uring ops should have very little overhead. >>> That's one of the main selling points of io_uring (as described in the >>> io_uring(7) man page). >>> >>> So I don't think it matters to performance whether there's a combined >>> WRITEV + READV (or COMMIT + FETCH) op or separate ops. >> >> This has to be performance proven and is no means what I'm seeing. How >> should io-uring improve performance if you have the same number of >> system calls? >> >> As I see it (@Jens or @Pavel or anyone else please correct me if I'm >> wrong), advantage of io-uring comes when there is no syscall overhead at >> all - either you have a ring with multiple entries and then one side >> operates on multiple entries or you have polling and no syscall overhead >> either. We cannot afford cpu intensive polling - out of question, >> besides that I had even tried SQPOLL and it made things worse (that is >> actually where my idea about application polling comes from). >> As I see it, for sync blocking calls (like meta operations) with one >> entry in the queue, you would get no advantage with >> IORING_OP_READV/IORING_OP_WRITEV - io-uring has do two system calls - >> one to submit from kernel to userspace and another from userspace to >> kernel. Why should io-uring be faster there? >> >> And from my testing this is exactly what I had seen - io-uring for meta >> requests (i.e. without a large request queue and *without* core >> affinity) makes meta operations even slower that /dev/fuse. >> >> For anything that imposes a large ring queue and where either side >> (kernel or userspace) needs to process multiple ring entries - system >> call overhead gets reduced by the queue size. Just for DIO or meta >> operations that is hard to reach. >> >> Also, if you are using IORING_OP_READV/IORING_OP_WRITEV, nothing would >> change in fuse kernel? I.e. IOs would go via fuse_dev_read()? >> I.e. we would not have encoded in the request which queue it belongs to? > > Want to try out my new ringbuffer syscall? > > I haven't yet dug far into the fuse protocol or /dev/fuse code yet, only > skimmed. But using it to replace the read/write syscall overhead should > be straightforward; you'll want to spin up a kthread for responding to > requests. I will definitely look at it this week. Although I don't like the idea to have a new kthread. We already have an application thread and have the fuse server thread, why do we need another one? > > The next thing I was going to look at is how you guys are using splice, > we want to get away from that too. Well, Ming Lei is working on that for ublk_drv and I guess that new approach could be adapted as well onto the current way of io-uring. It _probably_ wouldn't work with IORING_OP_READV/IORING_OP_WRITEV. https://lore.gnuweeb.org/io-uring/20240511001214.173711-6-ming.lei@redhat.com/T/ > > Brian was also saying the fuse virtio_fs code may be worth > investigating, maybe that could be adapted? I need to check, but really, the majority of the new additions is just to set up things, shutdown and to have sanity checks. Request sending/completing to/from the ring is not that much new lines. Thanks, Bernd