From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1ECBCC27C53 for ; Wed, 12 Jun 2024 16:24:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 882E26B009C; Wed, 12 Jun 2024 12:24:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 831376B009E; Wed, 12 Jun 2024 12:24:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6F8D16B00A2; Wed, 12 Jun 2024 12:24:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 51AD86B009C for ; Wed, 12 Jun 2024 12:24:55 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id E5C618018B for ; Wed, 12 Jun 2024 16:24:54 +0000 (UTC) X-FDA: 82222760508.02.5813D08 Received: from out-185.mta0.migadu.com (out-185.mta0.migadu.com [91.218.175.185]) by imf04.hostedemail.com (Postfix) with ESMTP id 23DFC4001A for ; Wed, 12 Jun 2024 16:24:51 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=v6CQRjiu; spf=pass (imf04.hostedemail.com: domain of kent.overstreet@linux.dev designates 91.218.175.185 as permitted sender) smtp.mailfrom=kent.overstreet@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1718209492; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=aQPu710OMl/zGUCfEwcpcZgM0JKI4jwl9NMG8oam6aE=; b=3H/S/UVso1xIcW/83T7sqRq1iKbza7TOODft5rcep7uQr5vuCvYr18J1fsk8/2QkuGLdj+ MViBIO17KDLoi5GRfQYuBoFH4CAo7sV3tDuNLwZz5e9SHrh+PjknRVcjKPpw2awNHsb3qT bkYjDcKKvVj5jrQ8RgFGOctb2SP2/zk= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1718209492; a=rsa-sha256; cv=none; b=1zA71DzZKSSrDooa+iSLOVpftzbuvP5ZrBPCFOSsAUimh3OPST4+fihoazw7mLIQnG7NoS 73TzH0LJoiwMeR5F4feyjYkRmB0p8PZlDEFwXJzPCQJOdc0d77FpL2uYP5YXiNhww2xkcc FPw1ZPs1rxAUCGPG0LvGAqnbFQhavCY= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=v6CQRjiu; spf=pass (imf04.hostedemail.com: domain of kent.overstreet@linux.dev designates 91.218.175.185 as permitted sender) smtp.mailfrom=kent.overstreet@linux.dev; dmarc=pass (policy=none) header.from=linux.dev X-Envelope-To: bernd.schubert@fastmail.fm DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1718209489; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=aQPu710OMl/zGUCfEwcpcZgM0JKI4jwl9NMG8oam6aE=; b=v6CQRjiuFllc99eBRpG6fXeeRDD7zwail9zUD4250MucsP+Wcf0Qz80iVr8FVghJO9iz4b RP5dY5jK7x3IQNc6ebQSOI3nc66nwn0plXYq2XHWcqjjjYtiRqoTFZx0dabMoeWMNbvQIQ rw/LzSNAVeK6No+oOeCXByJeZrwbO00= X-Envelope-To: bschubert@ddn.com X-Envelope-To: miklos@szeredi.hu X-Envelope-To: amir73il@gmail.com X-Envelope-To: linux-fsdevel@vger.kernel.org X-Envelope-To: akpm@linux-foundation.org X-Envelope-To: linux-mm@kvack.org X-Envelope-To: mingo@redhat.com X-Envelope-To: peterz@infradead.org X-Envelope-To: avagin@google.com X-Envelope-To: io-uring@vger.kernel.org Date: Wed, 12 Jun 2024 12:24:40 -0400 X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Kent Overstreet To: Bernd Schubert Cc: Bernd Schubert , Miklos Szeredi , Amir Goldstein , "linux-fsdevel@vger.kernel.org" , Andrew Morton , "linux-mm@kvack.org" , Ingo Molnar , Peter Zijlstra , Andrei Vagin , "io-uring@vger.kernel.org" Subject: Re: [PATCH RFC v2 00/19] fuse: fuse-over-io-uring Message-ID: <3bh7pncpg3qpeia5m7kgtolbvxwe2u46uwfixjhb5dcgni5k4m@kqode5qrywls> References: <99d13ae4-8250-4308-b86d-14abd1de2867@fastmail.fm> <62ecc4cf-97c8-43e6-84a1-72feddf07d29@fastmail.fm> <4e5a84ab-4aa5-4d8b-aa12-625082d92073@ddn.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Migadu-Flow: FLOW_OUT X-Stat-Signature: nfm5chpwitikb5aziiby9aewjttsgb5f X-Rspamd-Queue-Id: 23DFC4001A X-Rspam-User: X-Rspamd-Server: rspam05 X-HE-Tag: 1718209491-177218 X-HE-Meta: U2FsdGVkX18WZV5XPvpJNRCr+1hKyN9mF3jcWsazo5UrymAbH4xSSePM4EniJosQbgJIbdR+euK+vNrwpRQZV2uHtIZjuaAkx+QGCYmB0qDcg54IyaqUUOjQvt+gpaRecngoeqN4U9F48ldnEpfHm/ZnjOALL6m1cYvkQieuZCpIXn1TguyZ60LjmzjAV026AcvpKxjqyEroewPeQq6uGLUEpg90mXTOKe3jQg4vvK9jBMOlcJof9v+LpHLdlPm0WCui/r8L5HiS8TXM3eZEyghNpwLJH1hFJMeQv8/WEFGCi1CdSGIjAWyW8rd3YG8Fs+wmBOjFa5RHMZAmuZB55p/Udwbzw3br8XT8f1x7//M9g21WHOqKzL+iv5aAY/q+ISYiDAUHGRchsZAV8umO5ARB6KjqmOBCSMVrzEvd08A0fQncU/Nm20gypcGyTs0mepyq25MgcX5pvmkDb6X1EtsvLxERGNQFJVbn25WGM5wcYCDL1ZZ54wyCf8GvISwSSZpzCqe0Z3eAAJk7kcU9O4OBo8DL5wU3nS4GgsJ8OTCmciVo3wAarwGXrRGAg8xu7a3A9XwzWBDRqFrsswgzfi93SryaRLb+9WQWn3gZVc7YtJcIHKwXELI4KPHGh/Rf45o8D+6l4m2HPS+aJVUnx0LOIZvoz/BJBIyGt/Ed6vL20kDWBgdGebZjqgV9SHaw5H5PSGQcaF9lNJH7/iVE0VAn0A5mpFNelAZDecEP3kADTIYMzPDqABczlKNcaYp8s94iGUqQ3SUsX5XgR7GWpC8v9zwtio+OwnWpa6H9xFzbc9Jem6o3ev/70geurtV8ssQbQZ1dZmdO+Z4OUeGhUUGZW/CURX+M3kLRWGzA40RlYGQEoEuRcLZZEIDQUoQFf3MEKGoADbzQ8rfQu6yMAzNcv1arNRrelSELeN61Ebl3hL+KeNlmjr7hzC9WaKX3OAw1mmMRdSIF22Jq8ZO xX9puaxy 1eP2EyOT+7rPoYgKQuB0Kkl/bdUwSzaWDSt41SGA1ewBWS1dKDvyZo7TnvJEQ50IVex5LeYxRKp49LI/JH0W6jdHnoQpAT4UP4xRyoCMXjVHeBeE0u1/bQQJDaJSdgapWUBsJ6/Xq/DUtib8jURIeqhzT8Kx2XOXvJX9pri3mACCi7L0zbQel/tGpNKcrOj3b85WbPRGKYBm3P1Dd5E3c/atrlQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Jun 12, 2024 at 06:15:57PM GMT, Bernd Schubert wrote: > > > On 6/12/24 17:55, Kent Overstreet wrote: > > On Wed, Jun 12, 2024 at 03:40:14PM GMT, Bernd Schubert wrote: > > > On 6/12/24 16:19, Kent Overstreet wrote: > > > > On Wed, Jun 12, 2024 at 03:53:42PM GMT, Bernd Schubert wrote: > > > > > I will definitely look at it this week. Although I don't like the idea > > > > > to have a new kthread. We already have an application thread and have > > > > > the fuse server thread, why do we need another one? > > > > > > > > Ok, I hadn't found the fuse server thread - that should be fine. > > > > > > > > > > > > > > > > The next thing I was going to look at is how you guys are using splice, > > > > > > we want to get away from that too. > > > > > > > > > > Well, Ming Lei is working on that for ublk_drv and I guess that new approach > > > > > could be adapted as well onto the current way of io-uring. > > > > > It _probably_ wouldn't work with IORING_OP_READV/IORING_OP_WRITEV. > > > > > > > > > > https://lore.gnuweeb.org/io-uring/20240511001214.173711-6-ming.lei@redhat.com/T/ > > > > > > > > > > > > > > > > > Brian was also saying the fuse virtio_fs code may be worth > > > > > > investigating, maybe that could be adapted? > > > > > > > > > > I need to check, but really, the majority of the new additions > > > > > is just to set up things, shutdown and to have sanity checks. > > > > > Request sending/completing to/from the ring is not that much new lines. > > > > > > > > What I'm wondering is how read/write requests are handled. Are the data > > > > payloads going in the same ringbuffer as the commands? That could work, > > > > if the ringbuffer is appropriately sized, but alignment is a an issue. > > > > > > That is exactly the big discussion Miklos and I have. Basically in my > > > series another buffer is vmalloced, mmaped and then assigned to ring entries. > > > Fuse meta headers and application payload goes into that buffer. > > > In both kernel/userspace directions. io-uring only allows 80B, so only a > > > really small request would fit into it. > > > > Well, the generic ringbuffer would lift that restriction. > > Yeah, kind of. Instead allocating the buffer in fuse, it would be now allocated > in that code. At least all that setup code would be moved out of fuse. I will > eventually come to your patches today. > Now we only need to convince Miklos that your ring is better ;) > > > > > > Legacy /dev/fuse has an alignment issue as payload follows directly as the fuse > > > header - intrinsically fixed in the ring patches. > > > > *nod* > > > > That's the big question, put the data inline (with potential alignment > > hassles) or manage (and map) a separate data structure. > > > > Maybe padding could be inserted to solve alignment? > > Right now I have this struct: > > struct fuse_ring_req { > union { > /* The first 4K are command data */ > char ring_header[FUSE_RING_HEADER_BUF_SIZE]; > > struct { > uint64_t flags; > > /* enum fuse_ring_buf_cmd */ > uint32_t in_out_arg_len; > uint32_t padding; > > /* kernel fills in, reads out */ > union { > struct fuse_in_header in; > struct fuse_out_header out; > }; > }; > }; > > char in_out_arg[]; > }; > > > Data go into in_out_arg, i.e. headers are padded by the union. > I actually wonder if FUSE_RING_HEADER_BUF_SIZE should be page size > and not a fixed 4K. I would make the commands variable sized, so that commands with no data buffers don't need padding, and then when you do have a data command you only pad out that specific command so that the data buffer starts on a page boundary.