From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DEFDDC25B78 for ; Tue, 4 Jun 2024 23:45:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 481A06B0088; Tue, 4 Jun 2024 19:45:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 431B66B008A; Tue, 4 Jun 2024 19:45:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2F9546B008C; Tue, 4 Jun 2024 19:45:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 0CD296B0088 for ; Tue, 4 Jun 2024 19:45:19 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 82DD81A08E5 for ; Tue, 4 Jun 2024 23:45:18 +0000 (UTC) X-FDA: 82194839916.26.3F607F8 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf22.hostedemail.com (Postfix) with ESMTP id 65E65C001A for ; Tue, 4 Jun 2024 23:45:16 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=ii35dO9x; spf=pass (imf22.hostedemail.com: domain of minlei@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=minlei@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717544716; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=JXZh0KwFhXAR4SA+Lj8Y93PPzSjoD2fSnEAWnfuXfzg=; b=lFz/Df7BM4edIs37KI9THy6XO+NpjoXU8DJ7kKGAs3jmg6spoVFFmom/u1veex8/N+4/V5 1aNhNMtJJl+vB/K+P7JHdJxYMCLQh9SpKmgHnPK//lI7pTd9SKymBc4KvlRcEVAuE5Ka7/ SwslkhAJWQ/mZWpsJd0DAJ/DnOS+PI0= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717544716; a=rsa-sha256; cv=none; b=n5JLdy4syshHDsVakj4cNz/0zzUFthY/JoyrhYJrSu1y946vQvy1ZeQl/OlUcAQTPISOSf E8a3WR321AyHwvyWpk5iq6jmq4sDWATYTvH2pVda/LHVYvQmo/yxS6+C8T4VrONmfEiO0R xfPXdUXwBZuKVAMGK61PFh1QkKgo3es= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=ii35dO9x; spf=pass (imf22.hostedemail.com: domain of minlei@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=minlei@redhat.com; dmarc=pass (policy=none) header.from=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1717544715; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=JXZh0KwFhXAR4SA+Lj8Y93PPzSjoD2fSnEAWnfuXfzg=; b=ii35dO9xZ//jzgSunusVJyDwmhDizrJj/LJJnD8wN3j+dEVnTEdrMr5S5c7k4zDpPjs/Y+ DNnRdz1oF01RD28Zh55lfpsn1OqllEkPv67+0pwnNH6r20IdfQ/lAJeHrZsA923omE71XU zcMuCe1b9bJn1tRPo6OuIVWfe2FR4UI= Received: from mail-vs1-f69.google.com (mail-vs1-f69.google.com [209.85.217.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-266-eMxYmQD_NoSq3x0upCDV2Q-1; Tue, 04 Jun 2024 19:45:14 -0400 X-MC-Unique: eMxYmQD_NoSq3x0upCDV2Q-1 Received: by mail-vs1-f69.google.com with SMTP id ada2fe7eead31-48bd4902d1fso227942137.3 for ; Tue, 04 Jun 2024 16:45:14 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717544714; x=1718149514; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=JXZh0KwFhXAR4SA+Lj8Y93PPzSjoD2fSnEAWnfuXfzg=; b=Yb+LKfeYLItskv6vORbssn2jnG8wLDWmwvjN0N3Ns2EZjnlURG/wsXCjCgZuIy55uf 1l5XOylbBIUUZGBefCwP2j4HgCLQf8+Nz6u0OApweOB+QKKf7xqpUXuPmr0kUTbUB+cu JsVqJrvxVNw+EGURCYEC2fx0Qo4vLXYrKP88wX/h1x7BQ0RJxBf0Wz5HOabDIbPPkE9K b+hm1DYolCDSlJPcgVPBxcfMrC4MNmKKOhjsyze0xPypyMQ4pWLYxCG0sUoiPWFNVdWL LPXBk5uTk3S52ua535xqc4fFnpj54d7bKE4rxOvQYjRjWF6q5sjkG4CBhmFj9M5KwkvK psLg== X-Forwarded-Encrypted: i=1; AJvYcCWT0erN6acgqgwAWZWBloWEwyagRkhkZG8ZO2vVy6e958LYa0IJZG6WQKqjE6QuJK38MYkR2AkUUQ3FldcWTzJdEhY= X-Gm-Message-State: AOJu0YxfKcrz+vZWOdG3jWmiorrws4MiFocCSgFCbUBbb+0lFkgGPMLk xVwHcjnK56WhKKxExrsR0f+Oo4qdHjVgUD2S5EVzKPzOPdPmw0wZe+WG4bPYR+qLH6TiHKSyiUC eQ+EkuBajue9G1f1+osemFOQ1xFlUr5MIYwomowUR21WpFVjdaW4QqCjQAGP75Tp+gk38CgVGWP wMKyLxRrVuH5QmiOyXPBES0Tg= X-Received: by 2002:a67:eb02:0:b0:488:f11a:f3d6 with SMTP id ada2fe7eead31-48c04a4e3f1mr1090909137.2.1717544713477; Tue, 04 Jun 2024 16:45:13 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGKAGGgpsqRrU79YUDBLRsPtRoMS0ZU0o21Yq+K4onrqf89q9w9lINsjz0eGmk5k681UJ+u3BEmqkJKutL6nsk= X-Received: by 2002:a67:eb02:0:b0:488:f11a:f3d6 with SMTP id ada2fe7eead31-48c04a4e3f1mr1090880137.2.1717544712887; Tue, 04 Jun 2024 16:45:12 -0700 (PDT) MIME-Version: 1.0 References: <20240529-fuse-uring-for-6-9-rfc2-out-v1-0-d149476b1d65@ddn.com> <5mimjjxul2sc2g7x6pttnit46pbw3astwj2giqfr4xayp63el2@fb5bgtiavwgv> <8c3548a9-3b15-49c4-9e38-68d81433144a@fastmail.fm> <9db5fc0c-cce5-4d01-af60-f28f55c3aa99@kernel.dk> In-Reply-To: <9db5fc0c-cce5-4d01-af60-f28f55c3aa99@kernel.dk> From: Ming Lei Date: Wed, 5 Jun 2024 07:45:01 +0800 Message-ID: Subject: Re: [PATCH RFC v2 00/19] fuse: fuse-over-io-uring To: Jens Axboe Cc: Bernd Schubert , Kent Overstreet , Bernd Schubert , Miklos Szeredi , Amir Goldstein , linux-fsdevel@vger.kernel.org, Andrew Morton , linux-mm@kvack.org, Ingo Molnar , Peter Zijlstra , Andrei Vagin , io-uring@vger.kernel.org, Pavel Begunkov , Josef Bacik , Ming Lei X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Stat-Signature: sp11nke8wif7nzrf9xzu9jk4xozr78e8 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 65E65C001A X-HE-Tag: 1717544716-807856 X-HE-Meta: U2FsdGVkX18amKTnC2z7e0zWY+jRFYDhdcgVs3qp7wMW2l4uhwUsAKxOcV7/nF+Gr5frl1XUpE8F/NfSibPq4Lnz/xZz+TByG9m0Fylu9Pe69Di+xgrqV/+DbESXoSw8AiGrg8URUXV+Uf+iPOKtkvXCuwO4qCmoA+DTir8dZcjZwC0lGWm0EmdqBuKPPox0XbxZCfnM1c/kMhD47Alnc5s32bHb0oUDc9HUU6mhd58Ksv8N+O1FIMk9yklbqnGlz5PEwlGcyyVfmp5nXqYbuFH2bOU64h+okP8bxEKXMOGS+WajlSdeKIdki61v8F+Iq+b0kHxPY5xjL6zwxDYfTBnagipcJPfAMnrLrKrisYQdXMaTrklWM7lxTCoTDMRKZfnsXRCz5c3ClsfWfEoouF843tnyR55JIKgCCHySvTDEjSDJc6yt6u1pWsyU17Gia3rrH5qUN09vCc2wiyKQ57ncEzqyJ8jkDSknX9R6vx7ORV3u+cMwd5ovKu/lIC4chf5yzvjl1+8esXAa93Ux6QbiZvO7Y/3KmXxrZtYEl05oJhGzlL/ftPfUK6G3q9GGk8/OheCdX61PCUW8BYji0BZ2myMqtCLqu2Y+tMpAG1E1zGbUMBnp/3oZUeewqd+pdQ3juYIRcAniieHmGJTQBTrOqXGw36UORYU5tuxm2yppT8w8A2ZsDgOv2knUEdVU58Ip6edMK/oG7rpijjlo7WrvpbFyn+T64HMgjqZJNAiU+KMuHPhNHaj7YexS0oEncPvvs7NiJpQJGiVRQRVUaiy5Hw7e6LrVwLD+/X0w0c4stuKTUdfJPKq9Qw/NF2kTEWIGGxsDONvVRy7qIWFd2ka4MmLvfUZVDHdutr1J3VMNqATaYwpjWMzZLb7lX12bm9ThEkzL8YEuFfgMgIuFrU5tYVsqFuT1Vo/OKQ1NEL0FprYuVOliKJpQwV0XJ5h2rc8QXGAI9AxN0Uh4/bP ozLLe6XM q9sLAae5T2vGlT+88zptjyGHA3eEnhxLpE/kqinvt1gS1RoYAkkazZWptJNvjsVZSCmkj8UeLnF3jQm8BF4ZhT/ygrGB06Msx1H69jph+dfzzBcW2aQZaapi7KZ0rcZntLUjbnZik8mEVm2BUVgbHOjlQxDvoyxc1S6SugMcPqMXimSLy4w3ZOE2ao7oAYDVAhcXSdFt+1yyTaa63scGjLhO4DHG8AP01KIK7LK+bVhx8GnPMiXljLBCO4coAPjfr9cPXegeqNKJGvZZ9vxFSbYVp+BYM0KCfnE2LZAPRPf7UTMlb36n1iN5KanWWW5U1z7Rj/YUyGyCcdCxDtnXxpYVJA1UkMtgzs9qVFYUlUvEHJC6rrLcwMrzmT+bwkKoq0nzFRhTb+32QeqofNVGQSt2sMjat6nIMPiQ9/gG/Y4nAJQJeYksViwX/Ow0J/0iYcHED6ubbCL5yqt7G2JDIAGsInpZNkUcIb5nghdKEk+wpCB3Nyux1fJ9QCTHv6Pmf7FuqZlvhp8MzKAJ+DFrqaU8xMWIUziwJvMleEryJsKaMHJVB/dsT8O/e2WealxoQfamvmav26vG1VxLTGAxbvfsq2v9s71XFJ7+jfr6UDQ7LOqyu5sCfQ6rUyvwm07MCfGpQ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000005, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, May 31, 2024 at 12:21=E2=80=AFAM Jens Axboe wrote= : > > On 5/30/24 10:02 AM, Bernd Schubert wrote: > > > > > > On 5/30/24 17:36, Kent Overstreet wrote: > >> On Wed, May 29, 2024 at 08:00:35PM +0200, Bernd Schubert wrote: > >>> From: Bernd Schubert > >>> > >>> This adds support for uring communication between kernel and > >>> userspace daemon using opcode the IORING_OP_URING_CMD. The basic > >>> appraoch was taken from ublk. The patches are in RFC state, > >>> some major changes are still to be expected. > >>> > >>> Motivation for these patches is all to increase fuse performance. > >>> In fuse-over-io-uring requests avoid core switching (application > >>> on core X, processing of fuse server on random core Y) and use > >>> shared memory between kernel and userspace to transfer data. > >>> Similar approaches have been taken by ZUFS and FUSE2, though > >>> not over io-uring, but through ioctl IOs > >> > >> What specifically is it about io-uring that's helpful here? Besides th= e > >> ringbuffer? > >> > >> So the original mess was that because we didn't have a generic > >> ringbuffer, we had aio, tracing, and god knows what else all > >> implementing their own special purpose ringbuffers (all with weird > >> quirks of debatable or no usefulness). > >> > >> It seems to me that what fuse (and a lot of other things want) is just= a > >> clean simple easy to use generic ringbuffer for sending what-have-you > >> back and forth between the kernel and userspace - in this case RPCs fr= om > >> the kernel to userspace. > >> > >> But instead, the solution seems to be just toss everything into a new > >> giant subsystem? > > > > > > Hmm, initially I had thought about writing my own ring buffer, but then > > io-uring got IORING_OP_URING_CMD, which seems to have exactly what we > > need? From interface point of view, io-uring seems easy to use here, > > has everything we need and kind of the same thing is used for ublk - > > what speaks against io-uring? And what other suggestion do you have? > > > > I guess the same concern would also apply to ublk_drv. > > > > Well, decoupling from io-uring might help to get for zero-copy, as ther= e > > doesn't seem to be an agreement with Mings approaches (sorry I'm only > > silently following for now). We have concluded pipe & splice isn't good for zero copy, and io_uring provides zc in async way, which is really nice for async application. > > If you have an interest in the zero copy, do chime in, it would > certainly help get some closure on that feature. I don't think anyone > disagrees it's a useful and needed feature, but there are different view > points on how it's best solved. Now generic sqe group feature is being added, and generic zero copy can be built over it easily, can you or anyone take a look? https://lore.kernel.org/linux-block/20240511001214.173711-1-ming.lei@redhat= .com/ Thanks, Ming