From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 36F5EC282D1 for ; Sun, 9 Mar 2025 12:03:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E656D280002; Sun, 9 Mar 2025 08:03:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E13C0280001; Sun, 9 Mar 2025 08:03:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CDBBC280002; Sun, 9 Mar 2025 08:03:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id B03B0280001 for ; Sun, 9 Mar 2025 08:03:41 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id A8133825C0 for ; Sun, 9 Mar 2025 12:03:43 +0000 (UTC) X-FDA: 83201878326.04.8336706 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf21.hostedemail.com (Postfix) with ESMTP id D2E261C0012 for ; Sun, 9 Mar 2025 12:03:41 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=MMN3iFTt; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf21.hostedemail.com: domain of brauner@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=brauner@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741521822; a=rsa-sha256; cv=none; b=Kfoz5SVX0AnjMvQdFbXbgQu9LCg71neadd7SK5j1npYP25MgmXZHykmct0sog9mctAksu+ 4UyBhPZCc50XQVGY1HNgFxdjMRm4n1PR8/1od2cb1jnVj4ds7kTzzY5xZ1lQ3NH+2ANvBj 34z7CDbU06e4zwuWdS3P1huDDg0ENuM= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=MMN3iFTt; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf21.hostedemail.com: domain of brauner@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=brauner@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741521822; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ryHXxnvblF8yjtCN9QZtxXO7vDDSeFvVFlLMAGAlZqM=; b=zXSrHDmNVMsHbOZBDUkwZdofQH3H+JOqdmpvhuUbK78wtnmK62XQvuKC5Vt2S82AfbhHer DWM9DOmPcbUZz/3d4h0OZwkBCJ8DCZpRyWmlU8Oc7XpX3FfYOYZsVQBHgpmal2Q7Q57bfZ 1tYc2UvTkJ9B1eNwT05+iYG6p4Gc6B4= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id C12725C1D21; Sun, 9 Mar 2025 12:01:23 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DD0CFC4CEE5; Sun, 9 Mar 2025 12:03:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1741521819; bh=x8LiHcCv8DTXMAg2ZcZKR2faXOH9UvWMVP3REFZJBHc=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=MMN3iFTtKavUVNDKnRfSBedIJcy4ZJzQvZLr7uIufEVSHRR1ZoxrNAVbq+N4I0AJS MXWAqypAHyLu860XVaJ9HolV8bZebhOJR3DbWWOm2oPMsyLPvLuQzsHyq77WGGuG0K 1TPUvSi00ykZ8CVUwsHaOKsMdvAo4cJCuwdMm3RKY/mD93O6zHbbuwexP7zSEcMgfQ T/WNkljDQHub56fTSTgfgdeTuVBWsJTtpIMXYaSNUEXorrN4EoW7rddkjRCQiiOvW2 5XnXS/MkyFfmAZRpFp7smfbHx2AhYYnYVyfQkZc3iSsuM9rqumwc+7edu1yVDhSyXv ruUNUCYoPgzhQ== Date: Sun, 9 Mar 2025 13:03:31 +0100 From: Christian Brauner To: Pratyush Yadav Cc: Linus Torvalds , linux-kernel@vger.kernel.org, Jonathan Corbet , Eric Biederman , Arnd Bergmann , Greg Kroah-Hartman , Alexander Viro , Jan Kara , Hugh Dickins , Alexander Graf , Benjamin Herrenschmidt , David Woodhouse , James Gowans , Mike Rapoport , Paolo Bonzini , Pasha Tatashin , Anthony Yznaga , Dave Hansen , David Hildenbrand , Jason Gunthorpe , Matthew Wilcox , Wei Yang , Andrew Morton , linux-fsdevel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, kexec@lists.infradead.org Subject: Re: [RFC PATCH 1/5] misc: introduce FDBox Message-ID: <20250309-unerwartet-alufolie-96aae4d20e38@brauner> References: <20250307005830.65293-1-ptyadav@amazon.de> <20250307005830.65293-2-ptyadav@amazon.de> <20250307-sachte-stolz-18d43ffea782@brauner> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: D2E261C0012 X-Rspamd-Server: rspam11 X-Stat-Signature: 69k59w3nz63t9exgtqypszbcmhxkaqjh X-Rspam-User: X-HE-Tag: 1741521821-659781 X-HE-Meta: U2FsdGVkX18CNubFm9shTEOQiTR8IrFuZrAx+ZMyzA3Sh349dv8/PG2ifkn0eeLiGtGyQbZ4QkU2UNs/ozlvVCIqH9NrSBZQF0xqMa0YRy5B/hWVQRTUqKV5rkl19A0ldVIyPdboNCECDb/dkZc3rldMJUpNUf9peXJ8WciWinQz8BlM2sjD42z28ocWNvQii25MqFq6emznsxCS/OB7XvpfEH7o9ueV1A9ETM3g06y0fQw4gulRvwJ0Vio8W70tw6fx8oyvsPK5Pi6DS1rzZR4x6Oq1aB3UqyKbqR1/333iIEwHvQg9vR6P3Gfp45i5Pg24l5gTP3IKBpv2E62pV3QKIJoh7kLih9wawCY12trjuw/vfo3+lLZ8Y54TPeA/Sbn1MNY57Y7g0lWuLpQEAmw6ml2oqyEgCPI/T5+q+t5jvF8/PZhiLt/i1NlMgV6AnUoxNq3907TDg95Wo4f8dmAq7WEoFykIyYy0k57myOP7Ej7Sfd9Gx1Hz4sOmjhoYpzUNBPGr9ao6biaGHgOVEeQ11oalv39C2cSPH1pB8Ca1Spo1OHmfKnYi/rUyEAJ16BgxeLGCHo4nmJ1Gvwps6p1ww1TjiYlAbHGtEiLsbYM2sW9EeQ7xbUo4EfDD4TswC1m42lLohWQWl7LsZqcXe+MB3tHjCpJ+7gPpiHa6wSEXKym3ei5gSLVLwvfFbC1kXJOF755JRDQ7o1U8U5rGgNZNWhwULRn1GQFb3fDZooMATokplYksPoIMVuuF9OFuomYKqsT+idr68EfR5HCcd7HJjpVFkbtUS91WB0ZqLfes5fHpeN4fK/eB0GXFzKm6yMSedKUFG4FHLPZ72qadEz4jgIzP+AvadIJycxh2dgKaGux7JgUn/b4y/oGCueP27x07LdSJq0Vm14oCmhxoJr2ecVb4/OTskiv9jX1LphcfMyVxCC7Qph7bvgmcsp1XhNzhA5XSW0b1JkqPhR2 AhQRsOmo fs2SPO697xshdbmBpSasrFQNlOUOZXcx40z/NFLjMBL2fEEQAlsRIWUyyW4bvbEvbnfy1DZsIRBz/NR9YnMHm+QmLVMAx3n7NbDXjST20SHf9l7XS0jAhlv+xicrFc/IAU1+IL79INYh5KjyDoGpZfpKoaRMZdSkRFZuufunT+FD9R8nfWG/9G+PCY9JOESvN86Bw4jKx3+J0dLOSzqiYcUf9/7O8eBbwFM/t8clSBIsSRRlVX6efp5/8ctlBFp8h3wl1Rs8MakzlCdGkI0k9pIHlpZJKZuuP3LvSdrkv2dhdsyZ0Gc5po20YWMSZZh1ebccaVcLlxRHEdgH3HEa2kqSnWsrZVieumJEVpu2CTnReRQM9fBBkxc0k+Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sat, Mar 08, 2025 at 12:10:12AM +0000, Pratyush Yadav wrote: > Hi Christian, > > Thanks for the review! No worries, I'm not trying to be polemic. It's just that this whole proposed concept is pretty lightweight in terms of thinking about possible implications. > > This use-case is covered with systemd's fdstore and it's available to > > unprivileged userspace. Stashing arbitrary file descriptors in the > > kernel in this way isn't a good idea. > > For one, it can't be arbitrary FDs, but only explicitly enabled ones. > Beyond that, while not intended, there is no way to stop userspace from > using it as a stash. Stashing FDs is a needed operation for this to > work, and there is no way to guarantee in advance that userspace will > actually use it for KHO, and not just stash it to grab back later. As written it can't ever function as a generic file descriptor store. It only allows fully privileged processes to stash file descriptors. Which makes it useless for generic userspace. A generic fdstore should have a model that makes it usable unprivileged it probably should also be multi-instance and work easily with namespaces. This doesn't and hitching it on devtmpfs and character devices is guaranteed to not work well with such use-cases. It also has big time security issues and implications. Any file you stash in there will have the credentials of the opener attached to it. So if someone stashes anything in there you need permission mechanisms that ensures that Joe Random can't via FDBOX_GET_FD pull out a file for e.g., someone else's cgroup and happily migrate processses under the openers credentials or mess around some random executing binary. So you need a model of who is allowed to pull out what file descriptors from a file descriptor stash. What are the semantics for that? What's the security model for that? What are possible corner cases? For systemd's userspace fstore that's covered by policy it can implement quite easily what fds it accepts. For the kernel it's a lot more complicated. If someone puts in file descriptors for a bunch of files in there opened in different mount namespaces then this will pin said mount namespaces. If the last process in the mount namespace exists the mount namespace would be cleaned up but not anymore. The mount namespace would stay pinned. Not wrong, but needs to be spelled out what the implications of this are. What if someone puts a file descriptor from devtmpfs or for /dev/fdbox into an fdbox? Even if that's blocked, what happens if someone creates a detached bind-mount of a /dev/fdbox mount and mounts it into a different mount namespace and then puts a file descriptor for that mount namespace into the fdbox? Tons of other scenarios come to mind. Ignoring when networking is brought into the mix as well. It's not done by just letting the kernel stash some files and getting them out later somehow and then see whether it's somehow useful in the future for other stuff. A generic globally usable fdstore is not happening without a clear and detailed analysis what the semantics are going to be. So either that work is done right from the start or that stashing files goes out the window and instead that KHO part is implemented in a way where during a KHO dump relevant userspace is notified that they must now serialize their state into the serialization stash. And no files are actually kept in there at all.