From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8D63AC282EC for ; Tue, 18 Mar 2025 23:02:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E0B9E280002; Tue, 18 Mar 2025 19:02:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DB998280001; Tue, 18 Mar 2025 19:02:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C5B7A280002; Tue, 18 Mar 2025 19:02:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id A88E6280001 for ; Tue, 18 Mar 2025 19:02:39 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id EBCC61407B7 for ; Tue, 18 Mar 2025 23:02:39 +0000 (UTC) X-FDA: 83236198038.14.06E84B5 Received: from smtp-fw-52002.amazon.com (smtp-fw-52002.amazon.com [52.119.213.150]) by imf22.hostedemail.com (Postfix) with ESMTP id C8FD8C0021 for ; Tue, 18 Mar 2025 23:02:37 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=amazon.de header.s=amazon201209 header.b=JKi2ca4P; dmarc=pass (policy=quarantine) header.from=amazon.de; spf=pass (imf22.hostedemail.com: domain of "prvs=1653b3b2d=ptyadav@amazon.de" designates 52.119.213.150 as permitted sender) smtp.mailfrom="prvs=1653b3b2d=ptyadav@amazon.de" ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1742338957; a=rsa-sha256; cv=none; b=uMGAh3QVLoTLod/7NPoKMJwX6pTjRLkrr3fIFRpMEapQK4tyiHyjrb7OBvQrhpbVxg6z85 qXlOGgczWRB/K3K2DI0AcqLg0nUtX/P7H4hhTST7Pk5/p5zMRvKoxCguJNnknfeKw0/P9p hgWOQsjAxvrHWQp+U1RTMwX1KEw4C1k= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=amazon.de header.s=amazon201209 header.b=JKi2ca4P; dmarc=pass (policy=quarantine) header.from=amazon.de; spf=pass (imf22.hostedemail.com: domain of "prvs=1653b3b2d=ptyadav@amazon.de" designates 52.119.213.150 as permitted sender) smtp.mailfrom="prvs=1653b3b2d=ptyadav@amazon.de" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1742338957; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=o9Gs8I6caiy1/fPWZYvSLm6+kkS17oGi9fBfJW9dipQ=; b=e6sXvv2aZkdIUx1ocs1u8HaPeQPG0BR3RAdzTyI1vkDgId8wEsEJ+yES/ByVsDjfE+Ty8k z9vzErZoFHXOiYQ2114xbJPZEwL/483nNcokUi1XIf2gReV5HigbxZcIuvPjV9biN+fDsU W8jKVS0e6xqVFuLw+y/8+wlsWEOI6lQ= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.de; i=@amazon.de; q=dns/txt; s=amazon201209; t=1742338958; x=1773874958; h=from:to:cc:subject:in-reply-to:references:date: message-id:mime-version; bh=o9Gs8I6caiy1/fPWZYvSLm6+kkS17oGi9fBfJW9dipQ=; b=JKi2ca4P8YehvTAzINPJvl5RCJwwHUifqnM+HPaPDQKPTEBa6xN4WwQZ /heKLmE/X49CJG1O03GY6OBaDgd797C3SKZ5YcKU3BJirZaX297/j+Nrv 9T+TB2wjWDfe2pRh0Qt8RzKbymjG2vnXOADTJg8tGxP5NpJLjnTb3LX9N I=; X-IronPort-AV: E=Sophos;i="6.14,258,1736812800"; d="scan'208";a="706160683" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.43.8.6]) by smtp-border-fw-52002.iad7.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Mar 2025 23:02:34 +0000 Received: from EX19MTAUWB002.ant.amazon.com [10.0.21.151:9378] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.40.40:2525] with esmtp (Farcaster) id 587d40c0-f869-4b16-8e9b-ae504ea90e33; Tue, 18 Mar 2025 23:02:32 +0000 (UTC) X-Farcaster-Flow-ID: 587d40c0-f869-4b16-8e9b-ae504ea90e33 Received: from EX19D020UWC003.ant.amazon.com (10.13.138.187) by EX19MTAUWB002.ant.amazon.com (10.250.64.231) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14; Tue, 18 Mar 2025 23:02:32 +0000 Received: from EX19MTAUWB001.ant.amazon.com (10.250.64.248) by EX19D020UWC003.ant.amazon.com (10.13.138.187) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14; Tue, 18 Mar 2025 23:02:32 +0000 Received: from email-imr-corp-prod-pdx-1box-2b-ecca39fb.us-west-2.amazon.com (10.25.36.214) by mail-relay.amazon.com (10.250.64.254) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14 via Frontend Transport; Tue, 18 Mar 2025 23:02:32 +0000 Received: from dev-dsk-ptyadav-1c-43206220.eu-west-1.amazon.com (dev-dsk-ptyadav-1c-43206220.eu-west-1.amazon.com [172.19.91.144]) by email-imr-corp-prod-pdx-1box-2b-ecca39fb.us-west-2.amazon.com (Postfix) with ESMTP id 201F480140; Tue, 18 Mar 2025 23:02:32 +0000 (UTC) Received: by dev-dsk-ptyadav-1c-43206220.eu-west-1.amazon.com (Postfix, from userid 23027615) id AB4514EA8; Tue, 18 Mar 2025 23:02:31 +0000 (UTC) From: Pratyush Yadav To: Jason Gunthorpe CC: Christian Brauner , Linus Torvalds , , "Jonathan Corbet" , Eric Biederman , "Arnd Bergmann" , Greg Kroah-Hartman , Alexander Viro , Jan Kara , "Hugh Dickins" , Alexander Graf , "Benjamin Herrenschmidt" , David Woodhouse , James Gowans , Mike Rapoport , Paolo Bonzini , Pasha Tatashin , Anthony Yznaga , "Dave Hansen" , David Hildenbrand , Matthew Wilcox , Wei Yang , Andrew Morton , , , , Subject: Re: [RFC PATCH 1/5] misc: introduce FDBox In-Reply-To: <20250318145707.GX9311@nvidia.com> References: <20250307005830.65293-1-ptyadav@amazon.de> <20250307005830.65293-2-ptyadav@amazon.de> <20250307-sachte-stolz-18d43ffea782@brauner> <20250309-unerwartet-alufolie-96aae4d20e38@brauner> <20250317165905.GN9311@nvidia.com> <20250318-toppen-elfmal-968565e93e69@brauner> <20250318145707.GX9311@nvidia.com> Date: Tue, 18 Mar 2025 23:02:31 +0000 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain X-Rspamd-Server: rspam07 X-Rspam-User: X-Stat-Signature: e47i8ffomtuo7jf6de5pjdb4r46bd4tg X-Rspamd-Queue-Id: C8FD8C0021 X-HE-Tag: 1742338957-640381 X-HE-Meta: U2FsdGVkX181UEI0FRLFJu/2VkdDJe4PAd8pn6AEpPZz1huLuMN4NUBvG0FV+jiU8x2IlOXSehLLZ54oVuURwVln/C+GyXI0eagLGPQK40vsLa1tZ9ocJXnnYxyPKGcHNP7Z6N8piDFBujWu+5oL1sRkZViIUm0hqklRu+THfi0dtoBXRuVTh6Mg6dd8DaUkhGvB3rvpCjZiwyOd54ZRYhKOn7CsQoIHACxgKdHCWelfjb28gEZ6/XkWbq7gIP6P0GZAFfFpp4kjfnp7qzKPgdz0rXiB8zY2Z3NaSAA/qnsc5ntwXW+yCZiH6dv+6OD0N6gkAHHghUQY50YV6RtKBhsQ+4Je9t69fPq4nmhFfM7xHwEyGPBK//zZn/Er56/ZKf/e2+Iy0T7dfxfcXhjYrr2rXgApAGROSoYD0/CQaWGO4cYNILRUXjNakq30XceDohcW1txzhshDmAevu5A8zMDm3s+iQTdZzuWduSiXkHQZWaxfW+V197uk5/vUb14uJOfIBZQenpserXNb3WzjQ7m922Z8LFjn+Dmqj+7UEQiDHfYKgkV3cX3+5AXfdNO7OWXh7p7Yl0uwGoxOy6wDwR8NCKUnF7rnjQXGkv7+4QrswLlcgYoFZt6Zq9IXtBETFbBR7QPQkNny1Ql9JspdQk6jgW9TjKGNClMzPNJoQhxVgvevfI/TaxHCE8V0VfhMLxDjv+klshEIQSZiNkZlNfWQpF/U8AU24Ezl+HC0v+Dcw3lRdodKaDJmk2J3K7Y8doJhkQwx18gzWPW0ajIx2awLdlL8OkFH+k6+2iIqqnR26tlb+1DLhyKNvmDMCmsdrtMjiW1BUGipfvUbyb7QdEcXLOc2cDOmino9/n6c9pKDoZOKS9XuQd1vdhZRFVZyiyu1DBFP35yEHvOSptMxiqOQkGvEcukMXn1Qs3MdHIYOgJH5wxQQ/ZDnnjVDomKmg4ilgdNc85HzvKPe5xT jtYBUQtQ 2fHIYCJKojW51bzFz7xY5yoas8xAzScOmYvBC6Ohw4qy4GeN1FLRHRaZ94HiFZ05ZuUfnBg6QBGtUvmXh43nqdeJWDxv2SbTfo860Mg5yJt9ib2m3Evh9Ujgkgwbya9qeStmL/yrfmW34egjv/2Gz/8oUAm9Sxb10+v2aQt0yYFqDh7RKSvN1EIX8k3lQXMod634Or+vQIuzGsH73WNobRyab+Ok1rihlEeNPziGcU/d0RjNEwUj1YwD+ND3XKuqla2Re4t0pSHYR1TehkZR1BrOPNMkELEjDnuYgLoQm7NjQ93h4MTbnln1Og71n/olXg3fNeF8vkc15CGm+jd0LJf59L6NYxLfSJiR1XpufJBG2X8ImMftDqee6gKhoyV0cuII0xtcFqEd8TQC+zVsaO7AH+Iu7vbUGYAwO1qJvQkg8pSk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Mar 18 2025, Jason Gunthorpe wrote: > On Tue, Mar 18, 2025 at 03:25:25PM +0100, Christian Brauner wrote: > >> > It is not really a stash, it is not keeping files, it is hardwired to >> >> Right now as written it is keeping references to files in these fdboxes >> and thus functioning both as a crippled high-privileged fdstore and a >> serialization mechanism. > > I think Pratyush went a bit overboard on that, I can see it is useful > for testing, but really the kho control FD should be in either > serializing or deserializing mode and it should not really act as an > FD store. > > However, edge case handling makes this a bit complicated. > > Once a FD is submitted to be serialized that FD has to be frozen and > can't be allowed to change anymore. > > If the kexec process aborts then we need to unwind all of this stuff > and unfreeze all the FDs. I do think I might have went a bit overboard, but this was one of the reasons for doing so. Having the struct file around, and having the ability to map it back in allowed for kexec failure to be recoverable easily and quickly. I suppose we can serialize all FDs when the box is sealed and get rid of the struct file. If kexec fails, userspace can unseal the box, and FDs will be deserialized into a new struct file. This way, the behaviour from userspace perspective also stays the same regardless of whether kexec went through or not. This also helps tie FDBox closer to KHO. The downside is that the recovery time will be slower since the state has to be deserialized, but I suppose kexec failure should not happen too often so that is something we can live with. What do you think about doing it this way? > > It sure would be nice if the freezing process could be managed > generically somehow. > > One option for freezing would have the kernel enforce that userspace > has closed and idled the FD everywhere (eg check the struct file > refcount == 1). If userspace doesn't have access to the FD then it is > effectively frozen. Yes, that is what I want to do in the next revision. FDBox itself will not close the file descriptors when you put a FD in the box. It will just grab a reference and let the userspace close the FD. Then when the box is sealed, the operation can be refused if refcount != 1. > > In this case the error path would need to bring the FD back out of the > fdbox. > > Jason > -- Regards, Pratyush Yadav