From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A8AD3EEE26C for ; Fri, 13 Sep 2024 11:52:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DE7AA6B00C1; Fri, 13 Sep 2024 07:52:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D97626B00C2; Fri, 13 Sep 2024 07:52:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C10C86B00D2; Fri, 13 Sep 2024 07:52:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 9CDE56B00C1 for ; Fri, 13 Sep 2024 07:52:28 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 272A741C91 for ; Fri, 13 Sep 2024 11:52:28 +0000 (UTC) X-FDA: 82559552376.17.3F39CDE Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) by imf17.hostedemail.com (Postfix) with ESMTP id 9395F4000B for ; Fri, 13 Sep 2024 11:52:24 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf17.hostedemail.com: domain of jonathan.cameron@huawei.com designates 185.176.79.56 as permitted sender) smtp.mailfrom=jonathan.cameron@huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726228339; a=rsa-sha256; cv=none; b=xvhZBOASmwY+NL16456vzjhW0rGH1UQZQ/h8BXrJkQS8+JgbLGwqRpFjpJbB65GPiz0D5l AZLkQF4tI043bASMDyaJI2Yk305YqSuwWmXbgrz+RA39MOwZlLJor2sD79Y3H9wZwd3Ny4 DxRklsZ5p6VRJ7dPOJ9lzKIaPkDrihY= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf17.hostedemail.com: domain of jonathan.cameron@huawei.com designates 185.176.79.56 as permitted sender) smtp.mailfrom=jonathan.cameron@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726228339; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=59FKWksgxK5D9026BB1Vl3V8ivGsT0BrsOeq0VivgmA=; b=R3It0Dd8+Y6jK8vsYsuVoowXkIdqhzKulywD7IZKNhRtyGkJb/yOYxygDSdrS66FvEUA95 zFVck0pDKAs8SpWsjld/73Fm87+Zw7YH8fRNLiBL6S4HiEp1IrwIsxpwJqy4vEbkjCnIev C3r/8Iov8o4i9QIji/AS9CahxtJkLsk= Received: from mail.maildlp.com (unknown [172.18.186.231]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4X4sy90c4fz67pJF; Fri, 13 Sep 2024 19:48:17 +0800 (CST) Received: from frapeml500008.china.huawei.com (unknown [7.182.85.71]) by mail.maildlp.com (Postfix) with ESMTPS id 003E8140519; Fri, 13 Sep 2024 19:52:21 +0800 (CST) Received: from localhost (10.203.177.66) by frapeml500008.china.huawei.com (7.182.85.71) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.39; Fri, 13 Sep 2024 13:52:19 +0200 Date: Fri, 13 Sep 2024 12:52:18 +0100 From: Jonathan Cameron To: Shinas Rasheed CC: "dan.j.williams@intel.com" , "Ravis.OpenSrc@micron.com" , "ajayjoshi@micron.com" , "emirakhur@micron.com" , "john@jagalactic.com" , "linux-cxl@vger.kernel.org" , "linux-mm@kvack.org" , "sthanneeru.opensrc@micron.com" , "sthanneeru@micron.com" , Haseeb Gani , Sathesh B Edara , Srinivasa Rao Jampala , Shariful Alam , Vimlesh Kumar , Dharmesh Pitroda Subject: Re: [RFC PATCH] cxl/pci: Set default timeout for background operations Message-ID: <20240913125218.00003099@Huawei.com> In-Reply-To: References: Organization: Huawei Technologies Research and Development (UK) Ltd. X-Mailer: Claws Mail 4.1.0 (GTK 3.24.33; x86_64-w64-mingw32) MIME-Version: 1.0 Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: 8bit X-Originating-IP: [10.203.177.66] X-ClientProxiedBy: lhrpeml100003.china.huawei.com (7.191.160.210) To frapeml500008.china.huawei.com (7.182.85.71) X-Rspam-User: X-Stat-Signature: aro4igp33dh8xte6xyet8zpf6cgzmati X-Rspamd-Queue-Id: 9395F4000B X-Rspamd-Server: rspam02 X-HE-Tag: 1726228344-845687 X-HE-Meta: U2FsdGVkX19oxStqQOGzIfn16YbSdQ4oPk2+rl06z8mmZ2wQG6VOyrhQBSUfjvEKLKX7TE19suspa4BUVMPie30Mn3Zg/l3DsOoupKvaQQboF8oKobay9oqViOXjoctwaeF3AZhJQF2zXMAVHqC9mCCI+le1PhniFnpULHYof7BEQa1gmBwjJblq33nnC2DL8Lzt/2yWBlTSdBgugYL+sOgnzTYn41wzYKT1YoFmO8pnAXKn6ksM15WxvJZ5YrRmmiBmH0PzPDWioY0tJMfSwd4WRV2V9KbtqbEEQ3WRshUMgrPvPp0eTDdoDxCyq8IpdZgZFZ0kYTs2Fs6N3VsW3OqDEY/RFhClyHJWG66G9YgJ2ZZWGx+xiCT5bUwYTSuTXxK74qRWIEbQQGdfTgoGIM7lmIDDJL3Gvoj6wbNPHNas0yDH4dOiSU+Qgxza2e/QlgNR4AOVwzig0mYYsPCAwd/B1jAzQU6Hl3IKh5ZOgX8KGxbSCvjHxvZQMo7DKzvlsmdb0sOsoXzPeSt07IeGLkMzRefTeVG96XmcUYlB98MDtXa/E0CRXvs/5XC1Tg5OT/hMoxruGpCM6ZIk2rF5mSWbpIoOVXKd53jhx6l4Gudp56ZjuMGWCoVPoHESOkoPo6zY9bwmlpal/OUTfGKvJyFpdup15XFkYKBo4wO7e6cdT97DS6H/n3cL9NW+E5sDIcCNjgUNrhyyhNaDh3IaxbzgTHAWPtdmR35f0rbfAq9ZZcdRWVmlclifxdxfw+B1rsfv5By7iQZRyqn1tyVTGmgLCQb7775f9NklSRmB011XQtef8cZDR/Ik4BDQo1s33bwd7SUTJWgFv4yPgfOUmOyfq8dqz9hW4Odk0urg2v6jsCgG2YBdrqDOkuALK7h0VCp1us3LEA53ufza0hXrKfWu8KgnEPQneGsbb8j6/8JxAUKcq8DE7sWvxJAuPwgrcA4eaLJ83iZlH/6Tj8M IZdHUIjA 9W9wKmqGogar10/XqUSERWUW0gpZJI3DCj2k/o2tAT0822sKAqCnRKZwn3BaHRXAr2sK/WpKX+7M2sKZ8AWLSSjaEpFgqjiMn7wHD6zN0PU2fDLPSQkzCPwrWLFBl38xrMF68A0g6hImU0EYD+UHcMhiNcfNHuKMGy3Q7v5YSzubnaQE1pbk071BhB9PBtD3wnYCbelOu1yFkLykRsZRd06VWmepDAisgQugu5cWdHx1x1F/D5/XZ3+7quUerKeplyTw01OfrD6QC+r5begdFnXQH/XVZ4RJEy3d80YE7USF94PgVItbEnFciDqQRyr1nKMKCgxNPu8TQfPh9/6WMea0c6zHdn4ClppT9+dLGUeDASrfr4vS6x2DkpYPTw/ZunFozfZ9QCGm2DC3bclnO//L9W+JsrnKAFzHC4DFnEE3F7jvQ/mRbi7m6sgopUxRTcCEuWQ6caMh/HnCRxMXtUChqxCefhJvoNSJm X-Bogosity: Ham, tests=bogofilter, spamicity=0.000004, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, 2 Sep 2024 06:42:34 +0000 Shinas Rasheed wrote: > Hi, Hi Shinas, Note this is my view. Opinions may differ! > > Trying to revive this discussion about background commands. I'm sorry if more on the topic has been discussed elsewhere, > but this is a patch that I could find which related to the same topic. > > The requirement for default timeouts (or atleast a mechanism to provide timeouts to the command structure) seems to be needed > since the CXL spec doesn't seem to explicitly deny userspace access to background commands. The CXL spec (quite rightly) doesn't speak at all about providing userspace or other access to anything. This is purely an OS implementation question so this is the right place to address it for Linux. It would be perfectly valid for an OS to mediate fully every command including all vendor defined ones. If there is a command it doesn't know about, default to blocking it. There is a path to enable 'some' background commands from userspace but only the ones that support abort. > This seems to constrain a lot of > functionality with respect to Vendor Defined commands as well, if a command needs to be implemented as background. Indeed. If someone has a vendor defined command that locks the device up for a long period, then the kernel may well have a quirk in future to block that command (if we enable vendor commands from userspace in general - see the fwctl discussion). That may be a necessary precaution to avoid the kernel locking up for substantial periods. > > >> sthanneeru.opensrc@ wrote: > >> > From: Srinivasulu Thanneeru > >> > > >> > The CXL 3.0 specification outlines background operations, > >> > and support for handling these operations was added in following patch. > >> > > >> > Link: https://lore.kernel.org/all/20230523170927.20685-5-dave@stgolabs.net/ > >> > > >> > Mailbox commands like ‘Log Populate’ use background operations > >> > to complete the execution of the command. > >> > This can lead to a timeout, since there is currently no option > >> > in the ioctl cxl_send_command structure to specify > >> > a timeout value. The default values being zero can lead > >> > to the driver reporting false negatives to the application. > >> > > >> > This patch aims to establish default values, enabling mailbox commands > >> > that operate in the background to continue functioning even > >> > if a timeout is not set in the userspace application. > >> > >> The reason there are no defaults is because userspace is not allowed to > >> issue background commands. The CXL background command definition is > >> awkward in that it allows a single command to monopolize the mailbox for > >> an indefinite amount of time. > > I understand this is a design adopted by the kernel, and not actually by the CXL spec (at least I don't see it > specifying explicitly that background commands are not allowed in band). (I see you touch on this below) The background commands are 1 at a time. Given some background commands are used by the kernel, the ability to run non background commands at the same time doesn't help us in the general case. Hence proposal is to only allow userspace use of commands that may be cancelled. If the kernel wants to use the mailbox for a background command and one is already being processed by the device, the kernel can just cancel it. > > >> > >> Instead, the approach taken with Firmware Update and Sanitize is that a > >> kernel sysfs ABI mediates access to the mailbox and facilitates bounded > >> timeslices between command submissions. It effectively allows the kernel > >> to manage fairness and more importantly preempt userspace if it needs to > >> issue its own commands. > >> > >> I assume you are only seeing this lack of a default due to building with > >> CONFIG_CXL_MEM_RAW_COMMANDS=y? If yes, "raw" means "raw" and the kernel > >> is mostly taken out of the loop of saving userspace from itself. > > This seems to imply that no vendor defined command can be background commands. Please correct me if > I'm wrong, since Vendor defined commands have to be sent as raw commands, and the absence of default timeouts > indicate that the command will be immediately timed out. The fwctl discussion may provide a path for this, but I think that will still only support commands we can abort. > > Also, I understand the point that background commands (since there can only be one) monopolizes the > mailbox, and hence cannot be exposed to userspace and as I see the background command handling goes synchronously > with the usual command handling in the kernel (ie; foreground command has to wait for the background command to complete or timeout) > but doesn't this defeat the purpose of the spec defined ability to run a foreground command while a background command is > underway? It is a valid optimization to run non background commands. If there is a clear use case then I think we'd be happy to see patches. As you say though, that doesn't enable userspace background comamnds though as some of the commands the kernel uses are background as well so we still need to be able to abort the ones userspace issued. > > >> > >> All that said, ugh, "Log Populate" has no facility to time bound the > >> population of the log. I do not think it is tenable for Linux to > >> surrender mailbox access for an indefinite uninterruptible amount of > >> time... unless you want to handle "Log Populate" like Sanitize where the > >> unbounded background operation is tolerated because the device is taken > >> offline? > > > >It may be pointless to do a component state dump only on an offline device. > >My assumption is this one is hardware debug only. Patches out of > >tree or behind a really scary sounding config variable perhaps? > >Other than vendor log I don't think populate log applies to the other > >log types yet (they don't mention it anyway!) > > > >Jonathan > > Am I wrong to understand that this design decision is still in the open right now? > > Another concern was that right now, a WARN_ONCE is issued when a raw command is ensued. Maybe this belongs > on a different thread, but are raw commands to be depreciated (if then how can Vendor Defined commands be sent?) See fwctl discussion for Vendor Defined command. Right now Dave has focused on Get / Set feature but we'll expand to others. The FWCTL area in general is probably going to be discussed at the various conferences in Vienna next week, so for now this is a 'watch this space'. We did discuss the option of possibly supporting vendor defined commands explicitly if some or all of these conditions were met. 1) There was clear documentation. 2) Possibly a proposal for similar feature in the CXL specification or another industry spec. (to ensure we don't end supporting a never ending sequence of similar vendor specific commands) In a sense (1) means they are a defacto standard so we can be sure of side effects etc and things like how long they take. Jonathan > >