From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 82733E7717D for ; Thu, 12 Dec 2024 02:07:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 122266B0096; Wed, 11 Dec 2024 21:07:37 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0D2726B0098; Wed, 11 Dec 2024 21:07:37 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EDB856B0099; Wed, 11 Dec 2024 21:07:36 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id D1B146B0096 for ; Wed, 11 Dec 2024 21:07:36 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 590364450E for ; Thu, 12 Dec 2024 02:07:36 +0000 (UTC) X-FDA: 82884669924.15.3CC8752 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.20]) by imf21.hostedemail.com (Postfix) with ESMTP id 13AF61C0005 for ; Thu, 12 Dec 2024 02:06:47 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=CAEZkEmD; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf21.hostedemail.com: domain of alison.schofield@intel.com designates 198.175.65.20 as permitted sender) smtp.mailfrom=alison.schofield@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1733969232; a=rsa-sha256; cv=none; b=QFaeZHq6wADai5Ahazbcgt7zRwAMHLJ+CyXyY7Ms+7A2MpkhBJ0Uo+bOCtIFwJS2paJxoX HVehaDFoSue+GFRFCyw1p82jIFBCv7+LHWhpB4rCuVvqL/8xejhAGzTMpi/A6IIva8q9+5 QQT1Oi8NxZMmQaO5HzqToesSuzDVpas= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=CAEZkEmD; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf21.hostedemail.com: domain of alison.schofield@intel.com designates 198.175.65.20 as permitted sender) smtp.mailfrom=alison.schofield@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1733969232; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=AOcDI4wxMf+Ajny8J7MHkKSbchYd+8dRb8SdlCWd1jg=; b=vyE6LnOT0hqinOUQX9hF5r97+linemxkdzHHtJ/QyaomQLW/j3KDiFl5lM3Y+jLClMkRO7 dbQuDSAuNlV29lnIaqBNFMdPbv/0yIIYaNEnCmtewr1b1sfWDnCHiHMDke2Tfx1t55bdG+ N3bJ9F4wh53gH6p2iDN71puAbXgqQnk= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1733969254; x=1765505254; h=date:from:to:cc:subject:message-id:references: mime-version:in-reply-to; bh=/yluqfdO/e1RLDL5LRJ9aPH9pgS8hEbsx0IXJey8uNU=; b=CAEZkEmD7xeSdIqVuHI+1Y2RjlYwPqHbutBVC/WM01Z8//zx22zVR2N4 lWO1rn3kpi5Pcd1SMVilMc/TYEo3Pp5cpCCOMY0gVgZPkGi5KMA44qT8/ GrVYSzwn2aCORAAN28+zjrnSDFPCaRJi9gAnkQz7Sh2aeaZ0a0bYuLW8Q HndpFeL7lFBlP3S+4ZFwSMiqIBAIRl/k6N2RxFckEEhtRpiYYKHROOcu3 eTVey7u85NJ1QlkFjTCxgPhTfqsSSALVy/+eYia2bvmOj0XpqhfXk3ISV 0Lp9K17R7o9T7egmh+f+fj7n0R91ldn98Bv7/4p/ahcf1mtnruk6n+trT g==; X-CSE-ConnectionGUID: nXgGVIamQ5WQuJDCoZs8vw== X-CSE-MsgGUID: b0k56afzQk6ATY2I/Y54uA== X-IronPort-AV: E=McAfee;i="6700,10204,11283"; a="34098134" X-IronPort-AV: E=Sophos;i="6.12,227,1728975600"; d="scan'208";a="34098134" Received: from orviesa005.jf.intel.com ([10.64.159.145]) by orvoesa112.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Dec 2024 18:07:27 -0800 X-CSE-ConnectionGUID: chCiMb/CTW6nm3mOdKrpcg== X-CSE-MsgGUID: HkTs3YgqR/uAtjE0ZzF/CA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,224,1728975600"; d="scan'208";a="101006612" Received: from aschofie-mobl2.amr.corp.intel.com (HELO aschofie-mobl2.lan) ([10.125.110.67]) by orviesa005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Dec 2024 18:07:27 -0800 Date: Wed, 11 Dec 2024 18:07:24 -0800 From: Alison Schofield To: Dan Williams Cc: Nathan Fontenot , linux-cxl@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH] cxl: Update Soft Reserved resources upon region creation Message-ID: References: <20241202155542.22111-1-nathan.fontenot@amd.com> <675a12a3d09d7_10a083294c0@dwillia2-xfh.jf.intel.com.notmuch> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <675a12a3d09d7_10a083294c0@dwillia2-xfh.jf.intel.com.notmuch> X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 13AF61C0005 X-Stat-Signature: qw6tmn6sf4tcs3gf1xr1k9yppntirrfr X-Rspam-User: X-HE-Tag: 1733969207-722078 X-HE-Meta: U2FsdGVkX1+Ms1oboPm1RifO64beikUotiYctW5c/a3R4U2ztdeaKd9pGjYpN2kkeBbf8M6iOHcvxo2n49B/gD6S+ApbmRBec6xkSKvB3NsrAt8yHbaWGx6woifCH9vWvHrbyI9AFRWLQnGJhTqFT1h+cJFpkuJWq3zU+ly/FjrTV9klcPNMvXZd7MdKLBE17bG50McyQZ0n0UGqaCgfNFpLXwM/AXzwRlVkQYBcehd5uvaKDUIod7iRdjRabTtUZzKzL8VDJx2euESIxUkf42Lqg9MfRdjG0rUbH3eSmK/ply4NTgpUAKWly4mrW4hwPADjEyszdrnPJ3jQIO/EhgE0Fw0ZMTaZUKXNFQvuzkkFHJvH3XJ8/91FbKYF/Ugs+Rf9Ptu+5pWzjpksrF5YTjSnWuZ6+afWZkzpX825CeXO3cLmP4UAnkjnyaRL40xy3KibKAmjNoAKGZu2Q6OYu8j6auRtfPj7hk2hKWULvDr/LcwkfFSxOzE/a+8VSCrgDxGIbmQhCR3WHDmg2ndCKRnO+BGqcODa8oSeeo6PoaLjYU5QkJhN3SNx6ZcjTDw/rVlAdYMEO+O6QO8ZIvK+fohsN3yB5WvwF1HnSTIgtzNCvCR4gXcH1HnJLEqDw01Xiah589y2qORlDGqlVhQ18bTGakoa+nqMvIrnQuhiVqCwIk0P3waVzrbW7cGEbr7i35V0XymJNSLMQedhMnTsHXmyFwWvqSi1E+6FF/R6A9GEEwWJi9uLuxHTGD1ZOLn9WKGBBmpzN/R3d4pJ2IcvRPdRpz3WL2zNbTPJF9ukwpMKlgrrIRi8qnjVC87qm0IOutSsiM6RzUbu5zmTWJvLXOVywHDmhuoFUGAiP0uOdhCVIBp0MpUiOqppdqc297htApaniAgcy5F4vr/d6mbm7fQ03p1PcvMMtVxWFfx140pMOrWeyocHcW0gwbwRkoFXGuWqm6dgYJJClFI9Ipj uE+3Ad/z ibpsVUCMfGy695wulHh+TIMBLQ0v5z4Adc6HQ4Bh34uZo+peSbCoPVcGE7QFkYyc6ZGmpg42XnQyH6F9V3M3ueqFNNpnJAhdUjPZ/yZHzlRPvvP2qvBaCegvF+RUtXAjJ5CRY3D2kWOh0/pdoxbtoO6NM3EDazVK39cl/ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000126, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Dec 11, 2024 at 02:30:59PM -0800, Dan Williams wrote: > Nathan Fontenot wrote: > > Update handling of SOFT RESERVE iomem resources that intersect with > > CXL region resources to remove the intersections from the SOFT RESERVE > > resources. The current approach of leaving the SOFT RESERVE > > resource as is can cause failures during hotplug replace of CXL > > devices because the resource is not available for reuse after > > teardown of the CXL device. > > > > The approach is to trim out any pieces of SOFT RESERVE resources > > that intersect CXL regions. To do this, first set aside any SOFT RESERVE > > resources that intersect with a CFMWS into a separate resource tree > > during e820__reserve_resources_late() that would have been otherwise > > added to the iomem resource tree. > > > > As CXL regions are created the cxl resource created for the new > > region is used to trim intersections from the SOFT RESERVE > > resources that were previously set aside. > > > > Once CXL device probe has completed ant remaining SOFT RESERVE resources > > remaining are added to the iomem resource tree. As each resource > > is added to the oiomem resource tree a new notifier chain is invoked > > to notify the dax driver of newly added SOFT RESERVE resources so that > > the dax driver can consume them. > > Hi Nathan, this patch hit on all the mechanisms I would expect, but upon > reading it there is an opportunity to zoom out and do something blunter > than the surgical precision of this current proposal. > > In other words, I appreciate the consideration of potential corner > cases, but for overall maintainability this should aim to be an all or > nothing approach. > > Specifically, at the first sign of trouble, any CXL sub-driver probe > failure or region enumeration timeout, that the entire CXL topology be > torn down (trigger the equivalent of ->remove() on the ACPI0017 device), > and the deferred Soft Reserved ranges registered as if cxl_acpi was not > present (implement a fallback equivalent to hmem_register_devices()). > > No need to trim resources as regions arrive, just tear down everything > setup in the cxl_acpi_probe() path with devres_release_all(). > > So, I am thinking export a flag from the CXL core that indicates whether > any conflict with platform-firmware established CXL regions has > occurred. > > Read that flag from an cxl_acpi-driver-launched deferred workqueue that > is awaiting initial device probing to quiesce. If that flag indicates a > CXL enumeration failure then trigger devres_release_all() on the > ACPI0017 platform device and follow that up by walking the deferred Soft > Reserve resources to register raw (unparented by CXL regions) dax > devices. > This reads like a 'poison pill' case that is in addition to the original use cases that inspired this patch. I'm not sure I'm getting the 'all or nothing' language. The patch was doing Case 1) and 2). This poison-pill approach is 3) 1) SR aligns exactly with a CXL Window. This patch removes that SR so the address space is available for reuse if an auto-region is torn down. 2) SR is larger than a CXL Window. Where the SR aligns with the CXL Window the SR is handled as in 1). The SR leftovers are released to DAX. 3) Failure in auto-region assembly. New case: tear down ACPI0017 and release all SRs. So, after writing the above, maybe I get what the 'all' case is. Are you suggesting we stop trimming and ignore leftovers and just consider both 1) and 2) the 'all' case? No SR ever makes it into the iomem resource tree so all resources are available for reuse after teardown. And then there is 3), the nothing case! I get that. It will waste some leftovers that could have been handed to DAX. I know the SR greater than CXL Window came in handy for hotplug when the SRs were not released, but I never understood if SRs greater than CXL Window intended to serve any purpose. > Some more comments below: snip for now >