From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D856FE77180 for ; Fri, 13 Dec 2024 01:02:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5C7046B0089; Thu, 12 Dec 2024 20:02:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 574D26B008A; Thu, 12 Dec 2024 20:02:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 43C676B008C; Thu, 12 Dec 2024 20:02:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 24BCA6B0089 for ; Thu, 12 Dec 2024 20:02:00 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id AF9881202CD for ; Fri, 13 Dec 2024 01:01:59 +0000 (UTC) X-FDA: 82888133664.02.1F15854 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.9]) by imf08.hostedemail.com (Postfix) with ESMTP id 5FE7A160002 for ; Fri, 13 Dec 2024 01:01:40 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=eUxbmPZk; spf=pass (imf08.hostedemail.com: domain of alison.schofield@intel.com designates 198.175.65.9 as permitted sender) smtp.mailfrom=alison.schofield@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1734051700; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=OJ+4d7dhLoPSwcm4J0h9IyDQlenmW4Ovq7lT5LYC/lU=; b=NBq72YyUWrsTw3jIQw9xY7/tAizh6Gg8Dv1Er1gFGLSSbne2sabHDVn7nYoUi3LJCZQoUZ 3xcel0aUpu5c4vpoaQhtX0xGxPqN+fgOyFijOdqRcmcMmc2rPsViHQUOOFmc8mVFWZeC41 7eeNTXWwtwKVRbO66U2EuugoO+rea9A= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=eUxbmPZk; spf=pass (imf08.hostedemail.com: domain of alison.schofield@intel.com designates 198.175.65.9 as permitted sender) smtp.mailfrom=alison.schofield@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1734051700; a=rsa-sha256; cv=none; b=LO0XCseiCUBEsXqR0CX00Nf0dBuqikSLI2TZrXLIkMzUbTMFCxVI71FiJTUtaqDpIx8Uj8 Y2tGyCu3BnTmVS5OOFBMuidWMWMkdUzvqx6WslxFHQMNDzD0ZyCacqSCRlpAZeJhvvlSQt E/r45wuKX+epXRsGjN+L1vS1JEdbnLY= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1734051717; x=1765587717; h=date:from:to:cc:subject:message-id:references: mime-version:in-reply-to; bh=gE+XMjlbD6wK0m92eYbAc+ZOzvQchaLYpuEwTXR73rg=; b=eUxbmPZkQ1dVftW78a9gTVA8/0psX3shx1viR1vsEZrnYdf/4jGs1pIl mMSAdZAuY3merf4DpJRH1aWM5xfy5+bvTNcqqJocRqEKXSCnfB/+mkM6s cIpTFoc+lS86yrl3dwPPL6uFKHcyi2VKTJeaXZMFTAzxJ+EvMLnA6SMWC 94CbbjXezKXKiCO5eEdKAD8jTwymyFb4fBBgBSgDQD2RbFu8Lg2OxbycQ bFk7QB/ANJGjax1rnDoCFvNQJBsKt1Y/84UFPqCtRSmRvwAeVuIFKjTdE qs2dOi0qtrtCyKvauLHghl8xXA9uGyIBUo4i8/Cj0YSRtmB5vZDuF3rSb w==; X-CSE-ConnectionGUID: iE7eL3xkQDeQ4+FmbaDQtA== X-CSE-MsgGUID: PRhBWhH/SSSkdwwrXPjoUw== X-IronPort-AV: E=McAfee;i="6700,10204,11284"; a="56968439" X-IronPort-AV: E=Sophos;i="6.12,230,1728975600"; d="scan'208";a="56968439" Received: from orviesa002.jf.intel.com ([10.64.159.142]) by orvoesa101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Dec 2024 17:01:56 -0800 X-CSE-ConnectionGUID: 3Qejn4ZPTnWlKmvSoTPn0w== X-CSE-MsgGUID: wqJ+12uFRsOnhQFLRL+5yQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,230,1728975600"; d="scan'208";a="127199084" Received: from aschofie-mobl2.amr.corp.intel.com (HELO aschofie-mobl2.lan) ([10.125.110.122]) by orviesa002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Dec 2024 17:01:56 -0800 Date: Thu, 12 Dec 2024 17:01:53 -0800 From: Alison Schofield To: Gregory Price Cc: Nathan Fontenot , dan.j.williams@intel.com, linux-cxl@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH] cxl: Update Soft Reserved resources upon region creation Message-ID: References: <20241202155542.22111-1-nathan.fontenot@amd.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam05 X-Stat-Signature: eoaf54fwqc7kxdmofcbdqbxork6t11ur X-Rspamd-Queue-Id: 5FE7A160002 X-Rspam-User: X-HE-Tag: 1734051700-54488 X-HE-Meta: U2FsdGVkX19x2MiJ2BY2Ltve+5KOpzeR7C5/7QuZIO/5qF94NoKktZ7kO/JCWvuutGhq+fLVn2Sas7Q1HtL24GVhZfMYMbtzKstCqIaG8sUFMzL/VMlimaBEHZwLxplkl+lZ3FZaUCfnJZnm3cnGoaeNqHLopXd21FBRoW80cTJu/gHYxubpZGvJqEEGbowPf4KhJRNfxHcb4Ls1510ZtX+qNQC4pXCRAr83mUr5jyD3l15960/brL8xB7ycrn1Am075roe0O09VTQKUBkAd+230yX7p9XR1wT6zzpcksg0moLV/6s0s5My63X8jTSWlAFwqp6zlMgXehFBCbTPznCiHmlg/0ktFMjUSXSirg8jDhZ1Y+tVUy5LIAfSifGuDVu1MDm2UoVhpf0K90AN0kjAPvu7eMDYk5AQmaLGV6H5RnPU7b2EQmuO3d2icwe9QF0ggIRLV6HTIyjYSUIDqcEsL97j99tykZJ3ZQjvBm5xjj8Q51SfQc5xoPO7CGQcEk+mRhiufE+S3hUUIyT8G0aWzNkEXo7Z6wQJmYKFeIT+wzI/RHNlYDCX5eQVdQSj1/dhml1dKu7n+eTI9eJhQ5cGNEbscpHVf8nwg397D5jWxxmtadyNII2I8AlLH07HSN2q9YoQ5bUp2VT/tYuhd3MTk5fYs5YckKuMB/1Lbvy4waCFmiL70GYpgI4/i7r8ee0ldlG5EqdcYZ8bpgxoc0ntJk9JJoWsWK5sRuxHG5KzL+3bQBoZogKeT1DoLFE33AW92+RGnEMlGn8gwyqQZ3zWSN3UTOQV2Psr74YLfQEnKi7p+JckMoRSn7DFdK3tr2UtYW8JfesPAZrYErUAffQMA/c8eOXxXMBNgrhLomk8ORgU5bKKDzqWRLFFJE74LIt5NTO6LBAsNoQRFPgl1PvA2Vu8YRQWObGAw2V8W7PIfND6ySLawAHzJ6kGWeWCWHi5AQL1zNleK77k4YJP zYgXapVw M91/oUpDj86ugtA1NhMMmwMunV53M52nKS0iZ0ZGcww0bAaU+1UOROEUQC33soSERn5MExuHwawZm7fm7ts7eZ+janMoI0g26fO4i4ZzU+hCQV+Y7MaH/6Hjr5D6mDr1yebA3yfuCNP7J8aC51iOfgezgvw5Os8Y4K+kK X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Dec 12, 2024 at 05:42:08PM -0500, Gregory Price wrote: > On Mon, Dec 02, 2024 at 09:55:42AM -0600, Nathan Fontenot wrote: > ... snip ... > > diff --git a/kernel/resource.c b/kernel/resource.c > > index a83040fde236..8fc4121a1887 100644 > > --- a/kernel/resource.c > > +++ b/kernel/resource.c > ... snip ... > > +static int srmem_parse_cfmws(union acpi_subtable_headers *hdr, > > + void *arg, const unsigned long unused) > > +{ > > Chiming in a little late to the party here > > I don't think encoding CXL-specific terminology and functionality > directly into kernel/resource.c is wise by any measure. > > The abstraction here is completely inverted, and this is probably > in line with Dan's comments. > > The comments in e820.c allude to a similar issue > > * Prior to inserting SOFT_RESERVED resources we want to check > * for an intersection with potential CXL resources. > > This is similarly inverted - e820 doesn't know anthing about CXL > and it shouldn't have to be made aware of CXL. Mucking with > e820 is *begging* to be bitten elsewhere in parts of the system > that depend on it to be a relatively stable source of truth. > > > This tells me that this patch is trying to solve the wrong problem. > > > Your changelog alludes to supporting hotplug replace > > """ > The current approach of leaving the SOFT RESERVE resource as is can > cause failure during hotplug replace of CXL devices because the > resource is not available for reuse after teardown of the CXL device. > """ > > It sounds like we should be making the SR resource available for > re-use through proper teardown and cleanup of the resource tree, > rather than trying to change fundamental components like e820. > > If the driver was capable of using the SOFT_RESERVED region on > initial setup, it should be capable of re-using that region. > > > Is the issue here that the hotplug-replaced component has a > different capacity? It it being assigned a new region entirely? > Is it exactly the same, but the resource isn't being cleaned up? > > Can you provide more specifics about the exact hotplug interaction > that is happening? That might help understand the issue a bit better. > > > Much of this sounds like we need additional better tear-down > management and possibly additional cxl/acpi features to handle > hotplug of these devices - rather than changing resource.c. Hi Gregory, Never too late, and it serves to revisit and remember to carry along some of the why behind this patch as we adopt it to do more. Admittedly, the more, has also become the more important here! I'm not sure what Nathan first saw that led him to this issue. For me it was - BIOS labels a resource Soft Reserved and programs a region using that range. Later, the existing cxl path to destroy that region does not free up that Soft Reserved range. Users cannot create another region in it's place. Resource lost. We considered simply removing soft reserved resources on region teardown, and you can probably find a patches on lore doing just that. But - the problem grew. Sometimes BIOS creates an SR that is not aligned with the region they go on to program. Stranded resources. That's where the trim and give to DAX path originated. But - the problem grew. Sometimes the CXL driver fails to enumerate that BIOS defined region. More stranded resources. Let's find those too and give them to DAX. This is something we are seeing in the wild now and why Dan raised its priority. Dan is also suggesting that at that last event - failure to enumerate a BIOS defined region, we tear down the entire ACPI0017 toplogy and give everything to DAX. What Dan called, "the minimum requirement": all Soft Reserved ranges end up as dax-devices sounds like the right guideline moving forward. Hope that answers some of your why questions about the SR reuse. It seems the worst problem, failure to enumerate a region, kind of drives the solution now, and we should handle the soft reserveds all the same. Even if Nathan doesn't implement it in one final sweep through the Soft Reserveds, one way or another, those SR ranges must end up as dax-device resources - either CXL managed or BIOS Soft Reserved. --Alison > > > ~Gregory > > > + struct acpi_cedt_cfmws *cfmws; > > + struct srmem_arg *args = arg; > > + struct resource cfmws_res; > > + struct resource *res; > > + > > + res = args->res; > > + > > + cfmws = (struct acpi_cedt_cfmws *)hdr; > > + cfmws_res = DEFINE_RES_MEM(cfmws->base_hpa, > > + cfmws->base_hpa + cfmws->window_size); > > + > > + if (resource_overlaps(&cfmws_res, res)) { > > + args->overlaps += 1; > > + return 1; > > + } > > + > > + return 0; > > +} > > + > > +static bool resource_overlaps_cfmws(struct resource *res) > > +{ > > + struct srmem_arg arg = { > > + .res = res, > > + .overlaps = 0 > > + }; > > + > > + acpi_table_parse_cedt(ACPI_CEDT_TYPE_CFMWS, srmem_parse_cfmws, &arg); > > + > > + if (arg.overlaps) > > + return true; > > + > > + return false; > > +} > > + > > +int insert_soft_reserve_resource(struct resource *res) > > +{ > > + if (resource_overlaps_cfmws(res)) { > > + pr_info("Reserving Soft Reserve %pr\n", res); > > + return insert_resource(&srmem_resource, res); > > + } > > + > > + return insert_resource(&iomem_resource, res); > > +} > > +EXPORT_SYMBOL(insert_soft_reserve_resource); > > + > > static void __init > > __reserve_region_with_split(struct resource *root, resource_size_t start, > > resource_size_t end, const char *name) > > -- > > 2.43.0 > >