From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C851E77188 for ; Thu, 26 Dec 2024 19:25:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 47F346B008C; Thu, 26 Dec 2024 14:25:39 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 42EF76B0092; Thu, 26 Dec 2024 14:25:39 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2CF336B0093; Thu, 26 Dec 2024 14:25:39 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 0C1816B008C for ; Thu, 26 Dec 2024 14:25:39 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 91B2C160E20 for ; Thu, 26 Dec 2024 19:25:38 +0000 (UTC) X-FDA: 82938088338.22.19FA4AC Received: from mail-qk1-f170.google.com (mail-qk1-f170.google.com [209.85.222.170]) by imf22.hostedemail.com (Postfix) with ESMTP id E2123C0003 for ; Thu, 26 Dec 2024 19:24:52 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=gourry.net header.s=google header.b=SxXaeeZd; spf=pass (imf22.hostedemail.com: domain of gourry@gourry.net designates 209.85.222.170 as permitted sender) smtp.mailfrom=gourry@gourry.net; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1735241117; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=bvHPT0ccY32kPKcwl8x32j2H6trMUV124nGwkwGHE2M=; b=NIJKEjU5233cgc5qTj4wbW94qHky1z/sFFijnox6b+uTNAavWNZlwCz+XRupYzh36biP3b fhnCzqIxVzl3JkrO8UC96V5aRZD0OYgQo0zuMBsJYU1UxDmzSzT4OfPkQDIwm6mbMBmuBG Vi2eACZbzqLwc53BNfNSPNsMHS50DHU= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=gourry.net header.s=google header.b=SxXaeeZd; spf=pass (imf22.hostedemail.com: domain of gourry@gourry.net designates 209.85.222.170 as permitted sender) smtp.mailfrom=gourry@gourry.net; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1735241117; a=rsa-sha256; cv=none; b=SJMPbRReHgaDiIYkDiqFu/gJ6dvv+jCNTtBj1GL4CiSeXwQBJNm02FfDsiEMDVWQBSMCXl USUUiN44s3ZYhUCOXQhCh1yLCqEfVJSp6TDCqL68RW/VvzH9hClgjQ3BbOLlVhRPNBRqLp yIKrgsNBegDtLRilFTzFm25IlmncM5w= Received: by mail-qk1-f170.google.com with SMTP id af79cd13be357-7b6f19a6c04so584180185a.0 for ; Thu, 26 Dec 2024 11:25:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gourry.net; s=google; t=1735241136; x=1735845936; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:date:from:from:to:cc:subject:date:message-id:reply-to; bh=bvHPT0ccY32kPKcwl8x32j2H6trMUV124nGwkwGHE2M=; b=SxXaeeZd62p7wPIXOKuD3+Q3zD1PjqDWWocaysqDYrCesuuUDnP96VIpcZwcTq+TKi Hv5Cw+n9RFLUOabcY3b1LwXTjG8i0PJPWGCBp7sk+j76gvefyoL1RERkK/XzztZbLFE4 iEZ/d/2/kazJykK3Dou8MIrvh4Ovo2oDNQf4j9Vb5LGAvt870HQUyRaWZHquQ8AKV6Be 3q0KYT7ICmeoWSRP1Qr0xY6Ht91L2KZB/diVzJz0dERFTERweQqSD8I6wINv4HkanP/K h7MMbktf1T+hr5BsHOJ3jRbKzizvqa9a6toAlR6m5kcjSdPdOJt/iW7nIbQejwsSqfNJ Xlhg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1735241136; x=1735845936; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:date:from:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=bvHPT0ccY32kPKcwl8x32j2H6trMUV124nGwkwGHE2M=; b=QleI/Fgp2AJfsv7g9AQUrvt6p5PwNSdQxyVK1hHmHORVRBmIVby9dEL77C/68164sH J7t1LERwYFMVBUEe9JpLSuVVXeXYpr6yJlGk/DpKBU33U9Zzw9POakqg9jYk71ACw06X eWfKe8Uv+YgB8XLgD2CwaICS16n/z80YGjqHYAJ3XdUGXK2j/qQu6jjrigowzV/1E3Ie Ltsrh58NoWGRTBOFePGJKQ3yD8bysEx5ugipDd+Cbz10r+QTvVMRhtk7uUuCSWDoDI+K 75/rPeC5ye8paqmdERcgcrN/Lp+u2VJAPvj+h7oFjqYqjC+e/+8Se63mUtp7G2nPC9f9 DpDw== X-Forwarded-Encrypted: i=1; AJvYcCUnWJP5ZBgYFLRDlez0XFdTuQNf5blmpqaO3dXqDdUhF1eN0SzdMpvc0t5aA111ELCmsH8IwSko4Q==@kvack.org X-Gm-Message-State: AOJu0YzbrgOqJGfi9pf/CqGWrcL7JxNdUCZwPJJgGsJH+AmYF+l7J4KG lV/cILJEpNkWmkZJI8jQAFCEKkNu3nTXAp6XwtZpf5L05wjDlMkaTabK1ktFRlM= X-Gm-Gg: ASbGncsJVc0/5LwF6UTYuWR0OZ59vddja+IMESyfCynXgLq959P8m+80TXUvcvK1hUb jXoy1dPioDoOLYO4XRwYS3C/6OJ/oFZQxZeFHcpDBYp694aCQUmMHyaDSL9E9m9piaMcH26pCta ckSQJU3Oehq03LUVX+KBfbCp0oZVwxbA3xoVa32A85/hTkrbWVYu6euDIT7/YKs5g+bL4r9nQ6s 1SCkYqc1dE/X/DF6bco7UaVcrkx0NG5w0Vg7OCkI0/Mf9NsurKLr8r9y3NBhaoAFT7jMjqn9WkL 2j6n X-Google-Smtp-Source: AGHT+IH8lkhAouw1RNl0FDXSDuwCUOie6f3E0Pcyb3IuuD3GIIbIaXZCil62oB59Jbf5u0tpjedBCw== X-Received: by 2002:a05:620a:24cd:b0:7b1:11ac:627a with SMTP id af79cd13be357-7b9ba79cdeemr3532496985a.25.1735241135714; Thu, 26 Dec 2024 11:25:35 -0800 (PST) Received: from gourry-fedora-PF4VCD3F ([184.169.45.4]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7b9ac2bc85fsm640184785a.7.2024.12.26.11.25.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Dec 2024 11:25:34 -0800 (PST) From: Gregory Price X-Google-Original-From: Gregory Price Date: Thu, 26 Dec 2024 12:25:20 -0700 To: Alison Schofield Cc: Gregory Price , Nathan Fontenot , dan.j.williams@intel.com, linux-cxl@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH] cxl: Update Soft Reserved resources upon region creation Message-ID: References: <20241202155542.22111-1-nathan.fontenot@amd.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: E2123C0003 X-Stat-Signature: f8qeptuts5op9gqk5efaku3yeejsodin X-Rspam-User: X-Rspamd-Server: rspam11 X-HE-Tag: 1735241092-117284 X-HE-Meta: U2FsdGVkX1/U07jC+zyZWWs0T8DP3AHTmdeGcxJBgeEPlBmEv0IqRXLYdbUWlPXE7bGHavpnKXGuEcb5fTKO7DmRgQ1Y200Ln1qSbkB88zcAsfthHlBS0U4xcEkakjBEVzFBoKCSfAt3N9P+ubTn1f92PqvvMqBAO3mAf5EVg4yWoyjwSyOdeLzdkuAXf4LzdD3KcxW5H6zWZQH4npcRhgx1TpHoe84Eq6bj/7beGoB32uYe8Ti+lkrAaMaUgbcfTA4abVToYFkn/aV+VB8+xMKmDRNgqmAyPgMYATzhOT8AB4cU7lX/7gJuPtRAsF/Nhi65551tjr4pisndR18VMMb6f40BAPHzBn3qUj1qqhA1bQ1lfgnxgBTzDQ3uGcxRs+Rf7FFC1G2OSscTi9iQUFk8RgL/x7kCsKDJEN8JEvTSzoFBNgdtaNowE1WOTPy1dkb6ZlCWyhhOEmDvhc6EyBpLJqiuNT13Y7NfAGmqLHxtoBcbwS1I0DmnwDH+E6udW9Peh6PGD8If9RL6rT5JMmNkYXoyjGH9BRWkaWoi5dxzKRJUjhdXcPTxoHg3iM5adX48Tjo2dXpkvV6UtPIgsfM5OLt6qJnsH84+LHEZNaWh1/yh+HhYwr8aCJuyE9G6tvIpsRuhvaSQlqvWGhy7X5G/rh17rqmLGIoO0Lvg92FKIzXGGmQN3aw4JLgV+bMpraWEu1KS1SR5vO2Ap+XPEb7wypsK9gPxjCWuchT8hY7e1xOb59CYCOTTPdWmZJHW43r/lXHQ7hXr5U5vnQBfLJX5PXqDL38vgt6Oo7CC1eSumiYk0PieV+WAkHXca2jZYPv87qp2/FjlJf7GyX/TS22vbDYRpJPKUy12G1BGFWOQiklnKeHZ/AdNwIGUii3PF/w34J60lyVtkRK4r0D/VDP58v+t+PlB9IQMMG5ufBdYrWgYCQFJsAIBU+slPPuANfSYlQZQ3uxldzu2QCd X455NfSL FgEYQFJICivLIURNBBg0vjr0JH69bjdBNzf8Pn4ilJ4A9uKkWbqVdf4zYjOE3wxUBD+Cto7ae5QmUOKNrlrUbMgnwmoaRtr0q3fXRbfgBLU7uYcicwcBO9Tq5hGtkcwe9kmFkUP2JNkSmqX15mTIhXUkzBpXqCKxNYySgFpeyk8x40/IGPotFuXO10BqEzMqWJT97kJefeVoS5A3p55jrYyqCaTzxZFHVKuV/LuIqGWH0d+omGQ0X+Rj3lBNKqz7zpHgWRQVDzi5222oTK49SOLU5H24nk6TmUJihR448gOv94NCHQegVu8PtabZx4QPLZjt2+vCFxtSrvgw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.020881, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Dec 12, 2024 at 05:01:53PM -0800, Alison Schofield wrote: > BIOS labels a resource Soft Reserved and programs a region using > that range. Later, the existing cxl path to destroy that region > does not free up that Soft Reserved range. Users cannot create > another region in it's place. Resource lost. We considered simply > removing soft reserved resources on region teardown, and you can > probably find a patches on lore doing just that. > > But - the problem grew. Sometimes BIOS creates an SR that is not > aligned with the region they go on to program. Stranded resources. > That's where the trim and give to DAX path originated. > > But - the problem grew. Sometimes the CXL driver fails to enumerate > that BIOS defined region. More stranded resources. Let's find those > too and give them to DAX. This is something we are seeing in the > wild now and why Dan raised its priority. > Hm, this makes me concerned for what happens on "full hotplug" (literal physical removal/addition) of CXL devices - kind of like we've seen proposed with E3.S form factor devices from a variety of vendors. Like what happens in the following scenario (rhetorical question, I want to test this with QEMU - but i'm on a plane right now and want to get the experiment process down). Boot: No CXL device is present Post-boot: CXL device is physically hot-plugged - there won't be a resource registered, so I would presume the ACPI / EFI / CXL drivers would register one. Event 1: CXL device is shutdown and removed - Is the resource deleted? I would presume yes. - Is this true if the CXL device *was* present at boot time? If i'm following correctly ^ this is the present scenario? Lets assume the device was present at boot, and the resource is not deleted. Now we have a "stale resource"? Event 2A: A new CXL device is added - Possibility 1: Same capacity - resource is reused? - Possibility 2: Lower capacity - resource is chopped up? - Possibility 3: Higher capacity - resource is... lost forever? Fails to map? ??? Event 2B: A new CXL device is added on a different PCI dev id, then Event 2A occurs. - Is the "stale resource" reused here, or is a new one created? I hadn't really considered the impact of hotplug on the iomem resource blocks (soft) reserved at boot, but this is concerning. I remember ~1.5 years ago I was prototyping with hotplug behavior in QEMU and saw that it was possible to do runtime ACPI/PCI add/remove of CXL devices - this worked. But I didn't look at the effects on iomem resources - now i'm wondering what happens if I try to hot-unplug a CXL device that was present at boot. This won't affect me for the immediate future, but if we're mucking around in this space, might as well ask the question. I presume we'll find even worse corner cases here :D :| :[ :< I do know servers with front-facing E3.S CXL devices intended for hot-replace exist and are a real use-case. I have no idea how that is supposed to work the presence of stale iomem resources. > Dan is also suggesting that at that last event - failure to enumerate > a BIOS defined region, we tear down the entire ACPI0017 toplogy > and give everything to DAX. > > What Dan called, "the minimum requirement": all Soft Reserved ranges > end up as dax-devices sounds like the right guideline moving forward. > I guess devils in the details here. I sense an implication that it's possible for two distinct pieces of SR-providing hardware (HBM and CXL) could end up concatonated into a single SR range? That would obviously necessitate the need for chopping up an SR. So this all makes sense. But I don't disagree with the need for this, just concerned that we have CXL-specific logic landing in mm/ and e820 code. ~Gregory