From: Gregory Price <gourry@gourry.net>
To: Alison Schofield <alison.schofield@intel.com>
Cc: Gregory Price <gourry@gourry.net>,
Nathan Fontenot <nathan.fontenot@amd.com>,
dan.j.williams@intel.com, linux-cxl@vger.kernel.org,
linux-mm@kvack.org
Subject: Re: [PATCH] cxl: Update Soft Reserved resources upon region creation
Date: Thu, 26 Dec 2024 12:25:20 -0700 [thread overview]
Message-ID: <Z22toIWJfsgZJn_a@gourry-fedora-PF4VCD3F> (raw)
In-Reply-To: <Z1uHgePW0T81r3xC@aschofie-mobl2.lan>
On Thu, Dec 12, 2024 at 05:01:53PM -0800, Alison Schofield wrote:
> BIOS labels a resource Soft Reserved and programs a region using
> that range. Later, the existing cxl path to destroy that region
> does not free up that Soft Reserved range. Users cannot create
> another region in it's place. Resource lost. We considered simply
> removing soft reserved resources on region teardown, and you can
> probably find a patches on lore doing just that.
>
> But - the problem grew. Sometimes BIOS creates an SR that is not
> aligned with the region they go on to program. Stranded resources.
> That's where the trim and give to DAX path originated.
>
> But - the problem grew. Sometimes the CXL driver fails to enumerate
> that BIOS defined region. More stranded resources. Let's find those
> too and give them to DAX. This is something we are seeing in the
> wild now and why Dan raised its priority.
>
Hm, this makes me concerned for what happens on "full hotplug" (literal
physical removal/addition) of CXL devices - kind of like we've seen
proposed with E3.S form factor devices from a variety of vendors.
Like what happens in the following scenario (rhetorical question, I want
to test this with QEMU - but i'm on a plane right now and want to get
the experiment process down).
Boot: No CXL device is present
Post-boot: CXL device is physically hot-plugged
- there won't be a resource registered, so I would presume the ACPI
/ EFI / CXL drivers would register one.
Event 1: CXL device is shutdown and removed
- Is the resource deleted? I would presume yes.
- Is this true if the CXL device *was* present at boot time?
If i'm following correctly ^ this is the present scenario?
Lets assume the device was present at boot, and the resource is not
deleted. Now we have a "stale resource"?
Event 2A: A new CXL device is added
- Possibility 1: Same capacity - resource is reused?
- Possibility 2: Lower capacity - resource is chopped up?
- Possibility 3: Higher capacity - resource is... lost forever?
Fails to map? ???
Event 2B: A new CXL device is added on a different PCI dev id, then
Event 2A occurs.
- Is the "stale resource" reused here, or is a new one created?
I hadn't really considered the impact of hotplug on the iomem resource
blocks (soft) reserved at boot, but this is concerning.
I remember ~1.5 years ago I was prototyping with hotplug behavior in
QEMU and saw that it was possible to do runtime ACPI/PCI add/remove of
CXL devices - this worked. But I didn't look at the effects on iomem
resources - now i'm wondering what happens if I try to hot-unplug a CXL
device that was present at boot.
This won't affect me for the immediate future, but if we're mucking
around in this space, might as well ask the question. I presume we'll
find even worse corner cases here :D :| :[ :<
I do know servers with front-facing E3.S CXL devices intended for
hot-replace exist and are a real use-case. I have no idea how that
is supposed to work the presence of stale iomem resources.
> Dan is also suggesting that at that last event - failure to enumerate
> a BIOS defined region, we tear down the entire ACPI0017 toplogy
> and give everything to DAX.
>
> What Dan called, "the minimum requirement": all Soft Reserved ranges
> end up as dax-devices sounds like the right guideline moving forward.
>
I guess devils in the details here. I sense an implication that it's
possible for two distinct pieces of SR-providing hardware (HBM and CXL)
could end up concatonated into a single SR range? That would obviously
necessitate the need for chopping up an SR. So this all makes sense.
But I don't disagree with the need for this, just concerned that we have
CXL-specific logic landing in mm/ and e820 code.
~Gregory
prev parent reply other threads:[~2024-12-26 19:25 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-02 15:55 Nathan Fontenot
2024-12-02 18:56 ` Fan Ni
2024-12-04 16:33 ` Fontenot, Nathan
2024-12-02 19:00 ` kernel test robot
2024-12-03 0:31 ` kernel test robot
2024-12-03 1:12 ` kernel test robot
2024-12-11 11:31 ` Andy Shevchenko
2024-12-11 20:11 ` Fontenot, Nathan
2024-12-11 22:30 ` Dan Williams
2024-12-11 22:53 ` Dan Williams
2024-12-12 2:07 ` Alison Schofield
2024-12-12 3:20 ` Dan Williams
2024-12-12 18:12 ` Fontenot, Nathan
2024-12-12 19:57 ` Dan Williams
2024-12-12 22:42 ` Gregory Price
2024-12-13 1:01 ` Alison Schofield
2024-12-13 16:33 ` Fontenot, Nathan
2024-12-26 19:25 ` Gregory Price [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z22toIWJfsgZJn_a@gourry-fedora-PF4VCD3F \
--to=gourry@gourry.net \
--cc=alison.schofield@intel.com \
--cc=dan.j.williams@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nathan.fontenot@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox