linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Gregory Price <gourry@gourry.net>
To: Alison Schofield <alison.schofield@intel.com>
Cc: Gregory Price <gourry@gourry.net>,
	Nathan Fontenot <nathan.fontenot@amd.com>,
	dan.j.williams@intel.com, linux-cxl@vger.kernel.org,
	linux-mm@kvack.org
Subject: Re: [PATCH] cxl: Update Soft Reserved resources upon region creation
Date: Thu, 26 Dec 2024 12:25:20 -0700	[thread overview]
Message-ID: <Z22toIWJfsgZJn_a@gourry-fedora-PF4VCD3F> (raw)
In-Reply-To: <Z1uHgePW0T81r3xC@aschofie-mobl2.lan>

On Thu, Dec 12, 2024 at 05:01:53PM -0800, Alison Schofield wrote:
> BIOS labels a resource Soft Reserved and programs a region using
> that range. Later, the existing cxl path to destroy that region
> does not free up that Soft Reserved range. Users cannot create
> another region in it's place. Resource lost. We considered simply
> removing soft reserved resources on region teardown, and you can
> probably find a patches on lore doing just that.
> 
> But - the problem grew. Sometimes BIOS creates an SR that is not
> aligned with the region they go on to program. Stranded resources.
> That's where the trim and give to DAX path originated.
> 
> But - the problem grew. Sometimes the CXL driver fails to enumerate
> that BIOS defined region. More stranded resources. Let's find those
> too and give them to DAX. This is something we are seeing in the
> wild now and why Dan raised its priority.
> 

Hm, this makes me concerned for what happens on "full hotplug" (literal
physical removal/addition) of CXL devices - kind of like we've seen
proposed with E3.S form factor devices from a variety of vendors.

Like what happens in the following scenario (rhetorical question, I want
to test this with QEMU - but i'm on a plane right now and want to get
the experiment process down).

Boot: No CXL device is present

Post-boot: CXL device is physically hot-plugged
 - there won't be a resource registered, so I would presume the ACPI
   / EFI / CXL drivers would register one.

Event 1: CXL device is shutdown and removed
   - Is the resource deleted?  I would presume yes.
   - Is this true if the CXL device *was* present at boot time?

     If i'm following correctly ^ this is the present scenario? 

Lets assume the device was present at boot, and the resource is not
deleted.  Now we have a "stale resource"?

Event 2A: A new CXL device is added
   - Possibility 1: Same capacity - resource is reused?
   - Possibility 2: Lower capacity - resource is chopped up?
   - Possibility 3: Higher capacity - resource is... lost forever?
                    Fails to map? ???

Event 2B: A new CXL device is added on a different PCI dev id, then
          Event 2A occurs.
   - Is the "stale resource" reused here, or is a new one created?

I hadn't really considered the impact of hotplug on the iomem resource
blocks (soft) reserved at boot, but this is concerning.

I remember ~1.5 years ago I was prototyping with hotplug behavior in
QEMU and saw that it was possible to do runtime ACPI/PCI add/remove of
CXL devices - this worked.  But I didn't look at the effects on iomem
resources - now i'm wondering what happens if I try to hot-unplug a CXL
device that was present at boot.

This won't affect me for the immediate future, but if we're mucking
around in this space, might as well ask the question.  I presume we'll
find even worse corner cases here :D :| :[ :<

I do know servers with front-facing E3.S CXL devices intended for
hot-replace exist and are a real use-case. I have no idea how that
is supposed to work the presence of stale iomem resources.

> Dan is also suggesting that at that last event - failure to enumerate
> a BIOS defined region, we tear down the entire ACPI0017 toplogy
> and give everything to DAX.
> 
> What Dan called, "the minimum requirement": all Soft Reserved ranges
> end up as dax-devices sounds like the right guideline moving forward.
>

I guess devils in the details here.  I sense an implication that it's
possible for two distinct pieces of SR-providing hardware (HBM and CXL)
could end up concatonated into a single SR range? That would obviously
necessitate the need for chopping up an SR.  So this all makes sense.

But I don't disagree with the need for this, just concerned that we have
CXL-specific logic landing in mm/ and e820 code.

~Gregory


      parent reply	other threads:[~2024-12-26 19:25 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-02 15:55 Nathan Fontenot
2024-12-02 18:56 ` Fan Ni
2024-12-04 16:33   ` Fontenot, Nathan
2024-12-02 19:00 ` kernel test robot
2024-12-03  0:31 ` kernel test robot
2024-12-03  1:12 ` kernel test robot
2024-12-11 11:31 ` Andy Shevchenko
2024-12-11 20:11   ` Fontenot, Nathan
2024-12-11 22:30 ` Dan Williams
2024-12-11 22:53   ` Dan Williams
2024-12-12  2:07   ` Alison Schofield
2024-12-12  3:20     ` Dan Williams
2024-12-12 18:12   ` Fontenot, Nathan
2024-12-12 19:57     ` Dan Williams
2024-12-12 22:42 ` Gregory Price
2024-12-13  1:01   ` Alison Schofield
2024-12-13 16:33     ` Fontenot, Nathan
2024-12-26 19:25     ` Gregory Price [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z22toIWJfsgZJn_a@gourry-fedora-PF4VCD3F \
    --to=gourry@gourry.net \
    --cc=alison.schofield@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nathan.fontenot@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox