From: "David Hildenbrand (Arm)" <david@kernel.org>
To: Gregory Price <gourry@gourry.net>,
Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org, vishal.l.verma@intel.com,
dave.jiang@intel.com, osalvador@suse.de,
dan.j.williams@intel.com, ljs@kernel.org,
Liam.Howlett@oracle.com, vbabka@kernel.org, rppt@kernel.org,
surenb@google.com, mhocko@suse.com, linux-kernel@vger.kernel.org,
nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org,
kernel-team@meta.com
Subject: Re: [PATCH 0/8] dax/kmem: atomic whole-device hotplug via sysfs
Date: Mon, 13 Apr 2026 17:47:01 +0200 [thread overview]
Message-ID: <a4be48f2-ab0a-4808-a7db-2532ec65ad0b@kernel.org> (raw)
In-Reply-To: <ab7_AVLgzLaDRcv5@gourry-fedora-PF4VCD3F>
On 3/21/26 21:26, Gregory Price wrote:
> On Sat, Mar 21, 2026 at 10:40:21AM -0700, Andrew Morton wrote:
>> On Sat, 21 Mar 2026 11:03:56 -0400 Gregory Price <gourry@gourry.net> wrote:
>>
>>> The dax kmem driver currently onlines memory during probe using the
>>> system default policy, with no way to control or query the region state
>>> at runtime - other than by inspecting the state of individual blocks.
>>>
>>> Offlining and removing an entire region requires operating on individual
>>> memory blocks, creating race conditions where external entities can
>>> interfere between the offline and remove steps.
>>>
>>> The problem was discussed specifically in the LPC2025 device memory
>>> sessions - https://lpc.events/event/19/contributions/2016/ - where
>>> it was discussed how the non-atomic interface for dax hotplug is causing
>>> issues in some distributions which have competing userland controllers
>>> that interfere with each other.
>>>
>>> This series adds a sysfs "hotplug" attribute for atomic whole-device
>>> hotplug control, along with the mm and dax plumbing to support it.
>>
>> AI review (which hasn't completed at this time) has a lot to say:
>> https://sashiko.dev/#/patchset/20260321150404.3288786-1-gourry@gourry.net
>
> Looking at the results - i mucked up a UAF during the rebase that i
> didn't catch during testing. Will clean that up.
>
> I also just realized I left an extern in one of the patches that I
> thought I had removed.
>
> So I owe a respin on this in more ways than one.
>
> But on the AI review comment for non-trivial stuff
> ---
>
> Much of the remaining commentary is about either the pre-existing code
> race conditions, or design questions in the space of that race
> condition.
>
> Specifically: userland can still try to twiddle the memoryN/state bits
> while the dax device loops over non-contiguous regions.
dax_kmem_do_hotremove() loops over offline_and_remove_memory(). And once
something was removed, it can no longer get onlined, obviously. So that
should be good.
dax_kmem_do_hotplug() loops over __add_memory_driver_managed() and
onlines the memory directly using the specified policy.
User space can only interact with that after memory was already onlined.
So, really only user space can try offlining the memory after requested
onlining succeeded.
I don't think any udev rules do that? The usually only request to
online, which should be fine.
So if a user does that manually, good for him. We just have to make sure
that stuff keeps working as expected.
Or am I missing a case?
>
> I dropped this commit:
> https://lore.kernel.org/all/20260114235022.3437787-6-gourry@gourry.net/
>
> From the series, because the feedback here:
> https://lore.kernel.org/linux-mm/d1938a63-839b-44a5-a68f-34ad290fef21@kernel.org/
>
> suggested that offline_and_remove_memory() would resolve the race
> condition problem - but the patch proposed actually solved two issues:
>
> 1) Inconsistent hotplug state issue (user is still using the old
> per-block offlining pattern)
>
> 2) The old offline pattern calling BUG() instead of WARN() when trying
> to unbind while things are still online.
>
> But this goes to the issue of: If the race condition in userland has
> been around for many years, is it to be considered a feature we should
> not break - or on what time scale should we consider breaking it?
I think the only thing we care about is that even if udev rules
interact, that things just keeps working as expected.
That should be the case (see above) unless I am missing something?
I'll note that offline_and_remove_memory() can take a long time/forever
to succeed. User space can abort it by sending a critical signal.
For example, if you do
$ echo "unplugged" > magic_device_file
And it hangs, user space can kill the "echo" command, sending a fatal
signal and making offline_and_remove_memory() fail.
The question is, if you want to do your best to revert the other offline
operations and try re-adding/onlining what you already offlined.
offline_and_remove_memory() handles that much nicer internally, as it
tries to revert offlining, and only removes once everything was offlined.
I think I raised it previously, but you could add a
offline_and_remove_memory_ranges() that consumes multiple ranges, and
would do this for you under a single lock_device_hotplug().
--
Cheers,
David
next prev parent reply other threads:[~2026-04-13 15:47 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-21 15:03 Gregory Price
2026-03-21 15:03 ` [PATCH 1/8] mm/memory-tiers: consolidate memory type dedup into mt_get_memory_type() Gregory Price
2026-04-13 15:06 ` David Hildenbrand (Arm)
2026-03-21 15:03 ` [PATCH 2/8] mm/memory: add memory_block_align_range() helper Gregory Price
2026-04-13 15:12 ` David Hildenbrand (Arm)
2026-03-21 15:03 ` [PATCH 3/8] mm/memory_hotplug: pass online_type to online_memory_block() via arg Gregory Price
2026-03-21 15:04 ` [PATCH 4/8] mm/memory_hotplug: export mhp_get_default_online_type Gregory Price
2026-04-13 15:18 ` David Hildenbrand (Arm)
2026-03-21 15:04 ` [PATCH 5/8] mm/memory_hotplug: add __add_memory_driver_managed() with online_type arg Gregory Price
2026-04-13 15:21 ` David Hildenbrand (Arm)
2026-03-21 15:04 ` [PATCH 6/8] dax: plumb hotplug online_type through dax Gregory Price
2026-04-13 15:23 ` David Hildenbrand (Arm)
2026-03-21 15:04 ` [PATCH 7/8] dax/kmem: extract hotplug/hotremove helper functions Gregory Price
2026-03-21 15:04 ` [PATCH 8/8] dax/kmem: add sysfs interface for atomic whole-device hotplug Gregory Price
2026-03-21 17:40 ` [PATCH 0/8] dax/kmem: atomic whole-device hotplug via sysfs Andrew Morton
2026-03-21 20:26 ` Gregory Price
2026-04-13 15:47 ` David Hildenbrand (Arm) [this message]
2026-04-13 14:49 ` David Hildenbrand (Arm)
2026-04-13 14:53 ` David Hildenbrand (Arm)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a4be48f2-ab0a-4808-a7db-2532ec65ad0b@kernel.org \
--to=david@kernel.org \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=gourry@gourry.net \
--cc=kernel-team@meta.com \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=mhocko@suse.com \
--cc=nvdimm@lists.linux.dev \
--cc=osalvador@suse.de \
--cc=rppt@kernel.org \
--cc=surenb@google.com \
--cc=vbabka@kernel.org \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox