From: ranxiaokai627@163.com
To: david@kernel.org
Cc: akpm@linux-foundation.org, hannes@cmpxchg.org,
jackmanb@google.com, linux-mm@kvack.org, luizcap@redhat.com,
mhocko@suse.com, ran.xiaokai@zte.com.cn, ranxiaokai627@163.com,
surenb@google.com, vbabka@suse.cz, ziy@nvidia.com
Subject: Re: [PATCH] mm/page_owner: fix prematurely released rcu_read_lock()
Date: Thu, 25 Dec 2025 08:17:43 +0000 [thread overview]
Message-ID: <20251225081753.142479-1-ranxiaokai627@163.com> (raw)
In-Reply-To: <c6b766bf-0f05-47c3-bcc6-2f0e1961a864@kernel.org>
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=y, Size: 3863 bytes --]
Hi, David
>On 12/23/25 10:25, ranxiaokai627@163.com wrote:
>> From: Ran Xiaokai <ran.xiaokai@zte.com.cn>
>> --- a/mm/page_owner.c
>> +++ b/mm/page_owner.c
>> @@ -375,24 +375,25 @@ void __split_page_owner(struct page *page, int old_order, int new_order)
>> void __folio_copy_owner(struct folio *newfolio, struct folio *old)
>> {
>> struct page_ext *page_ext;
>> + struct page_ext *old_page_ext, *new_page_ext;
>> struct page_ext_iter iter;
>> struct page_owner *old_page_owner;
>> struct page_owner *new_page_owner;
>> depot_stack_handle_t migrate_handle;
>>
>> - page_ext = page_ext_get(&old->page);
>> - if (unlikely(!page_ext))
>> + old_page_ext = page_ext_get(&old->page);
>> + if (unlikely(!old_page_ext))
>> return;
>>
>> - old_page_owner = get_page_owner(page_ext);
>> - page_ext_put(page_ext);
>> + old_page_owner = get_page_owner(old_page_ext);
>>
>> - page_ext = page_ext_get(&newfolio->page);
>> - if (unlikely(!page_ext))
>> + new_page_ext = page_ext_get(&newfolio->page);
>> + if (unlikely(!new_page_ext)) {
>> + page_ext_put(old_page_ext);
>> return;
>> + }
>>
>> - new_page_owner = get_page_owner(page_ext);
>> - page_ext_put(page_ext);
>> + new_page_owner = get_page_owner(new_page_ext);
>>
>> migrate_handle = new_page_owner->handle;
>> __update_page_owner_handle(&newfolio->page, old_page_owner->handle,
>> @@ -414,12 +415,12 @@ void __folio_copy_owner(struct folio *newfolio, struct folio *old)
>> * for the new one and the old folio otherwise there will be an imbalance
>> * when subtracting those pages from the stack.
>> */
>> - rcu_read_lock();
>> for_each_page_ext(&old->page, 1 << new_page_owner->order, page_ext, iter) {
>> old_page_owner = get_page_owner(page_ext);
>> old_page_owner->handle = migrate_handle;
>> }
>> - rcu_read_unlock();
>> + page_ext_put(new_page_ext);
>> + page_ext_put(old_page_ext);
>> }
>
>How are you possibly able to call into __split_page_owner() while
>concurrently we are already finished with offlining the memory (-> all
>memory freed and isolated in the buddy) and triggering the notifier?
This patch does not involve __split_page_owner() – did you perhaps
mean __folio_copy_owner()?
You are right: when memory_notify(MEM_OFFLINE, &mem_arg)
is called, memory hot-remove has already completed.
At this stage, all memory in the mem_section has been fully freed
and removed from zone->free_area[], making them impossible
to be allocated.
Currently, only read_page_owner() and pagetypeinfo_showmixedcount_print()
genuinely require RCU locks to handle concurrency with MEM_OFFLINE events.
This is because during traversal of page frames across the system/zone,
we cannot pre-determine the hotplug/remove state of these PFNs.
In other functions, RCU locks are merely used to satisfy the assertion
WARN_ON_ONCE(!rcu_read_lock_held()) within lookup_page_ext(). right ?
Regarding page_ext, I have a question: Semantically, page_ext should
correspond to one structure per folio, not per 4K base page. In x86_64,
page_ext consumes 88 bytes—more memory than struct page itself.
As mTHP adoption grows, avoiding page owner metadata setup/cleanup
for tail pages would yield performance gains.
Moreover, with folio gradually replacing struct page, will struct page
eventually be superseded by dynamically allocated all kinds of memdesc_xxx_t?
If so, wouldn’t it be more reasonable to dynamically allocate a
corresponding page_ext structure after folio allocation?
While this would introduce overhead on the
memory allocation hotpath, I believe we could optimize it—perhaps
by allocating multiple page frames at once to store page_ext structures,
or by establishing a dedicated kmem_cache for page_ext.
What are your suggestions regarding this approach of
dynamically allocating page_ext structures?
>Doesn't make sense, no?
next prev parent reply other threads:[~2025-12-25 8:18 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-23 9:25 ranxiaokai627
2025-12-23 9:42 ` David Hildenbrand (Red Hat)
2025-12-25 8:17 ` ranxiaokai627 [this message]
2025-12-30 20:45 ` David Hildenbrand (Red Hat)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251225081753.142479-1-ranxiaokai627@163.com \
--to=ranxiaokai627@163.com \
--cc=akpm@linux-foundation.org \
--cc=david@kernel.org \
--cc=hannes@cmpxchg.org \
--cc=jackmanb@google.com \
--cc=linux-mm@kvack.org \
--cc=luizcap@redhat.com \
--cc=mhocko@suse.com \
--cc=ran.xiaokai@zte.com.cn \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox