From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
To: Dave Hansen <dave.hansen@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Mike Rapoport <rppt@kernel.org>,
David Hildenbrand <david@redhat.com>,
Vlastimil Babka <vbabka@suse.cz>,
Mel Gorman <mgorman@techsingularity.net>,
Tom Lendacky <thomas.lendacky@amd.com>,
"Kalra, Ashish" <ashish.kalra@amd.com>,
Rick Edgecombe <rick.p.edgecombe@intel.com>,
linux-mm@kvack.org, linux-coco@lists.linux.dev,
linux-kernel@vger.kernel.org, Srikanth Aithal <sraithal@amd.com>
Subject: Re: [PATCH] mm/page_alloc: fix deadlock on cpu_hotplug_lock in __accept_page()
Date: Tue, 1 Apr 2025 10:25:02 +0300 [thread overview]
Message-ID: <mc7rnftmisbx5fpefwaiobngzpbh66yk5535xrfxope4gobu36@2wite5exlfcd> (raw)
In-Reply-To: <b44cef03-46ef-4153-b21a-98aa6ff43a08@intel.com>
On Mon, Mar 31, 2025 at 12:07:07PM -0700, Dave Hansen wrote:
> On 3/29/25 10:10, Kirill A. Shutemov wrote:
> > + if (system_wq)
> > + schedule_work(&zone->unaccepted_cleanup);
> > + else
> > + unaccepted_cleanup_work(&zone->unaccepted_cleanup);
> > + }
> > }
>
> The 'system_wq' check seems like an awfully big hack. No other
> schedule_work() user does anything similar that I can find across the tree.
I don't see how it is "an awfully big hack". It is "use system_wq if it is
ready".
Maybe it is going to be marginally cleaner if schedule_work() would be
open-coded:
if (system_wq)
queue_work(system_wq, &zone->unaccepted_cleanup);
else
unaccepted_cleanup_work(&zone->unaccepted_cleanup);
?
>
> Instead of hacking in some internal state, could you use 'system_state',
> like:
>
> if (system_state == SYSTEM_BOOTING)
> unaccepted_cleanup_work(&zone->unaccepted_cleanup);
> else
> schedule_work(&zone->unaccepted_cleanup);
Really? The transition points between these states are arbitrary defined.
Who said that if we are out of SYSTEM_BOOTING we can use system_wq?
Tomorrow we can introduce additional state between BOOTING and SCHEDULING
and this code will be silently broken. The same for any new state before
BOOTING.
> The other method would be to make it more opportunistic? Basically,
> detect when it might deadlock:
>
> bool try_to_dec()
> {
> if (!cpus_read_trylock())
> return false;
>
> static_branch_dec_cpuslocked(&zones_with_unaccepted_pages);
> cpus_read_unlock();
>
> return true;
> }
>
> That still requires a bit in the zone to say whether the
> static_branch_dec() was deferred or not, though. It's kinda open-coding
> schedule_work().
It will also require special handling for soft CPU online/offline.
--
Kiryl Shutsemau / Kirill A. Shutemov
prev parent reply other threads:[~2025-04-01 7:25 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20250329171030.3942298-1-kirill.shutemov@linux.intel.com>
2025-03-31 19:07 ` Dave Hansen
2025-04-01 7:25 ` Kirill A. Shutemov [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=mc7rnftmisbx5fpefwaiobngzpbh66yk5535xrfxope4gobu36@2wite5exlfcd \
--to=kirill.shutemov@linux.intel.com \
--cc=akpm@linux-foundation.org \
--cc=ashish.kalra@amd.com \
--cc=dave.hansen@intel.com \
--cc=david@redhat.com \
--cc=linux-coco@lists.linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=rick.p.edgecombe@intel.com \
--cc=rppt@kernel.org \
--cc=sraithal@amd.com \
--cc=thomas.lendacky@amd.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox