linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Yeoreum Yun <yeoreum.yun@arm.com>
To: Ryan Roberts <ryan.roberts@arm.com>
Cc: akpm@linux-foundation.org, david@kernel.org,
	lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com,
	vbabka@suse.cz, rppt@kernel.org, surenb@google.com,
	mhocko@suse.com, ast@kernel.org, daniel@iogearbox.net,
	andrii@kernel.org, martin.lau@linux.dev, eddyz87@gmail.com,
	song@kernel.org, yonghong.song@linux.dev,
	john.fastabend@gmail.com, kpsingh@kernel.org, sdf@fomichev.me,
	haoluo@google.com, jolsa@kernel.org, jackmanb@google.com,
	hannes@cmpxchg.org, ziy@nvidia.com, bigeasy@linutronix.de,
	clrkwllms@kernel.org, rostedt@goodmis.org,
	catalin.marinas@arm.com, will@kernel.org, kevin.brodsky@arm.com,
	dev.jain@arm.com, yang@os.amperecomputing.com,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	bpf@vger.kernel.org, linux-rt-devel@lists.linux.dev,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH 0/2] introduce pagetable_alloc_nolock()
Date: Tue, 16 Dec 2025 16:52:13 +0000	[thread overview]
Message-ID: <aUGOPd7gNRf1xHEc@e129823.arm.com> (raw)
In-Reply-To: <916c17ba-22b1-456e-a184-cb3f60249af7@arm.com>

Hi Ryan,

> On 12/12/2025 16:18, Yeoreum Yun wrote:
> > Some architectures invoke pagetable_alloc() or __get_free_pages()
> > with preemption disabled.
> > For example, in arm64, linear_map_split_to_ptes() calls pagetable_alloc()
> > while spliting block entry to ptes and __kpti_install_ng_mappings()
> > calls __get_free_pages() to create kpti pagetable.
> >
> > Under PREEMPT_RT, calling pagetable_alloc() with
> > preemption disabled is not allowed, because it may acquire
> > a spin lock that becomes sleepable on RT, potentially
> > causing a sleep during page allocation.
> >
> > Since above two functions is called as callback of stop_machine()
> > where its callback is called in preemption disabled,
> > They could make a potential problem. (sleeping in preemption disabled).
> >
> > To address this, introduce pagetable_alloc_nolock() API.
>
> I don't really understand what the problem is that you're trying to fix. As I
> see it, there are 2 call sites in arm64 arch code that are calling into the page
> allocator from stop_machine() - one via via pagetable_alloc() and another via
> __get_free_pages(). But both of those calls are passing in GFP_ATOMIC. It was my
> understanding that the page allocator would ensure it never sleeps when
> GFP_ATOMIC is passed in, (even for PREEMPT_RT)?

Although GFP_ATOMIC is specify, it only affects of "water mark" of the
page with __GFP_HIGH. and to get a page, it must grab the lock --
zone->lock or pcp_lock in the rmqueue().

This zone->lock and pcp_lock is spin_lock and it's a sleepable in
PREEMPT_RT that's why the memory allocation/free using general API
except nolock() version couldn't be called since
if "contention" happens they'll sleep while waiting to get the lock.

The reason why "nolock()" can use, it always uses "trylock" with
ALLOC_TRYLOCK flags. otherwise GFP_ATOMIC also can be sleepable in
PREEMPT_RT.

>
> What is the actual symptom you are seeing?

Since the place where called while smp_cpus_done() and there seems no
contention, there seems no problem. However as I mention in another
thread
(https://lore.kernel.org/all/aT%2FdrjN1BkvyAGoi@e129823.arm.com/),
This gives a the false impression --
GFP_ATOMIC are “safe to use in preemption disabled”
even though they are not in PREEMPT_RT case, I've changed it.

>
> If the page allocator is somehow ignoring the GFP_ATOMIC request for PREEMPT_RT,
> then isn't that a bug in the page allocator? I'm not sure why you would change
> the callsites? Can't you just change the page allocator based on GFP_ATOMIC?

It doesn't ignore the GFP_ATOMIC feature:
  - __GFP_HIGH: use water mark till min reserved
  - __GFP_KSWAPD_RECLAIM: wake up kswapd if reclaim required.

But, it's a restriction -- "page allocation / free" API cannot be called
in preempt-disabled context at PREEMPT_RT.

That's why I think it's wrong usage not a page allocator bug.

[...]

--
Sincerely,
Yeoreum Yun


  reply	other threads:[~2025-12-16 16:53 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-12 16:18 Yeoreum Yun
2025-12-12 16:18 ` [PATCH 1/2] mm: " Yeoreum Yun
2025-12-12 16:18 ` [PATCH 2/2] arm64: mmu: use pagetable_alloc_nolock() while stop_machine() Yeoreum Yun
2025-12-13  7:05   ` Brendan Jackman
2025-12-14  9:13     ` Yeoreum Yun
2025-12-15  9:22       ` Brendan Jackman
2025-12-15  9:34         ` Yeoreum Yun
2025-12-15  9:55           ` Brendan Jackman
2025-12-15 10:06             ` Yeoreum Yun
2025-12-16 10:10               ` Brendan Jackman
2025-12-16 11:03                 ` Yeoreum Yun
2025-12-16 11:26                   ` Brendan Jackman
2025-12-16 12:01                     ` Yeoreum Yun
2025-12-16 12:39                       ` Brendan Jackman
2025-12-16 13:25                         ` Yeoreum Yun
2025-12-18  9:30   ` Michal Hocko
2025-12-18  9:36     ` Yeoreum Yun
2025-12-18 12:02       ` Ryan Roberts
2025-12-18 12:17         ` Michal Hocko
2025-12-18 12:24           ` Yeoreum Yun
2025-12-16 15:11 ` [PATCH 0/2] introduce pagetable_alloc_nolock() Ryan Roberts
2025-12-16 16:52   ` Yeoreum Yun [this message]
2025-12-17  9:34     ` Ryan Roberts
2025-12-17 10:48       ` Yeoreum Yun
2025-12-17 12:04         ` Ryan Roberts
2025-12-17 12:52           ` Yeoreum Yun
2025-12-17 13:15             ` Vlastimil Babka
2025-12-17 13:35               ` Brendan Jackman
2025-12-17 13:56                 ` Yeoreum Yun
2025-12-17 15:10                 ` Vlastimil Babka
2025-12-17 17:19                   ` Brendan Jackman
2025-12-18  7:47                     ` Vlastimil Babka
2025-12-18  7:52                   ` David Hildenbrand (Red Hat)
2025-12-23 22:59           ` Yang Shi
2025-12-24  7:00             ` Yeoreum Yun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aUGOPd7gNRf1xHEc@e129823.arm.com \
    --to=yeoreum.yun@arm.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bigeasy@linutronix.de \
    --cc=bpf@vger.kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=clrkwllms@kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=david@kernel.org \
    --cc=dev.jain@arm.com \
    --cc=eddyz87@gmail.com \
    --cc=hannes@cmpxchg.org \
    --cc=haoluo@google.com \
    --cc=jackmanb@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kevin.brodsky@arm.com \
    --cc=kpsingh@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-rt-devel@lists.linux.dev \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=martin.lau@linux.dev \
    --cc=mhocko@suse.com \
    --cc=rostedt@goodmis.org \
    --cc=rppt@kernel.org \
    --cc=ryan.roberts@arm.com \
    --cc=sdf@fomichev.me \
    --cc=song@kernel.org \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=will@kernel.org \
    --cc=yang@os.amperecomputing.com \
    --cc=yonghong.song@linux.dev \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox