* [PATCH] debugobject: don't wake up kswapd from fill_pool()
[not found] <000000000000008ddb05fb5e2576@google.com>
@ 2023-05-11 13:47 ` Tetsuo Handa
2023-05-12 3:44 ` Andrew Morton
0 siblings, 1 reply; 10+ messages in thread
From: Tetsuo Handa @ 2023-05-11 13:47 UTC (permalink / raw)
To: syzbot, syzkaller-bugs, Ingo Molnar, Thomas Gleixner, Andrew Morton
Cc: linux-kernel, linux-mm
syzbot is reporting lockdep warning in fill_pool(), for GFP_ATOMIC is
(__GFP_HIGH | __GFP_KSWAPD_RECLAIM) which wakes up kswapd.
Since fill_pool() might be called with arbitrary locks held,
fill_pool() should not assume that holding pgdat->kswapd_wait is safe.
Also, __GFP_NORETRY is pointless for !__GFP_DIRECT_RECLAIM allocation.
Reported-by: syzbot <syzbot+fe0c72f0ccbb93786380@syzkaller.appspotmail.com>
Closes: https://syzkaller.appspot.com/bug?extid=fe0c72f0ccbb93786380
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Fixes: 3ac7fe5a4aab ("infrastructure to debug (dynamic) objects")
---
lib/debugobjects.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index 003edc5ebd67..986adca357b4 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -126,7 +126,7 @@ static const char *obj_states[ODEBUG_STATE_MAX] = {
static void fill_pool(void)
{
- gfp_t gfp = GFP_ATOMIC | __GFP_NORETRY | __GFP_NOWARN;
+ gfp_t gfp = __GFP_HIGH | __GFP_NOWARN;
struct debug_obj *obj;
unsigned long flags;
--
2.18.4
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] debugobject: don't wake up kswapd from fill_pool()
2023-05-11 13:47 ` [PATCH] debugobject: don't wake up kswapd from fill_pool() Tetsuo Handa
@ 2023-05-12 3:44 ` Andrew Morton
2023-05-12 10:57 ` Tetsuo Handa
0 siblings, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2023-05-12 3:44 UTC (permalink / raw)
To: Tetsuo Handa
Cc: syzbot, syzkaller-bugs, Ingo Molnar, Thomas Gleixner,
linux-kernel, linux-mm
On Thu, 11 May 2023 22:47:32 +0900 Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> wrote:
> syzbot is reporting lockdep warning in fill_pool(), for GFP_ATOMIC is
> (__GFP_HIGH | __GFP_KSWAPD_RECLAIM) which wakes up kswapd.
> Since fill_pool() might be called with arbitrary locks held,
> fill_pool() should not assume that holding pgdat->kswapd_wait is safe.
hm. But many GFP_ATOMIC allocation attempts are made with locks held.
Why aren't all such callers buggy, by trying to wake kswapd with locks
held? What's special about this one?
> Also, __GFP_NORETRY is pointless for !__GFP_DIRECT_RECLAIM allocation
>
> Reported-by: syzbot <syzbot+fe0c72f0ccbb93786380@syzkaller.appspotmail.com>
> Closes: https://syzkaller.appspot.com/bug?extid=fe0c72f0ccbb93786380
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Fixes: 3ac7fe5a4aab ("infrastructure to debug (dynamic) objects")
> ---
> lib/debugobjects.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/lib/debugobjects.c b/lib/debugobjects.c
> index 003edc5ebd67..986adca357b4 100644
> --- a/lib/debugobjects.c
> +++ b/lib/debugobjects.c
> @@ -126,7 +126,7 @@ static const char *obj_states[ODEBUG_STATE_MAX] = {
>
> static void fill_pool(void)
> {
> - gfp_t gfp = GFP_ATOMIC | __GFP_NORETRY | __GFP_NOWARN;
> + gfp_t gfp = __GFP_HIGH | __GFP_NOWARN;
Does this weaken fill_pool()'s allocation attempt more than necessary?
We can still pass __GFP_HIGH?
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] debugobject: don't wake up kswapd from fill_pool()
2023-05-12 3:44 ` Andrew Morton
@ 2023-05-12 10:57 ` Tetsuo Handa
2023-05-12 12:54 ` Thomas Gleixner
0 siblings, 1 reply; 10+ messages in thread
From: Tetsuo Handa @ 2023-05-12 10:57 UTC (permalink / raw)
To: Andrew Morton
Cc: syzbot, syzkaller-bugs, Ingo Molnar, Thomas Gleixner,
linux-kernel, linux-mm
On 2023/05/12 12:44, Andrew Morton wrote:
> On Thu, 11 May 2023 22:47:32 +0900 Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> wrote:
>
>> syzbot is reporting lockdep warning in fill_pool(), for GFP_ATOMIC is
>> (__GFP_HIGH | __GFP_KSWAPD_RECLAIM) which wakes up kswapd.
>> Since fill_pool() might be called with arbitrary locks held,
>> fill_pool() should not assume that holding pgdat->kswapd_wait is safe.
>
> hm. But many GFP_ATOMIC allocation attempts are made with locks held.
> Why aren't all such callers buggy, by trying to wake kswapd with locks
> held? What's special about this one?
Because debugobject cannot know what locks are held when fill_pool() does
GFP_ATOMIC allocation.
syzbot is reporting base->lock => pgdat->kswapd_wait dependency
add_timer() {
__mod_timer() {
base = lock_timer_base(timer, &flags);
new_base = get_target_base(base, timer->flags);
if (base != new_base) {
raw_spin_unlock(&base->lock);
base = new_base;
raw_spin_lock(&base->lock);
}
debug_timer_activate(timer) {
debug_object_activate(timer, &timer_debug_descr) {
debug_objects_fill_pool() {
fill_pool() {
kmem_cache_zalloc(GFP_ATOMIC | __GFP_NORETRY | __GFP_NOWARN) {
// wakes kswapd
}
}
}
}
}
raw_spin_unlock_irqrestore(&base->lock, flags);
}
}
when pgdat->kswapd_wait => p->pi_lock dependency
__alloc_pages() {
get_page_from_freelist() {
rmqueue() {
wakeup_kswapd() {
wake_up_interruptible(&pgdat->kswapd_wait) {
__wake_up_common_lock() {
spin_lock_irqsave(&pgdat->kswapd_wait.lock, flags);
__wake_up_common() {
autoremove_wake_function() {
try_to_wake_up() {
raw_spin_lock_irqsave(&p->pi_lock, flags);
raw_spin_unlock_irqrestore(&p->pi_lock, flags);
}
}
}
spin_unlock_irqrestore(&pgdat->kswapd_wait.lock, flags);
}
}
}
}
}
}
and p->pi_lock => rq->__lock => base->lock dependency
wake_up_new_task() {
raw_spin_lock_irqsave(&p->pi_lock, rf.flags);
rq = __task_rq_lock(p, &rf); // acquires rq->lock
activate_task(rq, p, ENQUEUE_NOCLOCK) {
enqueue_task() {
psi_enqueue() {
psi_task_change() {
queue_delayed_work_on() {
__queue_delayed_work() {
add_timer() {
__mod_timer() {
base = lock_timer_base(timer, &flags); // acquires base->lock
debug_timer_activate(timer); // possible base->lock => pgdat->kswapd_wait => p->pi_lock dependency
raw_spin_unlock_irqrestore(&base->lock, flags);
}
}
}
}
}
}
}
}
task_rq_unlock(rq, p, &rf);
}
exists.
All GFP_ATOMIC allocation users are supposed to be aware of what locks
are held, and are supposed to explicitly remove __GFP_KSWAPD_RECLAIM
if waking up kswapd can cause deadlock. But reality is that we can't be
careful enough to error-free. Who would imagine GFP_ATOMIC allocation
while base->lock is held can form circular locking dependency?
>
>> Also, __GFP_NORETRY is pointless for !__GFP_DIRECT_RECLAIM allocation
__GFP_NORETRY is not checked by !__GFP_DIRECT_RECLAIM allocation.
GFP_ATOMIC - __GFP_KSWAPD_RECLAIM is __GFP_HIGH.
>>
>> @@ -126,7 +126,7 @@ static const char *obj_states[ODEBUG_STATE_MAX] = {
>>
>> static void fill_pool(void)
>> {
>> - gfp_t gfp = GFP_ATOMIC | __GFP_NORETRY | __GFP_NOWARN;
>> + gfp_t gfp = __GFP_HIGH | __GFP_NOWARN;
>
> Does this weaken fill_pool()'s allocation attempt more than necessary?
> We can still pass __GFP_HIGH?
What do you mean? I think that killing base->lock => pgdat->kswapd_wait
by removing __GFP_KSWAPD_RECLAIM is the right fix. This weakening is
needed for avoiding base->lock => pgdat->kswapd_wait dependency from
debugobject code.
For locking dependency safety, I wish that GFP_ATOMIC / GFP_NOWAIT do not imply
__GFP_KSWAPD_RECLAIM. Such allocations should not try to allocate as many pages
as even __GFP_HIGH fails. And if such allocations try to allocate as many pages
as even __GFP_HIGH fails, they likely already failed before background kswapd
reclaim finds some reusable pages....
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] debugobject: don't wake up kswapd from fill_pool()
2023-05-12 10:57 ` Tetsuo Handa
@ 2023-05-12 12:54 ` Thomas Gleixner
2023-05-12 13:09 ` Tetsuo Handa
0 siblings, 1 reply; 10+ messages in thread
From: Thomas Gleixner @ 2023-05-12 12:54 UTC (permalink / raw)
To: Tetsuo Handa, Andrew Morton
Cc: syzbot, syzkaller-bugs, Ingo Molnar, linux-kernel, linux-mm
On Fri, May 12 2023 at 19:57, Tetsuo Handa wrote:
> On 2023/05/12 12:44, Andrew Morton wrote:
>> On Thu, 11 May 2023 22:47:32 +0900 Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> wrote:
>>
>>> syzbot is reporting lockdep warning in fill_pool(), for GFP_ATOMIC is
>>> (__GFP_HIGH | __GFP_KSWAPD_RECLAIM) which wakes up kswapd.
>>> Since fill_pool() might be called with arbitrary locks held,
>>> fill_pool() should not assume that holding pgdat->kswapd_wait is safe.
https://lore.kernel.org/lkml/871qjldbes.ffs@tglx/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] debugobject: don't wake up kswapd from fill_pool()
2023-05-12 12:54 ` Thomas Gleixner
@ 2023-05-12 13:09 ` Tetsuo Handa
2023-05-12 18:07 ` Thomas Gleixner
0 siblings, 1 reply; 10+ messages in thread
From: Tetsuo Handa @ 2023-05-12 13:09 UTC (permalink / raw)
To: Thomas Gleixner, Andrew Morton
Cc: syzbot, syzkaller-bugs, Ingo Molnar, linux-kernel, linux-mm
On 2023/05/12 21:54, Thomas Gleixner wrote:
> On Fri, May 12 2023 at 19:57, Tetsuo Handa wrote:
>> On 2023/05/12 12:44, Andrew Morton wrote:
>>> On Thu, 11 May 2023 22:47:32 +0900 Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> wrote:
>>>
>>>> syzbot is reporting lockdep warning in fill_pool(), for GFP_ATOMIC is
>>>> (__GFP_HIGH | __GFP_KSWAPD_RECLAIM) which wakes up kswapd.
>>>> Since fill_pool() might be called with arbitrary locks held,
>>>> fill_pool() should not assume that holding pgdat->kswapd_wait is safe.
>
> https://lore.kernel.org/lkml/871qjldbes.ffs@tglx/
.config says IS_ENABLED(CONFIG_PREEMPT_RT) == false, and lockdep says about
base->lock => pgdat->kswapd_wait => p->pi_lock => rq->__lock => base->lock
dependency but does not say about db->lock.
How can your patch fix this problem?
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] debugobject: don't wake up kswapd from fill_pool()
2023-05-12 13:09 ` Tetsuo Handa
@ 2023-05-12 18:07 ` Thomas Gleixner
2023-05-12 23:13 ` Tetsuo Handa
0 siblings, 1 reply; 10+ messages in thread
From: Thomas Gleixner @ 2023-05-12 18:07 UTC (permalink / raw)
To: Tetsuo Handa, Andrew Morton
Cc: syzbot, syzkaller-bugs, Ingo Molnar, linux-kernel, linux-mm
On Fri, May 12 2023 at 22:09, Tetsuo Handa wrote:
> On 2023/05/12 21:54, Thomas Gleixner wrote:
>> On Fri, May 12 2023 at 19:57, Tetsuo Handa wrote:
>>> On 2023/05/12 12:44, Andrew Morton wrote:
>>>> On Thu, 11 May 2023 22:47:32 +0900 Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> wrote:
>>>>
>>>>> syzbot is reporting lockdep warning in fill_pool(), for GFP_ATOMIC is
>>>>> (__GFP_HIGH | __GFP_KSWAPD_RECLAIM) which wakes up kswapd.
>>>>> Since fill_pool() might be called with arbitrary locks held,
>>>>> fill_pool() should not assume that holding pgdat->kswapd_wait is safe.
>>
>> https://lore.kernel.org/lkml/871qjldbes.ffs@tglx/
>
> .config says IS_ENABLED(CONFIG_PREEMPT_RT) == false, and lockdep says about
> base->lock => pgdat->kswapd_wait => p->pi_lock => rq->__lock => base->lock
> dependency but does not say about db->lock.
>
> How can your patch fix this problem?
It's described in the changelog, no?
The main change is to make the refill invocation conditional when the
lookup fails. That's how that code has been from day one.
The patch which closed the race recently wreckaged those refill
oportunities and the fix for that introduced this problem.
Thanks,
tglx
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] debugobject: don't wake up kswapd from fill_pool()
2023-05-12 18:07 ` Thomas Gleixner
@ 2023-05-12 23:13 ` Tetsuo Handa
2023-05-13 8:33 ` Thomas Gleixner
0 siblings, 1 reply; 10+ messages in thread
From: Tetsuo Handa @ 2023-05-12 23:13 UTC (permalink / raw)
To: Thomas Gleixner, Andrew Morton
Cc: syzbot, syzkaller-bugs, Ingo Molnar, linux-kernel, linux-mm
On 2023/05/13 3:07, Thomas Gleixner wrote:
> On Fri, May 12 2023 at 22:09, Tetsuo Handa wrote:
>> On 2023/05/12 21:54, Thomas Gleixner wrote:
>>> On Fri, May 12 2023 at 19:57, Tetsuo Handa wrote:
>>>> On 2023/05/12 12:44, Andrew Morton wrote:
>>>>> On Thu, 11 May 2023 22:47:32 +0900 Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> wrote:
>>>>>
>>>>>> syzbot is reporting lockdep warning in fill_pool(), for GFP_ATOMIC is
>>>>>> (__GFP_HIGH | __GFP_KSWAPD_RECLAIM) which wakes up kswapd.
>>>>>> Since fill_pool() might be called with arbitrary locks held,
>>>>>> fill_pool() should not assume that holding pgdat->kswapd_wait is safe.
>>>
>>> https://lore.kernel.org/lkml/871qjldbes.ffs@tglx/
>>
>> .config says IS_ENABLED(CONFIG_PREEMPT_RT) == false, and lockdep says about
>> base->lock => pgdat->kswapd_wait => p->pi_lock => rq->__lock => base->lock
>> dependency but does not say about db->lock.
>>
>> How can your patch fix this problem?
>
> It's described in the changelog, no?
I can't find a proof that lookup_object() never returns NULL
when debug_object_activate() is called.
>
> The main change is to make the refill invocation conditional when the
> lookup fails. That's how that code has been from day one.
Making refill conditional helps reducing frequency of doing allocations.
I want a proof that allocations never happens in the worst scenario.
Are you saying that some debugobject function other than debug_object_activate()
guarantees that memory for that object was already allocated before
debug_object_activate() is called for the first time for that object,
_and_ such debugobject function is called without locks held?
>
> The patch which closed the race recently wreckaged those refill
> oportunities and the fix for that introduced this problem.
>
> Thanks,
>
> tglx
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] debugobject: don't wake up kswapd from fill_pool()
2023-05-12 23:13 ` Tetsuo Handa
@ 2023-05-13 8:33 ` Thomas Gleixner
2023-05-13 9:33 ` Tetsuo Handa
0 siblings, 1 reply; 10+ messages in thread
From: Thomas Gleixner @ 2023-05-13 8:33 UTC (permalink / raw)
To: Tetsuo Handa, Andrew Morton
Cc: syzbot, syzkaller-bugs, Ingo Molnar, linux-kernel, linux-mm,
Peter Zijlstra
On Sat, May 13 2023 at 08:13, Tetsuo Handa wrote:
> On 2023/05/13 3:07, Thomas Gleixner wrote:
>> The main change is to make the refill invocation conditional when the
>> lookup fails. That's how that code has been from day one.
>
> Making refill conditional helps reducing frequency of doing allocations.
> I want a proof that allocations never happens in the worst scenario.
>
> Are you saying that some debugobject function other than debug_object_activate()
> guarantees that memory for that object was already allocated before
> debug_object_activate() is called for the first time for that object,
> _and_ such debugobject function is called without locks held?
The point is that the allocation in activate() only happens when the
tracked entity was not initialized _before_ activate() is invoked.
That's a bug for dynamically allocated entities, but a valid scenario
for statically initialized entities as they can be activated without
prior init() obviously.
For dynamically allocated entities the init() function takes care of the
tracking object allocation and that's where the pool is refilled. So for
those the lookup will never fail.
Now I just stared at __alloc_pages_slowpath() and looked at the
condition for wakeup_all_kswapds(). ALLOC_KSWAPD is set because
debugobject uses GFP_ATOMIC which contains __GFP_KSWAPD_RECLAIM.
So debug objects needs to have s/GFP_ATOMIC/__GFP_HIGH/ to prevent that
wakeup path.
Thanks,
tglx
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] debugobject: don't wake up kswapd from fill_pool()
2023-05-13 8:33 ` Thomas Gleixner
@ 2023-05-13 9:33 ` Tetsuo Handa
2023-05-13 19:42 ` Thomas Gleixner
0 siblings, 1 reply; 10+ messages in thread
From: Tetsuo Handa @ 2023-05-13 9:33 UTC (permalink / raw)
To: Thomas Gleixner, Andrew Morton
Cc: syzbot, syzkaller-bugs, Ingo Molnar, linux-kernel, linux-mm,
Peter Zijlstra
On 2023/05/13 17:33, Thomas Gleixner wrote:
> Now I just stared at __alloc_pages_slowpath() and looked at the
> condition for wakeup_all_kswapds(). ALLOC_KSWAPD is set because
> debugobject uses GFP_ATOMIC which contains __GFP_KSWAPD_RECLAIM.
>
> So debug objects needs to have s/GFP_ATOMIC/__GFP_HIGH/ to prevent that
> wakeup path.
Yes. That is exactly what my patch does.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] debugobject: don't wake up kswapd from fill_pool()
2023-05-13 9:33 ` Tetsuo Handa
@ 2023-05-13 19:42 ` Thomas Gleixner
0 siblings, 0 replies; 10+ messages in thread
From: Thomas Gleixner @ 2023-05-13 19:42 UTC (permalink / raw)
To: Tetsuo Handa, Andrew Morton
Cc: syzbot, syzkaller-bugs, Ingo Molnar, linux-kernel, linux-mm,
Peter Zijlstra
On Sat, May 13 2023 at 18:33, Tetsuo Handa wrote:
> On 2023/05/13 17:33, Thomas Gleixner wrote:
>> Now I just stared at __alloc_pages_slowpath() and looked at the
>> condition for wakeup_all_kswapds(). ALLOC_KSWAPD is set because
>> debugobject uses GFP_ATOMIC which contains __GFP_KSWAPD_RECLAIM.
>>
>> So debug objects needs to have s/GFP_ATOMIC/__GFP_HIGH/ to prevent that
>> wakeup path.
>
> Yes. That is exactly what my patch does.
Indeed. For some reason your patch (though you cc'ed me) did not show up
in my inbox. I've grabbed it from lore so no need to resend.
Actually we want both changes.
- Your's to fix the underlying ancient problem.
- The one I did which restores the performance behaviour
Thanks,
tglx
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2023-05-13 19:42 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <000000000000008ddb05fb5e2576@google.com>
2023-05-11 13:47 ` [PATCH] debugobject: don't wake up kswapd from fill_pool() Tetsuo Handa
2023-05-12 3:44 ` Andrew Morton
2023-05-12 10:57 ` Tetsuo Handa
2023-05-12 12:54 ` Thomas Gleixner
2023-05-12 13:09 ` Tetsuo Handa
2023-05-12 18:07 ` Thomas Gleixner
2023-05-12 23:13 ` Tetsuo Handa
2023-05-13 8:33 ` Thomas Gleixner
2023-05-13 9:33 ` Tetsuo Handa
2023-05-13 19:42 ` Thomas Gleixner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox