linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: syzbot <syzbot+fe0c72f0ccbb93786380@syzkaller.appspotmail.com>,
	syzkaller-bugs@googlegroups.com, Ingo Molnar <mingo@elte.hu>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-kernel@vger.kernel.org, linux-mm <linux-mm@kvack.org>
Subject: Re: [PATCH] debugobject: don't wake up kswapd from fill_pool()
Date: Fri, 12 May 2023 19:57:07 +0900	[thread overview]
Message-ID: <d642e597-cf7d-b410-16ce-22dff483fd8e@I-love.SAKURA.ne.jp> (raw)
In-Reply-To: <20230511204458.819f9009d2ef8b46cc163191@linux-foundation.org>

On 2023/05/12 12:44, Andrew Morton wrote:
> On Thu, 11 May 2023 22:47:32 +0900 Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> wrote:
> 
>> syzbot is reporting lockdep warning in fill_pool(), for GFP_ATOMIC is
>> (__GFP_HIGH | __GFP_KSWAPD_RECLAIM) which wakes up kswapd.
>> Since fill_pool() might be called with arbitrary locks held,
>> fill_pool() should not assume that holding pgdat->kswapd_wait is safe.
> 
> hm.  But many GFP_ATOMIC allocation attempts are made with locks held. 
> Why aren't all such callers buggy, by trying to wake kswapd with locks
> held?  What's special about this one?

Because debugobject cannot know what locks are held when fill_pool() does
GFP_ATOMIC allocation.

syzbot is reporting base->lock => pgdat->kswapd_wait dependency

  add_timer() {
    __mod_timer() {
      base = lock_timer_base(timer, &flags);
      new_base = get_target_base(base, timer->flags);
      if (base != new_base) {
        raw_spin_unlock(&base->lock);
        base = new_base;
        raw_spin_lock(&base->lock);
      }
      debug_timer_activate(timer) {
        debug_object_activate(timer, &timer_debug_descr) {
          debug_objects_fill_pool() {
            fill_pool() {
              kmem_cache_zalloc(GFP_ATOMIC | __GFP_NORETRY | __GFP_NOWARN) {
                // wakes kswapd
              }
            }
          }
        }
      }
      raw_spin_unlock_irqrestore(&base->lock, flags);
    }
  }

when pgdat->kswapd_wait => p->pi_lock dependency

  __alloc_pages() {
    get_page_from_freelist() {
      rmqueue() {
        wakeup_kswapd() {
          wake_up_interruptible(&pgdat->kswapd_wait) {
            __wake_up_common_lock() {
              spin_lock_irqsave(&pgdat->kswapd_wait.lock, flags);
              __wake_up_common() {
                autoremove_wake_function() {
                  try_to_wake_up() {
                    raw_spin_lock_irqsave(&p->pi_lock, flags);
                    raw_spin_unlock_irqrestore(&p->pi_lock, flags);
                  }
                }
              }
              spin_unlock_irqrestore(&pgdat->kswapd_wait.lock, flags);
            }
          }
        }
      }
    }
  }

and p->pi_lock => rq->__lock => base->lock dependency

  wake_up_new_task() {
    raw_spin_lock_irqsave(&p->pi_lock, rf.flags);
    rq = __task_rq_lock(p, &rf); // acquires rq->lock
    activate_task(rq, p, ENQUEUE_NOCLOCK) {
      enqueue_task() {
        psi_enqueue() {
          psi_task_change() {
            queue_delayed_work_on() {
              __queue_delayed_work() {
                add_timer() {
                  __mod_timer() {
                    base = lock_timer_base(timer, &flags); // acquires base->lock
                    debug_timer_activate(timer); // possible base->lock => pgdat->kswapd_wait => p->pi_lock dependency
                    raw_spin_unlock_irqrestore(&base->lock, flags);
                  }
                }
              }
            }
          }
        }
      }
    }
    task_rq_unlock(rq, p, &rf);
  }

exists.

All GFP_ATOMIC allocation users are supposed to be aware of what locks
are held, and are supposed to explicitly remove __GFP_KSWAPD_RECLAIM
if waking up kswapd can cause deadlock. But reality is that we can't be
careful enough to error-free. Who would imagine GFP_ATOMIC allocation
while base->lock is held can form circular locking dependency?

> 
>> Also, __GFP_NORETRY is pointless for !__GFP_DIRECT_RECLAIM allocation

__GFP_NORETRY is not checked by !__GFP_DIRECT_RECLAIM allocation.
GFP_ATOMIC - __GFP_KSWAPD_RECLAIM is __GFP_HIGH.

>>
>> @@ -126,7 +126,7 @@ static const char *obj_states[ODEBUG_STATE_MAX] = {
>>  
>>  static void fill_pool(void)
>>  {
>> -	gfp_t gfp = GFP_ATOMIC | __GFP_NORETRY | __GFP_NOWARN;
>> +	gfp_t gfp = __GFP_HIGH | __GFP_NOWARN;
> 
> Does this weaken fill_pool()'s allocation attempt more than necessary? 
> We can still pass __GFP_HIGH?

What do you mean? I think that killing base->lock => pgdat->kswapd_wait
by removing __GFP_KSWAPD_RECLAIM is the right fix. This weakening is
needed for avoiding base->lock => pgdat->kswapd_wait dependency from
debugobject code.

For locking dependency safety, I wish that GFP_ATOMIC / GFP_NOWAIT do not imply
__GFP_KSWAPD_RECLAIM. Such allocations should not try to allocate as many pages
as even __GFP_HIGH fails. And if such allocations try to allocate as many pages
as even __GFP_HIGH fails, they likely already failed before background kswapd
reclaim finds some reusable pages....



  reply	other threads:[~2023-05-12 10:57 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <000000000000008ddb05fb5e2576@google.com>
2023-05-11 13:47 ` Tetsuo Handa
2023-05-12  3:44   ` Andrew Morton
2023-05-12 10:57     ` Tetsuo Handa [this message]
2023-05-12 12:54       ` Thomas Gleixner
2023-05-12 13:09         ` Tetsuo Handa
2023-05-12 18:07           ` Thomas Gleixner
2023-05-12 23:13             ` Tetsuo Handa
2023-05-13  8:33               ` Thomas Gleixner
2023-05-13  9:33                 ` Tetsuo Handa
2023-05-13 19:42                   ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d642e597-cf7d-b410-16ce-22dff483fd8e@I-love.SAKURA.ne.jp \
    --to=penguin-kernel@i-love.sakura.ne.jp \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@elte.hu \
    --cc=syzbot+fe0c72f0ccbb93786380@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox