From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C11B3C77B7C for ; Fri, 12 May 2023 10:57:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 318C56B0071; Fri, 12 May 2023 06:57:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2C9476B0074; Fri, 12 May 2023 06:57:25 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 190936B0075; Fri, 12 May 2023 06:57:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 049126B0071 for ; Fri, 12 May 2023 06:57:25 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id ECDBDC0FC0 for ; Fri, 12 May 2023 10:57:23 +0000 (UTC) X-FDA: 80781301566.19.2471F18 Received: from www262.sakura.ne.jp (www262.sakura.ne.jp [202.181.97.72]) by imf25.hostedemail.com (Postfix) with ESMTP id 33608A0017 for ; Fri, 12 May 2023 10:57:20 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=none; spf=none (imf25.hostedemail.com: domain of penguin-kernel@I-love.SAKURA.ne.jp has no SPF policy when checking 202.181.97.72) smtp.mailfrom=penguin-kernel@I-love.SAKURA.ne.jp; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1683889042; a=rsa-sha256; cv=none; b=qZMU08wGLebOiA/kftN6FNdxV5ALr0gjQ4ZuhZdtnD1aMGF2mtFqWIQue7q/E7mw94EmVs H8vKV6dQ2+TeRXjZhPJPbLpRyCPM2z/V+/GaFQ4UP4wem68ZnYjyheKaSZhytYNOz4ANh7 y2CICIhfdLBA+UgLlrgiSsKG7TmdwpE= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=none; spf=none (imf25.hostedemail.com: domain of penguin-kernel@I-love.SAKURA.ne.jp has no SPF policy when checking 202.181.97.72) smtp.mailfrom=penguin-kernel@I-love.SAKURA.ne.jp; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1683889042; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=HZFKgQVLNFmJ0Qa+9NBzc7TeL98znok498nULDXd8wI=; b=z7fROESAI60XJGZPDSPFvc1QTt0HIdhASP1aVYMWm4ADmP141DQ5GoA7Z92hhyvm04YwoE tdhsftBvThDHzOlUhJ3YiIEqi6CBDQXSpdyXsp8BMce6nvn7XOLS/GAkzBdn2nHjI6Zdi1 mo+aKzkODnOk4sFQYzkqRQR7pkmT+/U= Received: from fsav415.sakura.ne.jp (fsav415.sakura.ne.jp [133.242.250.114]) by www262.sakura.ne.jp (8.15.2/8.15.2) with ESMTP id 34CAv7ww070055; Fri, 12 May 2023 19:57:07 +0900 (JST) (envelope-from penguin-kernel@I-love.SAKURA.ne.jp) Received: from www262.sakura.ne.jp (202.181.97.72) by fsav415.sakura.ne.jp (F-Secure/fsigk_smtp/550/fsav415.sakura.ne.jp); Fri, 12 May 2023 19:57:07 +0900 (JST) X-Virus-Status: clean(F-Secure/fsigk_smtp/550/fsav415.sakura.ne.jp) Received: from [192.168.1.6] (M106072142033.v4.enabler.ne.jp [106.72.142.33]) (authenticated bits=0) by www262.sakura.ne.jp (8.15.2/8.15.2) with ESMTPSA id 34CAv6iv070050 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NO); Fri, 12 May 2023 19:57:06 +0900 (JST) (envelope-from penguin-kernel@I-love.SAKURA.ne.jp) Message-ID: Date: Fri, 12 May 2023 19:57:07 +0900 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.10.1 Subject: Re: [PATCH] debugobject: don't wake up kswapd from fill_pool() Content-Language: en-US To: Andrew Morton Cc: syzbot , syzkaller-bugs@googlegroups.com, Ingo Molnar , Thomas Gleixner , linux-kernel@vger.kernel.org, linux-mm References: <000000000000008ddb05fb5e2576@google.com> <6577e1fa-b6ee-f2be-2414-a2b51b1c5e30@I-love.SAKURA.ne.jp> <20230511204458.819f9009d2ef8b46cc163191@linux-foundation.org> From: Tetsuo Handa In-Reply-To: <20230511204458.819f9009d2ef8b46cc163191@linux-foundation.org> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 33608A0017 X-Stat-Signature: pbnsmqaak7jh755r43sumkxyq3ew1nf8 X-HE-Tag: 1683889040-127175 X-HE-Meta: U2FsdGVkX18qnrsDju59hjuKGk7LLeNNvaAQBR9C5guve4RRT21GN1F4Q7WGvLWTh23K5Gg1+yTTQKN5QjRGqf34Peom0eD1BGtwF5rdHna2iEzr/URDC20j4DrIdkdKchPI9EEBaBBkVonjVg5Ab539aPgk+2sJJawYZ62msET5Qd9k3nhQkx14J8mxjSYPE1ZNJ+rav7VrWdd9vl5Obo+DhQ2Sfaoyu3Wa2kYThzH2Ok4EvAstXI3WWYd9AXchuVG9QS4ENnaNre/PRjiZV/G4/+KLzf1f4hdx32cOYQ9MwUvNi3j3QDutmf+Ja4FzciOuNlhcSgvWw5qYYZS6WXsZ2HKjq4GQhLZNRxBRFH1PHZ6VWb1Kwe/X96WOhBtsjGa1mPrjYcLSQCQ69K6AQ1Frpn8sTTSgv1A45RhCOoFCXWkGIlxwFN1TIgTxSWzOLokT92b2TiiJVcpbrkTHYURb2xIDRrrdWP9QbwO5hEejhUx4eC0NlDgsA82D9AiuLwUQG8ctQhKdCK4GmfpdXUX6QD9cWevau3rVxdyEiMUkjpy2XIfo4NGlBE9xP62Vxr0nuGmDOmCupOnW3iTYzI5M5TjtWUyTPl/8P7wIUD3DvuodLtTDe+Ky8qKU2IwAzbI08COVcFBqF6JyksYTQYQAHTSkxcOcoVYCMROzH9oRP97sKQGwN8uMrXaxC6Rt8qennHz3vrZ9hQkC7zCnUHTJ2T//nk7ueMCkELg+fpEU48bULW3yOAgwyD94Q9NHh4bMvZN/L2MHEZowxBcuD5JVJm4hfChmac3+eU9tuT/5PcpXTfksWh/4u5pbb7h+Px2L8eoHgBinbMW9CtpDHnnoBehOAoRGnwvnuk4FhFscujvoWijzqYr3fFg8M9x9hYcvGza19g0xpvB+qnkRLFPvudQJu1YkQdBWlGsqyOqyQ72LMmsrHwPzpeiN2XeR45dZJwHGL2kKvizStpr /rRLZQ8v YSN+qf2DoXv+xa18oabthh8roeHdEDy21dgXKf9xdSnKlKMDmlZ5m3AmHULxuNNuxXxx8g2r+FUI9AkxjFw1yQAaIbFXQuuin+YQ4y8fZ23/K/+A038pTaZo00BjA4YMsTDP7zhPBYhG67tIfrTH9CVoXxXmOJqbKsk0QhCnZAxm7bwKZGHKFXsI5WuCgvaBcZQJqcmrvngpC5Y+J/E/B7FmDFYQknb5DbDaXaSJ32iHK3MvZVZU91AI+IAohDvkGY8NhsxvhXEp+5dzEeFJTuNxR08VaMODeS8nDJjEOX4pZkM3qWGrB/4DO8dZVP3zf+vtE4OND5fxkzZ9X28vVk95tnn5yjNCNTukBQzzBpoq7NdI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2023/05/12 12:44, Andrew Morton wrote: > On Thu, 11 May 2023 22:47:32 +0900 Tetsuo Handa wrote: > >> syzbot is reporting lockdep warning in fill_pool(), for GFP_ATOMIC is >> (__GFP_HIGH | __GFP_KSWAPD_RECLAIM) which wakes up kswapd. >> Since fill_pool() might be called with arbitrary locks held, >> fill_pool() should not assume that holding pgdat->kswapd_wait is safe. > > hm. But many GFP_ATOMIC allocation attempts are made with locks held. > Why aren't all such callers buggy, by trying to wake kswapd with locks > held? What's special about this one? Because debugobject cannot know what locks are held when fill_pool() does GFP_ATOMIC allocation. syzbot is reporting base->lock => pgdat->kswapd_wait dependency add_timer() { __mod_timer() { base = lock_timer_base(timer, &flags); new_base = get_target_base(base, timer->flags); if (base != new_base) { raw_spin_unlock(&base->lock); base = new_base; raw_spin_lock(&base->lock); } debug_timer_activate(timer) { debug_object_activate(timer, &timer_debug_descr) { debug_objects_fill_pool() { fill_pool() { kmem_cache_zalloc(GFP_ATOMIC | __GFP_NORETRY | __GFP_NOWARN) { // wakes kswapd } } } } } raw_spin_unlock_irqrestore(&base->lock, flags); } } when pgdat->kswapd_wait => p->pi_lock dependency __alloc_pages() { get_page_from_freelist() { rmqueue() { wakeup_kswapd() { wake_up_interruptible(&pgdat->kswapd_wait) { __wake_up_common_lock() { spin_lock_irqsave(&pgdat->kswapd_wait.lock, flags); __wake_up_common() { autoremove_wake_function() { try_to_wake_up() { raw_spin_lock_irqsave(&p->pi_lock, flags); raw_spin_unlock_irqrestore(&p->pi_lock, flags); } } } spin_unlock_irqrestore(&pgdat->kswapd_wait.lock, flags); } } } } } } and p->pi_lock => rq->__lock => base->lock dependency wake_up_new_task() { raw_spin_lock_irqsave(&p->pi_lock, rf.flags); rq = __task_rq_lock(p, &rf); // acquires rq->lock activate_task(rq, p, ENQUEUE_NOCLOCK) { enqueue_task() { psi_enqueue() { psi_task_change() { queue_delayed_work_on() { __queue_delayed_work() { add_timer() { __mod_timer() { base = lock_timer_base(timer, &flags); // acquires base->lock debug_timer_activate(timer); // possible base->lock => pgdat->kswapd_wait => p->pi_lock dependency raw_spin_unlock_irqrestore(&base->lock, flags); } } } } } } } } task_rq_unlock(rq, p, &rf); } exists. All GFP_ATOMIC allocation users are supposed to be aware of what locks are held, and are supposed to explicitly remove __GFP_KSWAPD_RECLAIM if waking up kswapd can cause deadlock. But reality is that we can't be careful enough to error-free. Who would imagine GFP_ATOMIC allocation while base->lock is held can form circular locking dependency? > >> Also, __GFP_NORETRY is pointless for !__GFP_DIRECT_RECLAIM allocation __GFP_NORETRY is not checked by !__GFP_DIRECT_RECLAIM allocation. GFP_ATOMIC - __GFP_KSWAPD_RECLAIM is __GFP_HIGH. >> >> @@ -126,7 +126,7 @@ static const char *obj_states[ODEBUG_STATE_MAX] = { >> >> static void fill_pool(void) >> { >> - gfp_t gfp = GFP_ATOMIC | __GFP_NORETRY | __GFP_NOWARN; >> + gfp_t gfp = __GFP_HIGH | __GFP_NOWARN; > > Does this weaken fill_pool()'s allocation attempt more than necessary? > We can still pass __GFP_HIGH? What do you mean? I think that killing base->lock => pgdat->kswapd_wait by removing __GFP_KSWAPD_RECLAIM is the right fix. This weakening is needed for avoiding base->lock => pgdat->kswapd_wait dependency from debugobject code. For locking dependency safety, I wish that GFP_ATOMIC / GFP_NOWAIT do not imply __GFP_KSWAPD_RECLAIM. Such allocations should not try to allocate as many pages as even __GFP_HIGH fails. And if such allocations try to allocate as many pages as even __GFP_HIGH fails, they likely already failed before background kswapd reclaim finds some reusable pages....