From: Qian Cai <cai@lca.pw>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: mhocko@kernel.org, sergey.senozhatsky.work@gmail.com,
pmladek@suse.com, rostedt@goodmis.org, peterz@infradead.org,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm/page_isolation: fix a deadlock with printk()
Date: Sat, 5 Oct 2019 20:10:47 -0400 [thread overview]
Message-ID: <49F0AD04-6F61-4A1D-BFD5-E0769EC6F103@lca.pw> (raw)
In-Reply-To: <20191005162942.b392b9336b860e245106faa2@linux-foundation.org>
[-- Attachment #1: Type: text/plain, Size: 3523 bytes --]
> On Oct 5, 2019, at 7:29 PM, Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Fri, 4 Oct 2019 12:42:26 -0400 Qian Cai <cai@lca.pw> wrote:
>
>> It is unsafe to call printk() while zone->lock was held, i.e.,
>>
>> zone->lock --> console_sem
>>
>> because the console could always allocate some memory in different code
>> paths and form locking chains in an opposite order,
>>
>> console_sem --> * --> zone->lock
>>
>> As the result, it triggers lockdep splats like below and in [1]. It is
>> fine to take zone->lock after has_unmovable_pages() (which has
>> dump_stack()) in set_migratetype_isolate(). While at it, remove a
>> problematic printk() in __offline_isolated_pages() only for debugging as
>> well which will always disable lockdep on debug kernels.
>>
>> The problem is probably there forever, but neither many developers will
>> run memory offline with the lockdep enabled nor admins in the field are
>> lucky enough yet to hit a perfect timing which required to trigger a
>> real deadlock. In addition, there aren't many places that call printk()
>> while zone->lock was held.
>>
>> WARNING: possible circular locking dependency detected
>> ------------------------------------------------------
>> test.sh/1724 is trying to acquire lock:
>> 0000000052059ec0 (console_owner){-...}, at: console_unlock+0x
>> 01: 328/0xa30
>>
>> but task is already holding lock:
>> 000000006ffd89c8 (&(&zone->lock)->rlock){-.-.}, at: start_iso
>> 01: late_page_range+0x216/0x538
>>
>> which lock already depends on the new lock.
>>
>> the existing dependency chain (in reverse order) is:
>>
>> -> #2 (&(&zone->lock)->rlock){-.-.}:
>> lock_acquire+0x21a/0x468
>> _raw_spin_lock+0x54/0x68
>> get_page_from_freelist+0x8b6/0x2d28
>> __alloc_pages_nodemask+0x246/0x658
>> __get_free_pages+0x34/0x78
>> sclp_init+0x106/0x690
>> sclp_register+0x2e/0x248
>> sclp_rw_init+0x4a/0x70
>> sclp_console_init+0x4a/0x1b8
>> console_init+0x2c8/0x410
>> start_kernel+0x530/0x6a0
>> startup_continue+0x70/0xd0
>
> This appears to be the core of our problem?
No, that is just one of those many places could form the lock chain.
console_lock -> other locks -> zone_lock
Another example is,
https://lore.kernel.org/lkml/1568823006.5576.178.camel@lca.pw/
It is easier to avoid,
zone_lock -> console_lock
rather than fixing the opposite.
> At initialization time,
> the sclp driver registers an inappropriate dependency with lockdep. It
> does this by calling into the page allocator while holding sclp_lock.
> But we don't *want* to teach lockdep that sclp_lock nests outside
> zone->lock. We want the opposite.
>
> So can we address this class of problem by declaring "thou shalt not
> call the page allocator while holding a lock which can be taken on the
> prink path?". And then declare sclp to be defective.
>
>
> And I think sclp is kinda buggy-but-lucky anyway: if console output is
> directed to sclp device #0 and we're then trying to initialize sclp
> device #1 then any printk which happens during that initialization will
> deadlock. The driver escapes this by only supporting a single device
> system-wide but it's not a model which drivers should generally follow.
>
> (And if sclp will only ever support a single device system-wide, why
> the heck does it need to take sclp_lock() on the device initialization
> path??)
>
>
[-- Attachment #2: Type: text/html, Size: 7245 bytes --]
next prev parent reply other threads:[~2019-10-06 0:10 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-04 16:42 Qian Cai
2019-10-04 17:47 ` David Hildenbrand
2019-10-04 18:53 ` Qian Cai
2019-10-05 23:29 ` Andrew Morton
2019-10-06 0:10 ` Qian Cai [this message]
2019-10-06 0:44 ` Andrew Morton
2019-10-06 1:10 ` Qian Cai
2019-10-06 1:56 ` Qian Cai
2019-10-06 0:59 ` Qian Cai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=49F0AD04-6F61-4A1D-BFD5-E0769EC6F103@lca.pw \
--to=cai@lca.pw \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=peterz@infradead.org \
--cc=pmladek@suse.com \
--cc=rostedt@goodmis.org \
--cc=sergey.senozhatsky.work@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox