linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Qian Cai <cai@lca.pw>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: mhocko@kernel.org, sergey.senozhatsky.work@gmail.com,
	pmladek@suse.com, rostedt@goodmis.org, peterz@infradead.org,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm/page_isolation: fix a deadlock with printk()
Date: Sat, 5 Oct 2019 20:10:47 -0400	[thread overview]
Message-ID: <49F0AD04-6F61-4A1D-BFD5-E0769EC6F103@lca.pw> (raw)
In-Reply-To: <20191005162942.b392b9336b860e245106faa2@linux-foundation.org>

[-- Attachment #1: Type: text/plain, Size: 3523 bytes --]



> On Oct 5, 2019, at 7:29 PM, Andrew Morton <akpm@linux-foundation.org> wrote:
> 
> On Fri,  4 Oct 2019 12:42:26 -0400 Qian Cai <cai@lca.pw> wrote:
> 
>> It is unsafe to call printk() while zone->lock was held, i.e.,
>> 
>> zone->lock --> console_sem
>> 
>> because the console could always allocate some memory in different code
>> paths and form locking chains in an opposite order,
>> 
>> console_sem --> * --> zone->lock
>> 
>> As the result, it triggers lockdep splats like below and in [1]. It is
>> fine to take zone->lock after has_unmovable_pages() (which has
>> dump_stack()) in set_migratetype_isolate(). While at it, remove a
>> problematic printk() in __offline_isolated_pages() only for debugging as
>> well which will always disable lockdep on debug kernels.
>> 
>> The problem is probably there forever, but neither many developers will
>> run memory offline with the lockdep enabled nor admins in the field are
>> lucky enough yet to hit a perfect timing which required to trigger a
>> real deadlock. In addition, there aren't many places that call printk()
>> while zone->lock was held.
>> 
>> WARNING: possible circular locking dependency detected
>> ------------------------------------------------------
>> test.sh/1724 is trying to acquire lock:
>> 0000000052059ec0 (console_owner){-...}, at: console_unlock+0x
>> 01: 328/0xa30
>> 
>> but task is already holding lock:
>> 000000006ffd89c8 (&(&zone->lock)->rlock){-.-.}, at: start_iso
>> 01: late_page_range+0x216/0x538
>> 
>> which lock already depends on the new lock.
>> 
>> the existing dependency chain (in reverse order) is:
>> 
>> -> #2 (&(&zone->lock)->rlock){-.-.}:
>>       lock_acquire+0x21a/0x468
>>       _raw_spin_lock+0x54/0x68
>>       get_page_from_freelist+0x8b6/0x2d28
>>       __alloc_pages_nodemask+0x246/0x658
>>       __get_free_pages+0x34/0x78
>>       sclp_init+0x106/0x690
>>       sclp_register+0x2e/0x248
>>       sclp_rw_init+0x4a/0x70
>>       sclp_console_init+0x4a/0x1b8
>>       console_init+0x2c8/0x410
>>       start_kernel+0x530/0x6a0
>>       startup_continue+0x70/0xd0
> 
> This appears to be the core of our problem?

No, that is just one of those many places could form the lock chain. 

console_lock -> other locks -> zone_lock

Another example is,

https://lore.kernel.org/lkml/1568823006.5576.178.camel@lca.pw/

It is easier to avoid,

zone_lock -> console_lock

rather than fixing the opposite.

>  At initialization time,
> the sclp driver registers an inappropriate dependency with lockdep.  It
> does this by calling into the page allocator while holding sclp_lock. 
> But we don't *want* to teach lockdep that sclp_lock nests outside
> zone->lock.  We want the opposite.
> 
> So can we address this class of problem by declaring "thou shalt not
> call the page allocator while holding a lock which can be taken on the
> prink path?".  And then declare sclp to be defective.
> 
> 
> And I think sclp is kinda buggy-but-lucky anyway: if console output is
> directed to sclp device #0 and we're then trying to initialize sclp
> device #1 then any printk which happens during that initialization will
> deadlock.  The driver escapes this by only supporting a single device
> system-wide but it's not a model which drivers should generally follow.
> 
> (And if sclp will only ever support a single device system-wide, why
> the heck does it need to take sclp_lock() on the device initialization
> path??)
> 
> 

[-- Attachment #2: Type: text/html, Size: 7245 bytes --]

  reply	other threads:[~2019-10-06  0:10 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-04 16:42 Qian Cai
2019-10-04 17:47 ` David Hildenbrand
2019-10-04 18:53   ` Qian Cai
2019-10-05 23:29 ` Andrew Morton
2019-10-06  0:10   ` Qian Cai [this message]
2019-10-06  0:44     ` Andrew Morton
2019-10-06  1:10       ` Qian Cai
2019-10-06  1:56       ` Qian Cai
2019-10-06  0:59   ` Qian Cai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49F0AD04-6F61-4A1D-BFD5-E0769EC6F103@lca.pw \
    --to=cai@lca.pw \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=peterz@infradead.org \
    --cc=pmladek@suse.com \
    --cc=rostedt@goodmis.org \
    --cc=sergey.senozhatsky.work@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox