From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.5 required=3.0 tests=MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 629D4ECE58E for ; Tue, 8 Oct 2019 08:40:36 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 31AE420673 for ; Tue, 8 Oct 2019 08:40:36 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 31AE420673 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id BCBAD8E0005; Tue, 8 Oct 2019 04:40:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B7C6A8E0003; Tue, 8 Oct 2019 04:40:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A917C8E0005; Tue, 8 Oct 2019 04:40:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0207.hostedemail.com [216.40.44.207]) by kanga.kvack.org (Postfix) with ESMTP id 832968E0003 for ; Tue, 8 Oct 2019 04:40:35 -0400 (EDT) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with SMTP id 1D638824CA22 for ; Tue, 8 Oct 2019 08:40:35 +0000 (UTC) X-FDA: 76019971230.16.fight09_50bcef1b7d09 X-HE-Tag: fight09_50bcef1b7d09 X-Filterd-Recvd-Size: 4791 Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by imf07.hostedemail.com (Postfix) with ESMTP for ; Tue, 8 Oct 2019 08:40:34 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 23F2BAF87; Tue, 8 Oct 2019 08:40:33 +0000 (UTC) Date: Tue, 8 Oct 2019 10:40:31 +0200 From: Michal Hocko To: Qian Cai Cc: Petr Mladek , akpm@linux-foundation.org, sergey.senozhatsky.work@gmail.com, rostedt@goodmis.org, peterz@infradead.org, linux-mm@kvack.org, john.ogness@linutronix.de, david@redhat.com, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2] mm/page_isolation: fix a deadlock with printk() Message-ID: <20191008084031.GC6681@dhcp22.suse.cz> References: <1570228005-24979-1-git-send-email-cai@lca.pw> <20191007143002.l37bt2lzqtnqjqxu@pathway.suse.cz> <1570460350.5576.290.camel@lca.pw> <20191007151237.GP2381@dhcp22.suse.cz> <1570462407.5576.292.camel@lca.pw> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <1570462407.5576.292.camel@lca.pw> User-Agent: Mutt/1.10.1 (2018-07-13) Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon 07-10-19 11:33:27, Qian Cai wrote: > On Mon, 2019-10-07 at 17:12 +0200, Michal Hocko wrote: > > On Mon 07-10-19 10:59:10, Qian Cai wrote: > > [...] > > > It is almost impossible to eliminate all the indirect call chains f= rom > > > console_sem/console_owner_lock to zone->lock because it is too norm= al that > > > something later needs to allocate some memory dynamically, so as lo= ng as it > > > directly call printk() with zone->lock held, it will be in trouble. > >=20 > > Do you have any example where the console driver really _has_ to > > allocate. Because I have hard time to believe this is going to work a= t > > all as the atomic context doesn't allow to do any memory reclaim and > > such an allocation would be too easy to fail so the allocation cannot > > really rely on it. >=20 > I don't know how to explain to you clearly, but let me repeat again one= last > time. There is no necessary for console driver directly to allocate con= sidering > this example, >=20 > CPU0: CPU1: CPU2: CPU3: > console_sem->lock zone->lock > pi->lock > pi->lock rq_lock > rq->lock > zone->lock > console_sem->lock >=20 > Here it only need someone held the rq_lock and allocate some memory. Is the scheduler really allocating while holding the rq lock? > There is > also true for port_lock. Since the deadlock could involve a lot of CPUs= and a > longer lock chain, it is impossible to predict which one to allocate so= me memory > while held a lock could end up with the same problematic lock chain. And that is exactly what I've said earlier. Locks used by consoles should really better be tail locks because then they are going to create arbitrary dependency chains. The zone->lock is in no way special here. =20 > > So again, crippling the MM code just because of lockdep false possiti= ves > > or a broken console driver sounds like a wrong way to approach the > > problem. > >=20 > > > [=A0=A0297.425964] -> #1 (&port_lock_key){-.-.}: > > > [=A0=A0297.425967]=A0=A0=A0=A0=A0=A0=A0=A0__lock_acquire+0x5b3/0xb4= 0 > > > [=A0=A0297.425967]=A0=A0=A0=A0=A0=A0=A0=A0lock_acquire+0x126/0x280 > > > [=A0=A0297.425968]=A0=A0=A0=A0=A0=A0=A0=A0_raw_spin_lock_irqsave+0x= 3a/0x50 > > > [=A0=A0297.425969]=A0=A0=A0=A0=A0=A0=A0=A0serial8250_console_write+= 0x3e4/0x450 > > > [=A0=A0297.425970]=A0=A0=A0=A0=A0=A0=A0=A0univ8250_console_write+0x= 4b/0x60 > > > [=A0=A0297.425970]=A0=A0=A0=A0=A0=A0=A0=A0console_unlock+0x501/0x75= 0 > > > [=A0=A0297.425971]=A0=A0=A0=A0=A0=A0=A0=A0vprintk_emit+0x10d/0x340 > > > [=A0=A0297.425972]=A0=A0=A0=A0=A0=A0=A0=A0vprintk_default+0x1f/0x30 > > > [=A0=A0297.425972]=A0=A0=A0=A0=A0=A0=A0=A0vprintk_func+0x44/0xd4 > > > [=A0=A0297.425973]=A0=A0=A0=A0=A0=A0=A0=A0printk+0x9f/0xc5 > > > [=A0=A0297.425974]=A0=A0=A0=A0=A0=A0=A0=A0register_console+0x39c/0x= 520 > > > [=A0=A0297.425975]=A0=A0=A0=A0=A0=A0=A0=A0univ8250_console_init+0x2= 3/0x2d > > > [=A0=A0297.425975]=A0=A0=A0=A0=A0=A0=A0=A0console_init+0x338/0x4cd > > > [=A0=A0297.425976]=A0=A0=A0=A0=A0=A0=A0=A0start_kernel+0x534/0x724 > > > [=A0=A0297.425977]=A0=A0=A0=A0=A0=A0=A0=A0x86_64_start_reservations= +0x24/0x26 > > > [=A0=A0297.425977]=A0=A0=A0=A0=A0=A0=A0=A0x86_64_start_kernel+0xf4/= 0xfb > > > [=A0=A0297.425978]=A0=A0=A0=A0=A0=A0=A0=A0secondary_startup_64+0xb6= /0xc0 > >=20 > > This is an early init code again so the lockdep sounds like a false > > possitive to me. >=20 > This is just a tip of iceberg to show the lock dependency, Does this tip point to a real deadlock or merely a class of lockdep false dependencies? > console_owner --> port_lock_key >=20 > which could easily happen everywhere with a simple printk(). --=20 Michal Hocko SUSE Labs