From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4B8FAEB64D7 for ; Fri, 23 Jun 2023 18:17:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6C7658D0002; Fri, 23 Jun 2023 14:17:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 677418D0001; Fri, 23 Jun 2023 14:17:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 566648D0002; Fri, 23 Jun 2023 14:17:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 48A9D8D0001 for ; Fri, 23 Jun 2023 14:17:45 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 1EF891A0CC9 for ; Fri, 23 Jun 2023 18:17:45 +0000 (UTC) X-FDA: 80934820890.11.FD420F6 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by imf09.hostedemail.com (Postfix) with ESMTP id 11F65140016 for ; Fri, 23 Jun 2023 18:17:42 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=KGh7V5u2; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf09.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.28 as permitted sender) smtp.mailfrom=mhocko@suse.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1687544263; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=t1uqCqPbLvdYT+hRmV0MBBOULi1G46OtarHmqDXAzJ0=; b=hbdfvq4kerwEwCborynb+8Th924+vCv++uUGP9oBKyC+mJjB+f29lRWwpncbHW0rRLH1PY VCjF/MhD7GBd/H01UcO/M1hhbDPzaduFfF0FVkkra4wzOHMFDsovuuGkAbiSumyRlVZDYl WGsis9vv+V4AyLR5YEE0lKFZUEk6BMM= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=KGh7V5u2; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf09.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.28 as permitted sender) smtp.mailfrom=mhocko@suse.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1687544263; a=rsa-sha256; cv=none; b=IjvNehAnydXe47sBEjix65cWCPGsCcl6jTmScZkkigsESF+d1epakTZTgc4YG7AMvx41Yz btSvYLXabgQdI+Mu55f21HprgWGSdMbTQ47WFQA4rjIQ2s6F7N3n08k7QrNe0ei+wDipcz nZOvZ+YR8clxIIJv8IUxOTiTdszhLGo= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 76DB42199D; Fri, 23 Jun 2023 18:17:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1687544261; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=t1uqCqPbLvdYT+hRmV0MBBOULi1G46OtarHmqDXAzJ0=; b=KGh7V5u2aX/1HxzjaOzy9g5Ev/MinoQZGuXpMIPieNwHSCrU0z8zno1vZgDzYHVrFNIo2U zJsTI9c9G+dlLQJHvm7PQ3S0WiNBkzrOwedOlAdFkG5ka5SZ4lkwz7wI7KGJ0T6Repmswq DNsHC2hooT9DlC1Ton17bEDkYD3HPvo= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 553BE1331F; Fri, 23 Jun 2023 18:17:41 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id MQHuEcXhlWQsRwAAMHmgww (envelope-from ); Fri, 23 Jun 2023 18:17:41 +0000 Date: Fri, 23 Jun 2023 20:17:40 +0200 From: Michal Hocko To: Sebastian Andrzej Siewior Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Luis Claudio R. Goncalves" , Andrew Morton , Boqun Feng , Ingo Molnar , John Ogness , Mel Gorman , Peter Zijlstra , Petr Mladek , Tetsuo Handa , Thomas Gleixner , Waiman Long , Will Deacon Subject: Re: [PATCH v2 2/2] mm/page_alloc: Use write_seqlock_irqsave() instead write_seqlock() + local_irq_save(). Message-ID: References: <20230623171232.892937-1-bigeasy@linutronix.de> <20230623171232.892937-3-bigeasy@linutronix.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230623171232.892937-3-bigeasy@linutronix.de> X-Rspam-User: X-Stat-Signature: iprqkreg13eyhbn3derk1x6iq1zby8pc X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 11F65140016 X-HE-Tag: 1687544262-469955 X-HE-Meta: U2FsdGVkX19PduGUWwkJHtpQpRrsoJ2IEvuiQArBPPtguQgbnnpCX8ZCAeN8T5y8vqoo6kTNhQmeGd9l7rbbt5Rezi+/32xOUNP3lh3hPQ1zHkjS7/iwZqPeRudw6e5aYuteWChwT6e9ZWOdeHsgK/k4EKpdLw1STkcQ1c+KctbweVma25gYhPS0U6kNreWb4OpoJQrgxp0ugu69rGy2XsehDk3cvmSUlYbx6Ugj+f1ZhDbrZx0mTmPVv7U/x7KN/5rfOv7lboIXzmzDuwdTH7yEn955c1JvyW/YO8Uw9ai5vF6Q2QM6vfLIKldpbL/qOUDb1zRn997FCVnVs22wW/ZgavVzsQrvYlca66b7zQ3fVUBoF1GNsT5CamokU4iPoW3M7/dZ93pUwWACvpefWS1Nvixs4TTFstjJExNo500fUkgqeJefr1YfxmFLLBWdU17wXebpzdDDPGol2x/tS8G3WUo0gvbihhu5xGPuFy9/+mTjgR1VbEiFAPPz0Om3na5yHVTowygThWDSF3X09VvyX8oeXkEAcelEywXDwICUidowy17T8h4NemEeUHMxm1485Q22SHqmjusGasJ8fkxhY3DZgeTrrg8ux14kZd+ykXM8FzS594glova78mKqnpwlpI7eaCfqCwbC8vonrYUr3/1jD8T8sTy/bEPEUT3afTPlQ9A4o68RtYWaNLMs0Qgo5Sh0b1CFyqJW4BXuVONBiXu+ZLDsl3Rk3nULEFoyUwgYkGwH1klxr4cC5HsagWHjm8dU/TrV793TMbST1pGw3770bhebsZ+2TOqXZ3vvQqfMaF8NkIqdk+GaNDgZotIRxC1YVIMLhn32nu+T6NmBcXUigee7MmEH5D1kDHzmzHdGfIQ11Yt4Nuy2O7WL8/gllZ7BBCdcN1YI+2cWZvLr0VPxG1mbajZFkDo08kv99A7czHiEwEHHm6V21e6lx8cOmc6xvE0KQ/x3vWO oDFxpWXk dE0xjs5DxDuO+SQfPEST20Kp5UdXFEb3yz1+OBxgh6aXyEpv7JKbnEgCP4eo+hxY0uKHtV7l0/EUg6CHBg969h9GISYJmcG8Y0/K586AyC0iwKoIyqtZrg5zfjmlK699p2Fc33PVO9MGQ7lohP89RoA7a+ol9nbRlArX4cBUDPnTXUOlUlr/Cx9pm4sAtfkZMukc1x4MSmZ7Xbua5yj62vn5XZOQHthPhpwJecFPgC7Bd03BJcxMcgq9ZXP07HTs4tfYxsoGPJ6j/XZwR5RzXr297X02PUeI+X6N0xenUc+vaao4JgN8HPJBFI5vfBOHmeXYF7ODuvxn0lT0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri 23-06-23 19:12:32, Sebastian Andrzej Siewior wrote: > __build_all_zonelists() acquires zonelist_update_seq by first disabling > interrupts via local_irq_save() and then acquiring the seqlock with > write_seqlock(). This is troublesome and leads to problems on > PREEMPT_RT. The problem is that the inner spinlock_t becomes a sleeping > lock on PREEMPT_RT and must not be acquired with disabled interrupts. > > The API provides write_seqlock_irqsave() which does the right thing in > one step. > printk_deferred_enter() has to be invoked in non-migrate-able context to > ensure that deferred printing is enabled and disabled on the same CPU. > This is the case after zonelist_update_seq has been acquired. > > There was discussion on the first submission that the order should be: > local_irq_disable(); > printk_deferred_enter(); > write_seqlock(); > > to avoid pitfalls like having an unaccounted printk() coming from > write_seqlock_irqsave() before printk_deferred_enter() is invoked. The > only origin of such a printk() can be a lockdep splat because the > lockdep annotation happens after the sequence count is incremented. > This is exceptional and subject to change. > > It was also pointed that PREEMPT_RT can be affected by the printk > problem since its write_seqlock_irqsave() does not really disable > interrupts. This isn't the case because PREEMPT_RT's printk > implementation differs from the mainline implementation in two important > aspects: > - Printing happens in a dedicated threads and not at during the > invocation of printk(). > - In emergency cases where synchronous printing is used, a different > driver is used which does not use tty_port::lock. > > Acquire zonelist_update_seq with write_seqlock_irqsave() and then defer > printk output. > > Fixes: 1007843a91909 ("mm/page_alloc: fix potential deadlock on zonelist_update_seq seqlock") > Signed-off-by: Sebastian Andrzej Siewior Thanks for extending the changelog. This is much more clearer IMO. One nit below which I haven't noticed before. Anyway Acked-by: Michal Hocko > --- > mm/page_alloc.c | 11 ++++------- > 1 file changed, 4 insertions(+), 7 deletions(-) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 47421bedc12b7..99b7e7d09c5c0 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -5808,11 +5808,10 @@ static void __build_all_zonelists(void *data) > unsigned long flags; > > /* > - * Explicitly disable this CPU's interrupts before taking seqlock > - * to prevent any IRQ handler from calling into the page allocator > - * (e.g. GFP_ATOMIC) that could hit zonelist_iter_begin and livelock. > + * The zonelist_update_seq must be acquired with irqsave because the > + * reader can be invoked from IRQ with GFP_ATOMIC. > */ > - local_irq_save(flags); > + write_seqlock_irqsave(&zonelist_update_seq, flags); > /* > * Explicitly disable this CPU's synchronous printk() before taking > * seqlock to prevent any printk() from trying to hold port->lock, for This is not the case anymore because the locking ordering has flipped. I would just extend the comment above by something like: * Also disable synchronous printk() to prevent any printk() from trying * to hold port->lock, for tty_insert_flip_string_and_push_buffer() on * other CPU might be calling kmalloc(GFP_ATOMIC | __GFP_NOWARN) with * port->lock held. > @@ -5820,7 +5819,6 @@ static void __build_all_zonelists(void *data) > * calling kmalloc(GFP_ATOMIC | __GFP_NOWARN) with port->lock held. > */ > printk_deferred_enter(); > - write_seqlock(&zonelist_update_seq); > > #ifdef CONFIG_NUMA > memset(node_load, 0, sizeof(node_load)); > @@ -5857,9 +5855,8 @@ static void __build_all_zonelists(void *data) > #endif > } > > - write_sequnlock(&zonelist_update_seq); > printk_deferred_exit(); > - local_irq_restore(flags); > + write_sequnlock_irqrestore(&zonelist_update_seq, flags); > } > > static noinline void __init > -- > 2.40.1 -- Michal Hocko SUSE Labs