From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 66275EB64D8 for ; Thu, 22 Jun 2023 12:09:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A0F188D0002; Thu, 22 Jun 2023 08:09:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 997F28D0001; Thu, 22 Jun 2023 08:09:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 839FA8D0002; Thu, 22 Jun 2023 08:09:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 72D908D0001 for ; Thu, 22 Jun 2023 08:09:08 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 3C19B1C8E5F for ; Thu, 22 Jun 2023 12:09:08 +0000 (UTC) X-FDA: 80930263176.07.C6749FE Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by imf10.hostedemail.com (Postfix) with ESMTP id 3C73AC0015 for ; Thu, 22 Jun 2023 12:09:05 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=IOunsqxt; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf10.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.28 as permitted sender) smtp.mailfrom=mhocko@suse.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1687435746; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=LVFZXTmMoybVjVYb+WbHA4lL09JClX1ohtPSQ+fwWXA=; b=F7TrfgLrxWbhSlm5X1iDlDJA6fE7sJY8BuM8U4f0YZZ4a8jdx/fpfb8SVWS4wfvIv0wj7I mQ5Mu+dxlX/fCMBrIMNq44IY8rgozHqM/JOP6dqSF8o2z6WU8D0Hpxh0dawyzWPiFl4Zlq CkE9ciCJxBGN0us3U7gnRi3e/1udV28= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=IOunsqxt; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf10.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.28 as permitted sender) smtp.mailfrom=mhocko@suse.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1687435746; a=rsa-sha256; cv=none; b=B+3u4LXX9XIiz5xH4OoQEwWq4qQY8iiG1Z+oYxlWJJHZE58sUeVXmYY0Ocn3hnUgc19sFC KRoDBpX7mGPp4QIiKlYy4W5OobUBKYGI1Q2b7iIPwz9lV/Z56dlM13ibo5Kc5LqRiDuRI3 Mj16fjGrSBtX3gXWwZnDH6MzsaaoGPc= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 9027B22B74; Thu, 22 Jun 2023 12:09:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1687435744; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=LVFZXTmMoybVjVYb+WbHA4lL09JClX1ohtPSQ+fwWXA=; b=IOunsqxtK8XInG+Q/MCQX0l+MwI4IL1rsI986J7tAepRIethtQrJbPVh8JsTd86QYwI51y VsRCd4hYvfP7tGD3b5gGs16CGzF869KHLsJbO52h1wiJcNZBxYinkrhz14Hr4yhQct0csJ KCYQZP/3kXyC/hXpgtamwI9drCdXD6E= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 6E49613905; Thu, 22 Jun 2023 12:09:04 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id Ji2mGOA5lGT5YgAAMHmgww (envelope-from ); Thu, 22 Jun 2023 12:09:04 +0000 Date: Thu, 22 Jun 2023 14:09:03 +0200 From: Michal Hocko To: Tetsuo Handa Cc: Sebastian Andrzej Siewior , linux-mm@kvack.org, "Luis Claudio R. Goncalves" , Andrew Morton , Mel Gorman , Thomas Gleixner , Petr Mladek Subject: Re: [PATCH] mm/page_alloc: Use write_seqlock_irqsave() instead write_seqlock() + local_irq_save(). Message-ID: References: <20230621104034.HT6QnNkQ@linutronix.de> <0e9fc992-8e05-2e63-b3b1-d8d3ce89fc16@I-love.SAKURA.ne.jp> <20230621130641.-5iueY1I@linutronix.de> <20230621143421.BgHjJklo@linutronix.de> <01031ffe-c81f-9cec-76fb-e70d548429cf@I-love.SAKURA.ne.jp> <8b6d3f39-c573-ca2b-957b-8c48c2fa68ad@I-love.SAKURA.ne.jp> <7758a46f-69a9-c585-53e0-9b1b220b75c0@I-love.SAKURA.ne.jp> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <7758a46f-69a9-c585-53e0-9b1b220b75c0@I-love.SAKURA.ne.jp> X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 3C73AC0015 X-Stat-Signature: mfa6679754awawa61b7weh6dssa4fuen X-Rspam-User: X-HE-Tag: 1687435745-263997 X-HE-Meta: U2FsdGVkX1+xP2LkCfvv0i/QzlHlFjbC+nmz8i3LB4Op2u5S+vrFkbANPzfKTUvIHI7rtt4QqVsRsDx57BRiOJlzwk9EAsAzOC95/loJloJQdhn1kiwwDNlcur7+BWMhoWuw2ChWaLvx2jLwtyW/wW0qiuNxvgYCTQZvCMMKkd1NQTWEEfgpdCF0j1mVwnM/itp9iOLBaJGh2Kd5PcUvPZuwNlwp+24LXdaNTdOkWgS43ujPxRk1C8dqOfu2avTtdHOjLp8evjuwLwZZdKrkGn2Ou2tjisi2jz+JSCNp8ZHL0NbO/+uba0wbKS0/FMlJPlxil5AxjJSGVn3P3s9h8BL3fh+knAOuU8p3eVkA1iFbRtEpO1VTk0w9fphfNv2C2hUsoWvgJIBE+yAJ34dCvfVhxnAxSzk1V2tBWG7gcxSkL5QKbWjJByHBszr9qeqQrYjzgNPVDqxk2w5zhNJKyjg6H4fGScqQMM3iOReOWEZ/CJwyl9YjK2O3oapaHqlOvXQ3U15O0dLTUV8vSI92i1HWtWMtoZuq6sXw+ytAJYXvZIAh6ak7fFb/nzcCkL6j025H/xzh/kuBUHiB6Oqo4zJWgUJz/pcnWISzyTsK9Gy6RbIkOKToW/k5O0eP+ukyvK+KCNf20/6zCLC+sqpNws1JGxv8DWe5e33a7dJtFynGdFtfHiV83AaqSbO5obxdl5LCtjym8qdfgYr/+KgghJ7QYZ/HYbKU2q3VseBe1beVBOlimRrm4hIBm6MHtLd5mElmmllOEL2GFS3y2EkCLzJJ7OnpgBI/qphu2ne+r2r/5UAPd231cIoBp1ZcWtJGEaZgtvDT1J7oPVJCY8S00fU1jMFWQN/sogxylBO44h99jywdUfpQz2aUQGfSryLsY0n1W0GiYA+hnIxC/R2xZ1Yoc6cqCdjQrFluZzCvltypWpI0XYZMgyMSmgWGCvSbud3q0DGKBy2qzkgt/MN 2qgdMNA+ I0ADq8S3YPtShYaFsyV94hT8AtqL8wPfw3UQBqFoqxkXpXvNU1NRKrwdzPaicNx9V0/OsWiR30UA0Z+xpWOmpKJR6T8BIB8fxh4lDq7PSMVgUQj7iCzpAmS/w7W7hZJN9L90O+lH9rfTiovsZCjwMkaSyJiO1w52YskbTD/Eks7FRpf17qpk9Yewy4A5dHYmy2xZNXLg8mEeuGQTPTvnnSknZGx13hwRDFdYy2zVOp55g7TxjVL3y/j2UL0cDCKJcLz/w X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu 22-06-23 19:58:33, Tetsuo Handa wrote: > On 2023/06/22 16:18, Michal Hocko wrote: > >>> It is explained as the first deadlock scenario in commit 1007843a9190 > >>> ("mm/page_alloc: fix potential deadlock on zonelist_update_seq seqlock"). > >>> We have to disable IRQ before making zonelist_update_seq.seqcount odd. > >>> > >> > >> Since we must replace local_irq_save() + write_seqlock() with write_seqlock_irqsave() for > >> CONFIG_PREEMPT_RT=y case but we must not replace local_irq_save() + write_seqlock() with > >> write_seqlock_irqsave() for CONFIG_PREEMPT_RT=n case, the proper fix is something like below? > > > > Now, I am confused. Why write_seqlock_irqsave is not allowed for !RT? > > Let me quote the changelog and he scenario 1: > > write_seqlock(&zonelist_update_seq); // makes zonelist_update_seq.seqcount odd > > // e.g. timer interrupt handler runs at this moment > > some_timer_func() { > > kmalloc(GFP_ATOMIC) { > > __alloc_pages_slowpath() { > > read_seqbegin(&zonelist_update_seq) { > > // spins forever because zonelist_update_seq.seqcount is odd > > } > > } > > } > > } > > // e.g. timer interrupt handler finishes > > write_sequnlock(&zonelist_update_seq); // makes zonelist_update_seq.seqcount even > > > > This is clearly impossible with write_seqlock_irqsave as interrupts are > > disabled before the lock is taken. > > Well, it seems that "I don't want to replace" rather than "we must not replace". OK, so this is an alteranative fix rather the proposed fix being incorrect. > I reread the thread but I couldn't find why nobody suggested write_seqlock_irqsave(). > The reason I proposed the > > local_irq_save() => printk_deferred_enter() => write_seqlock() > > ordering implies a precaution in case write_seqlock() involves printk() (e.g. lockdep, > KCSAN, soft-lockup warning), in addition to "local_irq_save() before printk_deferred_enter()" > requirement. Maybe people in that thread were happy with preserving this precaution... Precaution is a fair argument. I am not sure it is the strongest one to justify the ugly RT special casing though. I would propose to go with Sebastian's patch as a clear fix and if you really care about the pre-caution then make sure you describe potential problems. > You commented > > There shouldn't be any other locks (apart from hotplug) taken in that path IIRC. > > at https://lkml.kernel.org/ZCrYQj+2/uMtqNBm@dhcp22.suse.cz . > > If __build_all_zonelists() is already serialized by hotplug lock, we don't > need to call spin_lock(&zonelist_update_seq.lock) and we will be able to > replace write_seqlock(&zonelist_update_seq) with > write_seqcount_begin(&zonelist_update_seq.seqcount) like > cpuset_change_task_nodemask() does? Maybe, I haven't really dived into this deeper. One way or the other RT requires a special IRQ handling along with the seq lock, no? -- Michal Hocko SUSE Labs