From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C29E9EB64D8 for ; Thu, 22 Jun 2023 10:59:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1EA198D0002; Thu, 22 Jun 2023 06:59:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 197F08D0001; Thu, 22 Jun 2023 06:59:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 060C28D0002; Thu, 22 Jun 2023 06:59:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id E9FE38D0001 for ; Thu, 22 Jun 2023 06:59:03 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id A52FBB048F for ; Thu, 22 Jun 2023 10:59:03 +0000 (UTC) X-FDA: 80930086566.19.FFFE35A Received: from www262.sakura.ne.jp (www262.sakura.ne.jp [202.181.97.72]) by imf17.hostedemail.com (Postfix) with ESMTP id E83E14000D for ; Thu, 22 Jun 2023 10:59:00 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=none; dmarc=none; spf=none (imf17.hostedemail.com: domain of penguin-kernel@I-love.SAKURA.ne.jp has no SPF policy when checking 202.181.97.72) smtp.mailfrom=penguin-kernel@I-love.SAKURA.ne.jp ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1687431541; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=p/hrkQp+BH7BmIj3ylEHJpVrlN65HLoqZfR9MevM8nU=; b=biuaPE0UH6wY97ObLsQlnU/MAgUAybjLQEP5h2yDt00vP9v0fgnHGumVrsndDrZXBOcDeT 75JD242Ccb/4n3PLGyMTcLdVthb9tmLPX45QyCdYuOZpE/SI0WmM+HgEbzOJTaz1KSWDF5 fho26/r8GS2SEtJIr04D7iOTZC28M8A= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=none; dmarc=none; spf=none (imf17.hostedemail.com: domain of penguin-kernel@I-love.SAKURA.ne.jp has no SPF policy when checking 202.181.97.72) smtp.mailfrom=penguin-kernel@I-love.SAKURA.ne.jp ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1687431541; a=rsa-sha256; cv=none; b=Gcvj4HBUOF7x/2rqRHiGfaw+86Gf0zkmZjdjL+riTjBIhUcWNM/TqTP01v6oppMGqk28WN Kn6zuJTcxVQsxPrBBmu0llzdGhqgqJIkjLG2FyTr7qC2EK70TKYYyXbBPXRXTniozkk3es 0ofXrtSED6JppyXUuFwgd1vgMnwNpWU= Received: from fsav114.sakura.ne.jp (fsav114.sakura.ne.jp [27.133.134.241]) by www262.sakura.ne.jp (8.15.2/8.15.2) with ESMTP id 35MAwXNQ059431; Thu, 22 Jun 2023 19:58:33 +0900 (JST) (envelope-from penguin-kernel@I-love.SAKURA.ne.jp) Received: from www262.sakura.ne.jp (202.181.97.72) by fsav114.sakura.ne.jp (F-Secure/fsigk_smtp/550/fsav114.sakura.ne.jp); Thu, 22 Jun 2023 19:58:33 +0900 (JST) X-Virus-Status: clean(F-Secure/fsigk_smtp/550/fsav114.sakura.ne.jp) Received: from [192.168.1.6] (M106072142033.v4.enabler.ne.jp [106.72.142.33]) (authenticated bits=0) by www262.sakura.ne.jp (8.15.2/8.15.2) with ESMTPSA id 35MAwXc8059428 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NO); Thu, 22 Jun 2023 19:58:33 +0900 (JST) (envelope-from penguin-kernel@I-love.SAKURA.ne.jp) Message-ID: <7758a46f-69a9-c585-53e0-9b1b220b75c0@I-love.SAKURA.ne.jp> Date: Thu, 22 Jun 2023 19:58:33 +0900 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Subject: Re: [PATCH] mm/page_alloc: Use write_seqlock_irqsave() instead write_seqlock() + local_irq_save(). Content-Language: en-US To: Michal Hocko Cc: Sebastian Andrzej Siewior , linux-mm@kvack.org, "Luis Claudio R. Goncalves" , Andrew Morton , Mel Gorman , Thomas Gleixner , Petr Mladek References: <20230621104034.HT6QnNkQ@linutronix.de> <0e9fc992-8e05-2e63-b3b1-d8d3ce89fc16@I-love.SAKURA.ne.jp> <20230621130641.-5iueY1I@linutronix.de> <20230621143421.BgHjJklo@linutronix.de> <01031ffe-c81f-9cec-76fb-e70d548429cf@I-love.SAKURA.ne.jp> <8b6d3f39-c573-ca2b-957b-8c48c2fa68ad@I-love.SAKURA.ne.jp> From: Tetsuo Handa In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: E83E14000D X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: hwo81fh5oorznc74j8nr3kkqandrxyef X-HE-Tag: 1687431540-864428 X-HE-Meta: U2FsdGVkX1914IfYqAUa6sLyoh4gQpel9KP0kuo+uX1sk3HVhsquJ9twj1cXQKepoOkIA9knh9cHcpQsGP2bAmj0hcxLsvVD8iZ8lF+8HKCcq2QveS9pOaswzvsG5gXthZl+XDSz0q3obhdv0udOgKsmieiMls/UkGtl4uiRKqMLStr6kQusGQqliTdCe5di828wdsycOtwIDqI0bDEsG0Oa0Rjz64SXSJ+HValjhtbeemqUi5bp6R6YyO66v2cC/MTbNMbA7iH5xJAel7gLzu6dgtHzTWZTgCU5HsgvdXA7XuYh2OIUvowCaQJckv0bl0IU8xTsn3gZuxAAnwCGaPllNFrAlfQsy97/jB1wCplrmO5tRC9/5vKG98siZ03sP2+9y7lwXB+LH48QjAE5mazAqLlfYwHjOoFEsO2Uyy+CQNI8xVlODHd7f5iuf5sh3MtDaoRfOFDjcXGSM9uo0HnPGf5VAHQ9znS6gly8Sc9lv66KANMnRAphKGW8Re0dDvMF3a/HS6PgZHXtrM/0uwt9QxsfE7FNOkwym4G0R7rDt8LZipFvc7W+rumwGNi2x2Vts6QfvarQoU9/bqrS1f+yYvuRtPozKQdJMpmhz8Hr4cT2xEKwvI3eOMjqCkJCl0LgiWGR56F2vQiwValtjD8c9MdfKnwJAC+8uZMUVU+opy/tjwpJjdXR+rDsCds3nqlz7x9u98UPpxtcO4yudaoZOKKbMv1U7K6YtA+wefHUBOjWdUQYmP7QD194LyqdL3BAoBvTnIcq8F9GWoZiZi6WNRjZ1Q8UrtWleTEcfh6NXHDIgT2OgDcJkrZUllA9+FMtJp8HINzkqwcMHvqqnPRnvYcqtxjP1Yp6nMol93tf+FyTUvjk1rFOOQWzhfHjAUTQ5Oh5SDbQLYt9rwVhR+QoaBPYibs2916OlTMkkGIXxnSw4tWFA/bcD0rdaEI9INryklb9hoECNVWw9iK 8s2Am/ZG OndKlvgyhJBPfCEN7E4V1BK7HgBYBXnSkbh0sQ5TtJhdBt9IADlra9kkamMa8URhzIEDOdxCEU/flG8r3qYcfUQM3cimjw1UO2pGbQkHTi9Goyo1+0j7/8iZbQbfQIMwN77kTMv1MNmJ955JE5F9fStFOHy9YeqyfYHE5xovngTvh34rOYQYSMjqtBC3zj0IPgWSYKGo4yZGdRizegCvypxPuD4B9MJsA5W4/zL7Vztgd8xC0NZCBcVPZSVGJUtuD/Guem1BTALBpH42VrXyln1MXnLNEVTIcs4n8eMqE7/2yVehi/mYlF3VBFoYzox6anwDc X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2023/06/22 16:18, Michal Hocko wrote: >>> It is explained as the first deadlock scenario in commit 1007843a9190 >>> ("mm/page_alloc: fix potential deadlock on zonelist_update_seq seqlock"). >>> We have to disable IRQ before making zonelist_update_seq.seqcount odd. >>> >> >> Since we must replace local_irq_save() + write_seqlock() with write_seqlock_irqsave() for >> CONFIG_PREEMPT_RT=y case but we must not replace local_irq_save() + write_seqlock() with >> write_seqlock_irqsave() for CONFIG_PREEMPT_RT=n case, the proper fix is something like below? > > Now, I am confused. Why write_seqlock_irqsave is not allowed for !RT? > Let me quote the changelog and he scenario 1: > write_seqlock(&zonelist_update_seq); // makes zonelist_update_seq.seqcount odd > // e.g. timer interrupt handler runs at this moment > some_timer_func() { > kmalloc(GFP_ATOMIC) { > __alloc_pages_slowpath() { > read_seqbegin(&zonelist_update_seq) { > // spins forever because zonelist_update_seq.seqcount is odd > } > } > } > } > // e.g. timer interrupt handler finishes > write_sequnlock(&zonelist_update_seq); // makes zonelist_update_seq.seqcount even > > This is clearly impossible with write_seqlock_irqsave as interrupts are > disabled before the lock is taken. Well, it seems that "I don't want to replace" rather than "we must not replace". I reread the thread but I couldn't find why nobody suggested write_seqlock_irqsave(). The reason I proposed the local_irq_save() => printk_deferred_enter() => write_seqlock() ordering implies a precaution in case write_seqlock() involves printk() (e.g. lockdep, KCSAN, soft-lockup warning), in addition to "local_irq_save() before printk_deferred_enter()" requirement. Maybe people in that thread were happy with preserving this precaution... You commented There shouldn't be any other locks (apart from hotplug) taken in that path IIRC. at https://lkml.kernel.org/ZCrYQj+2/uMtqNBm@dhcp22.suse.cz . If __build_all_zonelists() is already serialized by hotplug lock, we don't need to call spin_lock(&zonelist_update_seq.lock) and we will be able to replace write_seqlock(&zonelist_update_seq) with write_seqcount_begin(&zonelist_update_seq.seqcount) like cpuset_change_task_nodemask() does?