From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 36903C32772 for ; Tue, 23 Aug 2022 13:25:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 95D5F8D0001; Tue, 23 Aug 2022 09:25:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 90D2C6B0074; Tue, 23 Aug 2022 09:25:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7D4578D0001; Tue, 23 Aug 2022 09:25:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 6B12B6B0073 for ; Tue, 23 Aug 2022 09:25:08 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 30010160B50 for ; Tue, 23 Aug 2022 13:25:08 +0000 (UTC) X-FDA: 79830928296.27.271244C Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by imf18.hostedemail.com (Postfix) with ESMTP id A7FBE1C0019 for ; Tue, 23 Aug 2022 13:25:07 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 410E81F9DB; Tue, 23 Aug 2022 13:25:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1661261106; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=cv1hO3HEeqRXpJ1nSdv79Mae3J8Pqj+075YHW6P/5Vs=; b=f/lZp2FsfU37gpmFr0dXPtBTg8y7N1kU3eCDJGIPfLmCKd0vaMPcHXRG8zCbO3X8Mmla/x RusmA2oHio2IshDj4IlY4heMSMvh64L6aYCDgB7N4rfCO4bNw1GVjiLc/5SBG+RZYmEH1u XELPc5N9iEQffYRaTKi0AqQJZsfN4Ew= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 2FA5213A89; Tue, 23 Aug 2022 13:25:06 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id KnGYCzLVBGObYgAAMHmgww (envelope-from ); Tue, 23 Aug 2022 13:25:06 +0000 Date: Tue, 23 Aug 2022 15:25:05 +0200 From: Michal Hocko To: Mel Gorman Cc: David Hildenbrand , Patrick Daly , linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, Juergen Gross Subject: Re: Race condition in build_all_zonelists() when offlining movable zone Message-ID: References: <20220817104028.uin7cmkb4qlpgfbi@suse.de> <11f91089-1958-c7eb-126f-af32130d9f8a@redhat.com> <20220823083349.5c2aolc6xgfhp3k7@suse.de> <20220823094950.ocjyur2h3mqnqbeg@suse.de> <0fc01e47-51f3-baf7-2d46-72291422f695@redhat.com> <20220823110946.o3eawk3kghaykcim@suse.de> <20220823125850.o3nhkmikmv7vyxq4@suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220823125850.o3nhkmikmv7vyxq4@suse.de> ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1661261107; a=rsa-sha256; cv=none; b=iyu9w94/ht+51LLzdRGVU4biJUU3F/xnY0b4brX9OlgWfhw9U1HlIS5mJ6YX25Y6D5Nxir qcIB040zgJ/mi+wVzKetgax4B4BAJiepkOGhC66UIMaK70Qp3eodRv1wG1FkiM11tZkcgR wjnkrOAG86fvHMBgjNs3cod428OC8jE= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b="f/lZp2Fs"; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf18.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.29 as permitted sender) smtp.mailfrom=mhocko@suse.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1661261107; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=cv1hO3HEeqRXpJ1nSdv79Mae3J8Pqj+075YHW6P/5Vs=; b=6Xv9KcVintkz4bEl1E96zoGitNbUUgvmlPg+vQOA0PwpRJAA/HbQSNT99FmJyVarro0yaM PJLJhXNp0MG+KzwQc9qM3L/mInQdCU/8yKjAKcwn1D2lptDkroyJCCy/V79I2/Kf2uvjvd QAETl3PqR4PhgF3SQGn9tX91IluDgdE= Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b="f/lZp2Fs"; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf18.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.29 as permitted sender) smtp.mailfrom=mhocko@suse.com X-Stat-Signature: js5463ffq7cbss4m9zkni4ieckdn9bxs X-Rspamd-Queue-Id: A7FBE1C0019 X-Rspamd-Server: rspam03 X-Rspam-User: X-HE-Tag: 1661261107-725896 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue 23-08-22 13:58:50, Mel Gorman wrote: > On Tue, Aug 23, 2022 at 02:18:27PM +0200, Michal Hocko wrote: > > On Tue 23-08-22 12:09:46, Mel Gorman wrote: > > > On Tue, Aug 23, 2022 at 12:34:09PM +0200, David Hildenbrand wrote: > > > > > @@ -6553,7 +6576,7 @@ static void __build_all_zonelists(void *data) > > > > > #endif > > > > > } > > > > > > > > > > - spin_unlock(&lock); > > > > > + write_sequnlock(&zonelist_update_seq); > > > > > } > > > > > > > > > > static noinline void __init > > > > > > > > > > > > > LGTM. The "retry_cpuset" label might deserve a better name now. > > > > > > > > > > Good point ... "restart"? > > > > > > > Would > > > > > > > > Fixes: 6aa303defb74 ("mm, vmscan: only allocate and reclaim from zones > > > > with pages managed by the buddy allocator") > > > > > > > > be correct? > > > > > > > > > > Not specifically because the bug is due to a zone being completely removed > > > resulting in a rebuild. This race probably existed ever since memory > > > hotremove could theoritically remove a complete zone. A Cc: Stable would > > > be appropriate as it'll apply with fuzz back to at least 5.4.210 but beyond > > > that, it should be driven by a specific bug report showing that hot-remove > > > of a full zone was possible and triggered the race. > > > > I do not think so. 6aa303defb74 has changed the zonelist building and > > changed the check from pfn range (populated) to managed (with a memory). > > I'm not 100% convinced. The present_pages should have been the spanned range > minus any holes that exist in the zone. If the zone is completely removed, > the span should be zero meaning present and managed are both zero. No? IIRC, and David will correct me if I am mixing this up. The difference is that zonelists are rebuilt during memory offlining and that is when managed pages are removed from the allocator. Zone itself still has that physical range populated and so this patch would have made a difference. Now, you are right that this is likely possible even without that commit but it is highly unlikely because physical hotremove is a very rare operation and the race window would be so large that it would be likely unfeasible. -- Michal Hocko SUSE Labs