From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E7D60C77B7A for ; Wed, 24 May 2023 09:21:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8BFBF900005; Wed, 24 May 2023 05:21:47 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8701C900003; Wed, 24 May 2023 05:21:47 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 73740900005; Wed, 24 May 2023 05:21:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 6270D900003 for ; Wed, 24 May 2023 05:21:47 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 2FBCB120948 for ; Wed, 24 May 2023 09:21:47 +0000 (UTC) X-FDA: 80824606254.03.EA35AF2 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by imf11.hostedemail.com (Postfix) with ESMTP id 10CAE4000F for ; Wed, 24 May 2023 09:21:44 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=mGN2Q5yY; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=9ATs2viG; spf=pass (imf11.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.28 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1684920105; a=rsa-sha256; cv=none; b=4Oq9I8YldJEYBlNklhn1RIWd1G1hsgz7+Z+OU30/25DrdsLT+3A6ofpb1UL4racQhzY/YJ PK+caNE0oUHQexiuFGvzSQwVh5gorWji/H70XC5qxgvlMQAXIq+CG92OBUDF0fgHoIwefj I3LseNXWZU/MMKl6YHHCYGBl7z+4wlA= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=mGN2Q5yY; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=9ATs2viG; spf=pass (imf11.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.28 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1684920105; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=QM4gol7aWxHM+cK08fFzQCFpDFcExgXFNwjtTAaTJjY=; b=0ih+mlq5W/iSwo7M09FY8jJF8NgLzHoHR1y7P7UCOmVRow2asx6qh4yPjfvc/mDm92MboF upLX2rSlVLgrgsqRb2n6vc8Uc4kdZ0+PT+mWM5oJM/Z70nZaxheFsfo1GabL/ScnnP/0cO Uzq+l7iEjUrHrkPrQhHm+z1y/Sy14nY= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 7AECC223F2; Wed, 24 May 2023 09:21:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1684920103; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QM4gol7aWxHM+cK08fFzQCFpDFcExgXFNwjtTAaTJjY=; b=mGN2Q5yYXM3sOwkoOzVucYbctEeSV5S5CjOkCLnGTEX4OefaG6LdHozlo7VP6XoyR3w+zs aJwyN3Q3+kc0eBbgdcW+P4FmyeclYPk/qqszRMsJp/LXntlCz/P+YfHAy8JVcr78Ojuf9d umqXoEZQEF0uc3ym2HPN+x4PINZXtcQ= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1684920103; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QM4gol7aWxHM+cK08fFzQCFpDFcExgXFNwjtTAaTJjY=; b=9ATs2viGI7oMBF8q/E6X4vUArTOwZzfVhlcbq6P33XT0G6FbJMnW7Si8dFysJ8aGcssps7 BFFLOTiMxG6dFSBQ== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 5D43813425; Wed, 24 May 2023 09:21:43 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id qIIJFifXbWSnewAAMHmgww (envelope-from ); Wed, 24 May 2023 09:21:43 +0000 Message-ID: <8fd1a56d-5a22-4bde-59a5-169a4696219e@suse.cz> Date: Wed, 24 May 2023 11:21:43 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.11.0 Subject: Re: [PATCH] mm: compaction: avoid GFP_NOFS ABBA deadlock To: Johannes Weiner , Andrew Morton Cc: Mel Gorman , Michal Hocko , linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@fb.com References: <20230519111359.40475-1-hannes@cmpxchg.org> Content-Language: en-US From: Vlastimil Babka In-Reply-To: <20230519111359.40475-1-hannes@cmpxchg.org> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 10CAE4000F X-Stat-Signature: 5p8dmrakbzsj9xjw6j75z6q9gp3zz5dg X-Rspam-User: X-HE-Tag: 1684920104-683154 X-HE-Meta: U2FsdGVkX1+R4bxl5MS0eWWsZzhcYgJ60IS7OZpSD8Mil0snj6K5soO2qfz30jw6Q6L/stRPazkOfOn4Lb2aSBF7CRs5blDSJmC6ISCNEoMqVzXfAC9+8HMo5bAmZl7AjmoxetvOn9FqGSDO6JhOjRcMSxAxGIZf1KaMd/ON9rBh+twjj+uceUS9pzqDY3I55bbuyuLLTrLmgbC+r6TGlBa3Bs3215iBWpMm6mC4WoguLqjCdoLeqqFcyOWPqR31oBuXYtu6cDjDS6Or3YUMAHZGtMoOCerenvrowJ1Jfqr1VbFUom6kPCqkGwgNlEA/Itmvd0CL32J5HiBfBsjEpo9ruxDf9/39Y6ZEJVCx8yyGlzMJRdy7xQFGhcGqfeWSc7r4SEUebAyeT3dnPdy8B5QMya3NGkQUM4C+8BuEeYJFxRi1quUJf5MHMz45u2ClfVPgUUGQPBpRQIxw04GeKME7WyB0H1BZH8Lf40+bxL2/admW9hccAvlnMbw2l84Yu6ruHJwtoZEgfg5DxhQDvju5+3ctdIrxVcwNpalhxLV487jP9UEBNkJXJxuB656lFPBCNdS4oykCICDqqMlf0ONjICmnXyBj2dTRbjoxgYsXYBOxb09TwB3WYWyf38ooZKFKT4bLGW2LG4rULDbN1ZtFwFIoSc7FtanSKxZ1Zm5YxrKK1YqHYI/xEZqLd83N0TlGTIcjvKbR8LZDSsIhT9VRoFjZ/qr7vUJsTwvxnpuyK/iLxa19nWz0zaJbLRV1ykp2VcU8GrQ9E+QbB/qTaAWR5mWUDrn8AXs76AA12JDds9e7dKyU0amQpaGbOXnFZwcPGz9YmBQy1kkhxU4CNgYPAAD5mu5L27adQ1v1GdwHfpCaA94acBHVwoa9XYjoAx+C4ZeVDRAkDtQyrUDM/n23PavGB9u+0kEBa2ZIFS8ozN/suke59Yh/7xatb3smGOwJJWt5Ql5hg38A5Ru F2Cmiws5 D8bTBx5aBEL9HvQ49LWBIMxq2/SQBwtA4IcNKtYxSXfBClijdeLiT+eBPY14EwUtA5US9PSxSoKx+m0VZzFor063gVhGKgoxPZqSAS3CF4cX/Mm0LVT8F2WaQamajs/m2Js59MRZ8W7Fa9cVEosct71+LpjnigSE27fnlOUIhwz31StSw0jfQFVJkuQf98+f7MsK2Pw6VR0ml/HnjxyFMkJCEO3S2322y4rxbrEvHZQXf5Jd44//N7PBFZesg4JMwh7pyKWCGFMeMiYqypqcsmd6MHOvUHzH0xWeCVrhFXQ9888o= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 5/19/23 13:13, Johannes Weiner wrote: > During stress testing with higher-order allocations, a deadlock > scenario was observed in compaction: One GFP_NOFS allocation was > sleeping on mm/compaction.c::too_many_isolated(), while all CPUs in > the system were busy with compactors spinning on buffer locks held by > the sleeping GFP_NOFS allocation. > > Reclaim is susceptible to this same deadlock; we fixed it by granting > GFP_NOFS allocations additional LRU isolation headroom, to ensure it > makes forward progress while holding fs locks that other reclaimers > might acquire. Do the same here. > > This code has been like this since compaction was initially merged, > and I only managed to trigger this with out-of-tree patches that > dramatically increase the contexts that do GFP_NOFS compaction. While > the issue is real, it seems theoretical in nature given existing > allocation sites. Worth fixing now, but no Fixes tag or stable CC. > Signed-off-by: Johannes Weiner So IIUC the change is done by not giving GFP_NOFS extra headroom, but instead restricting the headroom of __GFP_FS allocations. But the original one was probably too generous anyway so it should be fine? Acked-by: Vlastimil Babka > --- > mm/compaction.c | 16 ++++++++++++++-- > 1 file changed, 14 insertions(+), 2 deletions(-) > > v2: > - clarify too_many_isolated() comment (Mel) > - split isolation deadlock from no-contiguous-anon lockups as that's > a different scenario and deserves its own patch > > diff --git a/mm/compaction.c b/mm/compaction.c > index c8bcdea15f5f..c9a4b6dffcf2 100644 > --- a/mm/compaction.c > +++ b/mm/compaction.c > @@ -745,8 +745,9 @@ isolate_freepages_range(struct compact_control *cc, > } > > /* Similar to reclaim, but different enough that they don't share logic */ > -static bool too_many_isolated(pg_data_t *pgdat) > +static bool too_many_isolated(struct compact_control *cc) > { > + pg_data_t *pgdat = cc->zone->zone_pgdat; > bool too_many; > > unsigned long active, inactive, isolated; > @@ -758,6 +759,17 @@ static bool too_many_isolated(pg_data_t *pgdat) > isolated = node_page_state(pgdat, NR_ISOLATED_FILE) + > node_page_state(pgdat, NR_ISOLATED_ANON); > > + /* > + * Allow GFP_NOFS to isolate past the limit set for regular > + * compaction runs. This prevents an ABBA deadlock when other > + * compactors have already isolated to the limit, but are > + * blocked on filesystem locks held by the GFP_NOFS thread. > + */ > + if (cc->gfp_mask & __GFP_FS) { > + inactive >>= 3; > + active >>= 3; > + } > + > too_many = isolated > (inactive + active) / 2; > if (!too_many) > wake_throttle_isolated(pgdat); > @@ -806,7 +818,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, > * list by either parallel reclaimers or compaction. If there are, > * delay for some time until fewer pages are isolated > */ > - while (unlikely(too_many_isolated(pgdat))) { > + while (unlikely(too_many_isolated(cc))) { > /* stop isolation if there are still pages not migrated */ > if (cc->nr_migratepages) > return -EAGAIN;