From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 75CBCC001E0 for ; Mon, 31 Jul 2023 20:25:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D4D3E2800A3; Mon, 31 Jul 2023 16:25:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CFD6828007A; Mon, 31 Jul 2023 16:25:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BC61E2800A3; Mon, 31 Jul 2023 16:25:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id AAE1428007A for ; Mon, 31 Jul 2023 16:25:39 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 77509120396 for ; Mon, 31 Jul 2023 20:25:39 +0000 (UTC) X-FDA: 81073037598.26.2477325 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by imf15.hostedemail.com (Postfix) with ESMTP id D181DA0005 for ; Mon, 31 Jul 2023 20:25:35 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=NKRbwq2X; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=jHZI6MvH; spf=pass (imf15.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.28 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1690835137; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Ze17HiLYmlitxQUSBxKO+kA7SK1x6NA+Ycpu61wlnOI=; b=XKUmMW5CRZjIyymcrcpAJ3lZ1W43ggH8aAN/YW08Z5twGWOciYEo1xRxdbOZ5D6fBp0gFq 3Q+aYNKSjalTPe8krkqSdR0LmtuZ4haAKNjrqVmiFlXZH4cQZu8js9qaU/Oy8DqG3/I3nn fherLSw6JX/1RVtfGvhigGEqsUCjTHQ= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=NKRbwq2X; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=jHZI6MvH; spf=pass (imf15.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.28 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1690835137; a=rsa-sha256; cv=none; b=hbeXYqTpsN4FKbSxdG5wCtOKlKWAXqEkRNDsDjcfzFIkZC5je+ND3l3NoALQJly0KkmqS7 44w+VSNOQHCRAnXLT4ZqVV+E47JHrOymrP3kuR9ujoJ3JoduxyYh3iPL/boVPjdP+11oB+ zaGs1vUnRiHByCMgBC2fRyCHMLPuPr8= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 673BF22081; Mon, 31 Jul 2023 20:25:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1690835135; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Ze17HiLYmlitxQUSBxKO+kA7SK1x6NA+Ycpu61wlnOI=; b=NKRbwq2Xtq0Ett2CNRWDR/zGeXi2dVVh80Gu5/0280TgrP025KQs2qtUGlJThEac78tFfe Heo5S5GbGe/bngL6Jwd7D1lOX2CN1UuuQEnL+Gdx7W2K55QKdsM5PncTtadAiIh4Zcw15C YORGvzvXJAFUHq9NpM6+SuXyUw2Xqq8= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1690835135; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Ze17HiLYmlitxQUSBxKO+kA7SK1x6NA+Ycpu61wlnOI=; b=jHZI6MvHbgo0G3ou1hoUBBP0stk0XWVu1d1hZ+jmI4R/ZsSu0vslWTiDQH3hb2lyn0Q4EQ 5C6k4kCJgsxMh1Cg== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 507C61322C; Mon, 31 Jul 2023 20:25:35 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id JkX0Er8YyGQ/dwAAMHmgww (envelope-from ); Mon, 31 Jul 2023 20:25:35 +0000 Message-ID: <30d2c07a-e928-64f0-361e-60f8a05db815@suse.cz> Date: Mon, 31 Jul 2023 22:25:35 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.1 Subject: Re: [PATCH] mm: compaction: fix endless looping over same migrate block Content-Language: en-US To: Johannes Weiner , Andrew Morton Cc: Mel Gorman , linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <20230731172450.1632195-1-hannes@cmpxchg.org> From: Vlastimil Babka In-Reply-To: <20230731172450.1632195-1-hannes@cmpxchg.org> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: D181DA0005 X-Rspam-User: X-Stat-Signature: n7q57j6m4unmy1yekbrzmkqrus55z43a X-Rspamd-Server: rspam01 X-HE-Tag: 1690835135-441992 X-HE-Meta: U2FsdGVkX19H13y9949NUgiAsKwNQVe/CCwD4UG5UrAQHw32gjzNg0rrOyraPjcDZDKtATPElToU+Zzbr3P/bmlpP2HGvX5VM3PgKBL7CX3BZNzObOM1M4RkA0Pzn5JcvwwhKVxl9gA7YiaiL76qBAn6X7nyaYMFhRV8DscxMw4Gno01kLq3F5497k39Zxwv3TUBvecWPQuP4u6dJmIfE1AOnwuhJwO0VR1XV1F6DaEDTkv/60fXrf1Slr2a0cE/CXayDsJP+K35zsI9aOL2DmftHWZF/MqF8Dx0nVr8Foh4TRa4uOt5e+NBGiHwr3kpNyJD2vK2UI2LfKNNuWOrPmhVqgKZlinM7eHNHdyTJqsb81F1gLNbHrH0ANfwJIftVAeMzcu7I72xpg23TEH1TmLSWgx/LFjeM1cuiVJuYsL3a3RPgQIsoSpuV8ywVakzMEAxIUFfXjGxM2h7FV3PYg44WjCJppfTiROBumGKMaWjhTOERphJ8F2hljgOD6lXuhM96HOuLQGJGSkOTaWmsWCRdJh4fZT87dk10DmcRPinlBn29JMbKPDZ6fu0kIxX5YAXKID3aAdoAmpB55Iu5ARyM4+CpVzSFCg39rqfGVFQ8eoa/FIw+zu3czpTKryLZz3wn0BwLcXCTgrigz4V4S3Evey7Y1FDwuTENRTL51ZoMGhvPDgz/NBAfG6rgz0rpYm+BymCDxEnTzNhAZzT5MqWbWWI7G+txsseEP+UR5kyr6cGSMhomSHtmGUbYH+ML0MifY5RIwkrXARHo6c1k+iuMlznQCg/vlpQ2hWxe/sxxaE26VgSK4eu+ACPZnbw/I7APIPGQbK2qVvkLICIL2Ygv6r0cyatmk2z90F3KWrkNECFY0BpKSURnhB8QJGH4WICNGAR2z+di/DduBitgXOovr9zmjolJLAZO2hur9F56WLetpEExkemlRXtGEnjauNgGVAz2RxaKCRnun5 JH6VLYDZ Cw9rmUeHNeb9Jia/xWMvR9u5/MJceAxRgwd0iitrQUaql8r3a+3THZevVaZAjHrpo0/jNrDEPM7SpXnq4QIcp+HaWiPCYrQsTelm4JC3rr+qJ8kmbHCu7BBbHI7UM1sKppxWVUpCcymFiYFwXnMWTcPzuUEOBfdFoaqnlP3JXNMspI846it18G7boFEgktnC3VtGLTIqhgMVoegv/DWm8L0SspUolA1Dho6m9BmoWOiXjJvncjJptyjfVler3xY/D562+ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 7/31/23 19:24, Johannes Weiner wrote: > During stress testing, the following situation was observed: > > 70 root 39 19 0 0 0 R 100.0 0.0 959:29.92 khugepaged > 310936 root 20 0 84416 25620 512 R 99.7 1.5 642:37.22 hugealloc > > Tracing shows isolate_migratepages_block() endlessly looping over the > first block in the DMA zone: > > hugealloc-310936 [001] ..... 237297.415718: mm_compaction_finished: node=0 zone=DMA order=9 ret=no_suitable_page > hugealloc-310936 [001] ..... 237297.415718: mm_compaction_isolate_migratepages: range=(0x1 ~ 0x400) nr_scanned=513 nr_taken=0 > hugealloc-310936 [001] ..... 237297.415718: mm_compaction_finished: node=0 zone=DMA order=9 ret=no_suitable_page > hugealloc-310936 [001] ..... 237297.415718: mm_compaction_isolate_migratepages: range=(0x1 ~ 0x400) nr_scanned=513 nr_taken=0 > hugealloc-310936 [001] ..... 237297.415718: mm_compaction_finished: node=0 zone=DMA order=9 ret=no_suitable_page > hugealloc-310936 [001] ..... 237297.415718: mm_compaction_isolate_migratepages: range=(0x1 ~ 0x400) nr_scanned=513 nr_taken=0 > hugealloc-310936 [001] ..... 237297.415718: mm_compaction_finished: node=0 zone=DMA order=9 ret=no_suitable_page > hugealloc-310936 [001] ..... 237297.415718: mm_compaction_isolate_migratepages: range=(0x1 ~ 0x400) nr_scanned=513 nr_taken=0 > > The problem is that the functions tries to test and set the skip bit > once on the block, to avoid skipping on its own skip-set, using > pageblock_aligned() on the pfn as a test. But because this is the DMA > zone which starts at pfn 1, this is never true for the first block, > and the skip bit isn't set or tested at all. As a result, > fast_find_migrateblock() returns the same pageblock over and over. > > If the pfn isn't pageblock-aligned, also check if it's the start of > the zone to ensure test-and-set-exactly-once on unaligned ranges. > > Thanks to Vlastimil Babka for the help in debugging this. > > Fixes: 90ed667c03fe ("Revert "Revert "mm/compaction: fix set skip in fast_find_migrateblock""") Yeah I suggested this commit for Fixes: as before the commit (or the previous, reverted attempt) the skip would be set in fast_find_migrateblock() so even though the issue of not handling unaligned zones properly is older, it wouldn't cause an endless loop otherwise. Since 90ed667c03fe is rc1, we don't need stable. > Signed-off-by: Johannes Weiner Reviewed-by: Vlastimil Babka > --- > mm/compaction.c | 8 +++++--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/mm/compaction.c b/mm/compaction.c > index dbc9f86b1934..eacca2794e47 100644 > --- a/mm/compaction.c > +++ b/mm/compaction.c > @@ -912,11 +912,12 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, > > /* > * Check if the pageblock has already been marked skipped. > - * Only the aligned PFN is checked as the caller isolates > + * Only the first PFN is checked as the caller isolates > * COMPACT_CLUSTER_MAX at a time so the second call must > * not falsely conclude that the block should be skipped. > */ > - if (!valid_page && pageblock_aligned(low_pfn)) { > + if (!valid_page && (pageblock_aligned(low_pfn) || > + low_pfn == cc->zone->zone_start_pfn)) { > if (!isolation_suitable(cc, page)) { > low_pfn = end_pfn; > folio = NULL; > @@ -2002,7 +2003,8 @@ static isolate_migrate_t isolate_migratepages(struct compact_control *cc) > * before making it "skip" so other compaction instances do > * not scan the same block. > */ > - if (pageblock_aligned(low_pfn) && > + if ((pageblock_aligned(low_pfn) || > + low_pfn == cc->zone->zone_start_pfn) && > !fast_find_block && !isolation_suitable(cc, page)) > continue; >