From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 86801C369BD for ; Wed, 16 Apr 2025 08:24:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B31D56B01F3; Wed, 16 Apr 2025 04:24:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id ADECF6B01F4; Wed, 16 Apr 2025 04:24:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9599F6B01F6; Wed, 16 Apr 2025 04:24:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 731CF6B01F3 for ; Wed, 16 Apr 2025 04:24:20 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 52353B6D7F for ; Wed, 16 Apr 2025 08:24:21 +0000 (UTC) X-FDA: 83339219922.23.F9F04F3 Received: from mail.loongson.cn (mail.loongson.cn [114.242.206.163]) by imf26.hostedemail.com (Postfix) with ESMTP id A007614000D for ; Wed, 16 Apr 2025 08:24:18 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=none; spf=pass (imf26.hostedemail.com: domain of zhangtianyang@loongson.cn designates 114.242.206.163 as permitted sender) smtp.mailfrom=zhangtianyang@loongson.cn; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1744791859; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references; bh=yKMNbctlvI/MxjHyfQz412RJQQl8bLM46CPIdphsMpE=; b=u/p3i9Gr1syZefB6nOUQbrkAhtt1hYeWVu8XnWHU8Rzzd+w2Y8WRmjNcjLcWlWgAMo7oZ2 3x9H6tqQlt0GzmavEWNHjQK4oLfVYmUz21ycshgsFB7EP/oCL51TmJtuKplcTAfEznvbB8 zE9pzg1H4XxDxd5QuLsmZXO9RR/YMv0= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1744791859; a=rsa-sha256; cv=none; b=2btER3Ia2gNCsXjON1Ftu+2DrAiEMPOxrjiMKT5XPDXDiT8CwPj4T/ydDvNu5jUasuXzVC ktXwO/A3L8SGQfl5m5IOYWAKDqWJ0Y8Y7/cSje/uCpy0MuXBjjlqTliQGxhJj+0lNtzkVN HZQnJb5KU6JrNGMlkxn6UmJIs0p7WrY= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=none; spf=pass (imf26.hostedemail.com: domain of zhangtianyang@loongson.cn designates 114.242.206.163 as permitted sender) smtp.mailfrom=zhangtianyang@loongson.cn; dmarc=none Received: from loongson.cn (unknown [10.2.10.34]) by gateway (Coremail) with SMTP id _____8BxXWssaf9ni4m_AA--.55311S3; Wed, 16 Apr 2025 16:24:13 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.10.34]) by front1 (Coremail) with SMTP id qMiowMBxLscmaf9nfZ+FAA--.6358S2; Wed, 16 Apr 2025 16:24:07 +0800 (CST) From: Tianyang Zhang To: akpm@linux-foundation.org Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Tianyang Zhang Subject: [PATCH] mm/page_alloc.c: Avoid infinite retries caused by cpuset race Date: Wed, 16 Apr 2025 16:24:05 +0800 Message-Id: <20250416082405.20988-1-zhangtianyang@loongson.cn> X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-CM-TRANSID:qMiowMBxLscmaf9nfZ+FAA--.6358S2 X-CM-SenderInfo: x2kd0wxwld05hdqjqz5rrqw2lrqou0/ X-Coremail-Antispam: 1Uk129KBj93XoW7KrW8Aw1ftr1xCr47AF1xtFc_yoW8ur15pF WfuF17Ka1rA3W8Cws2yaykuryUZ3yDJF4fGr4UKr1xZwnxGr4Ikr17Gr90vFWUArsxZF1U tr45A3y8WFW5Z3gCm3ZEXasCq-sJn29KB7ZKAUJUUUU5529EdanIXcx71UUUUU7KY7ZEXa sCq-sGcSsGvfJ3Ic02F40EFcxC0VAKzVAqx4xG6I80ebIjqfuFe4nvWSU5nxnvy29KBjDU 0xBIdaVrnRJUUUk0b4IE77IF4wAFF20E14v26r1j6r4UM7CY07I20VC2zVCF04k26cxKx2 IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48v e4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI 0_Cr0_Gr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v2 6rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l57IF6xkI12xvs2x26I8E6xACxx1l5I 8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjxv20xvE14v26r1Y6r17McIj6I8E87Iv67AK xVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41l42xK82IYc2Ij64 vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG67AKxVWUJVWUGwC20s026x8G jcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r126r1DMIIYrxkI7VAKI48JMIIF0xvE2I x0cI8IcVAFwI0_Jr0_JF4lIxAIcVC0I7IYx2IY6xkF7I0E14v26r1j6r4UMIIF0xvE42xK 8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVW8JVWxJwCI42IY6I8E87Iv6xkF7I 0E14v26r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvjxUwmhFDUUUU X-Stat-Signature: 4uwhyfb9j6gu473p4w67n7gtzuiszfk5 X-Rspam-User: X-Rspamd-Queue-Id: A007614000D X-Rspamd-Server: rspam08 X-HE-Tag: 1744791858-265533 X-HE-Meta: U2FsdGVkX18tGwOqgLAvFC8rQ8KdhBF3deNF87qgW5ANeBUpDQn1laHjo6NdKZN7y73EGOb81swVT28guDx3cd4PrlvYoQa4tN+6z82YmxI+fh/Tz7dg9eRwhEU9X0ZNJqo23oH4grIbK5g/7DlEczcokzFUMRUjx6TOzm27v5pFZdSjULPOGZEid0eoUeHGXHvBZXND7UyBcGfDm5BQQl67GOhuMkaZcI1Y73W0zUk3TeUsjEliu+xYmUBZPTC1S8UaxiP61izlMyNoDbjwvm0ZpyFOl/rZTLmtKMDffpifa2C1buspFgCCteIYzdkQQlIijDtZ5OM647pfl6W3QdJwsPpaVR8a+C+AAcRCoi1QZ8Kd1X1tWaba77uVPNFbg8P4D3vHv58mgBZBM+Cre/1h1pDDFmJkF2BqcyRRVlORSQ48e+nIavWPQzzxGHuh0otxrX37zVgdHavTtndrUOSMf6ZuiKftr2AXP+jHP8R44+5t00rr36K0xiOj3oWIEOZPuh8iwjACkLal/JnNhVdOoG8x3N+Ai3TJYSBMjbz0FhAbGLp5V21FGz+YA6ZfYnaypSynVaq+gRa4Db7AIUtZMA9H30IzuHLcr5fyiMx5HKDQbY1uL4b0Yq/NhQuhgJiM347U+T2ETkd25n3nANyJCTN60jIYeGQOS5jeGcVt23k2Fwq58/9Xve4FZYl8IWOsMS61slaFrdK5iFi/nUnD7lp7dj8IHGOq8Ahz7PCwcqXa8PG3aAgpt2Sj3CdQftulnl47+rF6V0fQbGxDCweTa7GvrzWBP7NAPvj3JA7ZU1Yj9A8s3I2SBAjkaOMm7Ysn/hzjp9TQsxlA7NzdCVcr7FMOSiCXkgv7iSqUxuX+yQ6qKUtDHijpC7yXqB+EIMIKhT57IgRdYpfRNDXrme/CxCY3E10ItpntaNky9Pwlr7Jb7kAe9uqoRVooTJkRuAe0zgUQHqKH6AjGZso NHEyf11n Q5No+Muw+olSHuwmh0T7/jqcpBcjZUTKTwFZR+C0ZXc3d112DPQSQFCfLJ23lf8J9A8jaYNRIJ8UNkwz1KEickrF2uQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: __alloc_pages_slowpath has no change detection for ac->nodemask in the part of retry path, while cpuset can modify it in parallel. For some processes that set mempolicy as MPOL_BIND, this results ac->nodemask changes, and then the should_reclaim_retry will judge based on the latest nodemask and jump to retry, while the get_page_from_freelist only traverses the zonelist from ac->preferred_zoneref, which selected by a expired nodemask and may cause infinite retries in some cases cpu 64: __alloc_pages_slowpath { /* ..... */ retry: /* ac->nodemask = 0x1, ac->preferred->zone->nid = 1 */ if (alloc_flags & ALLOC_KSWAPD) wake_all_kswapds(order, gfp_mask, ac); /* cpu 1: cpuset_write_resmask update_nodemask update_nodemasks_hier update_tasks_nodemask mpol_rebind_task mpol_rebind_policy mpol_rebind_nodemask // mempolicy->nodes has been modified, // which ac->nodemask point to */ /* ac->nodemask = 0x3, ac->preferred->zone->nid = 1 */ if (should_reclaim_retry(gfp_mask, order, ac, alloc_flags, did_some_progress > 0, &no_progress_loops)) goto retry; } Simultaneously starting multiple cpuset01 from LTP can quickly reproduce this issue on a multi node server when the maximum memory pressure is reached and the swap is enabled Signed-off-by: Tianyang Zhang --- mm/page_alloc.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index fd6b865cb1ab..1e82f5214a42 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4530,6 +4530,14 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, } retry: + /* + * Deal with possible cpuset update races or zonelist updates to avoid + * infinite retries. + */ + if (check_retry_cpuset(cpuset_mems_cookie, ac) || + check_retry_zonelist(zonelist_iter_cookie)) + goto restart; + /* Ensure kswapd doesn't accidentally go to sleep as long as we loop */ if (alloc_flags & ALLOC_KSWAPD) wake_all_kswapds(order, gfp_mask, ac); -- 2.20.1