From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EC629C5B549 for ; Fri, 6 Jun 2025 07:02:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 49F5E6B007B; Fri, 6 Jun 2025 03:02:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4774C6B0088; Fri, 6 Jun 2025 03:02:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 38D3C6B0089; Fri, 6 Jun 2025 03:02:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 1D5E16B007B for ; Fri, 6 Jun 2025 03:02:32 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id A5C071A1B14 for ; Fri, 6 Jun 2025 07:02:31 +0000 (UTC) X-FDA: 83524082502.11.8E0E9C9 Received: from szxga04-in.huawei.com (szxga04-in.huawei.com [45.249.212.190]) by imf26.hostedemail.com (Postfix) with ESMTP id 2C31D14000D for ; Fri, 6 Jun 2025 07:02:28 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf26.hostedemail.com: domain of mawupeng1@huawei.com designates 45.249.212.190 as permitted sender) smtp.mailfrom=mawupeng1@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1749193350; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references; bh=x8mq4nOJujKCpuuwHEkLgnPpWmAv4J4bTP7NS51fCsk=; b=QS1peEulANOXJesIejmQ5HlfNhhS3cBIqFgqVDc0M5ETe0l04QYzn/rpuuccI6s2RJtOCw ysWFpih4bT4WCT1jSUkqKfQ9jiCWROwkwGgKGhHArhF8y9vvl4nEbUR8ylpEoI2qV7MhDU o38JVB9gBZaGaT2K+p43+G3xYnP1ot8= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf26.hostedemail.com: domain of mawupeng1@huawei.com designates 45.249.212.190 as permitted sender) smtp.mailfrom=mawupeng1@huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1749193350; a=rsa-sha256; cv=none; b=tV8CcVRhiwNM/meXTEkgMaUwrvcCWuUUNkkHu6xNeTr9EOsvcTrM4i3UqLf/cTp5rHYR/6 E4yUrbYgvH0JgqmHCsFn5cMmHHxQduEAE7KQwpyuDuJExWQXN3WcQ0gcl9mzJ84vutYx5L 7Y08Wi9Jix/Rdj/frRoaCVmlE0s7hT8= Received: from mail.maildlp.com (unknown [172.19.88.234]) by szxga04-in.huawei.com (SkyGuard) with ESMTP id 4bDBx80NVxz2Cf50; Fri, 6 Jun 2025 14:58:36 +0800 (CST) Received: from kwepemg100017.china.huawei.com (unknown [7.202.181.58]) by mail.maildlp.com (Postfix) with ESMTPS id 0011A140295; Fri, 6 Jun 2025 15:02:24 +0800 (CST) Received: from huawei.com (10.175.124.71) by kwepemg100017.china.huawei.com (7.202.181.58) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Fri, 6 Jun 2025 15:02:24 +0800 From: Wupeng Ma To: , CC: , , , , , , , Subject: [RFC PATCH] mm: Drain PCP during direct reclaim Date: Fri, 6 Jun 2025 14:59:30 +0800 Message-ID: <20250606065930.3535912-1-mawupeng1@huawei.com> X-Mailer: git-send-email 2.43.0 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-Originating-IP: [10.175.124.71] X-ClientProxiedBy: kwepems500002.china.huawei.com (7.221.188.17) To kwepemg100017.china.huawei.com (7.202.181.58) X-Rspam-User: X-Rspamd-Queue-Id: 2C31D14000D X-Rspamd-Server: rspam09 X-Stat-Signature: 5p5g91hzufk1mk9zwu5op1w67im3hjyk X-HE-Tag: 1749193348-86283 X-HE-Meta: U2FsdGVkX18KVnyKwtXsJSP5Eb+eukyerpOF8Y+3ZrD64meRjCvxaBp1/ZLlpUGAwMdr187iNg3vuIZFAQfxw3cOFm7Kb7fpphJ/Aw+0FjEGv4vNYow/dXnEfMn1hkA55QLhkhUMyU2lTTXpO6NqAAbnnaiwVXbgHBmrRiWGYlOGsvkn8rPtoKiZM/EhKeosdNfEjRfC3pLKHflCt5xXEmmwjywKZjFNOfLGiL6yDkkQ+BFCoyAkMS+gbnTz+NHj3W3rTXYDQXjR3zmafxjFVRfPmKuiUEcnIN/8Q2J6GlQUjxbPg8gST1rCCuBtki0/BbbYDdHneBI7piO/mapZ+3bWy618u/xZfQUTOZ1PTs7DToZ6+xp14jRKgUVaiXN6NIUKZiF1oiyDz8plnJNEJAxl2oQdJY38P+8np5AtntHfjvtEoNhwntG3BDuH22k3ToBzqy7VAqdRWDWaqLo3UZOYQJ9x4R4rKiMBSsk1GMrv7kOSKnuz8NekfaQfSpUQsS2lPj2Ndk42kU+JlVEPJHAdtNKLyyTpyvC5ZdGlYFuVBbSBTsMeymOke4nJcow9Pg8ufWPHWHbOWsLDX7JGtnqmQIJL4J0yolQ7p/ZBiOrQEjJ/tifIxUCb3MZ+U5DJljm+zPXFv0nl/fo2t4tUjH/DF6+rCrHiO1hPg8iD6drxKm77QDou2CpmB4EERRll3lnXp+Ln2Q1Suu08q4D6nHjcMziESacsuqX7WI3lEZnZgyH/HAaKbZun3fIlm5kctxhNLGUi5uwfMiO5H1sBeMXmxN7vZpUFqOGDN/0STb8u5qcHus+bk4czmgzm1At0sTDLGO0kJX5XrluG07ZG3/tGJWEtro+H91j6eNNUxNMtxBfBJZ89jwsJfyMSumRIFthYuW2Yt68= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Memory retained in Per-CPU Pages (PCP) caches can prevent hugepage allocations from succeeding despite sufficient free system memory. This occurs because: 1. Hugepage allocations don't actively trigger PCP draining 2. Direct reclaim path fails to trigger drain_all_pages() when: a) All zone pages are free/hugetlb (!did_some_progress) b) Compaction skips due to costly order watermarks (COMPACT_SKIPPED) Reproduction: - Alloc page and free the page via put_page to release to pcp - Observe hugepage reservation failure Solution: Actively drain PCP during direct reclaim for memory allocations. This increases page allocation success rate by making stranded pages available to any order allocations. Verification: This issue can be reproduce easily in zone movable with the following step: w/o this patch # numactl -m 2 dd if=/dev/urandom of=/dev/shm/testfile bs=4k count=64 # rm -f /dev/shm/testfile # sync # echo 3 > /proc/sys/vm/drop_caches # echo 2048 > /sys/devices/system/node/node2/hugepages/hugepages-2048kB/nr_hugepages # cat /sys/devices/system/node/node2/hugepages/hugepages-2048kB/nr_hugepages 2029 w/ this patch # numactl -m 2 dd if=/dev/urandom of=/dev/shm/testfile bs=4k count=64 # rm -f /dev/shm/testfile # sync # echo 3 > /proc/sys/vm/drop_caches # echo 2048 > /sys/devices/system/node/node2/hugepages/hugepages-2048kB/nr_hugepages # cat /sys/devices/system/node/node2/hugepages/hugepages-2048kB/nr_hugepages 2047 Signed-off-by: Wupeng Ma --- mm/page_alloc.c | 14 ++++---------- 1 file changed, 4 insertions(+), 10 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 2ef3c07266b3..464f2e48651e 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4137,28 +4137,22 @@ __alloc_pages_direct_reclaim(gfp_t gfp_mask, unsigned int order, { struct page *page = NULL; unsigned long pflags; - bool drained = false; psi_memstall_enter(&pflags); *did_some_progress = __perform_reclaim(gfp_mask, order, ac); - if (unlikely(!(*did_some_progress))) - goto out; - -retry: - page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac); + if (likely(*did_some_progress)) + page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac); /* * If an allocation failed after direct reclaim, it could be because * pages are pinned on the per-cpu lists or in high alloc reserves. * Shrink them and try again */ - if (!page && !drained) { + if (!page) { unreserve_highatomic_pageblock(ac, false); drain_all_pages(NULL); - drained = true; - goto retry; + page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac); } -out: psi_memstall_leave(&pflags); return page; -- 2.43.0