From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5851FC27C79 for ; Mon, 17 Jun 2024 11:36:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A87606B0181; Mon, 17 Jun 2024 07:36:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A10616B0182; Mon, 17 Jun 2024 07:36:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 88A5D6B0183; Mon, 17 Jun 2024 07:36:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 614EC6B0181 for ; Mon, 17 Jun 2024 07:36:22 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 0C149C1575 for ; Mon, 17 Jun 2024 11:36:22 +0000 (UTC) X-FDA: 82240177404.05.91F62EE Received: from out30-100.freemail.mail.aliyun.com (out30-100.freemail.mail.aliyun.com [115.124.30.100]) by imf28.hostedemail.com (Postfix) with ESMTP id E574DC0015 for ; Mon, 17 Jun 2024 11:36:18 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b="q3sCW/uB"; spf=pass (imf28.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.100 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1718624176; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=E2MMVK6CmD4r8j4aKlkyVgyeeNax6qRDpJMHibAguZM=; b=E07ZP/ogzzzo98PpQkQS5lnTyXcIS+NDODKfZ4+a89hXRWe/VYH57NIn8xkBB/gPsUg8Mv AwRKb4XmzB/OJB7DTmP14DugSvJBbbY0CQeMOvUo/JBI1FvrvMbFlUXLMPWc1Hjdr4oOfo C9/ZUZAQYLM1HUPTeD405gqZo6ngNH8= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b="q3sCW/uB"; spf=pass (imf28.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.100 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1718624176; a=rsa-sha256; cv=none; b=elkQ0nHzwxycnwYAzRNGzflALQF/dLH/TPccXiptLI6CGDSkdntmm5VXTUFr4ISDPFM4Hv GOZlpUNxJbBbtDlQTMA+r5CTyBIfJQwEaIc4dO3eIIkk6A5kjTB6gYhgkB6tZ/hdm+uo+F fPNvF2BxIhXnrS/gW1+0GgesEiABAig= DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1718624175; h=Message-ID:Date:MIME-Version:Subject:To:From:Content-Type; bh=E2MMVK6CmD4r8j4aKlkyVgyeeNax6qRDpJMHibAguZM=; b=q3sCW/uB8Gx910AaJjwDEDIYpYFX4lYCnJprPy2srP0FWVf9J+EBxZGf5y9yrVbiNr48d700Q+S9XeB7B+tYhvk8GIRB4KCB2SYjvplT0VDk4Y6cYHOgsgZOhl06SsRvJ6QzxZf+75jXOctXusVTwQHv6OfYtYvMDQV7fyO1lmw= X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R681e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033037067112;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=9;SR=0;TI=SMTPD_---0W8d3SmY_1718624173; Received: from 30.97.56.79(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0W8d3SmY_1718624173) by smtp.aliyun-inc.com; Mon, 17 Jun 2024 19:36:14 +0800 Message-ID: <6dc8df31-eb01-4382-8467-c5510f75531e@linux.alibaba.com> Date: Mon, 17 Jun 2024 19:36:13 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] mm/page_alloc: skip THP-sized PCP list when allocating non-CMA THP-sized page To: Barry Song <21cnbao@gmail.com> Cc: yangge1116 , akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, liuzixing@hygon.cn, Johannes Weiner , Vlastimil Babka , Zi Yan References: <1717492460-19457-1-git-send-email-yangge1116@126.com> <82d31425-86d7-16fa-d09b-fcb203de0986@126.com> <7087d0af-93d8-4d49-94f4-dc846a4e2b98@linux.alibaba.com> From: Baolin Wang In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: E574DC0015 X-Stat-Signature: in6x619gyiao1hhb5e4zrp51hfeemr37 X-Rspam-User: X-HE-Tag: 1718624178-352639 X-HE-Meta: U2FsdGVkX19GJpliDpgdTQQwwZC0+r3lYMn8Ga8KN2e8tcWYgavS2Jv/dAYkaqsg2ne9qaIjvm3PdTeKgYsykCGXHrDnsyr8vEGUvRLh7CC7Z4Y5wxMHvYW/1CPppnwPE0LjgpPKGVN8WEA76PFU2T9cSOmR4bHuTM6skXevft4LR86PXw0CIseKvYKp7d9nUu7v8UaXNNNHGZo45RgKb1CUC/HjAT93W44EayP0V9a1D6mIs1uxBgLXBjzE2V7QsMpA957ha1uYN0GQ9h5zry8pMKCMB2k3ZdLdZPFIk0rFQj34oB/fHc6iPg6V0NNEHQIzkp7nLBASdaQ7fVJ46pabC82ZyiCCM3gBQBokA+PdnQxUSnjaPDn+OMi9yGQheN9ZOIvbEdDjmJqekCRuRtIREKTFzIGBiXur91G46CCLmeonLNPsyvTok4N+bWTb/Vcz0jNVORg234M7fDK/cmDejV/fWesAldjTqiD/Ufgxc/Fx65k0PDBj543h0OJjjofPUc2npm+4EeMzSQQ8vmDGZPFT7oHHLd/TxcogJ+vX36vq7mT7L8916gJ17CaHU8zt5JxTBIrP+07qIxXeNDy4i2y0ZyPV8VYiwZ17+IqX8MU7TcSJYqpsl4iLJh5uTt/XDb6eiEX9izyqQXuAc4Hao9tOig/xDMbGLCNCE1Eba8WBti4pDkhmCRcYssudJAtvi6rJFs4XpbLZbbnGQxK8fVE/rWoRiwIzqo5FxhVo8yp9GgRqQSA2DujblB+LMujmuxPj3DmoG/PcsoiLmtpCAfJluPXzIbnHoZX+DMmxtNdrg06YPC/rDVTEeKFXPLIiZBzvO9Gv2CKmQRX9W7gTD6KVYgDBa3G7mGDy8WgXycU4PzvXQFllZg4+silEQ78Myl3hBlj+yo8GafFoTJ7CuaqRNddHWdsPsgc9xPas082OBT/UKKklP39fgkJ5BiM4S1VDceH/ABU+Bl0 k2tzHwXG i7VcLKa1i07SCetl9CVrjHYAJQQnW5Iddyb7hjNir5zJrj3lzU7rGKQ/UiEqMAYhghIG3j8s2oEFuiZTR05VTF5rrM5sv9ztLgm6McSt2J9NFC3HMa6m0LiPiA50ZJvwcdGA241i/pesXuoTNWiGEtAUKnX5AfndM4/vjIaIai21Fzs76nFe06fAqIIGSATY9VPPcuIwNEwGcT52IFjKqDgjfpCf6wAodggGsRSLYhoutmxWn5K/YfbvEjQo6WUD5KGcRS/ww6gN+DRsiSgKKJMXatkNHCvIcKU1Xe196CeYPvl4dTEQf42gXA781Xoqh54rh0pJg+PAGDrUtEEQbPCMx6R44ITso6fuNBR5I567xvaUUIIhZbET/qghmarXJx4CqdQHO8ZvvdAub8coZRxPnug== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/6/17 18:43, Barry Song wrote: > On Thu, Jun 6, 2024 at 3:07 PM Baolin Wang > wrote: >> >> >> >> On 2024/6/4 20:36, yangge1116 wrote: >>> >>> >>> 在 2024/6/4 下午8:01, Baolin Wang 写道: >>>> Cc Johannes, Zi and Vlastimil. >>>> >>>> On 2024/6/4 17:14, yangge1116@126.com wrote: >>>>> From: yangge >>>>> >>>>> Since commit 5d0a661d808f ("mm/page_alloc: use only one PCP list for >>>>> THP-sized allocations") no longer differentiates the migration type >>>>> of pages in THP-sized PCP list, it's possible to get a CMA page from >>>>> the list, in some cases, it's not acceptable, for example, allocating >>>>> a non-CMA page with PF_MEMALLOC_PIN flag returns a CMA page. >>>>> >>>>> The patch forbids allocating non-CMA THP-sized page from THP-sized >>>>> PCP list to avoid the issue above. >>>>> >>>>> Fixes: 5d0a661d808f ("mm/page_alloc: use only one PCP list for >>>>> THP-sized allocations") >>>>> Signed-off-by: yangge >>>>> --- >>>>> mm/page_alloc.c | 10 ++++++++++ >>>>> 1 file changed, 10 insertions(+) >>>>> >>>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >>>>> index 2e22ce5..0bdf471 100644 >>>>> --- a/mm/page_alloc.c >>>>> +++ b/mm/page_alloc.c >>>>> @@ -2987,10 +2987,20 @@ struct page *rmqueue(struct zone >>>>> *preferred_zone, >>>>> WARN_ON_ONCE((gfp_flags & __GFP_NOFAIL) && (order > 1)); >>>>> if (likely(pcp_allowed_order(order))) { >>>>> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE >>>>> + if (!IS_ENABLED(CONFIG_CMA) || alloc_flags & ALLOC_CMA || >>>>> + order != HPAGE_PMD_ORDER) { >>>> >>>> Seems you will also miss the non-CMA THP from the PCP, so I wonder if >>>> we can add a migratetype comparison in __rmqueue_pcplist(), and if >>>> it's not suitable, then fallback to buddy? >>> >>> Yes, we may miss some non-CMA THPs in the PCP. But, if add a migratetype >>> comparison in __rmqueue_pcplist(), we may need to compare many times >>> because of pcp batch. >> >> I mean we can only compare once, focusing on CMA pages. >> >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >> index 3734fe7e67c0..960a3b5744d8 100644 >> --- a/mm/page_alloc.c >> +++ b/mm/page_alloc.c >> @@ -2973,6 +2973,11 @@ struct page *__rmqueue_pcplist(struct zone *zone, >> unsigned int order, >> } >> >> page = list_first_entry(list, struct page, pcp_list); >> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE >> + if (order == HPAGE_PMD_ORDER && >> !is_migrate_movable(migratetype) && >> + is_migrate_cma(get_pageblock_migratetype(page))) >> + return NULL; >> +#endif > > This doesn't seem ideal either. It's possible that the PCP still has many > non-CMA folios, but due to bad luck, the first entry is "always" CMA. > In this case, > allocations with is_migrate_movable(migratetype) == false will always lose the > chance to use the PCP. It also appears to incur a PCP spin lock/unlock. Yes, just some ideas to to mitigate the issue... > > I don't see an ideal solution unless we bring back the CMA PCP :-) Tend to agree, and adding a CMA PCP seems the overhead can be acceptable?