From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 707CDC27C53 for ; Thu, 20 Jun 2024 00:34:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EF7F66B0387; Wed, 19 Jun 2024 20:34:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EA7BF6B0388; Wed, 19 Jun 2024 20:34:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D480C6B0389; Wed, 19 Jun 2024 20:34:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id B77056B0387 for ; Wed, 19 Jun 2024 20:34:04 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 347EAC09D8 for ; Thu, 20 Jun 2024 00:34:04 +0000 (UTC) X-FDA: 82249394808.25.11B4959 Received: from m16.mail.126.com (m16.mail.126.com [220.197.31.8]) by imf16.hostedemail.com (Postfix) with ESMTP id F02FD180005 for ; Thu, 20 Jun 2024 00:34:00 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=126.com header.s=s110527 header.b=DF6Ms+u1; spf=pass (imf16.hostedemail.com: domain of yangge1116@126.com designates 220.197.31.8 as permitted sender) smtp.mailfrom=yangge1116@126.com; dmarc=pass (policy=none) header.from=126.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1718843638; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=lsWT13bw0ulPMvpLu3IHN5kqyxObWTgClFkYDlqQVus=; b=sa5KA5AnsqRuNvxutsCdOKbM/8QrNLl9+qPKDSQOI9rvidKd0P3tvA9qzfCion8LqdwCbC auIEgry9kjM/ImXiAccOf7PcDSJNRC/vbiyeGjtGMf4ZcM7bkZOqR+jfo424t68XmTufOQ bc2Q6yCLhmeaRx1/LA6WipF9ALC2RbQ= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=126.com header.s=s110527 header.b=DF6Ms+u1; spf=pass (imf16.hostedemail.com: domain of yangge1116@126.com designates 220.197.31.8 as permitted sender) smtp.mailfrom=yangge1116@126.com; dmarc=pass (policy=none) header.from=126.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1718843638; a=rsa-sha256; cv=none; b=rYMMJemcjeNuKDj9bxFDwpuiC90Dqz6yUWYZSjxjLcU/g5ctPS6YHR6sGmiQ1XO0nkNc+W aRTSS99OpK3mox8Lx8VN7C2teKLB4fdPlRS7NyDp/wpCczQ55Cjv4GQE6bfpa91QMmOGqs 29nbglG0lm6itD94HID+Qbuzm+c+4GE= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=126.com; s=s110527; h=Message-ID:Date:MIME-Version:Subject:From: Content-Type; bh=lsWT13bw0ulPMvpLu3IHN5kqyxObWTgClFkYDlqQVus=; b=DF6Ms+u1R+MPFLZ90t9it8EowIi0z2ALVuu/u/rHryd/q1DTWii6R+hP1Ik5iB cpeEBF+rNFVx20ZdzhE8TSm+53APlcYB/1Pgz8Dde+vTcH1ohJUkiJPbaZxB0vy5 YdX+vK9uazna7TKqPFjhqct5ktkonMBGL+/WLXRAfYJtc= Received: from [172.21.22.210] (unknown [118.242.3.34]) by gzga-smtp-mta-g0-1 (Coremail) with SMTP id _____wDnz53weHNma2K2AA--.48280S2; Thu, 20 Jun 2024 08:33:54 +0800 (CST) Message-ID: Date: Thu, 20 Jun 2024 08:33:53 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] mm/page_alloc: add one PCP list for THP To: Barry Song <21cnbao@gmail.com> Cc: akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org, baolin.wang@linux.alibaba.com, mgorman@techsingularity.net, liuzixing@hygon.cn References: <1718801672-30152-1-git-send-email-yangge1116@126.com> From: Ge Yang In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-CM-TRANSID:_____wDnz53weHNma2K2AA--.48280S2 X-Coremail-Antispam: 1Uf129KBjvJXoWxZFWrWF15Gr4rWFW3KF1kuFg_yoW7JF1xpF WxJF4Yyayjq34UAw1xJ3Z0krna93yfKF1DGr1I9ry8ZrsxWFyS9a48KFnF9Fy8ArW7CF4x XryDt3Z3uF4qv37anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDUYxBIdaVFxhVjvjDU0xZFpf9x07j9TmDUUUUU= X-Originating-IP: [118.242.3.34] X-CM-SenderInfo: 51dqwwjhrrila6rslhhfrp/1tbiGBQEG2VLb4BrvgAAsH X-Rspamd-Server: rspam03 X-Rspam-User: X-Rspamd-Queue-Id: F02FD180005 X-Stat-Signature: dffptgs7ngreunqn87o637amdd3pd4k6 X-HE-Tag: 1718843640-571891 X-HE-Meta: U2FsdGVkX18kZuRsynMYW2FkFLBwtI08qT2cgRLwtH8w7zeGdlKCxmeCkzlWfLdYZRKRVbypWlZcM9dugRKqgdeLBCrT/o7Mgs2P2MIuyfPtw/2lozTtkEnI4Pgv601cIe37M2ar4xO7HppNcbFljJrCIKd1ES+iFKI8T1g/3agOjGwRgZyCXZ5iPH0Uee6l1b9pUKn+xtCiztep88NvuPLoWZUigjUcr+gJ3S6O4I/YRHhHWAScVMjlwSnGHU+rj7Gcja0iLI5PByjMkItzvkqLd8t9NTQFqWk7/gGRwmdNIlX9VjVEe5VuPFXP7BrEMbE5C86v3YaZY/+2436qjpfrSE0Kq6ZFGd4X56sr36+JKquLXVBKmQQy+3qpluD6tM4wuMq9IfmbUL+QagRL93VcbFJeA4eZmPWdd4fuNlswkP2+jj8HkfCaLgJMnWkK6R5FIhOF2Xysx/sAsLMkSCALveaSRmEYGhXEhtytJCUgxAkN4avO06zPtbqxUH4E+dqnikV+G53Zj9onyyX982u2Ua87zJ1CCyMyircwNUVqjV6qqIoi/BT7a42qr/GBhHGil5drSSYma6tG5P/i6x2n+xfjAgMsl9EdeToSiO4eLXmPw5V+3z2Cg95D0EnYjfFvnLSMloWl3ZF2ZvxDBV+CiYZKjAI3B84eY3QuyJDEQMhnj6rLXssdomnQ7wBpPVp4o6Eu7wT8lEy0gnt1Z8RKAlMPeC856S/QVXMf9dgE35RljH28WVPb4g03apjiCoKjCn8Kb7RUMvGViE4+7YXYGyInKRvQhbvQjRDi3VIaaVyuJlrPRUsbN1/nVFdAa3cXo+lUm9LFXbJUX5/xq3lDxehHJqKUU48a4gs3A8T1bOtoz3Fy5EaVWNpW9EOgAud5RTUa9dBtrCdUp2KTPe9b+3k+uZOnB3UhheHPJ/QQw99bn5NzXEyPLCFG90MrhKrJmfK1f8sN1eiyhQF 0TXFn4rR vXilr0HQa5AShqzEj+mqZZThLRylqhpx+tN8wTo+Aus9nt3cCI6zydVOY1MUnHr3+heswEdd+7qQEDVvbT5GvvCoo8Eq5exCNyNBQ5Ty3RY4kGC7y5ngq+u6oRoc34XbhnONCnz7PNq6pLpjtQ376lGkoWyeWZyivNNoL6JiLk3xWa24RTzO//ZQ7xwulLp/G2FPjUXl0o2z8CxwRMjsOmippVOadi9M/vsEz3gSp8hAiE9Q+iz1LqRobsChaOEqvQAY1FV/nKYZDenrExdqNsthmxEOJch1N3/88CFAD1v/AvsGqwlwhliechZC9KpK8eN/SCPm+WBR4+FzFNuVPnR+DSwcOJ98x397VdRtYB1moradqizIFdnlMpReywLxPBCL5jsarOcKcmwxSqBpAadyCj07J+YtnhAW0pLnj6YINTgYEt+Dru3oLn2f94FQ+YsXNopkwv7eftPZdsJfQjF9d8Gv9rvOAe3Qd X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: 在 2024/6/20 6:28, Barry Song 写道: > On Thu, Jun 20, 2024 at 12:55 AM wrote: >> >> From: yangge >> >> Since commit 5d0a661d808f ("mm/page_alloc: use only one PCP list for >> THP-sized allocations") no longer differentiates the migration type >> of pages in THP-sized PCP list, it's possible that non-movable >> allocation requests may get a CMA page from the list, in some cases, >> it's not acceptable. >> >> If a large number of CMA memory are configured in system (for >> example, the CMA memory accounts for 50% of the system memory), >> starting a virtual machine with device passthrough will get stuck. >> During starting the virtual machine, it will call >> pin_user_pages_remote(..., FOLL_LONGTERM, ...) to pin memory. Normally >> if a page is present and in CMA area, pin_user_pages_remote() will >> migrate the page from CMA area to non-CMA area because of >> FOLL_LONGTERM flag. But if non-movable allocation requests return >> CMA memory, migrate_longterm_unpinnable_pages() will migrate a CMA >> page to another CMA page, which will fail to pass the check in >> check_and_migrate_movable_pages() and cause migration endless. >> Call trace: >> pin_user_pages_remote >> --__gup_longterm_locked // endless loops in this function >> ----_get_user_pages_locked >> ----check_and_migrate_movable_pages >> ------migrate_longterm_unpinnable_pages >> --------alloc_migration_target >> >> This problem will also have a negative impact on CMA itself. For >> example, when CMA is borrowed by THP, and we need to reclaim it >> through cma_alloc() or dma_alloc_coherent(), we must move those >> pages out to ensure CMA's users can retrieve that contigous memory. >> Currently, CMA's memory is occupied by non-movable pages, meaning >> we can't relocate them. As a result, cma_alloc() is more likely to >> fail. >> >> To fix the problem above, we add one PCP list for THP, which will >> not introduce a new cacheline for struct per_cpu_pages. THP will >> have 2 PCP lists, one PCP list is used by MOVABLE allocation, and >> the other PCP list is used by UNMOVABLE allocation. MOVABLE >> allocation contains GPF_MOVABLE, and UNMOVABLE allocation contains >> GFP_UNMOVABLE and GFP_RECLAIMABLE. >> >> Fixes: 5d0a661d808f ("mm/page_alloc: use only one PCP list for THP-sized allocations") > > Please add the below tag > > Cc: > > And I don't think 'mm/page_alloc: add one PCP list for THP' is a good > title. Maybe: > > 'mm/page_alloc: Separate THP PCP into movable and non-movable categories' > > Whenever you send a new version, please add things like 'PATCH V2', 'PATCH V3'. > You have already missed several version numbers, so we may have to start from V2 > though V2 is wrong. > Ok, thanks. >> Signed-off-by: yangge >> --- >> include/linux/mmzone.h | 9 ++++----- >> mm/page_alloc.c | 9 +++++++-- >> 2 files changed, 11 insertions(+), 7 deletions(-) >> >> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h >> index b7546dd..cb7f265 100644 >> --- a/include/linux/mmzone.h >> +++ b/include/linux/mmzone.h >> @@ -656,13 +656,12 @@ enum zone_watermarks { >> }; >> >> /* >> - * One per migratetype for each PAGE_ALLOC_COSTLY_ORDER. One additional list >> - * for THP which will usually be GFP_MOVABLE. Even if it is another type, >> - * it should not contribute to serious fragmentation causing THP allocation >> - * failures. >> + * One per migratetype for each PAGE_ALLOC_COSTLY_ORDER. Two additional lists >> + * are added for THP. One PCP list is used by GPF_MOVABLE, and the other PCP list >> + * is used by GFP_UNMOVABLE and GFP_RECLAIMABLE. >> */ >> #ifdef CONFIG_TRANSPARENT_HUGEPAGE >> -#define NR_PCP_THP 1 >> +#define NR_PCP_THP 2 >> #else >> #define NR_PCP_THP 0 >> #endif >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >> index 8f416a0..0a837e6 100644 >> --- a/mm/page_alloc.c >> +++ b/mm/page_alloc.c >> @@ -504,10 +504,15 @@ static void bad_page(struct page *page, const char *reason) >> >> static inline unsigned int order_to_pindex(int migratetype, int order) >> { >> + bool __maybe_unused movable; >> + >> #ifdef CONFIG_TRANSPARENT_HUGEPAGE >> if (order > PAGE_ALLOC_COSTLY_ORDER) { >> VM_BUG_ON(order != HPAGE_PMD_ORDER); >> - return NR_LOWORDER_PCP_LISTS; >> + >> + movable = migratetype == MIGRATE_MOVABLE; >> + >> + return NR_LOWORDER_PCP_LISTS + movable; >> } >> #else >> VM_BUG_ON(order > PAGE_ALLOC_COSTLY_ORDER); >> @@ -521,7 +526,7 @@ static inline int pindex_to_order(unsigned int pindex) >> int order = pindex / MIGRATE_PCPTYPES; >> >> #ifdef CONFIG_TRANSPARENT_HUGEPAGE >> - if (pindex == NR_LOWORDER_PCP_LISTS) >> + if (pindex >= NR_LOWORDER_PCP_LISTS) >> order = HPAGE_PMD_ORDER; >> #else >> VM_BUG_ON(order > PAGE_ALLOC_COSTLY_ORDER); >> -- >> 2.7.4 >>