From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4D716C27C53 for ; Wed, 19 Jun 2024 11:09:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5E8176B040B; Wed, 19 Jun 2024 07:09:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 598246B040D; Wed, 19 Jun 2024 07:09:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 438F16B040E; Wed, 19 Jun 2024 07:09:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 265806B040B for ; Wed, 19 Jun 2024 07:09:39 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 311D91A10E4 for ; Wed, 19 Jun 2024 11:09:38 +0000 (UTC) X-FDA: 82247367636.07.CBDE8A4 Received: from m16.mail.126.com (m16.mail.126.com [117.135.210.9]) by imf01.hostedemail.com (Postfix) with ESMTP id 74F3640018 for ; Wed, 19 Jun 2024 11:09:34 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=126.com header.s=s110527 header.b=f38Jovt6; dmarc=pass (policy=none) header.from=126.com; spf=pass (imf01.hostedemail.com: domain of yangge1116@126.com designates 117.135.210.9 as permitted sender) smtp.mailfrom=yangge1116@126.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1718795369; a=rsa-sha256; cv=none; b=4Y4uYWH2pRerJ/8U4Vucj/iy4AnUmAc6ZSk68+veh9IMjjc6KJYfyq2kxrcZkzeISb6rju mjSSh0XRKy18BOsXVypCyi1MX6z4f/lx9HUH8CfD3BN4K2N/G0aP85xzhhikrysYU1MQnV GVOMpIO/4L/1d4vAleYkF5tDm3Ok12Y= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=126.com header.s=s110527 header.b=f38Jovt6; dmarc=pass (policy=none) header.from=126.com; spf=pass (imf01.hostedemail.com: domain of yangge1116@126.com designates 117.135.210.9 as permitted sender) smtp.mailfrom=yangge1116@126.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1718795369; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=aG/vx2BDuqsQqUPEZ8hVa2/sJkJ0MxkksOjRduMcktM=; b=OVKbL1Ar0F2B56ZWynIVvjt7P7+II5GUL/0pEigt05/0vOWZqYFfrT89hj/aG7xW2ROP0d 7+2/eM07zdYLfHPQKq/An0vtc+/N1aV+xLZ6wlgzzpU34UVZkzW6tB/4PAMHx+3j8vxiLi cMpX4AtE9SCFIZemcZzESNwiheIO2PY= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=126.com; s=s110527; h=Message-ID:Date:MIME-Version:Subject:From: Content-Type; bh=aG/vx2BDuqsQqUPEZ8hVa2/sJkJ0MxkksOjRduMcktM=; b=f38Jovt6Dpp8AVGlqTbgw8GZKwlmaxEh7S5+mn9SNzcECo/fNZP3gZBGijWeeU 8YkmwQeFsYnHRe8L1ym0wEz7+KQuO7GuxiXWJhD1Pr5YLWm/EE70ha+sYNKmrVR6 qrMP6czA06SLFSTrVChrA3OeislhDyl5Tzk8RFDvKpnwg= Received: from [172.21.22.210] (unknown [118.242.3.34]) by gzga-smtp-mta-g1-3 (Coremail) with SMTP id _____wCHv9VgvHJmqdiPBA--.21500S2; Wed, 19 Jun 2024 19:09:21 +0800 (CST) Message-ID: <86e39cc3-cae9-485a-9854-c998fb906cc2@126.com> Date: Wed, 19 Jun 2024 19:09:20 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] mm/page_alloc: add one PCP list for THP To: Barry Song <21cnbao@gmail.com> Cc: akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, baolin.wang@linux.alibaba.com, liuzixing@hygon.cn, Mel Gorman References: <1718790499-28151-1-git-send-email-yangge1116@126.com> From: Ge Yang In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-CM-TRANSID:_____wCHv9VgvHJmqdiPBA--.21500S2 X-Coremail-Antispam: 1Uf129KBjvJXoWxArW3Kr1rGrW5KrWkKw4Durg_yoWrury3pF WxJF4YvayjqryUCw1xJwn0krna93sxGFn7Cr1I9348ZrsxXFyS9a47KFnF9FykZrW7Cr1x Xryqq3sxuF4qy3DanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDUYxBIdaVFxhVjvjDU0xZFpf9x0zRTrW7UUUUU= X-Originating-IP: [118.242.3.34] X-CM-SenderInfo: 51dqwwjhrrila6rslhhfrp/1tbiWQYDG2VLa09vDwAAst X-Rspamd-Queue-Id: 74F3640018 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: 96z3ffsipem4rb33ojobnfcscbbntxno X-HE-Tag: 1718795374-546602 X-HE-Meta: U2FsdGVkX19zAJmg6zZLJUgMMQbm5QsK9jPmrrwH75WoUHU6KK4EwxgSbpeM54xNQhnGl4aHeEllzuimJqoV/K1xZBVhV/TyGvs442UoFROhmYtBqYrh3PGG4LlFOwn4Ni3xk7+eQHZIdZmW3BKW8pAEBmcnl92yIxUcSLV7xyRH8lW1sq860wHLjDeB62OW+optXK/SBKk8RszwD3ZpvDdDQWkOVdP/UAVHVtC+fjwFv2BmeTU/VcKHZ4TpPFT4/7MByxs5/mhfV31jFXbm39ouiQT0LApGiT/ZBtENSHllCfrPeAfCln0Hlb7p0gE6WP/+SuijQqgjgOsIXKHa5h0G4DNu8iigV8g8lXJAWT7LG4trT1V9WcBSP4rNB/AsAaAEnoeJz6RIsHoeTq1EGS8gJPhF3nXz8bayk8QqHQpFTaJXGXDoDvVBpo2tqJkLc86os9hDWeE5flIZQMCBR5TVNfQTI4q6dt++e5lLsa5pcj5blLIg1ahMGQqha3glg55iqhCYYvIqImjsfwtFvSnOHUNSibToqVyAvhzfvizhFQ0j5l7SxvmJ6Sb0Np99R2vPIuQNlv0gzdqLvq8TyHz69GsNfnZ8cWlRB3EtC5iru4OI7mcJ/FjCjwVxWb2wBF+sXa0MJqvFbmDrFS51cMM8PHRoy2jFZo04Pn2PY/91smyaFuPoqC0hIq7btipeOz4sGhBvI/XUF0MHvfx9dvravDTy1nZ0ZgoSnv4g+QkAnknMUkyngBewBmhqVMHd84oXg0IPKVau7oNfVM8wB9nGTXHcjwxnEwJKp888+tI2KfPSa1SZc2+Zl0Top58vQ9DwEXNui+bWZ3PcNSIB6DO/8mYljCBykK3SQc2SkZ0aI57YcHgOxYS0N8yrXWOULx/czHGN0h10W59Q6FE/SiOPEyCtbB9iUiUnjIJOwGy3RrBuGgt/J89hV1NpUcKgHFMP6tpy629GzOjz8cp 6fLlhZcx n1VSu+3SiEDue+aBuJ3y1PDgfjG5w9YmnHtoZXNayuzdFtxbSc/N8yrVwmDfTLpkXmSTvNLWeHavYVDAjG60TbxwdblTByaxaoMHGCKsRCH1OIIRRviIqFzUM8WB6WDw2mtwDwI2wsLtNhsI11PI0j7RuPcNMBTp7Vi6D0IN+EANeqPy2x9hORzYhodVNVINHgCv8QXrUC0vkO8OevXhgXrS3KYeMEjb451n1A+8Pyj9OUHvOMr0NLcSDcyEMzFSu4JvXiCLT8eyhrGRM4RuMjFXzsLRuQEHB+BaZp4GHp0wUHunAKWlyUrX1R6rGJ0kfrqBCv5lLdPtD8eHQjKKrVllRokJMGxblRWvuItssOmzpBgB/0vR7oP0k6FqgiGlIe+VYBrmHVfN9cQbLFRUtfC9xUBlX1NGiExV8W6FDE+q/D0H/pYsPeKdxWphdmSCfmz5nhQTsZjUN1VJf1qhyaJjx3CAsid6xPdnR X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: 在 2024/6/19 18:13, Barry Song 写道: > +Mel, the original author of commit 5d0a661d808f > ('mm/page_alloc: use only one PCP list..." > > > On Wed, Jun 19, 2024 at 9:48 PM wrote: >> >> From: yangge >> >> Since commit 5d0a661d808f ("mm/page_alloc: use only one PCP list for >> THP-sized allocations") no longer differentiates the migration type >> of pages in THP-sized PCP list, it's possible that non-movable >> allocation requests may get a CMA page from the list, in some cases, >> it's not acceptable. >> >> If a large number of CMA memory are configured in system (for >> example, the CMA memory accounts for 50% of the system memory), >> starting a virtual machine with device passthrough will get stuck. >> During starting the virtual machine, it will call >> pin_user_pages_remote(..., FOLL_LONGTERM, ...) to pin memory. Normally >> if a page is present and in CMA area, pin_user_pages_remote() will >> migrate the page from CMA area to non-CMA area because of >> FOLL_LONGTERM flag. But if non-movable allocation requests return >> CMA memory, migrate_longterm_unpinnable_pages() will migrate a CMA >> page to another CMA page, which will fail to pass the check in >> check_and_migrate_movable_pages() and cause migration endless. >> Call trace: >> pin_user_pages_remote >> --__gup_longterm_locked // endless loops in this function >> ----_get_user_pages_locked >> ----check_and_migrate_movable_pages >> ------migrate_longterm_unpinnable_pages >> --------alloc_migration_target >> > > Please also describe its potential negative impact to cma_alloc(). > >> To fix the problem above, we add one PCP list for THP, which will >> not introduce a new cacheline. THP will have 2 PCP lists, one PCP > > not introduce a new cacheline for struct per_cpu_pages. > >> list is used by MOVABLE allocation, and the other PCP list is used >> by UNMOVABLE allocation. MOVABLE allocation contains GPF_MOVABLE, >> and UNMOVABLE allocation contains GFP_UNMOVABLE and GFP_RECLAIMABLE. >> >> Link: https://lore.kernel.org/all/1717492460-19457-1-git-send-email-yangge1116@126.com/ > > no this tag. > >> Fixes: 5d0a661d808f ("mm/page_alloc: use only one PCP list for THP-sized allocations") > > Cc: ? > Thanks, I will prepare the next version. >> Signed-off-by: yangge >> --- >> include/linux/mmzone.h | 9 ++++----- >> mm/page_alloc.c | 9 +++++++-- >> 2 files changed, 11 insertions(+), 7 deletions(-) >> >> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h >> index b7546dd..cb7f265 100644 >> --- a/include/linux/mmzone.h >> +++ b/include/linux/mmzone.h >> @@ -656,13 +656,12 @@ enum zone_watermarks { >> }; >> >> /* >> - * One per migratetype for each PAGE_ALLOC_COSTLY_ORDER. One additional list >> - * for THP which will usually be GFP_MOVABLE. Even if it is another type, >> - * it should not contribute to serious fragmentation causing THP allocation >> - * failures. >> + * One per migratetype for each PAGE_ALLOC_COSTLY_ORDER. Two additional lists >> + * are added for THP. One PCP list is used by GPF_MOVABLE, and the other PCP list >> + * is used by GFP_UNMOVABLE and GFP_RECLAIMABLE. >> */ >> #ifdef CONFIG_TRANSPARENT_HUGEPAGE >> -#define NR_PCP_THP 1 >> +#define NR_PCP_THP 2 >> #else >> #define NR_PCP_THP 0 >> #endif >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >> index 8f416a0..0ecbde3 100644 >> --- a/mm/page_alloc.c >> +++ b/mm/page_alloc.c >> @@ -504,10 +504,15 @@ static void bad_page(struct page *page, const char *reason) >> >> static inline unsigned int order_to_pindex(int migratetype, int order) >> { >> + bool __maybe_unused movable; >> + >> #ifdef CONFIG_TRANSPARENT_HUGEPAGE >> if (order > PAGE_ALLOC_COSTLY_ORDER) { >> VM_BUG_ON(order != HPAGE_PMD_ORDER); >> - return NR_LOWORDER_PCP_LISTS; >> + >> + movable = migratetype == MIGRATE_MOVABLE; >> + >> + return NR_LOWORDER_PCP_LISTS + movable; >> } >> #else >> VM_BUG_ON(order > PAGE_ALLOC_COSTLY_ORDER); >> @@ -521,7 +526,7 @@ static inline int pindex_to_order(unsigned int pindex) >> int order = pindex / MIGRATE_PCPTYPES; >> >> #ifdef CONFIG_TRANSPARENT_HUGEPAGE >> - if (pindex == NR_LOWORDER_PCP_LISTS) >> + if (order > PAGE_ALLOC_COSTLY_ORDER) > > pindex >= NR_LOWORDER_PCP_LISTS > >> order = HPAGE_PMD_ORDER; >> #else >> VM_BUG_ON(order > PAGE_ALLOC_COSTLY_ORDER); >> -- >> 2.7.4 >> > > Thanks > Barry