From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 81A8BC02198 for ; Tue, 18 Feb 2025 07:25:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1DE782800EA; Tue, 18 Feb 2025 02:25:48 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 18EC22800E4; Tue, 18 Feb 2025 02:25:48 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0567F2800EA; Tue, 18 Feb 2025 02:25:47 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id DB7552800E4 for ; Tue, 18 Feb 2025 02:25:47 -0500 (EST) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 801EB140733 for ; Tue, 18 Feb 2025 07:25:47 +0000 (UTC) X-FDA: 83132230734.17.B46C3F8 Received: from m16.mail.126.com (m16.mail.126.com [117.135.210.9]) by imf21.hostedemail.com (Postfix) with ESMTP id A5C4A1C000C for ; Tue, 18 Feb 2025 07:25:44 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=126.com header.s=s110527 header.b=RLTlf6oW; spf=pass (imf21.hostedemail.com: domain of yangge1116@126.com designates 117.135.210.9 as permitted sender) smtp.mailfrom=yangge1116@126.com; dmarc=pass (policy=none) header.from=126.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739863545; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=zLzdik0C0YnWFe7OPEwXgLI3rtsrl+WgqVMk2VhKeMk=; b=ttq7caZ5sdrqoxp9znvIJhNpdS3YUQWUK43Xw9U5NL2VjHTgEeGtm5ZDEs9KnLK/S0lkDa s3gO+rgzQOBo+xPJBf1EBM9hzR7uv3bYWxP4IFDJIwXNv8daBYLXlBl9WzFwjQXgc+Lglr YAB9BZcNxp1KM9h0MCclHsizElgcnhQ= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=126.com header.s=s110527 header.b=RLTlf6oW; spf=pass (imf21.hostedemail.com: domain of yangge1116@126.com designates 117.135.210.9 as permitted sender) smtp.mailfrom=yangge1116@126.com; dmarc=pass (policy=none) header.from=126.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739863545; a=rsa-sha256; cv=none; b=WsIOEBQdLWLx1JAqCqFK3mPpcRQJYRTNs1lcfPDthX+iVNE3m8mIgfIjM5R4INWyINtbqS kr7QD/TSeE2gQkjutHHhM2dFstXEsV7FUH8wCsxxNNMOuXuJBEwJCORMLl93w3NnD2BxVA /4eGZSsrR9krkHLoiVSx1r6U/rKro6I= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=126.com; s=s110527; h=Message-ID:Date:MIME-Version:Subject:From: Content-Type; bh=zLzdik0C0YnWFe7OPEwXgLI3rtsrl+WgqVMk2VhKeMk=; b=RLTlf6oWJe2OZqfC4hQdnUUoTi9oFGNX82dKrYJuFXNK5+MdeR/aQ94RGO2Qvf k9/HjOTAJlBIMMNk8H4hHpe9sRCxNHhMJai/VeMiXYYLQVL3YfMCv6Dv9hTSBUC5 G8J5Sukdq8Vx12ilHHY958ftd4yonJw96MX3dWWzMpIWs= Received: from [172.19.20.199] (unknown []) by gzsmtp5 (Coremail) with SMTP id QCkvCgD3N_fzNbRnD98FBA--.29161S2; Tue, 18 Feb 2025 15:25:39 +0800 (CST) Message-ID: <03bc60a1-bd71-4acf-804d-312202625be8@126.com> Date: Tue, 18 Feb 2025 15:25:38 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH V2] mm/hugetlb: wait for hugepage folios to be freed To: Muchun Song Cc: akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, 21cnbao@gmail.com, david@redhat.com, baolin.wang@linux.alibaba.com, osalvador@suse.de, liuzixing@hygon.cn References: <1739604026-2258-1-git-send-email-yangge1116@126.com> From: Ge Yang In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-CM-TRANSID:QCkvCgD3N_fzNbRnD98FBA--.29161S2 X-Coremail-Antispam: 1Uf129KBjvJXoW3Ar1fJF1fGw43Jr1UuF1fCrg_yoW7CF4xpF yUKa1UGFWDJr9akrnrZwn0vr1I9rWvqFWUCrWSqw4fZ3ZxJ3srKFy2ywn0vayrArWkCFWI vrWjvrZruF1UZaDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDUYxBIdaVFxhVjvjDU0xZFpf9x07jUyCJUUUUU= X-Originating-IP: [112.64.138.194] X-CM-SenderInfo: 51dqwwjhrrila6rslhhfrp/1tbiOhj3G2e0K8GzngAAsa X-Rspamd-Queue-Id: A5C4A1C000C X-Stat-Signature: a94xrh7rorq7pe7di4jd4snjcifdbqnk X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1739863544-31932 X-HE-Meta: U2FsdGVkX1/Wa5EmNX3ydzVtPOJZS2654DU4Ku/lzUOACOu2STe738V6slLrmZO23t/cbTZwN5mkx+YTdA4oeMG8jqdVvyZ/bSsaJwxgSDqbO/KoGWP5nj4lqRnGJTXH/PnyhjKshmC2Me1NqVDQAmyLy33xIj4W3KmpQLsZaCR8O4I9h2cf7revw61rkkF6lGH5UvQQeeNKWCZ5rPY1QSGvg5Y/4vRfduZFmzkTunSlymk3MkSHU6cf+0BmQLdKlnYPakaFOuz2PPbKlwp0l05dF7PDbHMzZPNQnw46Q7rEfONELJ7BJbkEmt+NdgAZEB3Vtw2jzwGIpU2+8gyJ1/LPjqHT6LdOgqiD7/wrbelV4lkTxy/XUlCHj2fu2T2LZ9m8LUVRea9U6mgyhMX1HSwIJMLtosEKLDqs0IC0ahowSoNWLXdO+WSZ1HIiuq954FHPPE6g+VFO8qMfwPpY0w8sKAGHC6NFPcFCqREedEmU1FT7TnX4N+BUxyyLjzk5UGHXwnIftzeMSox9cuuvYrP8enXvoalLqLa3hPjBbxvkdCsXM5BLc6aajV3z7fhsI0S9rWflpcpjbRixgOIajjYnt4kTkDfbc+lGKv//egln6gjSyeGkIr1FXNvOT4Qt50mIHOR2Fcu939lYRPmYOnLOBDGNTmJHYDRChm5sQDbTNbTkIM+JAlv3cXHn4u8OQtKVB9SdsmFWSBLhuv1Dppwhuk+5cBQ228omxX5AGlaKarfASShxg2s2slU8JJhVeG8vBAG5/JV2St80Akcsb3YXMBuiBGHgt0hF/qAaMKnHj78uBAofMS8KUvM5EusTLo/2oJP0hq4IB75UpNculvu33gm2TaoCsdckPeib6FbC0gCnqo+S0dKtnbj0eIjZzylzYspdurwnyHmgW6CJ/lz6SZvVrRw7kHktO2o+vvs5VylNJsKAWipCKOea6CBy8cn+XfN285N/dS6RINe xLo1Kwwz jW281Akizz+LNrnjCnxnBev+6hDKh3BtztmnqsF2wADYMcBf4hcxznd3h1JixDJw7z0iglBDHk/trf/A42WySZ6/w27QfuQEX9cxxtaqonwgWai7huMS6a8Tn8J8FB8sURlVj+evZxwKTWhmQzFxkNG6STnYDyjxyFRUkckq5Wgq9NHJEva76+FGEhhMXGplgYqborYKru+zvSTvRvIMD6ok82FRnA9Zk8HvCw6qVGdyaj2St58sK/QbU+YbMAoQYHzkhJNDuOus88uLAhRHLNB6ivS91+kwzrfIqnplr66noXNlZG0uG8/aTCeNEf20Y3KFZ9ljrnLE4rtOgbuLUkBzP6Y4U7lgHxf2nRkzyj0r1dU807DiXED3cSxOdRKHgK5WKaJPgUvxpXpQqjl0A65jQFcgDl/FYI7OiKlxksCDUoEU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: 在 2025/2/18 15:05, Muchun Song 写道: > > >> On Feb 15, 2025, at 15:20, yangge1116@126.com wrote: >> >> From: Ge Yang >> >> Since the introduction of commit b65d4adbc0f0 ("mm: hugetlb: defer freeing >> of HugeTLB pages"), which supports deferring the freeing of HugeTLB pages, >> the allocation of contiguous memory through cma_alloc() may fail >> probabilistically. >> >> In the CMA allocation process, if it is found that the CMA area is occupied >> by in-use hugepage folios, these in-use hugepage folios need to be migrated >> to another location. When there are no available hugepage folios in the >> free HugeTLB pool during the migration of in-use HugeTLB pages, new folios >> are allocated from the buddy system. A temporary state is set on the newly >> allocated folio. Upon completion of the hugepage folio migration, the >> temporary state is transferred from the new folios to the old folios. >> Normally, when the old folios with the temporary state are freed, it is >> directly released back to the buddy system. However, due to the deferred >> freeing of HugeTLB pages, the PageBuddy() check fails, ultimately leading >> to the failure of cma_alloc(). >> >> Here is a simplified call trace illustrating the process: >> cma_alloc() >> ->__alloc_contig_migrate_range() // Migrate in-use hugepage >> ->unmap_and_move_huge_page() >> ->folio_putback_hugetlb() // Free old folios >> ->test_pages_isolated() >> ->__test_page_isolated_in_pageblock() >> ->PageBuddy(page) // Check if the page is in buddy >> >> To resolve this issue, we have implemented a function named >> wait_for_hugepage_folios_freed(). This function ensures that the hugepage >> folios are properly released back to the buddy system after their migration >> is completed. By invoking wait_for_hugepage_folios_freed() before calling >> PageBuddy(), we ensure that PageBuddy() will succeed. >> >> Fixes: b65d4adbc0f0 ("mm: hugetlb: defer freeing of HugeTLB pages") > > The actual blamed commit should be the > > commit c77c0a8ac4c52 ("mm/hugetlb: defer freeing of huge pages if in non-task context") > > which is the first to introducing the delayed work to free the hugetlb pages. > It was removed by commit db71ef79b59bb2 and then was brought back by commit > b65d4adbc0f0 immediately. > Ok, thanks. >> Signed-off-by: Ge Yang >> --- >> >> V2: >> - flush all folios at once suggested by David >> >> include/linux/hugetlb.h | 5 +++++ >> mm/hugetlb.c | 8 ++++++++ >> mm/page_isolation.c | 10 ++++++++++ >> 3 files changed, 23 insertions(+) >> >> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h >> index 6c6546b..04708b0 100644 >> --- a/include/linux/hugetlb.h >> +++ b/include/linux/hugetlb.h >> @@ -697,6 +697,7 @@ bool hugetlb_bootmem_page_zones_valid(int nid, struct huge_bootmem_page *m); >> >> int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list); >> int replace_free_hugepage_folios(unsigned long start_pfn, unsigned long end_pfn); >> +void wait_for_hugepage_folios_freed(void); >> struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, >> unsigned long addr, bool cow_from_owner); >> struct folio *alloc_hugetlb_folio_nodemask(struct hstate *h, int preferred_nid, >> @@ -1092,6 +1093,10 @@ static inline int replace_free_hugepage_folios(unsigned long start_pfn, >> return 0; >> } >> >> +static inline void wait_for_hugepage_folios_freed(void) >> +{ >> +} >> + >> static inline struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, >> unsigned long addr, >> bool cow_from_owner) >> diff --git a/mm/hugetlb.c b/mm/hugetlb.c >> index 30bc34d..36dd3e4 100644 >> --- a/mm/hugetlb.c >> +++ b/mm/hugetlb.c >> @@ -2955,6 +2955,14 @@ int replace_free_hugepage_folios(unsigned long start_pfn, unsigned long end_pfn) >> return ret; >> } >> >> +void wait_for_hugepage_folios_freed(void) > > We usually use the "hugetlb" term now instead of "huge_page" to differentiate with THP. > So I suggest naming it as wait_for_hugetlb_folios_freed(). > >> +{ >> + struct hstate *h; >> + >> + for_each_hstate(h) >> + flush_free_hpage_work(h); > > Because all hstate use the shared work to defer the freeing of hugetlb pages, we only > need to flush once. Directly useing flush_work(&free_hpage_work) is enough. > Ok, thanks. >> +} >> + >> typedef enum { >> /* >> * For either 0/1: we checked the per-vma resv map, and one resv >> diff --git a/mm/page_isolation.c b/mm/page_isolation.c >> index 8ed53ee0..f56cf02 100644 >> --- a/mm/page_isolation.c >> +++ b/mm/page_isolation.c >> @@ -615,6 +615,16 @@ int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn, >> int ret; >> >> /* >> + * Due to the deferred freeing of HugeTLB folios, the hugepage folios may >> + * not immediately release to the buddy system. This can cause PageBuddy() >> + * to fail in __test_page_isolated_in_pageblock(). To ensure that the >> + * hugepage folios are properly released back to the buddy system, we > > hugetlb folios. > Ok, thanks. > Muchun, > Thanks. > >> + * invoke the wait_for_hugepage_folios_freed() function to wait for the >> + * release to complete. >> + */ >> + wait_for_hugepage_folios_freed(); >> + >> + /* >> * Note: pageblock_nr_pages != MAX_PAGE_ORDER. Then, chunks of free >> * pages are not aligned to pageblock_nr_pages. >> * Then we just check migratetype first. >> -- >> 2.7.4 >>