From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F0474E77184 for ; Sat, 21 Dec 2024 12:05:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 19B526B0082; Sat, 21 Dec 2024 07:05:24 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 124CC6B0088; Sat, 21 Dec 2024 07:05:24 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EDF5F6B0089; Sat, 21 Dec 2024 07:05:23 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id CD75E6B0082 for ; Sat, 21 Dec 2024 07:05:23 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 46A0A140FFC for ; Sat, 21 Dec 2024 12:05:23 +0000 (UTC) X-FDA: 82918834656.27.D0F9819 Received: from m16.mail.126.com (m16.mail.126.com [117.135.210.7]) by imf03.hostedemail.com (Postfix) with ESMTP id B7ED02000C for ; Sat, 21 Dec 2024 12:05:03 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=126.com header.s=s110527 header.b=pSI2BljQ; spf=pass (imf03.hostedemail.com: domain of yangge1116@126.com designates 117.135.210.7 as permitted sender) smtp.mailfrom=yangge1116@126.com; dmarc=pass (policy=none) header.from=126.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1734782697; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ZosUyGnmovPEBWzy4oa5AcYcXN9kr0hw/5zK6jZF/oA=; b=Acv2vIayOCsmfQgyguQM/fdQEHg34PD1hJtc1HIaERuzL6nJT5A5mNba3IxNpMBi6GxDTZ G0SSxrkuMHCciEQLil9QZPUK8hBoIIEXwJg8eN+z2+ClGroMqTjqAqXQJ4z71loNZUFlEc 6UCOyjnNVyVwK9FC4neVLg8Jq5SDN8o= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1734782697; a=rsa-sha256; cv=none; b=718EOblR5yQ2SOnONX5hvI43fsoh3gLfymBiZI96OQCW0Ecwgh7hYKNKjy++eIPnoG2rAD dBnHvaT0CgHOJuSJBgD9eqF2Z1Zk+MQ+1MC+ia0jzyGvW5UmA6XcSJxO1sn6vHK6LQgRvT CrRQP/8K3Jg8iFPo2HlPiRzj1EsW+Do= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=126.com header.s=s110527 header.b=pSI2BljQ; spf=pass (imf03.hostedemail.com: domain of yangge1116@126.com designates 117.135.210.7 as permitted sender) smtp.mailfrom=yangge1116@126.com; dmarc=pass (policy=none) header.from=126.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=126.com; s=s110527; h=Message-ID:Date:MIME-Version:Subject:From: Content-Type; bh=ZosUyGnmovPEBWzy4oa5AcYcXN9kr0hw/5zK6jZF/oA=; b=pSI2BljQQLjBGym0Fj5onDlp1iv/QuBHPub0ar7EGSGggMZ/WqJ1YKhYoDRvdj 1WC01F2Bf6I6M9Zgjm01M7iToTQMk4dxOSsA3tLuWCrdtYzmzMINhsJLYCAWweO2 XgUvJWvv26QzWCjrnOVN+vBu4lRpJNujSgjYPHFVBo274= Received: from [172.20.10.3] (unknown [39.144.39.55]) by gzsmtp4 (Coremail) with SMTP id qCkvCgD3PIfrrmZnzSstCw--.48830S2; Sat, 21 Dec 2024 20:05:00 +0800 (CST) Message-ID: <333e584c-2688-4a3f-bc1f-2e84d5215005@126.com> Date: Sat, 21 Dec 2024 20:04:59 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] replace free hugepage folios after migration To: David Hildenbrand , akpm@linux-foundation.org Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org, 21cnbao@gmail.com, baolin.wang@linux.alibaba.com, muchun.song@linux.dev, liuzixing@hygon.cn, Oscar Salvador , Michal Hocko References: <1734503588-16254-1-git-send-email-yangge1116@126.com> <0b41cc6b-5c93-408f-801f-edd9793cb979@redhat.com> <1241b567-88b6-462c-9088-8f72a45788b7@126.com> From: Ge Yang In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-CM-TRANSID:qCkvCgD3PIfrrmZnzSstCw--.48830S2 X-Coremail-Antispam: 1Uf129KBjvJXoWxAr48ZFyxKry3Gr4xKw4Durg_yoWrZw4UpF WrGa1ak3yDJrZxJr12qwn8CF1FyrsrWFW0qF1rtF9YvwsxAryIkr12yw1Y93yfAr1fGa10 v3yvqws7u3WUZa7anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDUYxBIdaVFxhVjvjDU0xZFpf9x07UKFAJUUUUU= X-Originating-IP: [39.144.39.55] X-CM-SenderInfo: 51dqwwjhrrila6rslhhfrp/1tbiOhu7G2dlk9LH2gACsd X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: B7ED02000C X-Stat-Signature: g3mtpj7setdz56h3bdb64sxrnoxqr76y X-Rspam-User: X-HE-Tag: 1734782703-969578 X-HE-Meta: U2FsdGVkX1+guplpRAkdRyOeebc78RaML/5WOBcd4RN1gJCIYY3z78mDxqj9SCBkJs1ZHX8Bu0K6utyeKzqKVWz+9Exva7oNzYn6g8pOMJfl8RgwcXVMIE6qyONxg6znlFwh61286QeROAYCDELriCtbIW6h1n739/nm1LYjFHAh3ObBBuArPZEEjiXEnwAVgl9qnz5FLwvt/Wj2rCgvSsvxDxbNFZNK8HA1Ogy5FrafLvvsfwFdj7xPYgBMKyKv4nxI0XQ22sy3vah+xmKiNbXqrELQBf1/A9odCOidU2EjVZULU/9/pxNPmBh/POgnYAYaUnRDrkHPc9CrdgzuHQ4fMlU1nAQ3gDJzMY1L60XoUnDYtHEPe+JXc9ah1CFXdRyzG6IHIbxZwbo/+UGskSzVEeKOFKdQR0AzhO7J5+TUEll2tRLHL+vQL3WqrjB1zLbkYc8uiF9POoQc95WwS/FwuXv081Y8U6ON99iiQaa5nK08MHqJyChiyfBo2nBZFyn7rFOEfXm2+/J67CPZZgFALdr/WegTZTvBJSKnkzpyjD9OE3tZoBGN5eaq0jFNbPX+n+yZweNsp6MtbovPUBKfVx2RtU3Q88XbeQkf4Vr3JgKEiFjpEQ73F50EuH5tez6tWLs040rnwgzzfRcFvOBg92eqMFtBUl5lKe9aw+A79Aum73il/T34exgDUKOjfc6N9OXTQ4Gf0Tc2QqGlzo3pBffKqNxC+QuF6nNl4s91yMez1Qsua73y8vwHZUXLGLZ6ZJmuUanxSg4x9BMr97eMHQT2NNnYP2r/OHuVznrcnyTH+d3eW6egO6ZDwQCDc8K6bYeli2PfzwjTTb5hWqxBmfnSGbmErv73H+pZiEIiFn+bQets5YFf5QGAdmR2kFS8FbIk8bVHQXA6Df2ZBYC8dvHCvNd9sZZknaik8DL3BZ7h06cwSMK+xNrWBSbFpYfCNEHw6zAiXX4DAKt wt3J2pEZ KewXKU/KVw73pTTqb9L7Ty9bOcRlUvjaKf/gjgHl3pCgBmv5ZIc7nIVrFbEzBKN/3E/Pjes83J2vZ1xSTrE34eXdFi9m4N0gLKDBFi1QNqYeacXwRUzBd4m8NyXNVDdgcXUpa36wUHQ1q767p+3Lm+0H5QXmcMC5gwGV4BjkrLl0HHdRIBjDW3yyc6QL9UL1jSjW6jH5+MdLaAGxs1mvtqaxGICU4CssLP0YNjxLo7SIS8aYVBz1ghg6xJFxvbJQa3u4+2wQM/DdAj7mIPW/tvEqux1KKVfFiZx7GFwo7rvkdwy4Mr1MN4hEHp/ovNdqXj1DCJ+22i3GruXRVlBGDSzdLDH0gYtZZ8McyZ9ie+2WN0317a6sLSELMEhjis7U9BbPyDBpWcECXMv38M27mFr7oASWsNAClwkVYaJpAQd67NRYxWEHFXj8pYA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000122, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: 在 2024/12/21 0:30, David Hildenbrand 写道: > On 20.12.24 09:56, Ge Yang wrote: >> >> >> 在 2024/12/20 0:40, David Hildenbrand 写道: >>> On 18.12.24 07:33, yangge1116@126.com wrote: >>>> From: yangge >>> >>> CCing Oscar, who worked on migrating these pages during memory offlining >>> and alloc_contig_range(). >>> >>>> >>>> My machine has 4 NUMA nodes, each equipped with 32GB of memory. I >>>> have configured each NUMA node with 16GB of CMA and 16GB of in-use >>>> hugetlb pages. The allocation of contiguous memory via the >>>> cma_alloc() function can fail probabilistically. >>>> >>>> The cma_alloc() function may fail if it sees an in-use hugetlb page >>>> within the allocation range, even if that page has already been >>>> migrated. When in-use hugetlb pages are migrated, they may simply >>>> be released back into the free hugepage pool instead of being >>>> returned to the buddy system. This can cause the >>>> test_pages_isolated() function check to fail, ultimately leading >>>> to the failure of the cma_alloc() function: >>>> cma_alloc() >>>>       __alloc_contig_migrate_range() // migrate in-use hugepage >>>>       test_pages_isolated() >>>>           __test_page_isolated_in_pageblock() >>>>                PageBuddy(page) // check if the page is in buddy >>> >>> I thought this would be working as expected, at least we tested it with >>> alloc_contig_range / virtio-mem a while ago. >>> >>> On the memory_offlining path, we migrate hugetlb folios, but also >>> dissolve any remaining free folios even if it means that we will going >>> below the requested number of hugetlb pages in our pool. >>> >>> During alloc_contig_range(), we only migrate them, to then free them up >>> after migration. >>> >>> Under which circumstances doe sit apply that "they may simply be >>> released back into the free hugepage pool instead of being returned to >>> the buddy system"? >>> >> >> After migration, in-use hugetlb pages are only released back to the >> hugetlb pool and are not returned to the buddy system. > > We had > > commit ae37c7ff79f1f030e28ec76c46ee032f8fd07607 > Author: Oscar Salvador > Date:   Tue May 4 18:35:29 2021 -0700 > >     mm: make alloc_contig_range handle in-use hugetlb pages >     alloc_contig_range() will fail if it finds a HugeTLB page within the >     range, without a chance to handle them.  Since HugeTLB pages can be >     migrated as any LRU or Movable page, it does not make sense to bail > out >     without trying.  Enable the interface to recognize in-use HugeTLB > pages so >     we can migrate them, and have much better chances to succeed the call. > > > And I am trying to figure out if it never worked correctly, or if > something changed that broke it. > > > In start_isolate_page_range()->isolate_migratepages_block(), we do the > >     ret = isolate_or_dissolve_huge_page(page, &cc->migratepages); > > to add these folios to the cc->migratepages list. > > In __alloc_contig_migrate_range(), we migrate the pages using > migrate_pages(). > > > After that, the src hugetlb folios should still be isolated? Yes. But I'm > getting > confused when these pages get un-silated and putback to hugetlb/freed. > If the migration is successful, call folio_putback_active_hugetlb to release the src hugetlb folios back to the free hugetlb pool. trace: unmap_and_move_huge_page folio_putback_active_hugetlb folio_put free_huge_folio alloc_contig_range_noprof __alloc_contig_migrate_range if (test_pages_isolated()) //to determine if hugetlb pages in buddy isolate_freepages_range //grab isolated pages from freelists. else undo_isolate_page_range //undo isolate > >> >> The specific steps for reproduction are as follows: >> 1,Reserve hugetlb pages. Some of these hugetlb pages are allocated >> within the CMA area. >> echo 10240 > /proc/sys/vm/nr_hugepages >> >> 2,To ensure that hugetlb pages are in an in-use state, we can use the >> following command. >> qemu-system-x86_64 \ >>     -mem-prealloc \ >>     -mem-path /dev/hugepage/ \ >>     ... >> >> 3,At this point, using cma_alloc() to allocate contiguous memory may >> result in a probable failure. >> > > Will these free hugetlb folios become surplus pages? I would have assumed > they get freed immediately to the buddy, or does you config maybe allow for > surplus pages? > These freed hugetlb folios will not become surplus pages. I have not configured the system to allow for the existence of surplus pages.