From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CD33FC3DA4A for ; Thu, 15 Aug 2024 02:59:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 586A66B0082; Wed, 14 Aug 2024 22:59:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5367C6B0085; Wed, 14 Aug 2024 22:59:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3FE206B0088; Wed, 14 Aug 2024 22:59:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 21FB86B0082 for ; Wed, 14 Aug 2024 22:59:02 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 9AEB21A1267 for ; Thu, 15 Aug 2024 02:59:01 +0000 (UTC) X-FDA: 82452972882.26.9C9EAC8 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by imf06.hostedemail.com (Postfix) with ESMTP id B244818000D for ; Thu, 15 Aug 2024 02:58:58 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf06.hostedemail.com: domain of wangkefeng.wang@huawei.com designates 45.249.212.188 as permitted sender) smtp.mailfrom=wangkefeng.wang@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723690659; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=il6tP2jO9BO6uikTy3f4hjdWvbfONSxvfCWE4VQnR5g=; b=ieAgVmYKLcefD0eUWs1NKLkm9vPNci3rvFuV7VyDrQLXOOp6fKYxMu/wdScLoDgiCaLw+u Z+ChMI3i3UvSO+2MGbGAIV6clLoxwUVwvItjg2AJfKU2SgM9n2Tql/fMIGOHG46OC+A3Za WaVGwaseG1eqz+ZK93izvQ2NEN0VQW0= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723690659; a=rsa-sha256; cv=none; b=hkBusQbhy1FDBTxZQcvCtcHsDnL/QoXnrC2UgBp5Ip9/Slv4+U67SwItOALmpSUPS14VKR jZRM772ey75ut10sPc/V/2vlOfFsKR0hhXDb55xuUcVXlkzJgIVEErKz21k6zDHOCf0cJF 2JWgTixoyjRUENhH8lNc9uu2hnkYhd8= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf06.hostedemail.com: domain of wangkefeng.wang@huawei.com designates 45.249.212.188 as permitted sender) smtp.mailfrom=wangkefeng.wang@huawei.com Received: from mail.maildlp.com (unknown [172.19.163.174]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4WkqY66WXKzpTCl; Thu, 15 Aug 2024 10:57:30 +0800 (CST) Received: from dggpemf100008.china.huawei.com (unknown [7.185.36.138]) by mail.maildlp.com (Postfix) with ESMTPS id AE879140417; Thu, 15 Aug 2024 10:58:52 +0800 (CST) Received: from [10.174.177.243] (10.174.177.243) by dggpemf100008.china.huawei.com (7.185.36.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Thu, 15 Aug 2024 10:58:52 +0800 Message-ID: <113f25e0-7eed-405b-9369-bc23b780d315@huawei.com> Date: Thu, 15 Aug 2024 10:58:51 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH RFC] mm: skip gigantic pages in isolate_single_pageblock() when mem offline Content-Language: en-US To: Zi Yan CC: Matthew Wilcox , Andrew Morton , David Hildenbrand , Oscar Salvador , References: <20240813125226.1478800-1-wangkefeng.wang@huawei.com> <92fedec5-62c9-4ec0-9d4c-a722b30aa63c@huawei.com> <905740F8-58C6-4333-8EA1-4A53C95CC1FE@nvidia.com> <50FEEE33-49CA-48B5-B4C5-964F1BE25D43@nvidia.com> From: Kefeng Wang In-Reply-To: <50FEEE33-49CA-48B5-B4C5-964F1BE25D43@nvidia.com> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.174.177.243] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To dggpemf100008.china.huawei.com (7.185.36.138) X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: B244818000D X-Stat-Signature: qqus8uh3myt1mmgg8scb1fpz9k3wqona X-Rspam-User: X-HE-Tag: 1723690738-977840 X-HE-Meta: U2FsdGVkX1+cdNkf9G/asvLbo1pbBlRXDqPNyJ1TS3bmG+dSRsBuWLU8SnJzcE4rcJRdE3gmgyEk+bhdPAU44dV9I3vwjZbtfrowLuWutrMSM2/RpS8mJgD5UsMVLqxaw28FPVlziqF9O29kN7P99qVqaF5RhURrYdqSwt2chchLvDpG2dk/Hvn6v/9QVR/Og45mGvY8Es2n+ZaDcrlfg9OdRLhQ6RX2Jk8saV7sy58X2QEFAkgEKUc2IILy52QCof4gR22Pw89gHyJmloFgitR6Go9VAEwAnFkhAtAqGuoqNDvBXdMzXT8v+X746MjwU8Spvt9lY1T+4V/w4feIq6OR9uR4mqstcxMCJ4K0hjUf5KEO3b8Orl7bs4EM+4pIi7TwoWrppd7XqhuWo/Wn9YWVhyZN2JiXCjaYgIVj6WTR1/E8NyF3vio5ZhqJ7nLk+I08V8oAJvc6Ljp4eViN9QFDgerbqsd4Ge6qkAVIcoEnde0JZLoIgTlNE1S9vDxTZBQAjg2nh1+S0oeZ0Y1V4+O5OTuX5PfsI0sJquGrA7TVjSk5UJaDLizEc5aFSECLnbkUD2clwEb/9WnnKJt/h0ZUArBtDKCzeewcNnk/gL1a55YltEDkgcwH0coiSordEr7Hm0/TY311Ta1JA3mcJ0oEAJhKNdtBa8dycaWTNLdeFRPNKkSbv6rkSVILEkuETxOOvH97V+aGRo2Ue18eEkXHoMfwE1Q+5up1y3Y1Hdxu8QAgJgzXxIC5DztAlcAKJmZou7TC9vG+54+7DduAwMzTR5kHjktdzc8TAgge1OKHgXm9+brx0W9eNM00p/8qyVs0FgE/KRJ3n3G/v8jvfVyba5O6TaAb7KjRQuxwWCaMCpzlv0px2jRxCSNaaorI9szxlRywu8dVIYttkrhHNn75lP02tbtE4DudPZJcXFSegZcCmh6FVVQUI+JxdcukKh1hq4LNgP+vXw6QYK6 g7iL4Gv5 Q1HUtoYH9dlNS3J9TsWejOdjWP5VqbVReUNB7A0vfOacXeMW5n2GJOvzHtlyTGmlt4V/mWiwL9nLpGop//jksGWGDBui4LGRphQLBQBkW7H6SQ52FNARKl0rByi3YNkgsi/uC2fx+hw9kF5QjC51GXyDC36x6SjF8FAiUPB6EKf2Ho4bIqElvVUAU93wBUGcS8S5c8MKFZsX8ly3FUfI/LRFzEdeKYMxlsCA0UUB8NdyMauqN3kMkqhd66w69A+1ofbXR7PGLw5tfJS5FKJzP653juumepk2yKsGAV/kYROWiMP8Z/AWcASFmcj8kj5gjctoisnEq/Gp+bg+toiqcl3icVA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/8/14 22:53, Zi Yan wrote: > On 13 Aug 2024, at 22:01, Kefeng Wang wrote: > >> On 2024/8/13 22:59, Zi Yan wrote: >>> On 13 Aug 2024, at 10:46, Kefeng Wang wrote: >>> >>>> On 2024/8/13 22:03, Matthew Wilcox wrote: >>>>> On Tue, Aug 13, 2024 at 08:52:26PM +0800, Kefeng Wang wrote: >>>>>> The gigantic page size may larger than memory block size, so memory >>>>>> offline always fails in this case after commit b2c9e2fbba32 ("mm: make >>>>>> alloc_contig_range work at pageblock granularity"), >>>>>> >>>>>> offline_pages >>>>>> start_isolate_page_range >>>>>> start_isolate_page_range(isolate_before=true) >>>>>> isolate [isolate_start, isolate_start + pageblock_nr_pages) >>>>>> start_isolate_page_range(isolate_before=false) >>>>>> isolate [isolate_end - pageblock_nr_pages, isolate_end) pageblock >>>>>> __alloc_contig_migrate_range >>>>>> isolate_migratepages_range >>>>>> isolate_migratepages_block >>>>>> isolate_or_dissolve_huge_page >>>>>> if (hstate_is_gigantic(h)) >>>>>> return -ENOMEM; >>>>>> >>>>>> [ 15.815756] memory offlining [mem 0x3c0000000-0x3c7ffffff] failed due to failure to isolate range >>>>>> >>>>>> Fix it by skipping the __alloc_contig_migrate_range() if met gigantic >>>>>> pages when memory offline, which return back to the original logic to >>>>>> handle the gigantic pages. >>>>> >>>>> This seems like the wrong way to fix this. The logic in the next >>>>> PageHuge() section seems like it's specifically supposed to handle >>>>> gigantic pages. So you've just made that dead code, but instead of >>>>> removing it, you've left it there to confuse everyone? >>>> >>>> isolate_single_pageblock() in start_isolate_page_range() will be called >>>> from memory offline and contig allocation (alloc_contig_pages()), this >>>> changes only restore the behavior from memory offline code, but we still >>>> fail in contig allocation. >>>> >>>> From memory offline, we has own path to isolate/migrate page or dissolve >>>> free hugetlb folios, so I think we don't depends on the __alloc_contig_migrate_range(). >>>>> >>>>> I admit to not understanding this code terribly well. >>>>> >>>> A quick search from [1], the isolate_single_pageblock() is added for >>>> contig allocation, but it has negative effects on memory hotplug, >>>> Zi Yan, could you give some comments? >>>> >>>> [1] https://lore.kernel.org/linux-mm/20220425143118.2850746-1-zi.yan@sent.com/ >>> >>> Probably we can isolate the hugetlb page and use migrate_page() instead of >>> __alloc_contig_migrate_range() in the section below, since we are targeting >>> only hugetlb pages here. It should solve the issue. >> >> For contig allocation, I think we must isolate/migrate page in >> __alloc_contig_migrate_range(), but for memory offline,(especially for >> gigantic hugepage)as mentioned above, we already have own path to >> isolate/migrate used page and dissolve the free pages,the >> start_isolate_page_range() only need to mark page range MIGRATE_ISOLATE, >> that is what we did before b2c9e2fbba32, >> >> start_isolate_page_range >> scan_movable_pages >> do_migrate_range >> dissolve_free_hugetlb_folios >> >> Do we really need isolate/migrate the hugetlb page and for memory >> offline path? > > For memory offline path, there is do_migrate_range() to move the pages. > For contig allocation, there is __alloc_contig_migrate_range() after > isolation to migrate the pages. > > The migration code in isolate_single_pageblock() is not needed. > Something like this would be OK, just skip the page and let either > do_migrate_range() or __alloc_contig_migrate_range() to handle it: Oh, right, for alloc_contig_range(), we do have another __alloc_contig_migrate_range() after start_isolate_page_range(), then we could drop the following code, > > diff --git a/mm/page_isolation.c b/mm/page_isolation.c > index 042937d5abe4..587d723711c5 100644 > --- a/mm/page_isolation.c > +++ b/mm/page_isolation.c > @@ -402,23 +402,6 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags, > > #if defined CONFIG_COMPACTION || defined CONFIG_CMA > if (PageHuge(page)) { > - int page_mt = get_pageblock_migratetype(page); > - struct compact_control cc = { > - .nr_migratepages = 0, > - .order = -1, > - .zone = page_zone(pfn_to_page(head_pfn)), > - .mode = MIGRATE_SYNC, > - .ignore_skip_hint = true, > - .no_set_skip_hint = true, > - .gfp_mask = gfp_flags, > - .alloc_contig = true, > - }; > - INIT_LIST_HEAD(&cc.migratepages); > - > - ret = __alloc_contig_migrate_range(&cc, head_pfn, > - head_pfn + nr_pages, page_mt); > - if (ret) > - goto failed; > pfn = head_pfn + nr_pages; > continue; > } But we need to remove the CONFIG_COMPACTION/CMA too, thought? diff --git a/mm/page_isolation.c b/mm/page_isolation.c index 042937d5abe4..785c2d320631 100644 --- a/mm/page_isolation.c +++ b/mm/page_isolation.c @@ -395,30 +395,8 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags, unsigned long head_pfn = page_to_pfn(head); unsigned long nr_pages = compound_nr(head); - if (head_pfn + nr_pages <= boundary_pfn) { - pfn = head_pfn + nr_pages; - continue; - } - -#if defined CONFIG_COMPACTION || defined CONFIG_CMA - if (PageHuge(page)) { - int page_mt = get_pageblock_migratetype(page); - struct compact_control cc = { - .nr_migratepages = 0, - .order = -1, - .zone = page_zone(pfn_to_page(head_pfn)), - .mode = MIGRATE_SYNC, - .ignore_skip_hint = true, - .no_set_skip_hint = true, - .gfp_mask = gfp_flags, - .alloc_contig = true, - }; - INIT_LIST_HEAD(&cc.migratepages); - - ret = __alloc_contig_migrate_range(&cc, head_pfn, - head_pfn + nr_pages, page_mt); - if (ret) - goto failed; + if (head_pfn + nr_pages <= boundary_pfn || + PageHuge(page)) pfn = head_pfn + nr_pages; continue; } @@ -432,7 +410,6 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags, */ VM_WARN_ON_ONCE_PAGE(PageLRU(page), page); VM_WARN_ON_ONCE_PAGE(__PageMovable(page), page); -#endif goto failed; }