From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 38DF4C3DA7F for ; Wed, 31 Jul 2024 05:09:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A36236B0085; Wed, 31 Jul 2024 01:09:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9E5FF6B0088; Wed, 31 Jul 2024 01:09:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8D5436B0089; Wed, 31 Jul 2024 01:09:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 708DA6B0085 for ; Wed, 31 Jul 2024 01:09:36 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 22AEA1A01C2 for ; Wed, 31 Jul 2024 05:09:36 +0000 (UTC) X-FDA: 82398869952.28.B6EE346 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by imf11.hostedemail.com (Postfix) with ESMTP id 9F62F40003 for ; Wed, 31 Jul 2024 05:09:30 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=none; spf=pass (imf11.hostedemail.com: domain of wangkefeng.wang@huawei.com designates 45.249.212.188 as permitted sender) smtp.mailfrom=wangkefeng.wang@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722402546; a=rsa-sha256; cv=none; b=yHRwP/P9LEjQUKAihYQbWwWnZ7wzIeQOWv4J0gnx7Cy9jmUx7sEaD35UQ49/tTJLH/czPc QRctoqzAz74O42jAlET+0wQ+N+2zSTx5L8iNlmWCdvGoZkAsZxvwq/FguEoXkt0byJuJTR wilZISNUmRiW8kk0ddYTRI96HmKn7KE= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=none; spf=pass (imf11.hostedemail.com: domain of wangkefeng.wang@huawei.com designates 45.249.212.188 as permitted sender) smtp.mailfrom=wangkefeng.wang@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722402546; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TDipy76sLrS+Inp9lwtuW1piChy1WfulOf+RnQEPtM8=; b=8PqgDEgk+AvboFy+38EUjmQolujeLYPZeNcbLO8Pim5y9shcPmQtbzrNruDO6CI3cZG6y9 2pPyAL02h/Sf4BBnZKDCwwXFn1tJeY9QzpkkI1QgYLij3xyHq6YOxNXpH1C1p4Tc4Ihz8c btebKLiVCJnxsaIveatJObbKjt8VuD8= Received: from mail.maildlp.com (unknown [172.19.163.252]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4WYg876JvwzgYgg; Wed, 31 Jul 2024 13:07:35 +0800 (CST) Received: from dggpemf100008.china.huawei.com (unknown [7.185.36.138]) by mail.maildlp.com (Postfix) with ESMTPS id 47A731800A2; Wed, 31 Jul 2024 13:09:26 +0800 (CST) Received: from [10.174.177.243] (10.174.177.243) by dggpemf100008.china.huawei.com (7.185.36.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Wed, 31 Jul 2024 13:09:25 +0800 Message-ID: <56b9a389-4c36-4892-962d-b45878f30f4d@huawei.com> Date: Wed, 31 Jul 2024 13:09:25 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 2/4] mm: memory_hotplug: check hwpoisoned page firstly in do_migrate_range() Content-Language: en-US To: David Hildenbrand , Andrew Morton CC: Oscar Salvador , Miaohe Lin , Naoya Horiguchi , References: <20240725011647.1306045-1-wangkefeng.wang@huawei.com> <20240725011647.1306045-3-wangkefeng.wang@huawei.com> From: Kefeng Wang In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.174.177.243] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To dggpemf100008.china.huawei.com (7.185.36.138) X-Stat-Signature: axhje8g4zgdw4iazwog9triuifng9ech X-Rspamd-Queue-Id: 9F62F40003 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1722402570-515561 X-HE-Meta: U2FsdGVkX19BX8F4uY+8cPvNcmdgwxR6RW4xyIT2tLIVIiaXRWz1AczlUXmZN5RXL7bYd0L8eAieMWsP0PK7RXYIYea92kj/DXc9AX8x1KX+yUxmKIkLWjfDv05O0wu0Sp0WTegrK2y07Rojk8USMrErZJKu5i8VCwQ8Z+VVmqzJtXqjMQJq+QqoATAu1quQRB6weOkrrG0lLh3SyLVV+8YNRVzrCQXmXsU7Rdvfv5KtUMyHKyOsq+KRd9wY0Q2qP8YcNhX3vDsLOSTF3T2P1ft/E0cDMS8OAKJBSeg9teN9qeOZ/RcqrLrQVF2AdlxmJks1KfQ7/8P58kn7/RLXgQUdIijgV4r3DBaJnNIzOtzuGHQENiKu0kpjUDkbmDZPPa1pVsp67VUjdnR4aBsNPmUti5H/tbFXFwOqdSGfVETFNWbpxVeEPhVK01BcuUuQ6gPYWJ5rB/Y352ZZCZML/BaLo7tfYHSBxTnGXBIm/aj12IsVexOd59ZBqCqNhNIGlaLJ8DMHiBdcYI+cPNhwmH5GHayVlpX9fSpyvG+dvRxGKB243FemaT5jxe579uBAWH6jfQ09tLhwmox88fsLvrgoTylml5fcyJWgiZTseK/dUJls/IYw3Ynha2d6y+cuGSioJXfPF9XPAnTWiC23yD/0MCs1pmAlURQzDQc0laZLK5zlUyD0jSruv69uZHxvm53Bb6NAoUCSPlgVTafdqDIycYWgKx+5+gdZ1g5zuB/3XozluW+oteCcqoAdx/2IQCKyiw69jeYD3QfbJxq8OFBnfEO3Yh/vYe/XLxGz/zcEBZ4hahDjqoDWkwbqGQ+izPKgSAVV3MygTIY6F5RCKHb/vv7JX86RADrT5RYlUc0e32lnwR7MC99/vmM5+ntoXhqg2pbmepphAH6afYm7fN0UBXMS+2xfhgSZgmuDnyv/0gjGW2Dl+8O06r+jK4VTssL/eXL8gUP1YfTFlR6 /Z9Bx6zx 5tj1jXGrXU6/Dp6o26+lsfAOOT1K0OYKQWZ0Lebx4/nmxjVY4xHm2HIARjaDjE2ZE9GLImEYRW87rLX8Iwsfqnk6j54artoQIC/BIRBhSV2ke71LHuTGLdVtONGIEDSGOJ4qPdDdlegpH7j1yA1x5wttQO/D+rgBOseCpfh3ew0ZbSVAjvxNSU+Caw44+P2Io7Fu4mTE9ZBnW1k3VAWXXk+PjzZFLUx+ydBMJAqcDT7SEfrfzbxTXnH1EJzkuDEmwH+JknHnWRr/w9YTiR5v/xEJBYXyhFvZJ6y2N+oemTY0tPVYDvirnLU/b12ydEKaC1jcBPnSgqEOf95z7o9QR5D8QgA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/7/30 18:26, David Hildenbrand wrote: > On 25.07.24 03:16, Kefeng Wang wrote: >> The commit b15c87263a69 ("hwpoison, memory_hotplug: allow hwpoisoned >> pages to be offlined") don't handle the hugetlb pages, the dead loop >> still occur if offline a hwpoison hugetlb, luckly, after the commit >> e591ef7d96d6 ("mm,hwpoison,hugetlb,memory_hotplug: hotremove memory >> section with hwpoisoned hugepage"), the HPageMigratable of hugetlb >> page will be clear, and the hwpoison hugetlb page will be skipped in >> scan_movable_pages(), so the deed loop issue is fixed. > > did you mean "endless loop" ? Exactly, will fix the words. > >> >> However if the HPageMigratable() check passed(without reference and >> lock), the hugetlb page may be hwpoisoned, it won't cause issue since >> the hwpoisoned page will be handled correctly in the next movable >> pages scan loop, and it will be isolated in do_migrate_range() and >> but fails to migrated. In order to avoid the unnecessary isolation and >> unify all hwpoisoned page handling, let's unconditionally check hwpoison >> firstly, and if it is a hwpoisoned hugetlb page, try to unmap it as >> the catch all safety net like normal page does. > > But what's the benefit here besides slightly faster handling in an > absolute corner case (I strongly suspect that we don't care)? Yes, it is a very corner case, the goal is to move isolate_hugetlb() after HWpoison check, then to unify isolation and folio conversion (patch4). But we must correctly handle the hugetlb unmap when meet a hwpoisoned page. > >> >> Signed-off-by: Kefeng Wang >> --- >>   mm/memory_hotplug.c | 27 ++++++++++++++++----------- >>   1 file changed, 16 insertions(+), 11 deletions(-) >> >> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c >> index 66267c26ca1b..ccaf4c480aed 100644 >> --- a/mm/memory_hotplug.c >> +++ b/mm/memory_hotplug.c >> @@ -1788,28 +1788,33 @@ static void do_migrate_range(unsigned long >> start_pfn, unsigned long end_pfn) >>           folio = page_folio(page); >>           head = &folio->page; >> -        if (PageHuge(page)) { >> -            pfn = page_to_pfn(head) + compound_nr(head) - 1; >> -            isolate_hugetlb(folio, &source); >> -            continue; >> -        } else if (PageTransHuge(page)) >> -            pfn = page_to_pfn(head) + thp_nr_pages(page) - 1; >> - >>           /* >>            * HWPoison pages have elevated reference counts so the >> migration would >>            * fail on them. It also doesn't make any sense to migrate >> them in the >>            * first place. Still try to unmap such a page in case it is >> still mapped >> -         * (e.g. current hwpoison implementation doesn't unmap KSM >> pages but keep >> -         * the unmap as the catch all safety net). >> +         * (keep the unmap as the catch all safety net). >>            */ >> -        if (PageHWPoison(page)) { >> +        if (unlikely(PageHWPoison(page))) { > > We're not checking the head page here, will this work reliably for > hugetlb? (I recall some difference in per-page hwpoison handling between > hugetlb and THP due to the vmemmap optimization) Before this changes, the hwposioned hugetlb page won't try to unmap in do_migrate_range(), we hope it already unmapped in memory_failure(), as mentioned from comments, there maybe fail to unmap, so a new safeguard to try to unmap it again here, but we don't need to guarantee it. The unmap_posioned_folio() used to correctly handle hugetlb pages in shared mappings if we met a hwpoisoned page(maybe headpage/may subpage). > >