From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BC819C54E66 for ; Tue, 12 Mar 2024 07:07:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1860B6B011A; Tue, 12 Mar 2024 03:07:49 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 137186B0120; Tue, 12 Mar 2024 03:07:49 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 000166B0127; Tue, 12 Mar 2024 03:07:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id DF14E6B011A for ; Tue, 12 Mar 2024 03:07:48 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 91B6E1404BD for ; Tue, 12 Mar 2024 07:07:48 +0000 (UTC) X-FDA: 81887507016.21.4AA1E02 Received: from szxga07-in.huawei.com (szxga07-in.huawei.com [45.249.212.35]) by imf06.hostedemail.com (Postfix) with ESMTP id 29481180017 for ; Tue, 12 Mar 2024 07:07:44 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf06.hostedemail.com: domain of linmiaohe@huawei.com designates 45.249.212.35 as permitted sender) smtp.mailfrom=linmiaohe@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1710227266; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=UpGRVJTGBn8k+0Ic4fkc/LAFuiAZNt0ZagTkczJK9pM=; b=JyWUDES4s5rWc+3IQZNrUpKFrI7jTarw+NBxLhJRhZbL3v59xKdX9ksA701oMrJt0lMvqS pQvnGJCJrG295UPIZKN7qVpAbTI4LKTIBpfr19C+mYire4NDoJCgBA7rZufampUvvQOTdq +sNEe6T5JJrY6LyL1ERoRBjpirceh/Y= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf06.hostedemail.com: domain of linmiaohe@huawei.com designates 45.249.212.35 as permitted sender) smtp.mailfrom=linmiaohe@huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1710227266; a=rsa-sha256; cv=none; b=M+/JJjch/qia04lmmf57NQQqXosoQ8WEQbJr/pkG7fuO4om1pOMRwlPfkC8f+rxZjBPPOj 262JznMezinO7xDnDwoz9YdeW67vJrXhnyeDOu5QM3BZh0Kqg1r9hHBdoxuL24MG2qC/+M YWeJAdQUkebJevmq87wqgOc0JihfoI4= Received: from mail.maildlp.com (unknown [172.19.163.17]) by szxga07-in.huawei.com (SkyGuard) with ESMTP id 4Tv4Qx5dGNz1Z1wl; Tue, 12 Mar 2024 15:05:13 +0800 (CST) Received: from canpemm500002.china.huawei.com (unknown [7.192.104.244]) by mail.maildlp.com (Postfix) with ESMTPS id 6143C1A0172; Tue, 12 Mar 2024 15:07:40 +0800 (CST) Received: from [10.173.135.154] (10.173.135.154) by canpemm500002.china.huawei.com (7.192.104.244) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.35; Tue, 12 Mar 2024 15:07:39 +0800 Subject: Re: [PATCH 6/8] mm/memory-failure: Convert memory_failure() to use a folio To: Matthew Wilcox CC: , Naoya Horiguchi , Andrew Morton , References: <20240229212036.2160900-1-willy@infradead.org> <20240229212036.2160900-7-willy@infradead.org> From: Miaohe Lin Message-ID: <5eab08d7-ae38-4f99-401f-f361466e34e0@huawei.com> Date: Tue, 12 Mar 2024 15:07:39 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.173.135.154] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To canpemm500002.china.huawei.com (7.192.104.244) X-Rspamd-Queue-Id: 29481180017 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: zh19j6g5caey8tndx1aomqfz8oxnbfre X-HE-Tag: 1710227264-467911 X-HE-Meta: U2FsdGVkX18j26jZ1+WSXW2bzgZhpYE7ulNYkmxse/hOi4eYynqjX1oMWSgaCcZH0sv0TE2BIJMHQch736pT1ozNyo35w9WFGYXVec3xBQxTnZyjDc+GjXrqogl4vOmXRdJW0rf/41+jah8gj0eKf9whZBLU2GN5yzXgkXK255xOR/seiSJs6BRCic1bb43rPgk4lp4xkZ8P8uEy+zCCLcGUkGD68ds4YKjKJHFn/mNcI1QePQ7TTZ8wwAuMrd1R2iSMP9AT1JPLxPVh5ajGTzgaRUwYeVY4Gvtz/JFRxov6qtUgBfszGc8ZYAknNE9KOCPpxlYoiXtlGOPt9ZlJZupOZu7Ci8o0uLyqv78xOsGPjWDSSOubKpZnERaBwkW2MZ6e8ADMFCQt8Yy0ADn/YKxHcYreZ8ybqCsg0tiztBVP8rJW/qQx4A8rpY2fU/EsKX51jfffR2L+BiSK8mRTt1uFlp4mJGonpM/7Trpu9Wy2vm06n4dBhF4tId+eh+tcCD3AysCvrxXqqEhProOIyIDNV05F8aa1jPjmSYKx5jbJVoy1Cthd+nekCOcNSaJsAJcN+te8s/o1GKEdnkND/jQFBU+meywZz4xVWDc90aqPUwxe3PnIYt0wLZS3KWEhyk1qFmgwAnOhlZ8IeETh0sSvgQ2BNPqRemJQ0GL0hZnxSGOpyfG92ofayCQKAzgD/3A84oEgNEckVwGYpzph+gIbZDa//aOtxdy02WCMVKIuzD7QALgT2E5d2Zt1TViqCmDSv41IIdDsY6+dZ+RpU8SXs1Er0KJJRifP+mOMU80hBfdP1Jv+yWzpjI7OfCKX+JGNNXgcJ/ZPkBtDSX5OOfX+lSAws7vvjgB16yU5mkoiR7aa2Tyr+LETsU5KKi4npyagIEU3G+lr/8GbmS1sWHV0IfYgspd7eGebNjFj2o8o+LgknH25P6opSdUE+tdHFmWiu5iZlbmfOGFR4mg 4uoMJeXs Q7FOZOrlfQ0ND4rSpcuFF2XNBFo6whMxNLohbQwnKlb+zuDfgvRrjy4/U26mSai7QqhyOqvQEijAGI5N8NQLjNlLwH5hIWPtex2Bw/rAre4s0FXYsG5gsmpaA9A9m75HIWhAcIU8CSLYh5p23rgSdHp9RC4TYi6vO9nqydY3E0+6EZcbRCUpYZN73gG9Y4zbJSQKZtoNj9R/LQyurXXTNs7I5QiZ8yC7t+in6E+igF8P9u7dlvq1RtXKpLZf9EvETahs5dzWxunwZXh4lbNq/PNj6eA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/3/11 20:31, Matthew Wilcox wrote: > On Fri, Mar 08, 2024 at 04:48:33PM +0800, Miaohe Lin wrote: >> On 2024/3/1 5:20, Matthew Wilcox (Oracle) wrote: >>> @@ -2277,8 +2277,8 @@ int memory_failure(unsigned long pfn, int flags) >>> } >>> } >>> >>> - hpage = compound_head(p); >>> - if (PageTransHuge(hpage)) { >>> + folio = page_folio(p); >>> + if (folio_test_large(folio)) { >>> /* >>> * The flag must be set after the refcount is bumped >>> * otherwise it may race with THP split. > [...] >>> @@ -2318,11 +2319,11 @@ int memory_failure(unsigned long pfn, int flags) >>> * race window. If this happens, we could try again to hopefully >>> * handle the page next round. >>> */ >>> - if (PageCompound(p)) { >>> + if (folio_test_large(folio)) { >> >> folio_test_large() only checks whether PG_head is set but PageCompound() also checks PageTail(). >> So folio_test_large() and PageCompound() are not equivalent? > > Assuming we have a refcount on this page so it can't be simultaneously > split/freed/whatever, these three sequences are equivalent: If page is stable after page refcnt is held, I agree below three sequences are equivalent. > > 1 if (PageCompound(p)) > > 2 struct page *head = compound_head(p); > 2 if (PageHead(head)) > > 3 struct folio *folio = page_folio(p); > 3 if (folio_test_large(folio)) > > . > But please see below commit: """ commit f37d4298aa7f8b74395aa13c728677e2ed86fdaf Author: Andi Kleen Date: Wed Aug 6 16:06:49 2014 -0700 hwpoison: fix race with changing page during offlining When a hwpoison page is locked it could change state due to parallel modifications. The original compound page can be torn down and then this 4k page becomes part of a differently-size compound page is is a standalone regular page. Check after the lock if the page is still the same compound page. We could go back, grab the new head page and try again but it should be quite rare, so I thought this was safest. A retry loop would be more difficult to test and may have more side effects. The hwpoison code by design only tries to handle cases that are reasonably common in workloads, as visible in page-flags. I'm not really that concerned about handling this (likely rare case), just not crashing on it. Signed-off-by: Andi Kleen Acked-by: Naoya Horiguchi Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds diff --git a/mm/memory-failure.c b/mm/memory-failure.c index a013bc94ebbe..44c6bd201d3a 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1172,6 +1172,16 @@ int memory_failure(unsigned long pfn, int trapno, int flags) lock_page(hpage); + /* + * The page could have changed compound pages during the locking. + * If this happens just bail out. + */ + if (compound_head(p) != hpage) { + action_result(pfn, "different compound page after locking", IGNORED); + res = -EBUSY; + goto out; + } + /* * We use page flags to determine what action should be taken, but * the flags can be modified by the error containment action. One """ It says a page could still change to a differently-size compound page due to parallel modifications even if extra page refcnt is held and page is locked. But this commit is early (ten years ago) things might have changed. Any thoughts? Thanks.