From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1F5F7C3DA4A for ; Thu, 22 Aug 2024 08:04:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A47866B00A5; Thu, 22 Aug 2024 04:04:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9F5FE6B00AA; Thu, 22 Aug 2024 04:04:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8BD1B6B00B0; Thu, 22 Aug 2024 04:04:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 6EBF86B00A5 for ; Thu, 22 Aug 2024 04:04:24 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 18EB414134D for ; Thu, 22 Aug 2024 08:04:24 +0000 (UTC) X-FDA: 82479144048.30.A4ECFBB Received: from szxga06-in.huawei.com (szxga06-in.huawei.com [45.249.212.32]) by imf15.hostedemail.com (Postfix) with ESMTP id 778C7A002B for ; Thu, 22 Aug 2024 08:04:19 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf15.hostedemail.com: domain of linmiaohe@huawei.com designates 45.249.212.32 as permitted sender) smtp.mailfrom=linmiaohe@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724313782; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fBGoPcRv+B6PCVokJA/2f4Q76RUW1zRSsA7xfHhQpUo=; b=tLaXbS40ikcrF0AlBGRk1uMQGE4yeYw8TRzT7788le8+X1Cn0USt//tF00zyTRpywHbbdV Oc7wDWL9fbtOitCQGEhCJu3eAtzrtDzBJJJtYl6hv/Pzo3Q72f8L7+aFV8q28JkcHpoTuE 49ReFGHRxuSisn/Wr3cNEGL5oo8ugII= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724313782; a=rsa-sha256; cv=none; b=oUpSI00tjP/C6HL5HG6b3+1bQBEUd4AwWB4WrjJ3LH5RcYnFIUYlmm621OtXA8mHAXlDeF vYktm/WFYmsQw4s2OsbR3t6mvLFL+iNQnQeBcA2WejbAhDQbaln+vaJRdTb8My2w4Oz4bX BK4vQHlCtyNFjp1JT/6Fev1YNBqeHxk= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf15.hostedemail.com: domain of linmiaohe@huawei.com designates 45.249.212.32 as permitted sender) smtp.mailfrom=linmiaohe@huawei.com Received: from mail.maildlp.com (unknown [172.19.88.234]) by szxga06-in.huawei.com (SkyGuard) with ESMTP id 4WqFzZ0rT0z1xvlv; Thu, 22 Aug 2024 16:02:18 +0800 (CST) Received: from kwepemd200019.china.huawei.com (unknown [7.221.188.193]) by mail.maildlp.com (Postfix) with ESMTPS id 0AF97140447; Thu, 22 Aug 2024 16:04:13 +0800 (CST) Received: from [10.173.127.72] (10.173.127.72) by kwepemd200019.china.huawei.com (7.221.188.193) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Thu, 22 Aug 2024 16:04:11 +0800 Subject: Re: [PATCH] codetag: debug: mark codetags for pages which transitioned from being poison to unpoison as empty To: Hao Ge CC: , , Hao Ge , , , , , , , References: <20240822025800.13380-1-hao.ge@linux.dev> From: Miaohe Lin Message-ID: Date: Thu, 22 Aug 2024 16:04:10 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0 MIME-Version: 1.0 In-Reply-To: <20240822025800.13380-1-hao.ge@linux.dev> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.173.127.72] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To kwepemd200019.china.huawei.com (7.221.188.193) X-Rspamd-Queue-Id: 778C7A002B X-Stat-Signature: nek45grak5fq57cbqwnwb3qxuc5sycsw X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1724313859-519786 X-HE-Meta: U2FsdGVkX18AsLW4jJCTTvbcxKcxE+GOfoAhgdatj4o1sjQ/99i4e3qenLJLfdelM11kepVQuVJKkGd2mRb+TyLOPYMz0y/q4l0Pf4pGSYAiXvXB2oqkpzK7xrxpOujrqwXLHL/TH043g1UiytWPxSDKRxSYlFLxkPotDHhFbFH9Lvt/PKJE85jqM2U0G5xQi5CvB16pq0jZ2kghx9YE4q09dOJGNYW2NmrSJO09XaL9Ifz+DBahjU7mxbRiQnYZ/3LNG/9LB4Z4qknMRNUoUJSCWPASIJ+CgnE2moZkVYN1Mtnoafgohmq/yi4RClsUATHdjBrBMjls9MDHmg1bH2qJsLT51/OMbErbxkVRTLCSKeW0l/aO7tHwZCtSuXyllQF/VtAB8UNTy2hqu2iC02kBRG0HudA1L3k7F2NTyroNdcuLjAFAvemZiKJ0GyWioNb6IwcUrOyPUU3oQo2idq4BZaazccGBOzCQxn37u4edsgWuFnGjPUzJnYr3RW0IWBwXS6f1Ao900X8E6dh52otGaGay/ovQU3j/q0w8bSJaQ+wIfbvBNn7hx9wDPSeNxSpUXbvTFqbQcYYcCPy9o/NQPs1F3RCfLvr0Rh/4iqpLbYozP9oYCGuSaGXIU9moo5BtoQVXX7t40CtkmejTKYAV9dz1Au2Ts0n1MqysfTvYFIonxwzNsPMt4AEGEFunDXZ+QmeX0jF7EZ13kro+IGxLIXlJGN2dxGkj0TWfiVsM7uHtQxYDQYQHRw8DCwqKBc+i9gIwVcEO58qRnSzajyrtB21RdhXMODDTo/fPdLZR7Gi9ybFlnfVRYjJAzDgCoFI2WYK6RPZJP/7iDewsG1yQz1qRy/WGfIb9ckLQZSQQmzr4V//hdTPV1v0eYTmqfdDDQWX8k/PLpEixi4RRtMRjFcJBGUebpxL8dKi+CODh8RuUkdbJPMuhPW3V4QKLKeDqzKQbq9hX8tf2t3x ndZIADNQ XpTXgC54Eh51snHu/cBroMvgiEyKeD48GqlCjhIcpSqtPgsHUeT6yvnEAy6WMfxV1nLRPkZ5DtQ8ZcEbslefniQ4SFKzNYoyNicj3gwgtvMiD1ykqSXuEJekw1Km1U4Wxfneo9Sv0TQpVbDbN50363xQBw1aPr3VJU7hsnpG/o3UG5Sn+pOxoqvY/OWIgkkzdo0Lo3oVuotPNBtx5JxABGYiIBbNca4+uWZ7kkwrYhdcBen4B5Ma45mVWULWFYDws/TbyGq4RZvwHwesieSHsXdJjylrB0WqmQbim X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/8/22 10:58, Hao Ge wrote: > From: Hao Ge > Thanks for your patch. > The PG_hwpoison page will be caught and isolated on the entrance to > the free buddy page pool. so,when we clear this flag and return it > to the buddy system,mark codetags for pages as empty. > Is below scene cause the problem? 1. Pages are allocated. pgalloc_tag_add() will be called when prep_new_page(). 2. Pages are hwpoisoned. memory_failure() will set PG_hwpoison flag and pgalloc_tag_sub() will be called when pages are caught and isolated on the entrance to buddy. 3. unpoison_memory cleared flags and sent the pages to buddy. pgalloc_tag_sub() will be called again in free_pages_prepare(). So there is a imbalance that pgalloc_tag_add() is called once and pgalloc_tag_sub() is called twice? If so, let's think about more complicated scene: 1. Same as above. 2. Pages are hwpoisoned. But memory_failure() fails to handle it. So PG_hwpoison flag is set but pgalloc_tag_sub() is not called (pages are not sent to buddy). 3. unpoison_memory cleared flags and calls clear_page_tag_ref() without calling pgalloc_tag_sub() first. Will this cause problem? Though this should be really rare... Thanks. . > It was detected by [1] and the following WARN occurred: > > [ 113.930443][ T3282] ------------[ cut here ]------------ > [ 113.931105][ T3282] alloc_tag was not set > [ 113.931576][ T3282] WARNING: CPU: 2 PID: 3282 at ./include/linux/alloc_tag.h:130 pgalloc_tag_sub.part.66+0x154/0x164 > [ 113.932866][ T3282] Modules linked in: hwpoison_inject fuse ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute ip6table_nat ip6table_man4 > [ 113.941638][ T3282] CPU: 2 UID: 0 PID: 3282 Comm: madvise11 Kdump: loaded Tainted: G W 6.11.0-rc4-dirty #18 > [ 113.943003][ T3282] Tainted: [W]=WARN > [ 113.943453][ T3282] Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022 > [ 113.944378][ T3282] pstate: 40400005 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) > [ 113.945319][ T3282] pc : pgalloc_tag_sub.part.66+0x154/0x164 > [ 113.946016][ T3282] lr : pgalloc_tag_sub.part.66+0x154/0x164 > [ 113.946706][ T3282] sp : ffff800087093a10 > [ 113.947197][ T3282] x29: ffff800087093a10 x28: ffff0000d7a9d400 x27: ffff80008249f0a0 > [ 113.948165][ T3282] x26: 0000000000000000 x25: ffff80008249f2b0 x24: 0000000000000000 > [ 113.949134][ T3282] x23: 0000000000000001 x22: 0000000000000001 x21: 0000000000000000 > [ 113.950597][ T3282] x20: ffff0000c08fcad8 x19: ffff80008251e000 x18: ffffffffffffffff > [ 113.952207][ T3282] x17: 0000000000000000 x16: 0000000000000000 x15: ffff800081746210 > [ 113.953161][ T3282] x14: 0000000000000000 x13: 205d323832335420 x12: 5b5d353031313339 > [ 113.954120][ T3282] x11: ffff800087093500 x10: 000000000000005d x9 : 00000000ffffffd0 > [ 113.955078][ T3282] x8 : 7f7f7f7f7f7f7f7f x7 : ffff80008236ba90 x6 : c0000000ffff7fff > [ 113.956036][ T3282] x5 : ffff000b34bf4dc8 x4 : ffff8000820aba90 x3 : 0000000000000001 > [ 113.956994][ T3282] x2 : ffff800ab320f000 x1 : 841d1e35ac932e00 x0 : 0000000000000000 > [ 113.957962][ T3282] Call trace: > [ 113.958350][ T3282] pgalloc_tag_sub.part.66+0x154/0x164 > [ 113.959000][ T3282] pgalloc_tag_sub+0x14/0x1c > [ 113.959539][ T3282] free_unref_page+0xf4/0x4b8 > [ 113.960096][ T3282] __folio_put+0xd4/0x120 > [ 113.960614][ T3282] folio_put+0x24/0x50 > [ 113.961103][ T3282] unpoison_memory+0x4f0/0x5b0 > [ 113.961678][ T3282] hwpoison_unpoison+0x30/0x48 [hwpoison_inject] > [ 113.962436][ T3282] simple_attr_write_xsigned.isra.34+0xec/0x1cc > [ 113.963183][ T3282] simple_attr_write+0x38/0x48 > [ 113.963750][ T3282] debugfs_attr_write+0x54/0x80 > [ 113.964330][ T3282] full_proxy_write+0x68/0x98 > [ 113.964880][ T3282] vfs_write+0xdc/0x4d0 > [ 113.965372][ T3282] ksys_write+0x78/0x100 > [ 113.965875][ T3282] __arm64_sys_write+0x24/0x30 > [ 113.966440][ T3282] invoke_syscall+0x7c/0x104 > [ 113.966984][ T3282] el0_svc_common.constprop.1+0x88/0x104 > [ 113.967652][ T3282] do_el0_svc+0x2c/0x38 > [ 113.968893][ T3282] el0_svc+0x3c/0x1b8 > [ 113.969379][ T3282] el0t_64_sync_handler+0x98/0xbc > [ 113.969980][ T3282] el0t_64_sync+0x19c/0x1a0 > [ 113.970511][ T3282] ---[ end trace 0000000000000000 ]--- > > Link [1]: https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/madvise/madvise11.c > > Fixes: a8fc28dad6d5 ("alloc_tag: introduce clear_page_tag_ref() helper function") > Cc: stable@vger.kernel.org # v6.10 > Signed-off-by: Hao Ge > --- > mm/memory-failure.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/mm/memory-failure.c b/mm/memory-failure.c > index 7066fc84f351..570388c41532 100644 > --- a/mm/memory-failure.c > +++ b/mm/memory-failure.c > @@ -2623,6 +2623,12 @@ int unpoison_memory(unsigned long pfn) > > folio_put(folio); > if (TestClearPageHWPoison(p)) { > + /* the PG_hwpoison page will be caught and isolated > + * on the entrance to the free buddy page pool. > + * so,when we clear this flag and return it to the buddy system, > + * clear it's codetag > + */ > + clear_page_tag_ref(p); > folio_put(folio); > ret = 0; > } >