From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 63F3DEDE996 for ; Tue, 10 Sep 2024 01:59:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C33056B025C; Mon, 9 Sep 2024 21:59:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BE11F6B0260; Mon, 9 Sep 2024 21:59:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AA8576B0261; Mon, 9 Sep 2024 21:59:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 89DC06B025C for ; Mon, 9 Sep 2024 21:59:07 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 0592F12130F for ; Tue, 10 Sep 2024 01:59:07 +0000 (UTC) X-FDA: 82547170734.06.9C75EA9 Received: from szxga04-in.huawei.com (szxga04-in.huawei.com [45.249.212.190]) by imf18.hostedemail.com (Postfix) with ESMTP id CC3671C000F for ; Tue, 10 Sep 2024 01:59:03 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=none; spf=pass (imf18.hostedemail.com: domain of linmiaohe@huawei.com designates 45.249.212.190 as permitted sender) smtp.mailfrom=linmiaohe@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1725933442; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2m53VYwM9hywmRyhxGwt9v/VNeJj8feKvv1fFs3c0Po=; b=fC4I8Gr+1Mk0pq7d0/o/UIQ0KMdxbbno5bRBVdHP8w2EI9ykCpuRiuHb4Kh8rY+FA480jg 2qcpUrsylKi1Tgo1eFcn2rDqoKhC5/ild5VSFAXyxxyxTclwmzoo8zAdS7a/X8qbzyU20+ Km7sYCK0UEEeklL99ZB/ADw0Nwtqrzo= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1725933442; a=rsa-sha256; cv=none; b=hZbFgmW60VtMj+MKaZ6zCIGVyLNJnCf+b/uBwqWgRh6lJPpQ7KVmUaQDzVtaOnV1uVg4vf JdVvhShspkh2BUtEFTYsCdwxEorGPd6Ob+oCgBcVwDlCJUiiAmhGzpC5FUreLGKqSeK9Fb XK2pr2lPK8OeydkVvvGr32dptZiCa9Q= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=none; spf=pass (imf18.hostedemail.com: domain of linmiaohe@huawei.com designates 45.249.212.190 as permitted sender) smtp.mailfrom=linmiaohe@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com Received: from mail.maildlp.com (unknown [172.19.88.214]) by szxga04-in.huawei.com (SkyGuard) with ESMTP id 4X2n1V6wddz20nlT; Tue, 10 Sep 2024 09:58:54 +0800 (CST) Received: from kwepemd200019.china.huawei.com (unknown [7.221.188.193]) by mail.maildlp.com (Postfix) with ESMTPS id 6DF2D1A016C; Tue, 10 Sep 2024 09:58:59 +0800 (CST) Received: from [10.173.127.72] (10.173.127.72) by kwepemd200019.china.huawei.com (7.221.188.193) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Tue, 10 Sep 2024 09:58:58 +0800 Subject: Re: [PATCH 1/2] mm: support poison recovery from do_cow_fault() To: Kefeng Wang , Andrew Morton CC: , Jane Chu , Naoya Horiguchi , Tony Luck , Jiaqi Yan , David Hildenbrand References: <20240906024201.1214712-1-wangkefeng.wang@huawei.com> <20240906024201.1214712-2-wangkefeng.wang@huawei.com> From: Miaohe Lin Message-ID: <08ecad24-99e4-717c-6de0-5f7b708aad38@huawei.com> Date: Tue, 10 Sep 2024 09:58:58 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0 MIME-Version: 1.0 In-Reply-To: <20240906024201.1214712-2-wangkefeng.wang@huawei.com> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.173.127.72] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To kwepemd200019.china.huawei.com (7.221.188.193) X-Stat-Signature: mt3xc3rtxmxjgibkjzg58ab7fozrs6xp X-Rspamd-Queue-Id: CC3671C000F X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1725933543-487943 X-HE-Meta: U2FsdGVkX19t9GjZh0iI13ID0s/q5EuVSQJbr12d98mqp9Cs9ILJZTLdjF/UGvAz1YjshQTtM/9WYUtHaeqqB8OoLVdEed0X/YtxyeaaDGMc5D1DWsC5X42lf/fZFYXZfYZcuVF/+Ghx5wA9nOHL2a8mjKHrx2wde9CSjOqXJlrfu5DIV116vVRC8GkMaI8ED9022JoiOvoegUWf8IhWd38nVV6QGkkHkBPOQ1RPUS9rYlFH6lhtL78oGm0pbJQOyrC+CWK4b78LT+K8cR8Fr27gDMzUXS8NjXFQ7AAs6O1hen7IhzgO79u28HKn05ernc5V5vJhVWGo6dCZBh9B4PyTS+aCURBrYPj6igpYa2bJ6/h8y5Ln4WDxxF2zTBM7zaxKai7qGuVkCrc5VJ0EX57CHQiso3pGZ+4ndqUGo/S0uzBs/DPpeO07xnzTZOIkOOdZtqAcdEAdcquLhVfXg5LdPOBH1uSZFir+WFhMrGFGlZkWFyj+HoLuAfxf7ZjNdP1X282IZARfcWxRRg1qjg+87HvK5MN4OHltYEBs8Y4UvPhLmxXWM0p36HAoU+iQTCCzeNeo2cKzbxKlHWUkn3K9DoIu934o8a7txsKnz+/bxzckB+auwIs8eBOtPwYfBo028J+h4txRW4QKSxZz4Z7BWrUDXKG5HOkIZlWvESg6MQbjeMGK6NoH5NCFZlw6o4bDy6PAzVuomJPSJCUL3xCLpBPyXLtUAHTn+nclPUnCWniqI9iPR7/YarSL9/wyPfAc+I4vF+kLf9mSGNKvOX2U8rHpBtLN+n2pEaLoTLUlOSi0WmW3bEq7utXyE51w7lEwRiLFexjYX2YZ7QMQt+NbByeE74fDTpMSulapit8lhZfVPVZd0esbguQ8Acntp2IcVHR9fv0tLb7q0PQz8S71j02AIp79vAHAILnuvkMyfQiSSCKtiP70D3Fq8VpgRbFrJ1zm2mPUcXfOwvi lErCkgBm KBez6atU3bfA5aUfZc3gvxxx4SEBPgjzoNttHsaoBegcv9WldmIeXNgn8DJXnyvwOHBzeR/PwvM5n1BxahkyfMtjatnQlgWV7Mini1BjrglamKxhXpTvlFIAWewJODljsbKRpQxmIVDLvHWv84/4Ah0W/WuGsXS09xKLp8/J4CyP1HpXvPuJsgHnNMpcaiyNXqW1GWzWdRkL9CpKzh0ahZiYM9Y+itU28ZqsOpptkSNH6ESCBhvT4bqTk2UVHg9v1nT8VFzgOJo1FfIyJRestDKfZng== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/9/6 10:42, Kefeng Wang wrote: > Like commit a873dfe1032a ("mm, hwpoison: try to recover from copy-on > write faults"), there is another path which could crash because it does > not have recovery code where poison is consumed by the kernel in > do_cow_fault(), a crash calltrace shown below on old kernel, but it > could be happened in the lastest mainline code, > > CPU: 7 PID: 3248 Comm: mpi Kdump: loaded Tainted: G OE 5.10.0 #1 > pc : copy_page+0xc/0xbc > lr : copy_user_highpage+0x50/0x9c > Call trace: > copy_page+0xc/0xbc > do_cow_fault+0x118/0x2bc > do_fault+0x40/0x1a4 > handle_pte_fault+0x154/0x230 > __handle_mm_fault+0x1a8/0x38c > handle_mm_fault+0xf0/0x250 > do_page_fault+0x184/0x454 > do_translation_fault+0xac/0xd4 > do_mem_abort+0x44/0xbc > > Fix it by using copy_mc_user_highpage() to handle this case and return > VM_FAULT_HWPOISON for cow fault. > > Signed-off-by: Kefeng Wang > --- > mm/memory.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/mm/memory.c b/mm/memory.c > index 42674c0748cb..d310c073a1b3 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -5089,7 +5089,10 @@ static vm_fault_t do_cow_fault(struct vm_fault *vmf) > if (ret & VM_FAULT_DONE_COW) > return ret; > > - copy_user_highpage(vmf->cow_page, vmf->page, vmf->address, vma); > + if (copy_mc_user_highpage(vmf->cow_page, vmf->page, vmf->address, vma)) { > + ret = VM_FAULT_HWPOISON; > + goto uncharge_out; > + } When copy_mc_user_highpage fails, we should have vmf->page locked and hold the extra refcnt of vmf->page. So we should call unlock_page(vmf->page) and put_page(vmf->page) before goto uncharge_out? Thanks. .