From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8AFFAC7619A for ; Thu, 13 Apr 2023 01:49:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EDDBA6B0072; Wed, 12 Apr 2023 21:49:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E8C4E6B0074; Wed, 12 Apr 2023 21:49:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D2D25900002; Wed, 12 Apr 2023 21:49:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id C03C16B0072 for ; Wed, 12 Apr 2023 21:49:39 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 84FE61A02FA for ; Thu, 13 Apr 2023 01:49:39 +0000 (UTC) X-FDA: 80674686078.23.21126F6 Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) by imf25.hostedemail.com (Postfix) with ESMTP id 563ECA000C for ; Thu, 13 Apr 2023 01:49:35 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=none; spf=pass (imf25.hostedemail.com: domain of liushixin2@huawei.com designates 45.249.212.255 as permitted sender) smtp.mailfrom=liushixin2@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681350577; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=X8gB5Ow/4SF1GtRdu9KOF+kJfunPIOjkxrI5CJJxXi4=; b=mmTcLoz5H+/bsafkVgPSigFU7Gcfp+Jhl8uo2tBDKWCw/sB4aRCKCfLk7zDeFFtYuMTTGR Kes3UplicDRdHpzIV5hPM2xhqjXKyRGY138Z9R+cBUE92Tq8FTATJHA+qaVfFYGTVrqx6w yAB5eOVyAH6jcwlPCuFQ1V65qblTXvw= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=none; spf=pass (imf25.hostedemail.com: domain of liushixin2@huawei.com designates 45.249.212.255 as permitted sender) smtp.mailfrom=liushixin2@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681350577; a=rsa-sha256; cv=none; b=oVmw7PaJE9d4OuYdwBpc0123N8U0Odug1glIhydPYPaU6Uey0a35gzh3ZBoMsrnbxeCbAU U0aZMBfyNFo2Xbq28ci3fsr8U+GMnfzR/rENhrEarMApQXKWXYyiN50yHWPB36tSp6CIPN MqL0XxqItkVEG9AKjs1tIU+UAWwdSD0= Received: from dggpemm100009.china.huawei.com (unknown [172.30.72.53]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4Pxj8j21Qkz17SGR; Thu, 13 Apr 2023 09:45:57 +0800 (CST) Received: from [10.174.179.24] (10.174.179.24) by dggpemm100009.china.huawei.com (7.185.36.113) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Thu, 13 Apr 2023 09:49:31 +0800 Subject: Re: [PATCH -next] mm: hwpoison: support recovery from HugePage copy-on-write faults To: Andrew Morton , Mike Kravetz References: <20230411092741.780679-1-liushixin2@huawei.com> <20230412181350.GA22818@monkey> <20230412145718.0bcb7dd98112a3010711ad0b@linux-foundation.org> CC: Naoya Horiguchi , Tony Luck , Miaohe Lin , Muchun Song , , From: Liu Shixin Message-ID: <28bf1701-d2c6-ee2a-d92d-a603e1a1b3dd@huawei.com> Date: Thu, 13 Apr 2023 09:49:31 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.7.1 MIME-Version: 1.0 In-Reply-To: <20230412145718.0bcb7dd98112a3010711ad0b@linux-foundation.org> Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.174.179.24] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To dggpemm100009.china.huawei.com (7.185.36.113) X-CFilter-Loop: Reflected X-Stat-Signature: fit6n6y6emefpxkgtbwununmt7uz9ou4 X-Rspam-User: X-Rspamd-Queue-Id: 563ECA000C X-Rspamd-Server: rspam06 X-HE-Tag: 1681350575-798151 X-HE-Meta: U2FsdGVkX1/KAktPHN9sEnIxhYLtbkfs3Zlq9FwezEKpFg768/MOxyLzmZz4UkhyBjRkzNNhat1byi+4mFFfmA/9ua4kRvKPIHpSyK7QkZqwm5t0x4seLJT8MWVbXFkSCkgERQKImgYdZzB08RtMoh8QcTYbsf2sdxCRmtV4DFXxrHmgIHYLO1O8KfW4XYsr9GjB0q7qXDv4KjjfUCZeTvZQJ7+Z2wR9ld1CDsLiWbEKwYpHhtoDMiZM87igzZ5px7Anr4PdBf1JzuHnNiS5mZV/Xt9Ar26QfcXV3zV6+1xrY3sKgNZzqCgy+NWHuGN6F2w2gD7HWcQuPWqfxWLcSrXN19KaA+rjWFQg0qCw/1CCh1vY6JVNmYPtzv/6e7OuAkI+Ohfj05au8J1ed4SHkv+eI16eo9puyg2ekvgKOJKC4Uu8Qv/zSiLufoUHr/Vix30TT+MvWcWM5hk8plaveAo0YpC10diwRc3h2BH3leF7wQ5MUESQrHwbdLNKHW2h1D+hDF1W4hqbkhGG5WuJwzx/5f06zOh6uudtUiUdd6zr0f+6Z5DmTVDukAvQZT+UpF/n9NUcQW5H2QhVA9ds4W4B7QvQcxLTES6bVhRcTZrIaHbYxU6et/fBZAkPQxXH1oieMwgpPdgA1uzvFoHQdwRYj73/W2GGbHIoZtrjIXkXqJU4YtPsurFsZLhUnSUzgJ2ujy+zU/rg1mcsWQIvFp/uLhn1UscHXmzJWeLrYOvFYURMqQR4/XcRzOmUDqjTjKDVaSAcYtZpln8CRgjtQpqhY/jh3JYhfFD5D4aYcqgvnxcSLnl3cf1oBXVufKltq9eOvGAo+TcGZQXsvH7VnUQEwhqK/rXNhtDBNqJWZFrM8+XO30E/UrHOX368ho0eTSSRt3b67KWa69VM/iksatIHn4ZWuNvjcHPOqm1O5Gce5KWpXpfOa5Jc+3Mu/0bogk2FT2w5U/VC3atImIl IKv8p+Y4 5SmsdJCDvhkyB9mbb8H8YkppS3MBlcU8kB8k6S3Fd7+J+3JKw5kd9yaYpI05RCX+M7mi85nJVFf1HudcXDMnOuY4XMQDhrq+JiBCP+jLc7unFmjvhaP8N2lG7XmSWW/2bT2PE+2cXAnx5O/aun8+0PgnGxmDHL0v6G9VJpSDjRm3KQEQhhDjy20+/oIIdfSwcWDvdQzunKlswfZEmWTN9jI2g68tKlvXJ8hs6n/X4mW16M30= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2023/4/13 5:57, Andrew Morton wrote: > On Wed, 12 Apr 2023 11:13:50 -0700 Mike Kravetz wrote: > >> On 04/11/23 17:27, Liu Shixin wrote: >>> Patch a873dfe1032a ("mm, hwpoison: try to recover from copy-on write faults") >>> introduced a new copy_user_highpage_mc() function, and fix the kernel crash >>> when the kernel is copying a normal page as the result of a copy-on-write >>> fault and runs into an uncorrectable error. But it doesn't work for HugeTLB. >> Andrew asked about user-visible effects. Perhaps, a better way of >> stating this in the commit message might be: >> >> Commit a873dfe1032a ("mm, hwpoison: try to recover from copy-on write >> faults") introduced the routine copy_user_highpage_mc() to gracefully >> handle copying of user pages with uncorrectable errors. Previously, >> such copies would result in a kernel crash. hugetlb has separate code >> paths for copy-on-write and does not benefit from the changes made in >> commit a873dfe1032a. >> >> Modify hugetlb copy-on-write code paths to use copy_mc_user_highpage() >> so that they can also gracefully handle uncorrectable errors in user >> pages. This involves changing the hugetlb specific routine >> ?copy_user_folio()? from type void to int so that it can return an error. >> Modify the hugetlb userfaultfd code in the same way so that it can return >> -EHWPOISON if it encounters an uncorrectable error. > Thanks, but... what are the runtime effects? What does hugetlb > presently do when encountering these uncorrectable error? I have tested the HugeTLB case by using tony's testcase[1](need add a MAP_HUGETLB). Before this patch, the kernel will crash due to the uncorrectable errors. After this patch, if the error occurs in copy-on-write, the process will be killed, if the errors occurs in userfaultfd, it will return -EHWPOISON. Link: https://git.kernel.org/pub/scm/linux/kernel/git/aegl/ras-tools.git [1] > > > . >