From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0F6DAC433EF for ; Sun, 5 Jun 2022 04:28:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5D95B6B0071; Sun, 5 Jun 2022 00:28:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 587706B0073; Sun, 5 Jun 2022 00:28:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 450DE6B0074; Sun, 5 Jun 2022 00:28:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 371556B0071 for ; Sun, 5 Jun 2022 00:28:30 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id F0D29602CA for ; Sun, 5 Jun 2022 04:28:29 +0000 (UTC) X-FDA: 79542900738.19.D3F912B Received: from mail-pl1-f177.google.com (mail-pl1-f177.google.com [209.85.214.177]) by imf11.hostedemail.com (Postfix) with ESMTP id E0F804000F for ; Sun, 5 Jun 2022 04:28:18 +0000 (UTC) Received: by mail-pl1-f177.google.com with SMTP id o17so9730299pla.6 for ; Sat, 04 Jun 2022 21:28:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=message-id:date:mime-version:user-agent:subject:content-language:to :cc:references:from:in-reply-to:content-transfer-encoding; bh=cwL2wHcA34X3qXSJRLp9vC/RDXcQCiZVTt+d5U9lstk=; b=2VOwiJZp5DrIJXyOlgpB6VfK40D/XqlM2qapqxtFAZMjZMOMHTJFFkULcMMCP6JArV 87gJxVxrXOesZpMAzFjvBoZNfijEo6hJbiWSYNwhTxcfUJF4V3dVGMaKax7nChHJabyX UMUYGw90KXmANZKYS886MNSWBEJPgITEhpSgfiVKJ5NqSzto8yjP0IwUSTYz0OohkL88 BMWiGS3MYCEoAKKMAt+edZZJlkOX9PoMuP12bSKAEDeIk6r5TwF4RK/mC4P8jRPThBIm E0TkMyZpGS1/e76IsxNQ1A6tCp7MMhyU/lp6xThIH8z7iv3HJCmmDq0l+G2PwjmWEnqx SnGw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:cc:references:from:in-reply-to :content-transfer-encoding; bh=cwL2wHcA34X3qXSJRLp9vC/RDXcQCiZVTt+d5U9lstk=; b=tdL/b/mA7NelNSWEYwdh7zR8l9Xdvvv3I9teaFvzF+JurRCnmjr1JGWIAGL1j0y0ha K8Jdo2gFR4Nakhq256znWStTpFy958rwO5khZU3P/a4vtlpT8nu9Cyq7JCm0lLoFcFlq b82fymcPnIn+JBV0piQDPF3YjAKQP2JxpM593LAZLLH+ysOCPMyiTk7gno6HBJjAghPT AIfiGw39mWkqTrfmBkPn/axBHlgzASAoLDRzK34EWHTe/68pdqz76pfejGkzW/NDPkkA Ump19pGLt81lo6BFvAuSMqDwTs4r1DWxP2zeO85sl8vdOOqswPAeyyTqLrp+7XWMowMO E2hA== X-Gm-Message-State: AOAM5312qdgttT5UOeHijnZSipCLt20Z+zTgOpMg4GNMdtmgGZuvZV7M LrUZBF1LHY3QGQk/my43ziUZAQ== X-Google-Smtp-Source: ABdhPJwwme7JJVfs5BR6bWdb7PU9V2aWDGFRO/Qd+1CJvHyMOMIPgOQ5SnHjMPt2rjR5Fm2enliMHA== X-Received: by 2002:a17:90a:17c9:b0:1e8:5e58:f658 with SMTP id q67-20020a17090a17c900b001e85e58f658mr4940749pja.239.1654403306417; Sat, 04 Jun 2022 21:28:26 -0700 (PDT) Received: from [10.255.89.136] ([139.177.225.249]) by smtp.gmail.com with ESMTPSA id 13-20020a170902c20d00b0015e8d4eb2adsm7927589pll.247.2022.06.04.21.28.21 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 04 Jun 2022 21:28:25 -0700 (PDT) Message-ID: <584eedd3-9369-9df1-39e2-62e331abdcc0@bytedance.com> Date: Sun, 5 Jun 2022 12:24:24 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.8.1 Subject: Re: Re: [PATCH] mm/memory-failure: don't allow to unpoison hw corrupted page Content-Language: en-US To: Andrew Morton Cc: naoya.horiguchi@nec.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Tony Luck , Wu Fengguang References: <20220604103229.3378591-1-pizhenwei@bytedance.com> <20220604115616.b7d5912ac5a37db608f67b78@linux-foundation.org> From: zhenwei pi In-Reply-To: <20220604115616.b7d5912ac5a37db608f67b78@linux-foundation.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Stat-Signature: kt3otywipdsnf1qtypct1yq7yifazkei X-Rspam-User: Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=2VOwiJZp; spf=pass (imf11.hostedemail.com: domain of pizhenwei@bytedance.com designates 209.85.214.177 as permitted sender) smtp.mailfrom=pizhenwei@bytedance.com; dmarc=pass (policy=none) header.from=bytedance.com X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: E0F804000F X-HE-Tag: 1654403298-461306 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 6/5/22 02:56, Andrew Morton wrote: > On Sat, 4 Jun 2022 18:32:29 +0800 zhenwei pi wrote: > >> Currently unpoison_memory(unsigned long pfn) is designed for soft >> poison(hwpoison-inject) only. Unpoisoning a hardware corrupted page >> puts page back buddy only, this leads BUG during accessing on the >> corrupted KPTE. >> >> Do not allow to unpoison hardware corrupted page in unpoison_memory() >> to avoid BUG like this: >> >> Unpoison: Software-unpoisoned page 0x61234 >> BUG: unable to handle page fault for address: ffff888061234000 > > Thanks. > >> --- a/mm/memory-failure.c >> +++ b/mm/memory-failure.c >> @@ -2090,6 +2090,7 @@ int unpoison_memory(unsigned long pfn) >> { >> struct page *page; >> struct page *p; >> + pte_t *kpte; >> int ret = -EBUSY; >> int freeit = 0; >> static DEFINE_RATELIMIT_STATE(unpoison_rs, DEFAULT_RATELIMIT_INTERVAL, >> @@ -2101,6 +2102,13 @@ int unpoison_memory(unsigned long pfn) >> p = pfn_to_page(pfn); >> page = compound_head(p); >> >> + kpte = virt_to_kpte((unsigned long)page_to_virt(p)); >> + if (kpte && !pte_present(*kpte)) { >> + unpoison_pr_info("Unpoison: Page was hardware poisoned %#lx\n", >> + pfn, &unpoison_rs); >> + return -EPERM; >> + } >> + >> mutex_lock(&mf_mutex); >> >> if (!PageHWPoison(p)) { > > I guess we don't want to let fault injection crash the kernel, so a > cc:stable seems appropriate here. > > Can we think up a suitable Fixes: commit? I'm suspecting this bug has > been there for a long time? > Sure! 2009-Dec-16, hwpoison_unpoison() was introduced into linux in commit: 847ce401df392("HWPOISON: Add unpoisoning support") ... There is no hardware level unpoisioning, so this cannot be used for real memory errors, only for software injected errors. ... We can find that this function should be used for software level unpoisoning only in both commit log and comment in source code. unfortunately there is no check in function hwpoison_unpoison(). 2020-May-20, 17fae1294ad9d("x86/{mce,mm}: Unmap the entire page if the whole page is affected and poisoned") This clears KPTE, and leads BUG(described in this patch) during unpoisoning the hardware corrupted page. Fixes: 847ce401df392("HWPOISON: Add unpoisoning support") Fixes: 17fae1294ad9d("x86/{mce,mm}: Unmap the entire page if the whole page is affected and poisoned") Cc: Wu Fengguang Cc: Tony Luck . -- zhenwei pi