From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2CB90C433FE for ; Thu, 20 Oct 2022 18:42:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 508ED8E0002; Thu, 20 Oct 2022 14:42:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4B93A8E0001; Thu, 20 Oct 2022 14:42:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 380C68E0002; Thu, 20 Oct 2022 14:42:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 2A0778E0001 for ; Thu, 20 Oct 2022 14:42:50 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 020ED1C6B60 for ; Thu, 20 Oct 2022 18:42:49 +0000 (UTC) X-FDA: 80042199300.20.3DBB513 Received: from mail-pl1-f182.google.com (mail-pl1-f182.google.com [209.85.214.182]) by imf18.hostedemail.com (Postfix) with ESMTP id 959481C0006 for ; Thu, 20 Oct 2022 18:42:49 +0000 (UTC) Received: by mail-pl1-f182.google.com with SMTP id f9so105648plb.13 for ; Thu, 20 Oct 2022 11:42:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=GiLFWWbEtdCDuURuAcgBAS+IDqShPLESUc/digwUyOQ=; b=is+dxxAdd4jXbGomGgpqBlglzFuIhZBN+mY22McB5iz1ijkjv+BOtTkj7PHH7jR4w+ vWEf+Njo+LaPuFFiSjL3qGSpn9IMgegQr4ayfAi9LSJY330Mdg65huw6G/+2QWT7JB0l EfXLfofcVi43z9wEOwUm6BZuCO5InhFjCF0FSy7VT3uBkbIhZ/l++dJmv1aDxm7aheJw jKfZRNM3zxR2uIVrMmPR/7yWQA6XYzXF8JfJprgTguhr4dDSSc/NpKW6eymDIRs6Nuc/ /QS2HSwvHLZKW0aeS7OMu/sH2PvqTgBZ9wVY6+N6wY5jWM7X7YUgjBhQ27/noQ7Ih22l CiFg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=GiLFWWbEtdCDuURuAcgBAS+IDqShPLESUc/digwUyOQ=; b=7SfmucfPH9+N4tQcqt135hFJHxSIATpkxvVFu2FadJJc7jeG7mj+eTmdm8AeWzkdVe pBpZtV6tfnZYV2IvI2+U64RV4RsLnpZod2jdiibbrRD9/l9McZaruJvt1qmmx2Mbswkl m2HNuhJAkeXZEoii7/Us6z0XeseMHJloRiJ/FYtZkVNIdoYnc5/6pg3AsLzl9pI0F3zR g+PLiDaG0DMB83nuP3gwiwYJMteOW8+305oE8/dYYKFQGGSGMlyF+gJo+EAA1vQpbzgv DzBV9Ve9cpyc+MMjEofO9Bew8nbr3VCJHNVj7COTX+6x624q7xPoOdstCkgpfdpPAoYG RtOQ== X-Gm-Message-State: ACrzQf2bOlGf0XgX9z/t6RgL3tRYGHQ9IOcB0OWOWe17thHpx/kjW8Mp wJeA+z8rqsZekx12Pq3Vmi3ecSX7GEDD50CBEsY= X-Google-Smtp-Source: AMsMyM6UhYzjmEGu+p0gK8975yIVhbSs0Cg4B50yT+TQ3UmVYiKa78m9OHOBciwIIqBzUSJ5kno8pCZNuZrQ9fnQ+6Y= X-Received: by 2002:a17:903:41cf:b0:183:5a22:c63e with SMTP id u15-20020a17090341cf00b001835a22c63emr14905540ple.61.1666291368493; Thu, 20 Oct 2022 11:42:48 -0700 (PDT) MIME-Version: 1.0 References: <20221018200125.848471-1-jthoughton@google.com> In-Reply-To: From: Yang Shi Date: Thu, 20 Oct 2022 11:42:36 -0700 Message-ID: Subject: Re: [PATCH] hugetlbfs: don't delete error page from pagecache To: Mike Kravetz Cc: James Houghton , Muchun Song , Naoya Horiguchi , Miaohe Lin , Andrew Morton , Axel Rasmussen , linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1666291369; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=GiLFWWbEtdCDuURuAcgBAS+IDqShPLESUc/digwUyOQ=; b=kf9VGshusX5ez7y+reKTCtUTdvpsX1TB4u72WaJkN6NRpdL9ecEFA7zsXSOUB8JAlI61qK nOMYSjoSkRVNVkBtETEe2l2nNnWzBUBDKZAvMYimMZnjyJh7T4qvzaCO3qlJ6bc5sSt6pK W4Ekhdv+l/+NAwXos7Qs+2SBz/+RZbk= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=is+dxxAd; spf=pass (imf18.hostedemail.com: domain of shy828301@gmail.com designates 209.85.214.182 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1666291369; a=rsa-sha256; cv=none; b=OPvhH0TgD7bhHYIWuD91XD8d3cUaby9eFgFvkwUw72kvypfweVIm57o7qft9gtgEnnRPTt fXNv2PnpWGgUQ4zTVVJE+ww6ncpVp2RyJrmAKWjV3TyPzUxj+IaaE5iahO6vr90nxckO07 yGWoCamEaiFkhb7tnYXic/gx+WQdY6M= X-Rspamd-Server: rspam12 X-Rspam-User: Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=is+dxxAd; spf=pass (imf18.hostedemail.com: domain of shy828301@gmail.com designates 209.85.214.182 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-Stat-Signature: aiqpfdi71bxrxkq3584fdgacrmzpsbu6 X-Rspamd-Queue-Id: 959481C0006 X-HE-Tag: 1666291369-511728 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Oct 19, 2022 at 11:55 AM Mike Kravetz wrote: > > On 10/19/22 11:31, Yang Shi wrote: > > On Tue, Oct 18, 2022 at 1:01 PM James Houghton wrote: > > > > > > This change is very similar to the change that was made for shmem [1], > > > and it solves the same problem but for HugeTLBFS instead. > > > > > > Currently, when poison is found in a HugeTLB page, the page is removed > > > from the page cache. That means that attempting to map or read that > > > hugepage in the future will result in a new hugepage being allocated > > > instead of notifying the user that the page was poisoned. As [1] states, > > > this is effectively memory corruption. > > > > > > The fix is to leave the page in the page cache. If the user attempts to > > > use a poisoned HugeTLB page with a syscall, the syscall will fail with > > > EIO, the same error code that shmem uses. For attempts to map the page, > > > the thread will get a BUS_MCEERR_AR SIGBUS. > > > > > > [1]: commit a76054266661 ("mm: shmem: don't truncate page if memory failure happens") > > > > > > Signed-off-by: James Houghton > > > > Thanks for the patch. Yes, we should do the same thing for hugetlbfs. > > When I was working on shmem I did look into hugetlbfs too. But the > > problem is we actually make the whole hugetlb page unavailable even > > though just one 4K sub page is hwpoisoned. It may be fine to 2M > > hugetlb page, but a lot of memory may be a huge waste for 1G hugetlb > > page, particular for the page fault path. > > One thing that complicated this a bit is the vmemmap optimizations for > hugetlb. However, I believe Naoya may have addressed this recently. > > > So I discussed this with Mike offline last year, and I was told Google > > was working on PTE mapped hugetlb page. That should be able to solve > > the problem. And we'd like to have the high-granularity hugetlb > > mapping support as the predecessor. > > Yes, I went back in my notes and noticed it had been one year. No offense > intended to James and his great work on HGM. However, in hindsight we should > have fixed this in some way without waiting for a HGM based. > > > There were some other details, but I can't remember all of them, I > > have to refresh my memory by rereading the email discussions... > > I think the complicating factor was vmemmap optimization. As mentioned > above, this may have already been addressed by Naoya in patches to > indicate which sub-page(s) had the actual error. > > As Yang Shi notes, this patch makes the entire hugetlb page inaccessible. > With some work, we could allow reads to everything but the sub-page with > error. However, this should be much easier with HGM. And, we could > potentially even do page faults everywhere but the sub-page with error. > > I still think it may be better to wait for HGM instead of trying to do > read access to all but sub-page with error now. But, entirely open to > other opinions. I have no strong preference about which goes first. > > I plan to do a review of this patch a little later. > -- > Mike Kravetz