From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E9EB2C433F5 for ; Thu, 23 Sep 2021 20:40:05 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7A76C61164 for ; Thu, 23 Sep 2021 20:40:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 7A76C61164 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id F3ECF900002; Thu, 23 Sep 2021 16:40:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EC7166B0072; Thu, 23 Sep 2021 16:40:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D67E7900002; Thu, 23 Sep 2021 16:40:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0239.hostedemail.com [216.40.44.239]) by kanga.kvack.org (Postfix) with ESMTP id C17EA6B0071 for ; Thu, 23 Sep 2021 16:40:04 -0400 (EDT) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 702BE1836B076 for ; Thu, 23 Sep 2021 20:40:04 +0000 (UTC) X-FDA: 78620005128.08.DBFAF0C Received: from mail-ed1-f50.google.com (mail-ed1-f50.google.com [209.85.208.50]) by imf21.hostedemail.com (Postfix) with ESMTP id 389B7D02F4B4 for ; Thu, 23 Sep 2021 20:40:04 +0000 (UTC) Received: by mail-ed1-f50.google.com with SMTP id v24so27630791eda.3 for ; Thu, 23 Sep 2021 13:40:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=rROM2SQJGIjf7YaskdxJKjD00Nttcaop2C06zG+GQdk=; b=Get8jwpdrg7VgSWQme5Kpcu2NOvv0BNPWb2i8qGQynkG9g0Y0+SRut2kBzgMcwmGmp SYs8QZtCnGb9nyPJzQuzU+O+BDNU41IQwaV2kg1ZIQDSZFnRrRz6im63AHLVXpaYyqlk aRguPB5ImCSsXos+TufuMYLlUao6WAxVvWEqjepdt46geTF16Tvu5MbK3P2AdaynDH/n BwzCAm0tyzjHCYdaRkAg2llpPR8HLrpuIFaYLpA/jUv4DZJBE8u3Kddyb4Pbx2EWbbU9 xmcgLx4E900vievL4iImM1V6wmoXHXXSNOdBdJ0wRQFK5SqUUeyvEvCatjwMsQgKOtRZ JPFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=rROM2SQJGIjf7YaskdxJKjD00Nttcaop2C06zG+GQdk=; b=k/D4mo8f0e9H8JGsXpDod6nmZ4ukZNwdCL93do2yrjYqMzgNYb3OmrqRQdaur/LcR3 aIcAYb+XTKsBHsuEIh3jjOPytPT2IHTMnYQi0go9v8ujTO9bLZpnOMot4N9QdChamFiJ nt2KNRqJB0RjoeCh0Mkj9j5frXv9RcD2YuoauOMOTz8xBrwM6CKK39DzD6XYIycOzSQr Ef70aL3+i28ie0o9Nwb8r7wOkxtueIAP5osoyHWshrYAQLuHATLr+XzqDaSi0BMvGKQh jcQUEVbXX1DZf9f/sDw+mSm/zTH+JzD/QTyRBn3zTBMfv99o3gtstiUSgwUYx5ykA1ff xdIg== X-Gm-Message-State: AOAM531+cr9T+IfcrofflVCWDlZmLVUGI8s9IvUoBStYZHiTFAIYM8+Z nlBav3AJXUPZKC+YTAwahOtk9D0w9lXkR+o3rY0= X-Google-Smtp-Source: ABdhPJzEDnlbZpJuZ8uiE4FoKnNlFLHoMIEU5vEGgonHRoioJOqUedUclccqKpHYY0HT6iK/rt9jJh71w3BmVaDRICY= X-Received: by 2002:a50:e0c8:: with SMTP id j8mr903525edl.283.1632429602265; Thu, 23 Sep 2021 13:40:02 -0700 (PDT) MIME-Version: 1.0 References: <20210923032830.314328-1-shy828301@gmail.com> <20210923032830.314328-2-shy828301@gmail.com> <20210923143901.mdc6rejuh7hmr5vh@box.shutemov.name> In-Reply-To: From: Yang Shi Date: Thu, 23 Sep 2021 13:39:49 -0700 Message-ID: Subject: Re: [v2 PATCH 1/5] mm: filemap: check if THP has hwpoisoned subpage for PMD page fault To: "Kirill A. Shutemov" Cc: =?UTF-8?B?SE9SSUdVQ0hJIE5BT1lBKOWggOWPoyDnm7TkuZ8p?= , Hugh Dickins , "Kirill A. Shutemov" , Matthew Wilcox , Peter Xu , Oscar Salvador , Andrew Morton , Linux MM , Linux FS-devel Mailing List , Linux Kernel Mailing List Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 389B7D02F4B4 X-Stat-Signature: ynzznjbik8mj5e7m4wnbkk8p7yjtgtt4 Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=Get8jwpd; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf21.hostedemail.com: domain of shy828301@gmail.com designates 209.85.208.50 as permitted sender) smtp.mailfrom=shy828301@gmail.com X-HE-Tag: 1632429604-346271 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Sep 23, 2021 at 10:15 AM Yang Shi wrote: > > On Thu, Sep 23, 2021 at 7:39 AM Kirill A. Shutemov wrote: > > > > On Wed, Sep 22, 2021 at 08:28:26PM -0700, Yang Shi wrote: > > > When handling shmem page fault the THP with corrupted subpage could be PMD > > > mapped if certain conditions are satisfied. But kernel is supposed to > > > send SIGBUS when trying to map hwpoisoned page. > > > > > > There are two paths which may do PMD map: fault around and regular fault. > > > > > > Before commit f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths") > > > the thing was even worse in fault around path. The THP could be PMD mapped as > > > long as the VMA fits regardless what subpage is accessed and corrupted. After > > > this commit as long as head page is not corrupted the THP could be PMD mapped. > > > > > > In the regulat fault path the THP could be PMD mapped as long as the corrupted > > > > s/regulat/regular/ > > > > > page is not accessed and the VMA fits. > > > > > > This loophole could be fixed by iterating every subpage to check if any > > > of them is hwpoisoned or not, but it is somewhat costly in page fault path. > > > > > > So introduce a new page flag called HasHWPoisoned on the first tail page. It > > > indicates the THP has hwpoisoned subpage(s). It is set if any subpage of THP > > > is found hwpoisoned by memory failure and cleared when the THP is freed or > > > split. > > > > > > Cc: > > > Suggested-by: Kirill A. Shutemov > > > Signed-off-by: Yang Shi > > > --- > > > > ... > > > > > diff --git a/mm/filemap.c b/mm/filemap.c > > > index dae481293b5d..740b7afe159a 100644 > > > --- a/mm/filemap.c > > > +++ b/mm/filemap.c > > > @@ -3195,12 +3195,14 @@ static bool filemap_map_pmd(struct vm_fault *vmf, struct page *page) > > > } > > > > > > if (pmd_none(*vmf->pmd) && PageTransHuge(page)) { > > > - vm_fault_t ret = do_set_pmd(vmf, page); > > > - if (!ret) { > > > - /* The page is mapped successfully, reference consumed. */ > > > - unlock_page(page); > > > - return true; > > > - } > > > + vm_fault_t ret = do_set_pmd(vmf, page); > > > + if (ret == VM_FAULT_FALLBACK) > > > + goto out; > > > > Hm.. What? I don't get it. Who will establish page table in the pmd then? > > Aha, yeah. It should jump to the below PMD populate section. Will fix > it in the next version. > > > > > > + if (!ret) { > > > + /* The page is mapped successfully, reference consumed. */ > > > + unlock_page(page); > > > + return true; > > > + } > > > } > > > > > > if (pmd_none(*vmf->pmd)) { > > > @@ -3220,6 +3222,7 @@ static bool filemap_map_pmd(struct vm_fault *vmf, struct page *page) > > > return true; > > > } > > > > > > +out: > > > return false; > > > } > > > > > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > > > index 5e9ef0fc261e..0574b1613714 100644 > > > --- a/mm/huge_memory.c > > > +++ b/mm/huge_memory.c > > > @@ -2426,6 +2426,8 @@ static void __split_huge_page(struct page *page, struct list_head *list, > > > /* lock lru list/PageCompound, ref frozen by page_ref_freeze */ > > > lruvec = lock_page_lruvec(head); > > > > > > + ClearPageHasHWPoisoned(head); > > > + > > > > Do we serialize the new flag with lock_page() or what? I mean what > > prevents the flag being set again after this point, but before > > ClearPageCompound()? > > No, not in this patch. But I think we could use refcount. THP split > would freeze refcount and the split is guaranteed to succeed after > that point, so refcount can be checked in memory failure. The > SetPageHasHWPoisoned() call could be moved to __get_hwpoison_page() > when get_unless_page_zero() bumps the refcount successfully. If the > refcount is zero it means the THP is under split or being freed, we > don't care about these two cases. Setting the flag in __get_hwpoison_page() would make this patch depend on patch #3. However, this patch probably will be backported to older versions. To ease the backport, I'd like to have the refcount check in the same place where THP is checked. So, something like "if (PageTransHuge(hpage) && page_count(hpage) != 0)". Then the call to set the flag could be moved to __get_hwpoison_page() in the following patch (after patch #3). Does this sound good to you? > > The THP might be mapped before this flag is set, but the process will > be killed later, so it seems fine. > > > > > -- > > Kirill A. Shutemov