From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.6 required=3.0 tests=BAYES_00, CHARSET_FARAWAY_HEADER,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,PDS_BAD_THREAD_QP_64, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 03E8DC433DB for ; Thu, 18 Mar 2021 17:25:50 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 87EDB64EF2 for ; Thu, 18 Mar 2021 17:25:49 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 87EDB64EF2 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=nec.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id F0F9F6B0073; Thu, 18 Mar 2021 13:25:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EBE796B0075; Thu, 18 Mar 2021 13:25:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D37E36B0078; Thu, 18 Mar 2021 13:25:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0212.hostedemail.com [216.40.44.212]) by kanga.kvack.org (Postfix) with ESMTP id B881E6B0073 for ; Thu, 18 Mar 2021 13:25:48 -0400 (EDT) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 3A8DB812E for ; Thu, 18 Mar 2021 17:25:48 +0000 (UTC) X-FDA: 77933672334.10.A49515B Received: from JPN01-OS2-obe.outbound.protection.outlook.com (mail-eopbgr1410058.outbound.protection.outlook.com [40.107.141.58]) by imf01.hostedemail.com (Postfix) with ESMTP id 7EBEC5001E51 for ; Thu, 18 Mar 2021 17:25:33 +0000 (UTC) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Zdgmr7X/FvdS3/uXatEp1a0PhdVDSs+hiR/f+OsyP89kk4N5IIpVDvR6ukm4Ct5lfC97qdz8i5Cf0t09TyVE31CtLM8QITfhV6mXd6LwwZBPySl8f4fyvGoc3/t1K4D4obJg0X/ucPsGWdhmAiq4ttvGBBqrbg8XoK5sOpe2Ni3oFtFU4K3K+QKkyXLyDpmrUJYn5g/3rN418Xfby8eSmnb1pK66HzcshITjTEq8A/X7ypapHct7wbH31DFSZkjlo6xgjJsaYi5O73GOwyVRDJgnqfrRU9jeX3ZLhVkemoPVFrqxWnsW10smyWftpEcJ+fRZ3X9TM7qGe4CWA8eu8Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=zNDcDxKgqgeo53NWmGx/fDQA4Ku6nPed7oV0japBxc0=; b=egCZja9NNjb94ZkVY9VWZdem2eSfu7UrG40q/LZgmpWxPFfvkkJODfiuTB+K9gNb8Q+TwpOm5eYimnSm8zw5nfxgqJE27gkGqBi1AgqvvYpi+7p9aAAOMc3b8WlUeRA6sTlQwqhBSz9/PVGhEupom07xhZlMNeiPK+onF2RVL4Gemv7idSyP6Cgctl0Otyhs4PNvgZyNAN+nTllb36yGHAcnmlmtP+6DO/DOyzEsDsJk1/q0kcSAAnzJEgdjg+yWXU4+8xZNFmPoKceZu0nqCDSCXcO3bDVUgzmE3aqVSM4rjmFGgqpMIsCkF83AlQwose+MkXA7P65wcJ6U76bpaw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nec.com; dmarc=pass action=none header.from=nec.com; dkim=pass header.d=nec.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nec.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=zNDcDxKgqgeo53NWmGx/fDQA4Ku6nPed7oV0japBxc0=; b=d6AnXDnDCjGQk9lHy3n166/KF1qAbzGEYBkHQwTNURzAqFO4HtQWQNCH0x5MfPd6zghim7ocMK8Gu0+0zWHhzqz5CxIObytl/iTv6OwqxkHyBA91NGYBwFvH91u2OgzM8arUKxm6ikxUnmLubu/ZXn7x+8PRyeYCENeLReZzVVQ= Received: from TY1PR01MB1852.jpnprd01.prod.outlook.com (2603:1096:403:8::12) by TY1PR01MB1850.jpnprd01.prod.outlook.com (2603:1096:403:5::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3933.32; Thu, 18 Mar 2021 17:25:13 +0000 Received: from TY1PR01MB1852.jpnprd01.prod.outlook.com ([fe80::1552:1791:e07c:1f72]) by TY1PR01MB1852.jpnprd01.prod.outlook.com ([fe80::1552:1791:e07c:1f72%7]) with mapi id 15.20.3933.032; Thu, 18 Mar 2021 17:25:13 +0000 From: =?iso-2022-jp?B?SE9SSUdVQ0hJIE5BT1lBKBskQktZOH0hIUQ+TGkbKEIp?= To: Matthew Wilcox CC: "Kirill A. Shutemov" , "linux-mm@kvack.org" , "Kirill A. Shutemov" , Hugh Dickins , Andi Kleen Subject: Re: File THP and HWPoison Thread-Topic: File THP and HWPoison Thread-Index: AQHXHAA4IgNp1V5/UUWgBBvfkRLobaqJ1fgAgAApVQA= Date: Thu, 18 Mar 2021 17:25:12 +0000 Message-ID: <20210318172512.GA30960@hori.linux.bs1.fc.nec.co.jp> References: <20210316140947.GA3420@casper.infradead.org> <20210318140843.7dv3wnxg4geplrjx@box> <20210318145716.GO3420@casper.infradead.org> In-Reply-To: <20210318145716.GO3420@casper.infradead.org> Accept-Language: ja-JP, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: infradead.org; dkim=none (message not signed) header.d=none;infradead.org; dmarc=none action=none header.from=nec.com; x-originating-ip: [165.225.97.70] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 1c855159-6f0b-4df0-0ca0-08d8ea32c9ee x-ms-traffictypediagnostic: TY1PR01MB1850: x-ms-exchange-transport-forked: True x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:7691; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: w2IfVlYlC/sI1NbyvJB5Kfv6bcNsz7jXeDv5zoHBUPbkJ3AyJSC5IWNVk6J6dtHYjJrWncFkMwSczeYYoLSGWzzI1yjwKczkhb4Fh2RzmyCGsWXnKTSHet7rIo+yCvuV/Sj+nNVLj6sj4wKaa5rQyH6m3lk15ARwGzzSCp2uyuhJ453KC26hppYlQjNBGUJ2GYydssXI4UHJj2kPFfvN+1HZazcYzKX4tCEbC1NFp9sEy6UvZqOrMLNqyKblWfNriDRcb0fW0DaJbx3UXTGGiNYigiq83wxgDjpvdI7DphxaOFPrcLFXLsL4GLn5msZhDd9PHw8N78kuWbgfRECsOAumc3Z0qqBop/IdE2jhW1SN8Jc7XdDEsE5pXwV6yM0Uv8l5cPcDSRn+ImN3+qx7l+M/b46SLtsylhCzgkrQEypscXslgTDM7Uug9pvQCxAIIcqWQorYWXlq66+QRisCdzcIML06QEddhc9nEPS/Vw2nyR4QsJEBTcrE5tTlclvtMX36P/M5dUHHLfeVat8gtgBD1ASa0yp1B7hKFx1s5eH3zu6A+BFqJ+pvRcSufBObVBC+xLi/MVCOUHHfl2qck9MeF1gLC+4KMyZSAHfJnUtflNkxEO4bLS9fGDaJkK9WUGvzP94qgxm+bBXjZiSeaEPOM6Vj8gtH576564hi5+jsYcEtq8p8rcBSQyhHSOSNUr7jiMLLVOD7QxtPfysn9Q== x-forefront-antispam-report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:TY1PR01MB1852.jpnprd01.prod.outlook.com;PTR:;CAT:NONE;SFS:(4636009)(396003)(376002)(39840400004)(136003)(346002)(366004)(86362001)(76116006)(85182001)(5660300002)(66946007)(64756008)(66446008)(66476007)(33656002)(71200400001)(66556008)(83380400001)(186003)(6512007)(6916009)(9686003)(26005)(38100700001)(54906003)(6486002)(3480700007)(478600001)(8676002)(966005)(8936002)(316002)(4326008)(1076003)(2906002)(6506007)(55236004);DIR:OUT;SFP:1101; x-ms-exchange-antispam-messagedata: =?iso-2022-jp?B?QjdwbHR0SllQMDF3RjhtVmQ1NHZFR3YrUm5Bb2hoSGs0WEtUSHp1ckFk?= =?iso-2022-jp?B?Wm91TDlBbzBiQ3I1RTFyaFFSQkhPYTdKMEtOVVdWZU5CdUJtc2N6VHZl?= =?iso-2022-jp?B?K2pFMWlHekV4Q0VhN2tncW5EZjdCc2Q1VUt4UnpyWTdzK0RQZ1lDTm5W?= =?iso-2022-jp?B?OEY3VEpFNndURm14cFVKWFF6NjNnaVdXWkVvVm1zaXlJL1M5SjlOQThR?= =?iso-2022-jp?B?amdxTndlSVNFWXlWdnhYWkV0MU5JMk5hTy9CdEZVVDl6YllrU2JEK2s3?= =?iso-2022-jp?B?VmpCeTJJazhwWTZ1VUlNTDRWbkV4L0M2bW1NUkdoa01HT3I0Vis3ZEhF?= =?iso-2022-jp?B?K2ZSNXIzTmlTa1pLMWJxaGFSM0ZOSTZBckFpUXh0M2VTK05temJLaWRX?= =?iso-2022-jp?B?bkhPbGNIVFcxOTQ3SDQzME9HbjNjYUpDY2dkL0g4akpWZlRtUCsvRDZr?= =?iso-2022-jp?B?L0VrMlhrcjlQYWRua1pIM3ZQWnp5ZmR4QVc5cVBqNDRxR2czdGlyM1VC?= =?iso-2022-jp?B?MlRGTlRwN05SUEtUVStlNTN2d3loU1V2MEFSNFJLN29xaGw2K09QYjlQ?= =?iso-2022-jp?B?U0wrbWVwZGYyYm5xM2FBcmx4Q2NyOFlvSDdnNGJkUXdsTG5YbnZNUlZy?= =?iso-2022-jp?B?Um9lMkRIY1RaeTlTS2JqWWJYUGNJbDdvNytGeVMvWThVelI4aHVNMnky?= =?iso-2022-jp?B?Y0FKWU85b3gyTExYeTIvN2k0dk44REQzTDQ2blArMGFjNTVMWElhYUl6?= =?iso-2022-jp?B?V0VYVHNBZ3orcGl1M0V3OFVmZEswUk44eCtLeE01dHVMWUhsd0YyWG45?= =?iso-2022-jp?B?Wlh4VVZFNXpSaDRMR08ydENIMm5uUU9wN1poQ0lyMU5qVXdtZVFPZ0ZI?= =?iso-2022-jp?B?UjJySHk1aGVkclIxd3dQWWxYRWFRYVJEQ0RqT0tkMXRXTEkzSHFGbmpI?= =?iso-2022-jp?B?MlVRKzdsY3JYaE1PYkpieWhiRzlxcTFycmhmMjh5cUJ6QzUxcWRtU0Y1?= =?iso-2022-jp?B?dy96eG9QWVFDMzk0UnVBcThzNHJVZXFCcGV2V2x2ZUhIaWpMblZMeE5y?= =?iso-2022-jp?B?ejRGOVJxRWQ2MWQvZEdkNG9jcndweDF1d1ZIaE92bWtRcmZYNDJ2Uzd3?= =?iso-2022-jp?B?Y2diMThKK0lnbjZJeXkwWjZYK1M2Ri9xc2lzdmJNWUw3NFNXaUlmUEJM?= =?iso-2022-jp?B?VXpjYWovR2tmV3B2WGU2WkxWU29tUGQwMURuUk5IK2hoSEVlY0Fnb0R4?= =?iso-2022-jp?B?OGhHOExyWUowS2RIazdyRmxSc3lhNzIvVEdEQkVVclF1ckRodnZZMTcy?= =?iso-2022-jp?B?SUwyd2VHdlRGUkIvVndyYmxEM0ozTDkxRmpmQmdmcVlpbnBscjd6b0tv?= =?iso-2022-jp?B?M25LZmowTnNENEk3S2k2MVZDOFkyZjJQTjU0eXNzT2VYaDkyRHlXNWRY?= =?iso-2022-jp?B?ckMyWkFXZHJXZUVFVnlOeHpmMDFxVC9oU0FMVnBlUmdjcjdsMjhpTDhZ?= =?iso-2022-jp?B?aGlVaElrWExXUlkwelk5c0FMUFZEakVXUkVJbTJEcWFnUE8xZEI5Z0ky?= =?iso-2022-jp?B?SUszWFBuQnN0bkRwWm9LRVpYK1IrcmJtZlc4eFZBaUxVcXJ6b2xZMXkx?= =?iso-2022-jp?B?YTAwdGlpRHJHbnNKeXlvRzM2Ty9NRWdhdEE0RmxRMlVxeTlhZU16bkhL?= =?iso-2022-jp?B?d3pRNjJaVUJUd1NFL1U3VTlhWXcxd1RLa2h2U3JGdUo1aTB5Tm55RW9U?= =?iso-2022-jp?B?OHB1L0NGcXU5VTFPcCtEdXNWSUc1TVRzN1ZHZlFrV2E3YzE5Vkt2eUFu?= =?iso-2022-jp?B?ZGdLRUpvMjJFV3pJdjE4NHBhSk1UK05rc3k1V3lyY2NxTlpwMVlyWGd2?= =?iso-2022-jp?B?dWkxdFhpckRZTGVwSzdkcFZJc1BmU2pkNC9tRjBIRjdyK2xuKzJ3RWtm?= Content-Type: text/plain; charset="iso-2022-jp" Content-ID: Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: nec.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: TY1PR01MB1852.jpnprd01.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 1c855159-6f0b-4df0-0ca0-08d8ea32c9ee X-MS-Exchange-CrossTenant-originalarrivaltime: 18 Mar 2021 17:25:12.8991 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: e67df547-9d0d-4f4d-9161-51c6ed1f7d11 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: +fFf03yj2gIh5igF4zVKm1O+UPtnj8BJWuBXzbcSPXS+isdp1W18SmrXQZmA5EISfY0PAx9ZJptSKx5DD3KHcw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: TY1PR01MB1850 X-Stat-Signature: xgt9w4bpgerfirhchbbmxan679iuqo5e X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 7EBEC5001E51 Received-SPF: none (nec.com>: No applicable sender policy available) receiver=imf01; identity=mailfrom; envelope-from=""; helo=JPN01-OS2-obe.outbound.protection.outlook.com; client-ip=40.107.141.58 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1616088333-739717 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Mar 18, 2021 at 02:57:16PM +0000, Matthew Wilcox wrote: > On Thu, Mar 18, 2021 at 05:08:43PM +0300, Kirill A. Shutemov wrote: > > On Tue, Mar 16, 2021 at 02:09:47PM +0000, Matthew Wilcox wrote: > > > If we get a memory failure in the middle of a file THP, I think we ha= ndle > > > it poorly. > > >=20 > > > int memory_failure(unsigned long pfn, int flags) > > > ... > > > if (TestSetPageHWPoison(p)) { > > > ... > > > orig_head =3D hpage =3D compound_head(p); > > > ... > > > if (PageTransHuge(hpage)) { > > > if (try_to_split_thp_page(p, "Memory Failure") < 0) { > > > action_result(pfn, MF_MSG_UNSPLIT_THP, MF_IGN= ORED); > > > return -EBUSY; > > > } > > >=20 > > > static int try_to_split_thp_page(struct page *page, const char *msg) > > > { > > > lock_page(page); > > > if (!PageAnon(page) || unlikely(split_huge_page(page))) { > > > unsigned long pfn =3D page_to_pfn(page); > > >=20 > > > unlock_page(page); > > > if (!PageAnon(page)) > > > pr_info("%s: %#lx: non anonymous thp\n", msg,= pfn); > > > else > > > pr_info("%s: %#lx: thp split failed\n", msg, = pfn); > > > put_page(page); > > > return -EBUSY; > > >=20 > > > So (for some reason) we don't even try to split a file THP. But then= , > > > if we take a page fault on a file THP: > > >=20 > > > static struct page *next_uptodate_page(struct page *page, > > > ... > > > if (PageHWPoison(page)) > > > goto skip; > > > (... but we're only testing the head page here, which isn't necessari= ly > > > the one which got the error ...) > > >=20 > > > if (pmd_none(*vmf->pmd) && PageTransHuge(page)) { > > > vm_fault_t ret =3D do_set_pmd(vmf, page); > > >=20 > > > So we now map the PMD-sized page into userspace, even though it has a > > > HWPoison in it. > > >=20 > > > I think there are two things that we should be doing: > > >=20 > > > 1. Attempt to split THPs which are file-backed. That makes most of t= his > > > problem disappear because there won't be THPs with HWPoison, mostly. > >=20 > > +Naoya. Could you give more context here? Recently, I tried to address the problem on https://lore.kernel.org/linux-mm/20210209062128.453814-1-nao.horiguchi@gmai= l.com/ but the patch was found incorrect because the related page table entries di= sappeared after split_huge_page() succeeded. I thought I'm going to study more, but didn't make it this week because I looked at other review requests. A pmd mapping for anonymous thp is replaced with 512 pte mappings by split_huge_page(), so I'm wondering why we don't do the same for shmem thp. >=20 > I did some git archaeology and found this check was introduced in > 7f6bf39bbdd1 ("mm/hwpoison: fix panic due to split huge zero page") where > it wasn't intended to catch _file_ pages at all, but the zero page. > I suspect that nobody thought to look at this when introducing THP > for shmem. Yes, 7f6bf39bbdd1 was worked before thp page cache, so we did not consider it at that time. Thanks, Naoya Horiguchi >=20 > > > 2. When the THP fails to split, use a spare page flag to indicate tha= t > > > the THP contains a HWPoison bit in one of its subpages. There are a > > > lot of PF_SECOND flags available for this purpose. > > >=20 > > > but I know almost nothing about the memory-failure subsystem and I'm > > > still learning all the complexities of THPs, so it's entirely possibl= e > > > I've overlooked something important. > >=20 > > I wounder if it would be cleaner to switch PG_hwpoison to PF_HEAD: if > > split failed we posion whole compound page. Yes, we will waste more > > memory, but it makes it much cleaner for user: just check if the page i= s > > poisoned. >=20 > I think that's a poor quality implementation ... it'd cause processes > to die that weren't even touching the page that had hwpoison. Using > a PF_SECOND bit lets us do the check as cheaply as if we made hwpoison > PF_HEAD.=