From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.4 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AD7C4C3A5A2 for ; Sat, 21 Sep 2019 01:19:12 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 41E5D20882 for ; Sat, 21 Sep 2019 01:19:12 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=nvidia.com header.i=@nvidia.com header.b="F+tZCjP1" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 41E5D20882 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=nvidia.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id AA5FF6B0007; Fri, 20 Sep 2019 21:19:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A56EC6B0008; Fri, 20 Sep 2019 21:19:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 945DC6B000A; Fri, 20 Sep 2019 21:19:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0144.hostedemail.com [216.40.44.144]) by kanga.kvack.org (Postfix) with ESMTP id 6FBFB6B0007 for ; Fri, 20 Sep 2019 21:19:11 -0400 (EDT) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with SMTP id 20E99181AC9AE for ; Sat, 21 Sep 2019 01:19:11 +0000 (UTC) X-FDA: 75957169302.16.lace23_6d2f943d4a40e X-HE-Tag: lace23_6d2f943d4a40e X-Filterd-Recvd-Size: 5246 Received: from hqemgate16.nvidia.com (hqemgate16.nvidia.com [216.228.121.65]) by imf46.hostedemail.com (Postfix) with ESMTP for ; Sat, 21 Sep 2019 01:19:10 +0000 (UTC) Received: from hqpgpgate101.nvidia.com (Not Verified[216.228.121.13]) by hqemgate16.nvidia.com (using TLS: TLSv1.2, DES-CBC3-SHA) id ; Fri, 20 Sep 2019 18:19:14 -0700 Received: from hqmail.nvidia.com ([172.20.161.6]) by hqpgpgate101.nvidia.com (PGP Universal service); Fri, 20 Sep 2019 18:19:08 -0700 X-PGP-Universal: processed; by hqpgpgate101.nvidia.com on Fri, 20 Sep 2019 18:19:08 -0700 Received: from DRHQMAIL107.nvidia.com (10.27.9.16) by HQMAIL101.nvidia.com (172.20.187.10) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Sat, 21 Sep 2019 01:19:08 +0000 Received: from [10.110.48.28] (10.124.1.5) by DRHQMAIL107.nvidia.com (10.27.9.16) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Sat, 21 Sep 2019 01:19:07 +0000 Subject: Re: [PATCH 3/3] mm:fix gup_pud_range To: Qiujun Huang CC: , , , , , Aneesh Kumar K.V , , , References: <1568994684-1425-1-git-send-email-hqjagain@gmail.com> <1a162778-41b9-4428-1058-82aaf82314b1@nvidia.com> X-Nvconfidentiality: public From: John Hubbard Message-ID: Date: Fri, 20 Sep 2019 18:19:07 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: X-Originating-IP: [10.124.1.5] X-ClientProxiedBy: HQMAIL105.nvidia.com (172.20.187.12) To DRHQMAIL107.nvidia.com (10.27.9.16) Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1569028754; bh=ODFPVDhWIF0mUh1caEQVxVT/dYo1Jtz7qSmrlSav8Xk=; h=X-PGP-Universal:Subject:To:CC:References:X-Nvconfidentiality:From: Message-ID:Date:User-Agent:MIME-Version:In-Reply-To: X-Originating-IP:X-ClientProxiedBy:Content-Type:Content-Language: Content-Transfer-Encoding; b=F+tZCjP1J+56uBfk1Cz7RZMsWDsexP/Z5/oCQ1lAiv5c9/aOyor1eOycZdjhGN+3D W4lWMVv+JxaP3HD/I3gAfMAKPZb/hHX5wnULS6MuF7/Otc+FzSKGwg4dI7bqkCOiq1 ALJg/ampTnuHu8vrZpw6KWl6puBTOJ8nTOTKtvRx26G23DKDTvN5pVf0ShWFTsOe5O rb3XtQa56H8Zf8fdawFdS7yYp+kc6M57EFtEaN31VZtOMouls07RDg0EM0THjjtLp9 9eCAp5d61OjUDV0CQPqTrYbl1Kqpph5DSCZL3Jbdcl1U2o1HcqLxBb2hKB4hvv+a1y Z0OZZyqJNaj8w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 9/20/19 5:33 PM, Qiujun Huang wrote: >> On 9/20/19 8:51 AM, Qiujun Huang wrote: ... >> It would be nice if this spelled out a little more clearly what's >> wrong. I think you and Aneesh are saying that the entry is really >> a swap entry, created by the MCE response to a bad page? > do_machine_check-> > do_memory_failure-> > memory_failure-> > hwpoison_user_mappings > will updated PUD level PTE entry as a swap entry. > > static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, > unsigned long address, void *arg) > { > ... > if (PageHWPoison(page) && !(flags & TTU_IGNORE_HWPOISON)) { > pteval = swp_entry_to_pte(make_hwpoison_entry(subpage)); OK, that helps. Let's add something approximately like this to the commit description: do_machine_check() do_memory_failure() memory_failure() hw_poison_user_mappings() try_to_unmap() pteval = swp_entry_to_pte(make_hwpoison_entry(subpage)); ...and now we have a swap entry that indicates that the page entry refers to a bad (and poisoned) page of memory, but gup_fast() at this level of the page table was ignoring swap entries, and incorrectly assuming that "!pxd_none() == valid and present". And this was not just a poisoned page problem, but a generaly swap entry problem. So, any swap entry type (device memory migration, numa migration, or just regular swapping) could lead to the same problem. Fix this by checking for pxd_present(), instead of pxd_none(). > ... >> >>> >>> Signed-off-by: Qiujun Huang >>> --- >>> mm/gup.c | 2 ++ >>> 1 file changed, 2 insertions(+) >>> >>> diff --git a/mm/gup.c b/mm/gup.c >>> index 98f13ab..6157ed9 100644 >>> --- a/mm/gup.c >>> +++ b/mm/gup.c >>> @@ -2230,6 +2230,8 @@ static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end, >>> next = pud_addr_end(addr, end); >>> if (pud_none(pud)) >>> return 0; >>> + if (unlikely(!pud_present(pud))) >>> + return 0; >> >> If the MCE hwpoison behavior puts in swap entries, then it seems like all >> page table walkers would need to check for p*d_present(), and maybe at all >> levels too, right? > I think so >> Should those changes be part of this fix, do you think? thanks, -- John Hubbard NVIDIA