From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.7 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8012EFA3728 for ; Wed, 16 Oct 2019 20:02:23 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 45BD720854 for ; Wed, 16 Oct 2019 20:02:23 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=intel-com.20150623.gappssmtp.com header.i=@intel-com.20150623.gappssmtp.com header.b="dtthg5EZ" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 45BD720854 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id EA7798E0006; Wed, 16 Oct 2019 16:02:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E57F18E0005; Wed, 16 Oct 2019 16:02:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D6EB48E0006; Wed, 16 Oct 2019 16:02:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0103.hostedemail.com [216.40.44.103]) by kanga.kvack.org (Postfix) with ESMTP id B65C38E0005 for ; Wed, 16 Oct 2019 16:02:22 -0400 (EDT) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with SMTP id 3513A801901A for ; Wed, 16 Oct 2019 20:02:22 +0000 (UTC) X-FDA: 76050719724.25.pain78_1493f21a6435c X-HE-Tag: pain78_1493f21a6435c X-Filterd-Recvd-Size: 5432 Received: from mail-oi1-f196.google.com (mail-oi1-f196.google.com [209.85.167.196]) by imf08.hostedemail.com (Postfix) with ESMTP for ; Wed, 16 Oct 2019 20:02:21 +0000 (UTC) Received: by mail-oi1-f196.google.com with SMTP id o205so76203oib.12 for ; Wed, 16 Oct 2019 13:02:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=FHNz99ZfBU4H/6+e3BXs4XfNAFX8LVbryN5UqXjbu7I=; b=dtthg5EZZ6M5n+jDlRO+XUgJ7uRzwWsrmdYf51cen7Gcgu4XipcrmMUfzXFgudSK4x burC4NFvhRNF3ykdCqRgQWCkURUpUAymbeF6gDH5VQIRLHQVSKHTNDnRbBeKtEiSYGeg INMiCFLVO64voTtuLWiPiHJ5i2/wWAePJVvmK/an+fOcJZfKRF2x+MHntSCc/AgwaRZA ytUMMSIX9vdcGG2ZIotBaJRV2ziIfVC97BuxS84OmCcDXY188d/+uXkWKhJKdk/pgsGT CnDVim+xlg1ES+K1vIE1gFXHeAQzAX72hRZTo4ehTgcQkXukG+ExCbw+qZd9xfyAtUoO 2eVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=FHNz99ZfBU4H/6+e3BXs4XfNAFX8LVbryN5UqXjbu7I=; b=Cgy1pOMQZgyj7ckh0JvnPzrPTlHrPysSDkkxokU81PKgtEdnq6eBgcUlZ5QyYMj5fl Wk/8C3lhX4M8qdFRg8wKXaOHM6FEZKsBtqIUllWzSMV90iIR9rKBbOT3hUAps+Urahix 2zMBwDO+cxVUBZlsrwbv7ItESpBXyzukZavrylp8D/zN9Qvx3MDjjLniVaqpDfwG0mml SqtxbWuCu8DK4HfVrmWzd/MvOFJmMRonm3sZ67fN2/Anf2Jg4q40JW/ggIQx0aPBhCWi roJWa94hUqZny03kOrjeF0pi+E9L59lodDwtQX7slST/w01azIzVD9A7CUfL77OIvoZx m/tw== X-Gm-Message-State: APjAAAUHxsH8MD7aiAZ8viNT/Q1E0n4AOa00ZyJBMQ0+x3i6PQ6EJIyD ZxV4nGgD0CMmy7yorDLYg5LCo6UGnRVH2WZRpXo2Dw== X-Google-Smtp-Source: APXvYqwcJcti+SIQuAW47j8ngZ+NPZclH4h0ArtC1oKNxg9RonHLIJ2S/EfU3D3UB1hu9GkydPYLCNDntlwygRVUFZw= X-Received: by 2002:aca:620a:: with SMTP id w10mr118318oib.0.1571256140474; Wed, 16 Oct 2019 13:02:20 -0700 (PDT) MIME-Version: 1.0 References: <20191008093711.3410-1-thomas_os@shipmail.org> <20191015100653.ittq4b2mx7pszky5@box> <3a16a199-a4bd-5503-3146-3fb24bfb2638@shipmail.org> In-Reply-To: <3a16a199-a4bd-5503-3146-3fb24bfb2638@shipmail.org> From: Dan Williams Date: Wed, 16 Oct 2019 13:02:08 -0700 Message-ID: Subject: Re: [RFC PATCH] mm: Fix a huge pud insertion race during faulting To: =?UTF-8?Q?Thomas_Hellstr=C3=B6m_=28VMware=29?= Cc: "Kirill A. Shutemov" , Matthew Wilcox , linux-mm , Linux Kernel Mailing List , Thomas Hellstrom Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Oct 15, 2019 at 10:59 PM Thomas Hellstr=C3=B6m (VMware) wrote: > > Hi, Dan, > > On 10/16/19 3:44 AM, Dan Williams wrote: > > On Tue, Oct 15, 2019 at 3:06 AM Kirill A. Shutemov wrote: > >> On Tue, Oct 08, 2019 at 11:37:11AM +0200, Thomas Hellstr=C3=B6m (VMwar= e) wrote: > >>> From: Thomas Hellstrom > >>> > >>> A huge pud page can theoretically be faulted in racing with pmd_alloc= () > >>> in __handle_mm_fault(). That will lead to pmd_alloc() returning an > >>> invalid pmd pointer. Fix this by adding a pud_trans_unstable() functi= on > >>> similar to pmd_trans_unstable() and check whether the pud is really s= table > >>> before using the pmd pointer. > >>> > >>> Race: > >>> Thread 1: Thread 2: Comment > >>> create_huge_pud() Fallback - not taken. > >>> create_huge_pud() Taken. > >>> pmd_alloc() Returns an invalid po= inter. > >>> > >>> Cc: Matthew Wilcox > >>> Fixes: a00cc7d9dd93 ("mm, x86: add support for PUD-sized transparent = hugepages") > >>> Signed-off-by: Thomas Hellstrom > >>> --- > >>> RFC: We include pud_devmap() as an unstable PUD flag. Is this correct= ? > >>> Do the same for pmds? > >> I *think* it is correct and we should do the same for PMD, but I may b= e > >> wrong. > >> > >> Dan, Matthew, could you comment on this? > > The _devmap() check in these paths near _trans_unstable() has always > > been about avoiding assumptions that the corresponding page might be > > page cache or anonymous which for dax it's neither and does not behave > > like a typical page. > > The concern here is that _trans_huge() returns false for _devmap() > pages, which means that also _trans_unstable() returns false. > > Still, I figure someone could zap the entry at any time using madvise(), > so AFAICT the entry is indeed unstable, and it's a bug not to include > _devmap() in the _trans_unstable() functions? Yes, I can't think a case where it is wrong to include _devmap() in a _trans_unstable(). It may be unnecessary if the given path can't reasonably ever encounter a file-backed dax mapping, but it's otherwise ok to always consider _devmap().