From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.0 required=3.0 tests=INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A218CC10F27 for ; Tue, 10 Mar 2020 21:09:11 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 661D721655 for ; Tue, 10 Mar 2020 21:09:11 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 661D721655 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 004186B0003; Tue, 10 Mar 2020 17:09:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F1D4B6B0006; Tue, 10 Mar 2020 17:09:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E34746B0007; Tue, 10 Mar 2020 17:09:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0170.hostedemail.com [216.40.44.170]) by kanga.kvack.org (Postfix) with ESMTP id CB17B6B0003 for ; Tue, 10 Mar 2020 17:09:10 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 9CC4A8248047 for ; Tue, 10 Mar 2020 21:09:10 +0000 (UTC) X-FDA: 76580692860.29.music24_50149c712ed44 X-HE-Tag: music24_50149c712ed44 X-Filterd-Recvd-Size: 5916 Received: from mail-wm1-f42.google.com (mail-wm1-f42.google.com [209.85.128.42]) by imf31.hostedemail.com (Postfix) with ESMTP for ; Tue, 10 Mar 2020 21:09:10 +0000 (UTC) Received: by mail-wm1-f42.google.com with SMTP id a141so2993641wme.2 for ; Tue, 10 Mar 2020 14:09:10 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=vBlVR0/6AqSx69KL2XC6YUUZMyEmq7M8pE/bxrslsfY=; b=IWzeNFVrelyg6lkHOdM5awxLZIRiGXuZGD+WcjaFgsD5ShGFaMU/4igwgLBg5T/u6r WPi0kxFhzsryf/wmndfSx2lZcHFfVKxF9cic6OW8j0zv7uU4Pvx+g0nf1VrzQn/YqZT3 cZielW+54XOoiPIdCs9oHw5FDdytB3lQz/l3qLIgg8lklGC1lBgLD+j8mc5rRb+Tgtq2 ls2Y6GuwC02uOHVBkVK5fxxJfLidckga3bCNIBVZPJxCpzIVA8dAbCRF2KD6tcTJAnoq Oz85cno3WyE/wzQ/k4X/1QtB1x2+0vLvX1LaBUzUkyCfJ/WbEs9p6Wf6w3QITkQO+g+d bPFw== X-Gm-Message-State: ANhLgQ06WK2IsAIuKzBtD2C2vK7C5Ks3yCxDbCtmszvd9cmLvpzxoXqp qwbY1CvOyTuYj1yJfvmsLDc= X-Google-Smtp-Source: ADFU+vscXDgpMC623/Va/knMMKGNYTRcsriKmLvs/0bz/CA0E/q41ws+2j4X5nm5BTI57+fggsZX4g== X-Received: by 2002:a7b:cb10:: with SMTP id u16mr3718880wmj.96.1583874549025; Tue, 10 Mar 2020 14:09:09 -0700 (PDT) Received: from localhost (ip-37-188-253-35.eurotel.cz. [37.188.253.35]) by smtp.gmail.com with ESMTPSA id 138sm271231wmb.21.2020.03.10.14.09.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Mar 2020 14:09:08 -0700 (PDT) Date: Tue, 10 Mar 2020 22:09:06 +0100 From: Michal Hocko To: Jann Horn Cc: Minchan Kim , Linux-MM , kernel list , Daniel Colascione , Dave Hansen , "Joel Fernandes (Google)" Subject: Re: interaction of MADV_PAGEOUT with CoW anonymous mappings? Message-ID: <20200310210906.GD8447@dhcp22.suse.cz> References: <20200310184814.GA8447@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue 10-03-20 20:11:45, Jann Horn wrote: > On Tue, Mar 10, 2020 at 7:48 PM Michal Hocko wrote: > > On Tue 10-03-20 19:08:28, Jann Horn wrote: > > > Hi! > > > > > > >From looking at the source code, it looks to me as if using > > > MADV_PAGEOUT on a CoW anonymous mapping will page out the page if > > > possible, even if other processes still have the same page mapped. Is > > > that correct? > > > > > > If so, that's probably bad in environments where many processes (with > > > different privileges) are forked from a single zygote process (like > > > Android and Chrome), I think? If you accidentally call it on a CoW > > > anonymous mapping with shared pages, you'll degrade the performance of > > > other processes. And if an attacker does it intentionally, they could > > > use that to aid with exploiting race conditions or weird > > > microarchitectural stuff (e.g. the new https://lviattack.eu/lvi.pdf > > > talks about "the assumption that attackers can provoke page faults or > > > microcode assists for (arbitrary) load operations in the victim > > > domain"). > > > > > > Should madvise_cold_or_pageout_pte_range() maybe refuse to operate on > > > pages with mapcount>1, or something like that? Or does it already do > > > that, and I just missed the check? > > > > I have brought up side channel attacks earlier [1] but only in the > > context of shared page cache pages. I didn't really consider shared > > anonymous pages to be a real problem. I was under impression that CoW > > pages shouldn't be a real problem because any security sensible > > applications shouldn't allow untrusted code to be forked and CoW > > anything really important. I believe we have made this assumption > > in other places - IIRC on gup with FOLL_FORCE but I admit I have > > very happily forgot most details. I have quickly checked FOLL_FORCE and it is careful to break CoW on the write access. > Android has a "zygote" process that starts up the whole Java > environment with a bunch of libraries before entering into a loop that > fork()s off a child every time the user wants to launch an app. So all > the apps, and even browser renderer processes, on the device share > many CoW VMAs. See > . I still have to think about how this could be used for any reasonable attack. But certainly the simplest workaround is to simply back off on pages mapped multiple times as we do for THP already. Something like the following should work but I haven't tested it at all diff --git a/mm/madvise.c b/mm/madvise.c index 43b47d3fae02..02daa447bf47 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -351,6 +351,10 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd, goto regular_page; return 0; } + + /* Do not interfere with other mappings of this page */ + if (page_mapcount(page) != 1) + goto huge_unlock; if (pmd_young(orig_pmd)) { pmdp_invalidate(vma, addr, pmd); @@ -426,6 +430,10 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd, continue; } + /* Do not interfere with other mappings of this page */ + if (page_mapcount(page) != 1) + continue; + VM_BUG_ON_PAGE(PageTransCompound(page), page); if (pte_young(ptent)) { -- Michal Hocko SUSE Labs