From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E9B3EC4167B for ; Tue, 29 Nov 2022 15:31:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 898CB6B0071; Tue, 29 Nov 2022 10:31:05 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8489A6B0072; Tue, 29 Nov 2022 10:31:05 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 710406B0074; Tue, 29 Nov 2022 10:31:05 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 60C7E6B0071 for ; Tue, 29 Nov 2022 10:31:05 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id D70EB121021 for ; Tue, 29 Nov 2022 15:31:04 +0000 (UTC) X-FDA: 80186868048.27.2A02BD8 Received: from mail-il1-f176.google.com (mail-il1-f176.google.com [209.85.166.176]) by imf18.hostedemail.com (Postfix) with ESMTP id EEB531C001C for ; Tue, 29 Nov 2022 15:31:03 +0000 (UTC) Received: by mail-il1-f176.google.com with SMTP id z9so6768459ilu.10 for ; Tue, 29 Nov 2022 07:31:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=uMjNraLB6scr/xv9kM9mpf61HY1js/ZvtGhWlHG2rk8=; b=e2HH0dG5vmD1BA10H/FjRsLuqF361HQkMIjd2fOscEY5B1/yxduJlciZ+dvNrFYBsm scC3vBN1hlQXikK9x21+E4UlkGiX3srvT/7XmurqXi0b2LdY0xsm4E1izws3WaVNrPcc jJkCzD+7a2khe/8FXDY8wXwNjUxoo+NxaWcS5Oap2mSg83+KMPGf1T+D2mrbFrCTYvvY Lhv9I9rmVTa0GGM41qyDFFAOkT2vakmP6Jq38DU7slDN2WzlFO4s7s8AILaTaVmAXR1h f67tsuAow7TRlWkS9T7oHDNZNg++2n3Hz9VQ1+n3SJc+zkcg1tkq+pcPBvaKyIkLae4B gPsA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=uMjNraLB6scr/xv9kM9mpf61HY1js/ZvtGhWlHG2rk8=; b=WGuMy/OOvYWsShRybREbJAadzwUCZjnrAsFcUvl3Fm3YAhNX6StZh0g5SlC2t044W/ 3Mz67fEBGq7B1OxbUc4e/Mv2JdLab+4JbHS9qu7Z8mGvK57uuAymrydKKBeFY6zp/Rxc yCVCCWSQlukQRfYmrJNAz65ohgJj2i7Qlsf7jQMdK4913P+rFnxBWn7DZ6NTvHlYWCGw IyLKJXuXSL0pHfPgG/1dL7eQ3N5K9baZ7GjKvhoWAX7D0vcM1Qr9RVm2vVRFv4UpFAGO 8TLq/O9ntj34P+YsMI9gJwhsnAMT4XDq+rScpe9b/TW3MPPgYaWa0xGeYjV0/p5oUmsP TfVA== X-Gm-Message-State: ANoB5pmscb+ytI1jUgxKyZ4ffvGWVNNw8JNy14T/7t32A5dpP5rGZF9w Qr0tuiKqGtbRcnS2/mrNzp3rP+PCJmQGuBYPm3u/Hg== X-Google-Smtp-Source: AA0mqf5H0WkPblZ9WWd2dEdD0H7A50y46L/PoWWxVtSVOwBX/7+zkXb6Rv258KlZroTTNsgYcCkxjub2fARp740h8kk= X-Received: by 2002:a92:c10f:0:b0:303:1f6a:b30c with SMTP id p15-20020a92c10f000000b003031f6ab30cmr2118645ile.254.1669735862770; Tue, 29 Nov 2022 07:31:02 -0800 (PST) MIME-Version: 1.0 References: <20221128180252.1684965-1-jannh@google.com> <20221128180252.1684965-2-jannh@google.com> In-Reply-To: From: Jann Horn Date: Tue, 29 Nov 2022 16:30:26 +0100 Message-ID: Subject: Re: [PATCH v4 2/3] mm/khugepaged: Fix GUP-fast interaction by sending IPI To: Yang Shi Cc: security@kernel.org, Andrew Morton , David Hildenbrand , Peter Xu , John Hubbard , linux-kernel@vger.kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=e2HH0dG5; spf=pass (imf18.hostedemail.com: domain of jannh@google.com designates 209.85.166.176 as permitted sender) smtp.mailfrom=jannh@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1669735864; a=rsa-sha256; cv=none; b=MQLBbbuZ6p1+UoW7lX+WcHfqLeaKMmo1PN/ODLDdxv0sIATEF5wCjyErLJa0nqkf3uIytu N4E9qNUMdngx+3OU1QaEar9qZuYSCrk+VvbrvYUhBtG3lfzB0iua/MlvArQ1zRbuNN8CRe FwjjCh47D/xM9G5K/zJ1mqgihuhAtNQ= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1669735864; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=uMjNraLB6scr/xv9kM9mpf61HY1js/ZvtGhWlHG2rk8=; b=fMxdYX7jrIh+uacJ2Y57LX3io+Mbv0OC0qXMYZReLsP/c7VJbaLJ4wjJZs5BdhYQRfJbv2 urKv1822aIgXzE/JnWhlInKwO1tL6NGUw+dm8S79aEUnpxntPArL4MreCs398foCw/T47m utckXK+4VW8eJO2kpSjGeqV/p2x+bvc= Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=e2HH0dG5; spf=pass (imf18.hostedemail.com: domain of jannh@google.com designates 209.85.166.176 as permitted sender) smtp.mailfrom=jannh@google.com; dmarc=pass (policy=reject) header.from=google.com X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: EEB531C001C X-Stat-Signature: bx8i1te8jmsr3cofxas8q3e3u49ucnba X-Rspam-User: X-HE-Tag: 1669735863-668553 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Nov 28, 2022 at 11:10 PM Yang Shi wrote: > > On Mon, Nov 28, 2022 at 12:12 PM Jann Horn wrote: > > > > On Mon, Nov 28, 2022 at 9:10 PM Yang Shi wrote: > > > On Mon, Nov 28, 2022 at 11:57 AM Jann Horn wrote: > > > > > > > > On Mon, Nov 28, 2022 at 8:54 PM Yang Shi wrote: > > > > > > > > > > On Mon, Nov 28, 2022 at 10:03 AM Jann Horn wrote: > > > > > > > > > > > > Since commit 70cbc3cc78a99 ("mm: gup: fix the fast GUP race against THP > > > > > > collapse"), the lockless_pages_from_mm() fastpath rechecks the pmd_t to > > > > > > ensure that the page table was not removed by khugepaged in between. > > > > > > > > > > > > However, lockless_pages_from_mm() still requires that the page table is not > > > > > > concurrently freed or reused to store non-PTE data. Otherwise, problems > > > > > > can occur because: > > > > > > > > > > > > - deposited page tables can be freed when a THP page somewhere in the > > > > > > mm is removed > > > > > > - some architectures store non-PTE information inside deposited page > > > > > > tables (see radix__pgtable_trans_huge_deposit()) > > > > > > > > > > > > Additionally, lockless_pages_from_mm() is also somewhat brittle with > > > > > > regards to page tables being repeatedly moved back and forth, but > > > > > > that shouldn't be an issue in practice. > > > > > > > > > > > > Fix it by sending IPIs (if the architecture uses > > > > > > semi-RCU-style page table freeing) before freeing/reusing page tables. > > > > > > > > > > > > As noted in mm/gup.c, on configs that define CONFIG_HAVE_FAST_GUP, > > > > > > there are two possible cases: > > > > > > > > > > > > 1. CONFIG_MMU_GATHER_RCU_TABLE_FREE is set, causing > > > > > > tlb_remove_table_sync_one() to send an IPI to synchronize with > > > > > > lockless_pages_from_mm(). > > > > > > 2. CONFIG_MMU_GATHER_RCU_TABLE_FREE is unset, indicating that all > > > > > > TLB flushes are already guaranteed to send IPIs. > > > > > > tlb_remove_table_sync_one() will do nothing, but we've already > > > > > > run pmdp_collapse_flush(), which did a TLB flush, which must have > > > > > > involved IPIs. > > > > > > > > > > I'm trying to catch up with the discussion after the holiday break. I > > > > > understand you switched from always allocating a new page table page > > > > > (we decided before) to sending IPIs to serialize against fast-GUP, > > > > > this is fine to me. > > > > > > > > > > So the code now looks like: > > > > > pmdp_collapse_flush() > > > > > sending IPI > > > > > > > > > > But the missing part is how we reached "TLB flushes are already > > > > > guaranteed to send IPIs" when CONFIG_MMU_GATHER_RCU_TABLE_FREE is > > > > > unset? ARM64 doesn't do it IIRC. Or did I miss something? > > > > > > > > From arch/arm64/Kconfig: > > > > > > > > select MMU_GATHER_RCU_TABLE_FREE > > > > > > > > CONFIG_MMU_GATHER_RCU_TABLE_FREE is not a config option that the user > > > > can freely toggle; it is an option selected by the architecture. > > > > > > Aha, I see :-) BTW, shall we revert "mm: gup: fix the fast GUP race > > > against THP collapse"? It seems not necessary anymore if this approach > > > is used IIUC. > > > > Yeah, I agree. > > Since this patch could solve two problems: the use-after-free of the > data page (pinned by fast-GUP) and the page table page and my patch > will be reverted, so could you please catch both issues in this > patch's commit log? I'd like to preserve the description of the issue > fixed by my patch. I think that it is helpful to see the information > about all the fixed problems in one commit instead of digging into > another reverted commit. OK, I will rewrite the commit message to describe the overall problem, including the part addressed by your patch.