From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 89B7AC433FE for ; Mon, 28 Nov 2022 20:15:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E7B296B0071; Mon, 28 Nov 2022 15:15:53 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E042B6B0072; Mon, 28 Nov 2022 15:15:53 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C7DD96B0073; Mon, 28 Nov 2022 15:15:53 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id B37B36B0071 for ; Mon, 28 Nov 2022 15:15:53 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 8E91340392 for ; Mon, 28 Nov 2022 20:15:53 +0000 (UTC) X-FDA: 80183956986.02.47C9064 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf22.hostedemail.com (Postfix) with ESMTP id 2D782C0010 for ; Mon, 28 Nov 2022 20:15:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1669666552; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=FqaAFD3G+jZ4WHE7W2ZLsRGOd169Cc0eOcFbDpNxulY=; b=EBJFa3PvDZqBo2MiCIgLSoe4Qw+RhX763hiylMCbZlHUskJXjOAzHJ4I9A0BvY0R+vYK2Z s0AC11nLvg9rSBIeTUO6z8+REJTeexwLpTpy1DXSGiIZ4rcqUPlKR0DrL3vXHZ+cT9N6fl lOGu6fyFVC06hG63CK4jubeCrhrwlsg= Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-48-MZp3Sy5qNF6mu0jpB_OYRg-1; Mon, 28 Nov 2022 15:15:51 -0500 X-MC-Unique: MZp3Sy5qNF6mu0jpB_OYRg-1 Received: by mail-qk1-f198.google.com with SMTP id h13-20020a05620a244d00b006fb713618b8so22449320qkn.0 for ; Mon, 28 Nov 2022 12:15:51 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=FqaAFD3G+jZ4WHE7W2ZLsRGOd169Cc0eOcFbDpNxulY=; b=dbl3oza+cjpW5lVfjob5ZjtkFiVfCDkvH6QAIC2MvtKyGN7NeQz3LF//CFRvgsd3J8 tzjD7zFIw54qoxRQKn8sgg9E6l4nXcGFDOh0fKHIHdFiOzcVp6LJavLiuGMA5apcthga 0+5B7Dy4Gfv3OkFkt+yP5y+2eL+r4j25a52w0HU3gCEvp13bwFyp9AKtqojRIKjeBqh6 H4CRd3djMJWooDHuPqR6wgkfYZ9Lp/yA+QLnSo1HTuEeRVr/9lKwZfhyT/3ke0daSNfo FWk7LzYZnRyQzPrPHps4VIXWlLPlHQbksBeyk2al9QVRiIgOHLZQMJTuaSXJtIJJ9e4+ gfZA== X-Gm-Message-State: ANoB5pnf7B8SJcf05QR/ef4LkD87yXkN064MkWYPjUGiVXTnTvbxE+i+ XP2wWhwZdK86bdJjCJA4K8fVvlInqj32BJaBH9L+8bfPQp/fvR11b/NFOrtkklPw0B4N4Kj+N/B HiXjyFBRWalc= X-Received: by 2002:a0c:90f1:0:b0:4c6:8f2e:9a2 with SMTP id p104-20020a0c90f1000000b004c68f2e09a2mr47944978qvp.100.1669666550916; Mon, 28 Nov 2022 12:15:50 -0800 (PST) X-Google-Smtp-Source: AA0mqf6Y5NmQy9S7nCPfZbHSBEKE87bVtEm6ScTkGyMabYTFiOFNFzAj5q1Ydhc3+5YgVPO0ynqwVQ== X-Received: by 2002:a0c:90f1:0:b0:4c6:8f2e:9a2 with SMTP id p104-20020a0c90f1000000b004c68f2e09a2mr47944950qvp.100.1669666550639; Mon, 28 Nov 2022 12:15:50 -0800 (PST) Received: from x1n (bras-base-aurron9127w-grc-46-70-31-27-79.dsl.bell.ca. [70.31.27.79]) by smtp.gmail.com with ESMTPSA id d13-20020ac851cd000000b0039cba52974fsm7358402qtn.94.2022.11.28.12.15.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Nov 2022 12:15:50 -0800 (PST) Date: Mon, 28 Nov 2022 15:15:49 -0500 From: Peter Xu To: Jann Horn Cc: Yang Shi , security@kernel.org, Andrew Morton , David Hildenbrand , John Hubbard , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH v4 2/3] mm/khugepaged: Fix GUP-fast interaction by sending IPI Message-ID: References: <20221128180252.1684965-1-jannh@google.com> <20221128180252.1684965-2-jannh@google.com> MIME-Version: 1.0 In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1669666553; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=FqaAFD3G+jZ4WHE7W2ZLsRGOd169Cc0eOcFbDpNxulY=; b=Ib3B0STg8YwdoJlFCd8+RG7mE2W9l/LESxtvPg8FFgBpL5q2W+wV+ABIKJApN4zYrQ4hvU Z/Xes1rXeQtrjqZIlhq1PAjQXt+7khTfOzLvSisW5TSnBb9U0qhrzEJPrf0A0tc0H+Y1gp 2yW+CWssj2gz7odUAaP36t3yH/Xl9io= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=EBJFa3Pv; spf=pass (imf22.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1669666553; a=rsa-sha256; cv=none; b=wKbob70QHbuB5Ixqf9yB5sHp2VdnAXulLpWrfjVTG97nMJ40Cs7wkvcMbb0S1kWffdiN+Z G67dJNprBpmMalt9InJmx7y3YThZWLuWZx2Z2faWAs/rcJfWnBjTwZP7spDEMvJdTB1uQ3 pMXGxgvMWEc2/EIc3a7z6UsOCgbPFYE= X-Stat-Signature: n149egiqqttoyxt9dg31px3zftzw86eu X-Rspam-User: X-Rspamd-Queue-Id: 2D782C0010 X-Rspamd-Server: rspam11 Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=EBJFa3Pv; spf=pass (imf22.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-HE-Tag: 1669666552-949972 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Nov 28, 2022 at 08:56:54PM +0100, Jann Horn wrote: > On Mon, Nov 28, 2022 at 8:54 PM Yang Shi wrote: > > > > On Mon, Nov 28, 2022 at 10:03 AM Jann Horn wrote: > > > > > > Since commit 70cbc3cc78a99 ("mm: gup: fix the fast GUP race against THP > > > collapse"), the lockless_pages_from_mm() fastpath rechecks the pmd_t to > > > ensure that the page table was not removed by khugepaged in between. > > > > > > However, lockless_pages_from_mm() still requires that the page table is not > > > concurrently freed or reused to store non-PTE data. Otherwise, problems > > > can occur because: > > > > > > - deposited page tables can be freed when a THP page somewhere in the > > > mm is removed > > > - some architectures store non-PTE information inside deposited page > > > tables (see radix__pgtable_trans_huge_deposit()) > > > > > > Additionally, lockless_pages_from_mm() is also somewhat brittle with > > > regards to page tables being repeatedly moved back and forth, but > > > that shouldn't be an issue in practice. > > > > > > Fix it by sending IPIs (if the architecture uses > > > semi-RCU-style page table freeing) before freeing/reusing page tables. > > > > > > As noted in mm/gup.c, on configs that define CONFIG_HAVE_FAST_GUP, > > > there are two possible cases: > > > > > > 1. CONFIG_MMU_GATHER_RCU_TABLE_FREE is set, causing > > > tlb_remove_table_sync_one() to send an IPI to synchronize with > > > lockless_pages_from_mm(). > > > 2. CONFIG_MMU_GATHER_RCU_TABLE_FREE is unset, indicating that all > > > TLB flushes are already guaranteed to send IPIs. > > > tlb_remove_table_sync_one() will do nothing, but we've already > > > run pmdp_collapse_flush(), which did a TLB flush, which must have > > > involved IPIs. > > > > I'm trying to catch up with the discussion after the holiday break. I > > understand you switched from always allocating a new page table page > > (we decided before) to sending IPIs to serialize against fast-GUP, > > this is fine to me. > > > > So the code now looks like: > > pmdp_collapse_flush() > > sending IPI > > > > But the missing part is how we reached "TLB flushes are already > > guaranteed to send IPIs" when CONFIG_MMU_GATHER_RCU_TABLE_FREE is > > unset? ARM64 doesn't do it IIRC. Or did I miss something? > > From arch/arm64/Kconfig: > > select MMU_GATHER_RCU_TABLE_FREE > > CONFIG_MMU_GATHER_RCU_TABLE_FREE is not a config option that the user > can freely toggle; it is an option selected by the architecture. True. I think I understand what Yang is confused about and I had the same question (asked in the old threads but didn't yet got a confirmation), since I think arm64 didn't use IPI for tlb is also true (according to the arm64 version of __flush_tlb_range), so PPC doesn't seem to be the only one. I mentioned PPC only because I saw the comment in mmu_gather.c: * Architectures that do not have this (PPC) need to delay the freeing by some * other means, this is that means. So I think it's obsolete. In short, IIUC there's just an implicit dependency that any !MMU_GATHER_RCU_TABLE_FREE arch must require IPI for tlb flush (not vice versa, hence arm64 can have RCU_TABLE_FREE), or something could be broken. -- Peter Xu