From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DB286C47074 for ; Tue, 2 Jan 2024 03:36:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1BCFE6B0274; Mon, 1 Jan 2024 22:36:36 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 16D7F6B0278; Mon, 1 Jan 2024 22:36:36 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 033D76B0279; Mon, 1 Jan 2024 22:36:35 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id E42D66B0274 for ; Mon, 1 Jan 2024 22:36:35 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id ADAB7A1747 for ; Tue, 2 Jan 2024 03:36:35 +0000 (UTC) X-FDA: 81632958750.10.C9C21D0 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf04.hostedemail.com (Postfix) with ESMTP id 7AB1540018 for ; Tue, 2 Jan 2024 03:36:33 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=uhoENczf; spf=pass (imf04.hostedemail.com: domain of jszhang@kernel.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=jszhang@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1704166594; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=5M7zvmTSbEScnl4ro+fb1PhSM4uk+LvmEDTLTJMKYkQ=; b=vEKxFhNK++73P7uJfiEDiArCcXLdqL7yYXDB+IudBdAwg5j9YrS+qXr6T+x7D9PJwIv0aT aXiIfFyx4E0nLZdJpcdm8QU3E6/vH2hNCjz5P/gv5PlJ1CesAyO8uwe5eaNGA1oWI00cYF 1IecXlWlBHxMB3K3e4nkCgdgKjJXXXo= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1704166594; a=rsa-sha256; cv=none; b=UjnZnAxNHWznxKmwfORCVnAYavLmEcljThQISPHaOcbtScNovN3P87/afAIBdWnliEr/2F L2h+p4UOxJJMOzhLjqBkpUUyxzNF1aYsehYDatIdUbsPR2zCdHNaxm1BgXLjzrIMJveIKp Fmmwxv86fuYJ+Vk+7qINWcLsZwFwTtE= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=uhoENczf; spf=pass (imf04.hostedemail.com: domain of jszhang@kernel.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=jszhang@kernel.org; dmarc=pass (policy=none) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sin.source.kernel.org (Postfix) with ESMTP id 6074BCE0B79; Tue, 2 Jan 2024 03:36:29 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2BD48C433CB; Tue, 2 Jan 2024 03:36:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1704166588; bh=WeKR+pf7kCbWxOsXcSMjqhGRylxECnkrQmi2F37hamY=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=uhoENczfe/ZlasEP9n18xnig3cHIoUjmDo0y+kaeTW2aQPf4Q7JPGDu13cK0KzFB+ lczgZu64UvAltdriaMUQxV4aStNyNpdrA2029uchXbxhK4VjLRLD7GYMJFE1Hlc4gG 5PjSWM68hTZjL1hIx8Js9isYEozJnVEnoJQe+BcbB6iIzpGrce13sMf6CXBs7T0M9L zdIcyIlhpVcQrW284GZtCj+bhrG1P79B1sbg97Sr+Fp9HlCTfQ3aVHKqrBbyPU+t6a X/TqjsJ7LoZCACDBGSCbRo7YVE3d8Akt7boHerKUngY/ERnnlcyR5uqWp2jiVRe9xP Jkw5sfob8hM9A== Date: Tue, 2 Jan 2024 11:23:46 +0800 From: Jisheng Zhang To: Alexandre Ghiti Cc: Paul Walmsley , Palmer Dabbelt , Albert Ou , Will Deacon , "Aneesh Kumar K . V" , Andrew Morton , Nick Piggin , Peter Zijlstra , linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH 3/4] riscv: enable MMU_GATHER_RCU_TABLE_FREE for SMP && MMU Message-ID: References: <20231219175046.2496-1-jszhang@kernel.org> <20231219175046.2496-4-jszhang@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: 7AB1540018 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: rzf6ww917eypaz9a6enn16ukube1p7mj X-HE-Tag: 1704166593-919046 X-HE-Meta: U2FsdGVkX1/ucDtw6RfWWQdyk3xQ7pWc5jpNpAJxnYZH5sgUrJ4RwsVNhvYh7kv5CXENPGCVs2ca04n9w9zz5SCb9X4DgDmByTG5aHdNGVbroyGiEUq8AxQYI3uigzCxjuPyzWZCqwspYTNpBEJG8aXNcY2Yd7SbabewDPaGaB2/N/gJY+q9Nx+acz7PMasygLkVTx6tLJoec9cxszH9qSZw4x74XcWe6V2TkJzcaaBws5Gqp3FCR0xjgRxfPTw+I6bUAzXO67zYal4eZ5jIqjqs96V7WO7Fs4+TOvI4WlyqRGX9zXQaYcIV3WI8uh3PLE5t0kp2b9JT8bWM0RFQwr5PLQNZgRqoleUGwj4AxyPDuhI805kzmhc4e51nhaLiyj59eWr52JJgMXYejM5Y+E7tzy/BfXXuZF47pOf08te6Ci9gyG8tiDjF/AbE5SvW69TMJpViID6Lgdczk09plYuxZjJFfVsmYUxew6558kH1zUbIWWlAMdQZn8nhIo4L9NYMsbTEQPUreiP/rLD+8iNTxfAopyaa3xqUXIIRW+UdIzN9bbKRgjSkeoJ/HgCUHU3SYhnw4qGX8GI9nLEw0YBFZFb38DrT1S2o5Qc+JXFwvo1Wc1cAklMnm1FI/PDnj+vOt7okN4g7VA/dyu4Gl4anAdw3PEXS7aK2kduUy9AwkAUbQEkJQ59U+g9W3OhvlDq1sJ9NdTY/i51vkZYsZgrFPSXmwh4v+eLya+kNPsyv5O6f1qCiB7TH/CSI528/Ep27x4748LKw87veLDVffG01f4DKAQUTfTAJH8fLrCgOV6kcwC1QKPXJfdtHJQeHInIBr0DazNIddko+sjJ1Jlnngxis3mp350m85sHpIzfUWsaRY3eQDo48m4PfltlsyRCrHudgf8PfA+RqXn4zr5xrbUg1k+LYqKx/robGFrm0qE6oS1cUzg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sun, Dec 31, 2023 at 07:32:47AM +0100, Alexandre Ghiti wrote: > On 19/12/2023 18:50, Jisheng Zhang wrote: > > In order to implement fast gup we need to ensure that the page > > table walker is protected from page table pages being freed from > > under it. > > > > riscv situation is more complicated than other architectures: some > > riscv platforms may use IPI to perform TLB shootdown, for example, > > those platforms which support AIA, usually the riscv_ipi_for_rfence is > > true on these platforms; some riscv platforms may rely on the SBI to > > perform TLB shootdown, usually the riscv_ipi_for_rfence is false on > > these platforms. To keep software pagetable walkers safe in this case > > we switch to RCU based table free (MMU_GATHER_RCU_TABLE_FREE). See the > > comment below 'ifdef CONFIG_MMU_GATHER_RCU_TABLE_FREE' in > > include/asm-generic/tlb.h for more details. > > > > This patch enables MMU_GATHER_RCU_TABLE_FREE, then use > > > > *tlb_remove_page_ptdesc() for those platforms which use IPI to perform > > TLB shootdown; > > > > *tlb_remove_ptdesc() for those platforms which use SBI to perform TLB > > shootdown; > > > Can you elaborate a bit more on what those functions do differently and why > we need to differentiate IPI vs SBI TLB shootdown? I don't understand this. Hi Alex, If IPI, the local_irq_save in lockless_pages_from_mm() of fast gup code path will block page table pages from being freed, I think the comments there is execellent. If SBI, the local_irq_save in lockless_pages_from_mm() can't acchieve the goal however. Because local_irq_save() only disable S-privilege IPI irq, it can't disable M-privilege's, which the SBI implementation use to shootdown TLB entry. So we need MMU_GATHER_RCU_TABLE_FREE helper for SBI case. Thanks > > > Both case mean that disabling interrupts will block the free and > > protect the fast gup page walker. > > > > Signed-off-by: Jisheng Zhang > > --- > > arch/riscv/Kconfig | 1 + > > arch/riscv/include/asm/pgalloc.h | 23 ++++++++++++++++++----- > > arch/riscv/include/asm/tlb.h | 18 ++++++++++++++++++ > > 3 files changed, 37 insertions(+), 5 deletions(-) > > > > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig > > index 24c1799e2ec4..d3555173d9f4 100644 > > --- a/arch/riscv/Kconfig > > +++ b/arch/riscv/Kconfig > > @@ -147,6 +147,7 @@ config RISCV > > select IRQ_FORCED_THREADING > > select KASAN_VMALLOC if KASAN > > select LOCK_MM_AND_FIND_VMA > > + select MMU_GATHER_RCU_TABLE_FREE if SMP && MMU > > select MODULES_USE_ELF_RELA if MODULES > > select MODULE_SECTIONS if MODULES > > select OF > > diff --git a/arch/riscv/include/asm/pgalloc.h b/arch/riscv/include/asm/pgalloc.h > > index 3c5e3bd15f46..deaf971253a2 100644 > > --- a/arch/riscv/include/asm/pgalloc.h > > +++ b/arch/riscv/include/asm/pgalloc.h > > @@ -102,7 +102,10 @@ static inline void __pud_free_tlb(struct mmu_gather *tlb, pud_t *pud, > > struct ptdesc *ptdesc = virt_to_ptdesc(pud); > > pagetable_pud_dtor(ptdesc); > > - tlb_remove_page_ptdesc(tlb, ptdesc); > > + if (riscv_use_ipi_for_rfence()) > > + tlb_remove_page_ptdesc(tlb, ptdesc); > > + else > > + tlb_remove_ptdesc(tlb, ptdesc); > > } > > } > > @@ -136,8 +139,12 @@ static inline void p4d_free(struct mm_struct *mm, p4d_t *p4d) > > static inline void __p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4d, > > unsigned long addr) > > { > > - if (pgtable_l5_enabled) > > - tlb_remove_page_ptdesc(tlb, virt_to_ptdesc(p4d)); > > + if (pgtable_l5_enabled) { > > + if (riscv_use_ipi_for_rfence()) > > + tlb_remove_page_ptdesc(tlb, virt_to_ptdesc(p4d)); > > + else > > + tlb_remove_ptdesc(tlb, virt_to_ptdesc(p4d)); > > + } > > } > > #endif /* __PAGETABLE_PMD_FOLDED */ > > @@ -169,7 +176,10 @@ static inline void __pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd, > > struct ptdesc *ptdesc = virt_to_ptdesc(pmd); > > pagetable_pmd_dtor(ptdesc); > > - tlb_remove_page_ptdesc(tlb, ptdesc); > > + if (riscv_use_ipi_for_rfence()) > > + tlb_remove_page_ptdesc(tlb, ptdesc); > > + else > > + tlb_remove_ptdesc(tlb, ptdesc); > > } > > #endif /* __PAGETABLE_PMD_FOLDED */ > > @@ -180,7 +190,10 @@ static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte, > > struct ptdesc *ptdesc = page_ptdesc(pte); > > pagetable_pte_dtor(ptdesc); > > - tlb_remove_page_ptdesc(tlb, ptdesc); > > + if (riscv_use_ipi_for_rfence()) > > + tlb_remove_page_ptdesc(tlb, ptdesc); > > + else > > + tlb_remove_ptdesc(tlb, ptdesc); > > } > > #endif /* CONFIG_MMU */ > > diff --git a/arch/riscv/include/asm/tlb.h b/arch/riscv/include/asm/tlb.h > > index 1eb5682b2af6..a0b8b853503f 100644 > > --- a/arch/riscv/include/asm/tlb.h > > +++ b/arch/riscv/include/asm/tlb.h > > @@ -10,6 +10,24 @@ struct mmu_gather; > > static void tlb_flush(struct mmu_gather *tlb); > > +#ifdef CONFIG_MMU > > +#include > > + > > +/* > > + * While riscv platforms with riscv_ipi_for_rfence as true require an IPI to > > + * perform TLB shootdown, some platforms with riscv_ipi_for_rfence as false use > > + * SBI to perform TLB shootdown. To keep software pagetable walkers safe in this > > + * case we switch to RCU based table free (MMU_GATHER_RCU_TABLE_FREE). See the > > + * comment below 'ifdef CONFIG_MMU_GATHER_RCU_TABLE_FREE' in include/asm-generic/tlb.h > > + * for more details. > > + */ > > +static inline void __tlb_remove_table(void *table) > > +{ > > + free_page_and_swap_cache(table); > > +} > > + > > +#endif /* CONFIG_MMU */ > > + > > #define tlb_flush tlb_flush > > #include