From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 39EB6C47074 for ; Thu, 4 Jan 2024 10:45:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 972C86B03B5; Thu, 4 Jan 2024 05:45:15 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9226F6B03B6; Thu, 4 Jan 2024 05:45:15 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7AA786B03B7; Thu, 4 Jan 2024 05:45:15 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 6485F6B03B5 for ; Thu, 4 Jan 2024 05:45:15 -0500 (EST) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 3084A1C08A4 for ; Thu, 4 Jan 2024 10:45:15 +0000 (UTC) X-FDA: 81641296590.23.92988B0 Received: from relay6-d.mail.gandi.net (relay6-d.mail.gandi.net [217.70.183.198]) by imf11.hostedemail.com (Postfix) with ESMTP id 2AE3E40015 for ; Thu, 4 Jan 2024 10:45:12 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=none; spf=pass (imf11.hostedemail.com: domain of alex@ghiti.fr designates 217.70.183.198 as permitted sender) smtp.mailfrom=alex@ghiti.fr; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1704365113; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=UNW0+8L5x3ecwkKJ1GguSCZjulwXlQpn23zQDVgotzo=; b=z23qErZ/Lh+WspIWfJYXje/VOQ8xCXaJdbKBbPBwL1o2+8++Z9T8t12UjCSl33/iKMRX+L z82bB9iEmUcHKog6RJS7gr78nBmbruONo06NPfj2NlMWDQssrJvFY/rU0RMF0e4dpR8h+W 1Qvkh0P4PiBalmOaxP7eIzu4VeSEAYI= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1704365113; a=rsa-sha256; cv=none; b=1qUFr+XtzTY8x7a0eA95aVj0T6xQ+Fc70Mq7pPa5gcVbCD+I84LAhMlLxCIVn7SueBF7Ct f0YSiPvH88Ik86VjnDaH2hU74WwRT5pEwRZU7h5cA0KcNEuEq2OAl6JI2j7igEhvJ1X453 O6egMFxDYyD+jbgco93U6X5BqPEGemw= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=none; spf=pass (imf11.hostedemail.com: domain of alex@ghiti.fr designates 217.70.183.198 as permitted sender) smtp.mailfrom=alex@ghiti.fr; dmarc=none Received: by mail.gandi.net (Postfix) with ESMTPSA id 7F999C000A; Thu, 4 Jan 2024 10:45:08 +0000 (UTC) Message-ID: <3b9fb9f0-86e8-4a76-be69-9f4598434c6c@ghiti.fr> Date: Thu, 4 Jan 2024 11:45:08 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 3/4] riscv: enable MMU_GATHER_RCU_TABLE_FREE for SMP && MMU Content-Language: en-US To: Jisheng Zhang Cc: Paul Walmsley , Palmer Dabbelt , Albert Ou , Will Deacon , "Aneesh Kumar K . V" , Andrew Morton , Nick Piggin , Peter Zijlstra , linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org References: <20231219175046.2496-1-jszhang@kernel.org> <20231219175046.2496-4-jszhang@kernel.org> From: Alexandre Ghiti In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-GND-Sasl: alex@ghiti.fr X-Rspamd-Queue-Id: 2AE3E40015 X-Rspam-User: X-Stat-Signature: 3871qqofqyib4w1sfy4azgiep7ri8h98 X-Rspamd-Server: rspam03 X-HE-Tag: 1704365112-958857 X-HE-Meta: U2FsdGVkX1/sIEAIA2bZ9iPbkgqq2EvSTx5SxL3HIobo/FVdtqxmyoOIUF+eOO1a07j2Auin6Ga8eATuRIcMe+l//0roqT4yOtvPemPYmsBQzcLywb1PPqxxkwlZsfWsUt4vtiTxJvRVUS5463ZzHvWP8ah8Hn+rpW6OsZrQDKKLSVJND0G9YP8sLm2yLqAOjh+qPpvRe1XxOvzSJXykTbHlIhe0KKo3LegouINicaNi41AZhhRUPGLw1d3XgFHzrgEva/rDApATj+2kZrqPae0dZoBozKLBAH5Jz6H2GIXbT6K3sxi1CwWeqXo5iV9vejFjhhq6lw/v2RLmpgkDMScocBtlMDh9UStOG42VI3PGyM4JRhxqiGgDh6NTNRoqcCcFk5tsaO0J+TctwlBMo7M22/d6b0m6l3K+Ajn/VVxzJsSZw15upKZwd+goGHzHNC3/FgKZGtXEdrBs1V/xSSkaMvKUfsK9Ei+C48o+Mhy+m/pSNBt5oWNiN+x1D2Bon2Wk9DpahxDe5WUSsk8Ew65xVvP5ApJiRXYtxpb31c+iW5QMnX6Q1qFHQk4I0HPB5DfpY7ki2hoEIzcg/Lo8P37bGFIDAqOQ0H9xgqbfpe97XnLZ4+cpxWTn7FRMxbAYIbG4fqVS19lqi5Iai4mIYMTkhdJiCoCRLaWC0z0zU5HKoF+Tw6W5aOMppTXcGX+n9eYsmZxxTaHpVRx4J3PYGB0l0qmQpNaOYrR+oKLmBibWzL+DacdQdQTlpRBYiBhxQEp/tiedhV5Au6wc+tuQxme7X3XSxbzLw/k3tDioRqTWjoaglmtteq+CFq1QXTPAlCY9x8l3gl/lHau11jaslFiPbWcyJqc3D4wetlaoFrgq2FnTKhtQNBmDuRacMBTx X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Jisheng, On 02/01/2024 04:23, Jisheng Zhang wrote: > On Sun, Dec 31, 2023 at 07:32:47AM +0100, Alexandre Ghiti wrote: >> On 19/12/2023 18:50, Jisheng Zhang wrote: >>> In order to implement fast gup we need to ensure that the page >>> table walker is protected from page table pages being freed from >>> under it. >>> >>> riscv situation is more complicated than other architectures: some >>> riscv platforms may use IPI to perform TLB shootdown, for example, >>> those platforms which support AIA, usually the riscv_ipi_for_rfence is >>> true on these platforms; some riscv platforms may rely on the SBI to >>> perform TLB shootdown, usually the riscv_ipi_for_rfence is false on >>> these platforms. To keep software pagetable walkers safe in this case >>> we switch to RCU based table free (MMU_GATHER_RCU_TABLE_FREE). See the >>> comment below 'ifdef CONFIG_MMU_GATHER_RCU_TABLE_FREE' in >>> include/asm-generic/tlb.h for more details. >>> >>> This patch enables MMU_GATHER_RCU_TABLE_FREE, then use >>> >>> *tlb_remove_page_ptdesc() for those platforms which use IPI to perform >>> TLB shootdown; >>> >>> *tlb_remove_ptdesc() for those platforms which use SBI to perform TLB >>> shootdown; >> >> Can you elaborate a bit more on what those functions do differently and why >> we need to differentiate IPI vs SBI TLB shootdown? I don't understand this. > Hi Alex, > > If IPI, the local_irq_save in lockless_pages_from_mm() of fast gup code > path will block page table pages from being freed, I think the comments > there is execellent. > > If SBI, the local_irq_save in lockless_pages_from_mm() can't acchieve > the goal however. Because local_irq_save() only disable S-privilege IPI irq, > it can't disable M-privilege's, which the SBI implementation use to > shootdown TLB entry. So we need MMU_GATHER_RCU_TABLE_FREE helper for > SBI case. Ok, I get it now, can you add the following link to your commit description https://elixir.bootlin.com/linux/v6.6/source/mm/mmu_gather.c#L162 ? It describes the problem very clearly. > Thanks > >>> Both case mean that disabling interrupts will block the free and >>> protect the fast gup page walker. >>> >>> Signed-off-by: Jisheng Zhang >>> --- >>> arch/riscv/Kconfig | 1 + >>> arch/riscv/include/asm/pgalloc.h | 23 ++++++++++++++++++----- >>> arch/riscv/include/asm/tlb.h | 18 ++++++++++++++++++ >>> 3 files changed, 37 insertions(+), 5 deletions(-) >>> >>> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig >>> index 24c1799e2ec4..d3555173d9f4 100644 >>> --- a/arch/riscv/Kconfig >>> +++ b/arch/riscv/Kconfig >>> @@ -147,6 +147,7 @@ config RISCV >>> select IRQ_FORCED_THREADING >>> select KASAN_VMALLOC if KASAN >>> select LOCK_MM_AND_FIND_VMA >>> + select MMU_GATHER_RCU_TABLE_FREE if SMP && MMU >>> select MODULES_USE_ELF_RELA if MODULES >>> select MODULE_SECTIONS if MODULES >>> select OF >>> diff --git a/arch/riscv/include/asm/pgalloc.h b/arch/riscv/include/asm/pgalloc.h >>> index 3c5e3bd15f46..deaf971253a2 100644 >>> --- a/arch/riscv/include/asm/pgalloc.h >>> +++ b/arch/riscv/include/asm/pgalloc.h >>> @@ -102,7 +102,10 @@ static inline void __pud_free_tlb(struct mmu_gather *tlb, pud_t *pud, >>> struct ptdesc *ptdesc = virt_to_ptdesc(pud); >>> pagetable_pud_dtor(ptdesc); >>> - tlb_remove_page_ptdesc(tlb, ptdesc); >>> + if (riscv_use_ipi_for_rfence()) >>> + tlb_remove_page_ptdesc(tlb, ptdesc); >>> + else >>> + tlb_remove_ptdesc(tlb, ptdesc); >>> } >>> } >>> @@ -136,8 +139,12 @@ static inline void p4d_free(struct mm_struct *mm, p4d_t *p4d) >>> static inline void __p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4d, >>> unsigned long addr) >>> { >>> - if (pgtable_l5_enabled) >>> - tlb_remove_page_ptdesc(tlb, virt_to_ptdesc(p4d)); >>> + if (pgtable_l5_enabled) { >>> + if (riscv_use_ipi_for_rfence()) >>> + tlb_remove_page_ptdesc(tlb, virt_to_ptdesc(p4d)); >>> + else >>> + tlb_remove_ptdesc(tlb, virt_to_ptdesc(p4d)); >>> + } >>> } >>> #endif /* __PAGETABLE_PMD_FOLDED */ >>> @@ -169,7 +176,10 @@ static inline void __pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd, >>> struct ptdesc *ptdesc = virt_to_ptdesc(pmd); >>> pagetable_pmd_dtor(ptdesc); >>> - tlb_remove_page_ptdesc(tlb, ptdesc); >>> + if (riscv_use_ipi_for_rfence()) >>> + tlb_remove_page_ptdesc(tlb, ptdesc); >>> + else >>> + tlb_remove_ptdesc(tlb, ptdesc); >>> } >>> #endif /* __PAGETABLE_PMD_FOLDED */ >>> @@ -180,7 +190,10 @@ static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte, >>> struct ptdesc *ptdesc = page_ptdesc(pte); >>> pagetable_pte_dtor(ptdesc); >>> - tlb_remove_page_ptdesc(tlb, ptdesc); >>> + if (riscv_use_ipi_for_rfence()) >>> + tlb_remove_page_ptdesc(tlb, ptdesc); >>> + else >>> + tlb_remove_ptdesc(tlb, ptdesc); >>> } >>> #endif /* CONFIG_MMU */ >>> diff --git a/arch/riscv/include/asm/tlb.h b/arch/riscv/include/asm/tlb.h >>> index 1eb5682b2af6..a0b8b853503f 100644 >>> --- a/arch/riscv/include/asm/tlb.h >>> +++ b/arch/riscv/include/asm/tlb.h >>> @@ -10,6 +10,24 @@ struct mmu_gather; >>> static void tlb_flush(struct mmu_gather *tlb); >>> +#ifdef CONFIG_MMU >>> +#include >>> + >>> +/* >>> + * While riscv platforms with riscv_ipi_for_rfence as true require an IPI to >>> + * perform TLB shootdown, some platforms with riscv_ipi_for_rfence as false use >>> + * SBI to perform TLB shootdown. To keep software pagetable walkers safe in this >>> + * case we switch to RCU based table free (MMU_GATHER_RCU_TABLE_FREE). See the >>> + * comment below 'ifdef CONFIG_MMU_GATHER_RCU_TABLE_FREE' in include/asm-generic/tlb.h A direct link would be better, I did not find the comment you mention here and even though, it can still move around later. And then you can add: Reviewed-by: Alexandre Ghiti Thanks! Alex >>> + * for more details. >>> + */ >>> +static inline void __tlb_remove_table(void *table) >>> +{ >>> + free_page_and_swap_cache(table); >>> +} >>> + >>> +#endif /* CONFIG_MMU */ >>> + >>> #define tlb_flush tlb_flush >>> #include > _______________________________________________ > linux-riscv mailing list > linux-riscv@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv