From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B303EC7EE23 for ; Thu, 8 Jun 2023 08:26:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 125486B0074; Thu, 8 Jun 2023 04:26:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0AFEF6B0075; Thu, 8 Jun 2023 04:26:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EB83D8E0002; Thu, 8 Jun 2023 04:26:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id DF7246B0074 for ; Thu, 8 Jun 2023 04:26:09 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 9F5934018F for ; Thu, 8 Jun 2023 08:26:09 +0000 (UTC) X-FDA: 80878898058.10.FDABBDA Received: from relay1-d.mail.gandi.net (relay1-d.mail.gandi.net [217.70.183.193]) by imf09.hostedemail.com (Postfix) with ESMTP id 00931140019 for ; Thu, 8 Jun 2023 08:26:05 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=none; dmarc=none; spf=none (imf09.hostedemail.com: domain of alex@ghiti.fr has no SPF policy when checking 217.70.183.193) smtp.mailfrom=alex@ghiti.fr ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1686212766; a=rsa-sha256; cv=none; b=BK0vZMG2EG9GErkrwH3BgK6wPaIfocCJ0VB7XR1o3UuA+tmNDcJZfXsuPK1+bDfMbIjCpD xxzQebjqwitGvsSbBSs7jg2mEQ3gAf9zmo7RkIRzC4rP5zctP12erRZ2B8VS6rE6+jJURj ggUSkf4qXnRC+vx0KiVsOJuky7LUJa4= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=none; dmarc=none; spf=none (imf09.hostedemail.com: domain of alex@ghiti.fr has no SPF policy when checking 217.70.183.193) smtp.mailfrom=alex@ghiti.fr ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1686212766; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NOCn1p0AQbF22zvJJLWPlfA92SXqZKr4SYkzvQMCA7Y=; b=Cxh3d4Uye0qP4PUOubQKm7xC8SJ4SMoeZfpQkPC1SRl05yZEPN8anaQkNdzGveJOF8t/YL aRzgz7JEr7fZBfVpDhI8TjrkppI/y1QCv/n89VqT/mR3rqFJ9BpSL2YNsL6EOn2i8joVhk TqpSe3ceYt+Z+nFfqGSOUOgDIgZJwPY= X-GND-Sasl: alex@ghiti.fr X-GND-Sasl: alex@ghiti.fr X-GND-Sasl: alex@ghiti.fr X-GND-Sasl: alex@ghiti.fr X-GND-Sasl: alex@ghiti.fr X-GND-Sasl: alex@ghiti.fr X-GND-Sasl: alex@ghiti.fr X-GND-Sasl: alex@ghiti.fr X-GND-Sasl: alex@ghiti.fr X-GND-Sasl: alex@ghiti.fr X-GND-Sasl: alex@ghiti.fr Received: by mail.gandi.net (Postfix) with ESMTPSA id 51E5224000E; Thu, 8 Jun 2023 08:26:00 +0000 (UTC) Message-ID: Date: Thu, 8 Jun 2023 10:25:59 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.11.0 Subject: Re: [PATCH v2] riscv: mm: Pre-allocate PGD entries for vmalloc/modules area Content-Language: en-US To: =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= , Paul Walmsley , Palmer Dabbelt , Albert Ou , linux-riscv@lists.infradead.org Cc: =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux@rivosinc.com, Alexandre Ghiti , Joerg Roedel References: <20230531093817.665799-1-bjorn@kernel.org> From: Alexandre Ghiti In-Reply-To: <20230531093817.665799-1-bjorn@kernel.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 00931140019 X-Stat-Signature: y5u5ynk54ronefs55qdk75fyis5ynwi3 X-Rspam-User: X-HE-Tag: 1686212765-441118 X-HE-Meta: U2FsdGVkX184ldF1s+ijaK5o7yWL1KKYkll4IBfPE+4EFeDgq721EZ0X9pYg7vctjpSGoiZ0tCtPEUAe/y6/SbYjP5G4I9cnDrsSVPr5YNitiaB3ARDEVGN8EM3QqF5hjwEqH7huc+iaSeltqQZKMdj4tAnyXfWjUd1UBNk0104yWa2zMEaJGGma/8rmY4BibIu1Npb+UZLWDbOWg9/xzPkfEsyicWKVf64DcdThZ0DE3TvbNlVjojitdUmIX3Wq3QIORY5PFTdZqL6SYLx8mYTGl1KAx8s+VYb3/hzjvySoBFJBJoLH66B4njbmFMplRi44xysjBny2Hol754QROlYQmNC5Ay2BOsFY9SDxpucCBTol5bkvgwimB/OB0OtQIdpikgqBOEWFUAKZfh7qsvr7GB3CMR/f9P7LEl1UBvL9v1NSen0pkbCOiH/YVKCP1ituLQSKfqLPuGLs+nqh7P0PIURYZdzyQ9tozKKQiPiwOatZBKmpSKv0HHIgiJoSK3YAjPQ6lKB3EQNIEhGnqR2zMUxu54NAL5rJDqEYxqfKrT7skBoY9hOUqz8haniZ32jN51oO6I1IZnVWP3XAZo9FTM2O6HEswkxateIQbR1Jhc6qZzS1kj7i9HmeGcgJ0VBxJidDFvTE2vPk47cqAnaj83aQBR33c3RscO9fmwBeeHQlHdr7vh4HyrES+6H6uf9EPr7sL+zbKZrUq3QOWA46Nb98wPSLWnSZAke5Z1NvSqYIgA7SvYI5Xxmpp+YnPBOrArIAL6px9nAludSJUlA6ZEaUjX7OIlyuT1usYYVMW815mjxwlaOuA4B0hlS4mnYtQbDS4e0ySAmpPOfd5mnMaJTZnc0+Fp4KF0XWH7rMVMvaomAfm2Va2vQaBz/yKEH0OcFisT7ZrdIPahza8a2SrIHc5oO+BHKOgRW9Bgkq4z5EVkDCKLMs+oUKUJlE1OItMDHyLCogcKVo222 ANF2UGh0 5YpZcbs4XQ3n/m47wmuFt74IxE0YnRrTWnr+4GFwYMXnr0LW5j5rrEk61seVLhn62t+KXYntP20uUDMuUqCkjWQgTVuYAqxP1rXybai/rUILxukdLGAvJda4aRj/eFu2+ZEoPvBw8ALG1tleEj3qI0BGg1+9sO8IGKUeOG75o+MlIsrDe09R9uE2qTb/2P6sgPGqhn7Jx5nifwy3A01mY8ozPfMH2OCMpX9l6hZOnE9VWMy4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi Björn, On 31/05/2023 11:38, Björn Töpel wrote: > From: Björn Töpel > > The RISC-V port requires that kernel PGD entries are to be > synchronized between MMs. This is done via the vmalloc_fault() > function, that simply copies the PGD entries from init_mm to the > faulting one. > > Historically, faulting in PGD entries have been a source for both bugs > [1], and poor performance. > > One way to get rid of vmalloc faults is by pre-allocating the PGD > entries. Pre-allocating the entries potientially wastes 64 * 4K (65 on > SV39). The pre-allocation function is pulled from Jörg Rödel's x86 > work, with the addition of 3-level page tables (PMD allocations). > > The pmd_alloc() function needs the ptlock cache to be initialized > (when split page locks is enabled), so the pre-allocation is done in a > RISC-V specific pgtable_cache_init() implementation. > > Pre-allocate the kernel PGD entries for the vmalloc/modules area, but > only for 64b platforms. > > Link: https://lore.kernel.org/lkml/20200508144043.13893-1-joro@8bytes.org/ # [1] > Signed-off-by: Björn Töpel > --- > v1->v2: Fixed broken !MMU build. > --- > arch/riscv/mm/fault.c | 16 ++---------- > arch/riscv/mm/init.c | 58 +++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 60 insertions(+), 14 deletions(-) > > diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c > index 8685f85a7474..b023fb311e28 100644 > --- a/arch/riscv/mm/fault.c > +++ b/arch/riscv/mm/fault.c > @@ -238,24 +238,12 @@ void handle_page_fault(struct pt_regs *regs) > * only copy the information from the master page table, > * nothing more. > */ > - if (unlikely((addr >= VMALLOC_START) && (addr < VMALLOC_END))) { > + if ((!IS_ENABLED(CONFIG_MMU) || !IS_ENABLED(CONFIG_64BIT)) && > + unlikely(addr >= VMALLOC_START && addr < VMALLOC_END)) { > vmalloc_fault(regs, code, addr); > return; > } > > -#ifdef CONFIG_64BIT > - /* > - * Modules in 64bit kernels lie in their own virtual region which is not > - * in the vmalloc region, but dealing with page faults in this region > - * or the vmalloc region amounts to doing the same thing: checking that > - * the mapping exists in init_mm.pgd and updating user page table, so > - * just use vmalloc_fault. > - */ > - if (unlikely(addr >= MODULES_VADDR && addr < MODULES_END)) { > - vmalloc_fault(regs, code, addr); > - return; > - } > -#endif > /* Enable interrupts if they were enabled in the parent context. */ > if (!regs_irqs_disabled(regs)) > local_irq_enable(); > diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c > index 747e5b1ef02d..45ceaff5679e 100644 > --- a/arch/riscv/mm/init.c > +++ b/arch/riscv/mm/init.c > @@ -1363,3 +1363,61 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node, > return vmemmap_populate_basepages(start, end, node, NULL); > } > #endif > + > +#if defined(CONFIG_MMU) && defined(CONFIG_64BIT) > +/* > + * Pre-allocates page-table pages for a specific area in the kernel > + * page-table. Only the level which needs to be synchronized between > + * all page-tables is allocated because the synchronization can be > + * expensive. > + */ > +static void __init preallocate_pgd_pages_range(unsigned long start, unsigned long end, > + const char *area) > +{ > + unsigned long addr; > + const char *lvl; > + > + for (addr = start; addr < end && addr >= start; addr = ALIGN(addr + 1, PGDIR_SIZE)) { > + pgd_t *pgd = pgd_offset_k(addr); > + p4d_t *p4d; > + pud_t *pud; > + pmd_t *pmd; > + > + lvl = "p4d"; > + p4d = p4d_alloc(&init_mm, pgd, addr); > + if (!p4d) > + goto failed; > + > + if (pgtable_l5_enabled) > + continue; > + > + lvl = "pud"; > + pud = pud_alloc(&init_mm, p4d, addr); > + if (!pud) > + goto failed; > + > + if (pgtable_l4_enabled) > + continue; > + > + lvl = "pmd"; > + pmd = pmd_alloc(&init_mm, pud, addr); > + if (!pmd) > + goto failed; > + } > + return; > + > +failed: > + /* > + * The pages have to be there now or they will be missing in > + * process page-tables later. > + */ > + panic("Failed to pre-allocate %s pages for %s area\n", lvl, area); > +} > + > +void __init pgtable_cache_init(void) > +{ > + preallocate_pgd_pages_range(VMALLOC_START, VMALLOC_END, "vmalloc"); > + if (IS_ENABLED(CONFIG_MODULES)) > + preallocate_pgd_pages_range(MODULES_VADDR, MODULES_END, "bpf/modules"); > +} > +#endif > > base-commit: ac9a78681b921877518763ba0e89202254349d1b You can add: Reviewed-by: Alexandre Ghiti Thanks! Alex