From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 78346C433F5 for ; Tue, 30 Nov 2021 17:30:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EFDB26B0072; Tue, 30 Nov 2021 12:30:47 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E86046B0073; Tue, 30 Nov 2021 12:30:47 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D266A6B0075; Tue, 30 Nov 2021 12:30:47 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0074.hostedemail.com [216.40.44.74]) by kanga.kvack.org (Postfix) with ESMTP id BBAD86B0072 for ; Tue, 30 Nov 2021 12:30:47 -0500 (EST) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 7103C184C4DD3 for ; Tue, 30 Nov 2021 17:30:37 +0000 (UTC) X-FDA: 78866286114.15.29FA815 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf03.hostedemail.com (Postfix) with ESMTP id DFD5830000A4 for ; Tue, 30 Nov 2021 17:30:31 +0000 (UTC) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 90E75ED1; Tue, 30 Nov 2021 09:30:35 -0800 (PST) Received: from login2.euhpc.arm.com (login2.euhpc.arm.com [10.6.27.34]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 374F43F694; Tue, 30 Nov 2021 09:30:34 -0800 (PST) From: Vladimir Murzin To: linux-arch-owner@vger.kernel.org, linux-mm@kvack.org Cc: dennis@kernel.org, tj@kernel.org, cl@linux.com, akpm@linux-foundation.org, npiggin@gmail.com, hch@lst.de, arnd@arndb.de, vladimir.murzin@arm.com Subject: [PATCH] percpu: km: Use for SMP+NOMMU Date: Tue, 30 Nov 2021 17:29:53 +0000 Message-Id: <20211130172954.129587-1-vladimir.murzin@arm.com> X-Mailer: git-send-email 2.24.0 MIME-Version: 1.0 X-Stat-Signature: fypgrkrsor5zc4noibsw74rqm8hfco61 Authentication-Results: imf03.hostedemail.com; dkim=none; spf=pass (imf03.hostedemail.com: domain of vladimir.murzin@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=vladimir.murzin@arm.com; dmarc=pass (policy=none) header.from=arm.com X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: DFD5830000A4 X-HE-Tag: 1638293431-646498 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: I have recently updated kernel for R-class (SMP+NOMMU) and observed following build failure: arm-none-linux-gnueabihf-ld: mm/percpu.o: in function `pcpu_post_unmap_tl= b_flush': mm/percpu-vm.c:188: undefined reference to `flush_tlb_kernel_range' ARM NOMMU declare flush_tlb_kernel_range() but doesn't define it, so I tried to understand what pulls that function... First candidate is 93274f1dd6b0 ("percpu: flush tlb in pcpu_reclaim_populated()") which added another user of pcpu_post_unmap_tlb_flush(), yet simple revert did not make things better... The second candidate turned to be 4ad0ae8c64ac ("mm/vmalloc: remove unmap_kernel_range"), more precisely NOMMU part. This one is interesting. Before conversion to vmap_pages_range_noflush() we had static inline int map_kernel_range_noflush(unsigned long start, unsigned long size, pgprot_t prot, struct page **pages) { return size >> PAGE_SHIFT; } static int __pcpu_map_pages(unsigned long addr, struct page **pages, int nr_pages) { return map_kernel_range_noflush(addr, nr_pages << PAGE_SHIFT, PAGE_KERNEL, pages); } static int pcpu_map_pages(struct pcpu_chunk *chunk, struct page **pages, int page_start, int page_e= nd) { unsigned int cpu, tcpu; int i, err; for_each_possible_cpu(cpu) { err =3D __pcpu_map_pages(pcpu_chunk_addr(chunk, cpu, page= _start), &pages[pcpu_page_idx(cpu, page_sta= rt)], page_end - page_start); if (err < 0) goto err; for (i =3D page_start; i < page_end; i++) pcpu_set_page_chunk(pages[pcpu_page_idx(cpu, i)], chunk); } return 0; err: for_each_possible_cpu(tcpu) { if (tcpu =3D=3D cpu) break; __pcpu_unmap_pages(pcpu_chunk_addr(chunk, tcpu, page_star= t), page_end - page_start); } pcpu_post_unmap_tlb_flush(chunk, page_start, page_end); return err; } Here __pcpu_map_pages() would never return negative value, so compiler optimizes error path and pcpu_post_unmap_tlb_flush() never referenced. After conversion to vmap_pages_range_noflush() we got static inline int vmap_pages_range_noflush(unsigned long addr, unsigned long end, pgprot_t prot, struct page **pages, unsigned int page_shi= ft) { return -EINVAL; } static int __pcpu_map_pages(unsigned long addr, struct page **pages, int nr_pages) { return vmap_pages_range_noflush(addr, addr + (nr_pages << PAGE_SH= IFT), PAGE_KERNEL, pages, PAGE_SHIFT); } and pcpu_map_pages() unchanged. Here __pcpu_map_pages() always return negative value, so compiler cannot optimise error path and pcpu_post_unmap_tlb_flush() stay referenced. Now it is kind of clear why it worked before and why it refuses to build. Next is to understand how to fix that. I noticed [1] following comment from Nicholas: > Previous code had a strange NOMMU implementation of > map_kernel_range_noflush that came in with commit b554cb426a955 > ("NOMMU: support SMP dynamic percpu_alloc") which would return > success if the size was <=3D PAGE_SIZE, but that has no way of working > on NOMMU because the page can not be mapped to the new start address > even if there is only one of them. So change this code to always > return failure. > > NOMMU probably needs to take a closer look at what it does with > percpu-vm.c and carve out a special case there if necessary rather > than pretend vmap works. So I started looking into it. First thing I noticed is than UP+NOMMU doesn't run into the same issue. The reason is that they pull mm/percpu-km.c with config NEED_PER_CPU_KM depends on !SMP bool default y Looking more into history of kernel memory based allocator it looks like it was designed with SMP in mind at least it is mentioned in b0c9778b1d07 ("percpu: implement kernel memory based chunk allocation"): > Implement an alternate percpu chunk management based on kernel memeory > for nommu SMP architectures.=20 ... and even latter when NEED_PER_CPU_KM was introduced by bbddff054587 ("percpu: use percpu allocator on UP too"): > Currently, users of percpu allocators need to handle UP differently, > which is somewhat fragile and ugly. Other than small amount of > memory, there isn't much to lose by enabling percpu allocator on UP. > It can simply use kernel memory based chunk allocation which was added > for SMP archs w/o MMUs. I could not find justification on prohibiting SMP+NOMMU in patch discussion [2] either. It looks like that dependency was overlooked and did not consider SMP+NOMMU. This probably also was a reason for b554cb426a95 ("NOMMU: support SMP dynamic percpu_alloc"), yet patch author and reviewers were also included into original b0c9778b1d07 ("percpu: implement kernel memory based chunk allocation")... Unless I'm missing something I'm proposing bringing kernel memory based allocator back for use with SMP+NOMMU Vladimir Murzin (1): percpu: km: ensure it is used with NOMMU (either UP or SMP) mm/Kconfig | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) --=20 2.7.4