From: Vladimir Murzin <vladimir.murzin@arm.com>
To: linux-arch-owner@vger.kernel.org, linux-mm@kvack.org
Cc: dennis@kernel.org, tj@kernel.org, cl@linux.com,
akpm@linux-foundation.org, npiggin@gmail.com, hch@lst.de,
arnd@arndb.de, vladimir.murzin@arm.com
Subject: [PATCH] percpu: km: Use for SMP+NOMMU
Date: Tue, 30 Nov 2021 17:29:53 +0000 [thread overview]
Message-ID: <20211130172954.129587-1-vladimir.murzin@arm.com> (raw)
I have recently updated kernel for R-class (SMP+NOMMU) and observed
following build failure:
arm-none-linux-gnueabihf-ld: mm/percpu.o: in function `pcpu_post_unmap_tlb_flush':
mm/percpu-vm.c:188: undefined reference to `flush_tlb_kernel_range'
ARM NOMMU declare flush_tlb_kernel_range() but doesn't define it, so I
tried to understand what pulls that function...
First candidate is 93274f1dd6b0 ("percpu: flush tlb in
pcpu_reclaim_populated()") which added another user of
pcpu_post_unmap_tlb_flush(), yet simple revert did not make things
better...
The second candidate turned to be 4ad0ae8c64ac ("mm/vmalloc: remove
unmap_kernel_range"), more precisely NOMMU part. This one is
interesting. Before conversion to vmap_pages_range_noflush() we had
static inline int
map_kernel_range_noflush(unsigned long start, unsigned long size,
pgprot_t prot, struct page **pages)
{
return size >> PAGE_SHIFT;
}
static int __pcpu_map_pages(unsigned long addr, struct page **pages,
int nr_pages)
{
return map_kernel_range_noflush(addr, nr_pages << PAGE_SHIFT,
PAGE_KERNEL, pages);
}
static int pcpu_map_pages(struct pcpu_chunk *chunk,
struct page **pages, int page_start, int page_end)
{
unsigned int cpu, tcpu;
int i, err;
for_each_possible_cpu(cpu) {
err = __pcpu_map_pages(pcpu_chunk_addr(chunk, cpu, page_start),
&pages[pcpu_page_idx(cpu, page_start)],
page_end - page_start);
if (err < 0)
goto err;
for (i = page_start; i < page_end; i++)
pcpu_set_page_chunk(pages[pcpu_page_idx(cpu, i)],
chunk);
}
return 0;
err:
for_each_possible_cpu(tcpu) {
if (tcpu == cpu)
break;
__pcpu_unmap_pages(pcpu_chunk_addr(chunk, tcpu, page_start),
page_end - page_start);
}
pcpu_post_unmap_tlb_flush(chunk, page_start, page_end);
return err;
}
Here __pcpu_map_pages() would never return negative value, so
compiler optimizes error path and pcpu_post_unmap_tlb_flush() never
referenced.
After conversion to vmap_pages_range_noflush() we got
static inline
int vmap_pages_range_noflush(unsigned long addr, unsigned long end,
pgprot_t prot, struct page **pages, unsigned int page_shift)
{
return -EINVAL;
}
static int __pcpu_map_pages(unsigned long addr, struct page **pages,
int nr_pages)
{
return vmap_pages_range_noflush(addr, addr + (nr_pages << PAGE_SHIFT),
PAGE_KERNEL, pages, PAGE_SHIFT);
}
and pcpu_map_pages() unchanged.
Here __pcpu_map_pages() always return negative value, so compiler
cannot optimise error path and pcpu_post_unmap_tlb_flush() stay
referenced.
Now it is kind of clear why it worked before and why it refuses to build.
Next is to understand how to fix that. I noticed [1] following comment
from Nicholas:
> Previous code had a strange NOMMU implementation of
> map_kernel_range_noflush that came in with commit b554cb426a955
> ("NOMMU: support SMP dynamic percpu_alloc") which would return
> success if the size was <= PAGE_SIZE, but that has no way of working
> on NOMMU because the page can not be mapped to the new start address
> even if there is only one of them. So change this code to always
> return failure.
>
> NOMMU probably needs to take a closer look at what it does with
> percpu-vm.c and carve out a special case there if necessary rather
> than pretend vmap works.
So I started looking into it. First thing I noticed is than UP+NOMMU
doesn't run into the same issue. The reason is that they pull
mm/percpu-km.c with
config NEED_PER_CPU_KM
depends on !SMP
bool
default y
Looking more into history of kernel memory based allocator it looks
like it was designed with SMP in mind at least it is mentioned in
b0c9778b1d07 ("percpu: implement kernel memory based chunk
allocation"):
> Implement an alternate percpu chunk management based on kernel memeory
> for nommu SMP architectures.
... and even latter when NEED_PER_CPU_KM was introduced by
bbddff054587 ("percpu: use percpu allocator on UP too"):
> Currently, users of percpu allocators need to handle UP differently,
> which is somewhat fragile and ugly. Other than small amount of
> memory, there isn't much to lose by enabling percpu allocator on UP.
> It can simply use kernel memory based chunk allocation which was added
> for SMP archs w/o MMUs.
I could not find justification on prohibiting SMP+NOMMU in patch
discussion [2] either. It looks like that dependency was overlooked and
did not consider SMP+NOMMU.
This probably also was a reason for b554cb426a95 ("NOMMU: support SMP
dynamic percpu_alloc"), yet patch author and reviewers were also included
into original b0c9778b1d07 ("percpu: implement kernel memory based chunk
allocation")...
Unless I'm missing something I'm proposing bringing kernel memory based
allocator back for use with SMP+NOMMU
Vladimir Murzin (1):
percpu: km: ensure it is used with NOMMU (either UP or SMP)
mm/Kconfig | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
--
2.7.4
next reply other threads:[~2021-11-30 17:30 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-30 17:29 Vladimir Murzin [this message]
2021-11-30 17:29 ` [PATCH] percpu: km: ensure it is used with NOMMU (either UP or SMP) Vladimir Murzin
2021-11-30 17:41 ` Dennis Zhou
2021-12-01 11:51 ` Vladimir Murzin
2021-12-03 21:02 ` Dennis Zhou
2021-12-06 8:27 ` Vladimir Murzin
2021-12-06 12:01 ` Rob Landley
2021-12-06 16:21 ` Rich Felker
2021-12-06 17:54 ` Dennis Zhou
2021-12-14 16:29 ` Geert Uytterhoeven
2021-12-14 17:26 ` Dennis Zhou
2021-12-14 19:02 ` Geert Uytterhoeven
2021-12-14 19:18 ` Dennis Zhou
2021-12-14 20:12 ` Geert Uytterhoeven
2021-12-14 20:50 ` Dennis Zhou
2021-12-15 7:56 ` Geert Uytterhoeven
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20211130172954.129587-1-vladimir.murzin@arm.com \
--to=vladimir.murzin@arm.com \
--cc=akpm@linux-foundation.org \
--cc=arnd@arndb.de \
--cc=cl@linux.com \
--cc=dennis@kernel.org \
--cc=hch@lst.de \
--cc=linux-arch-owner@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=npiggin@gmail.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox