[PATCH] percpu: km: Use for SMP+NOMMU

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Vladimir Murzin <vladimir.murzin@arm.com>
To: linux-arch-owner@vger.kernel.org, linux-mm@kvack.org
Cc: dennis@kernel.org, tj@kernel.org, cl@linux.com,
	akpm@linux-foundation.org, npiggin@gmail.com, hch@lst.de,
	arnd@arndb.de, vladimir.murzin@arm.com
Subject: [PATCH] percpu: km: Use for SMP+NOMMU
Date: Tue, 30 Nov 2021 17:29:53 +0000	[thread overview]
Message-ID: <20211130172954.129587-1-vladimir.murzin@arm.com> (raw)

I have recently updated kernel for R-class (SMP+NOMMU) and observed
following build failure:

arm-none-linux-gnueabihf-ld: mm/percpu.o: in function `pcpu_post_unmap_tlb_flush':
mm/percpu-vm.c:188: undefined reference to `flush_tlb_kernel_range'

ARM NOMMU declare flush_tlb_kernel_range() but doesn't define it, so I
tried to understand what pulls that function...

First candidate is 93274f1dd6b0 ("percpu: flush tlb in
pcpu_reclaim_populated()") which added another user of
pcpu_post_unmap_tlb_flush(), yet simple revert did not make things
better...

The second candidate turned to be 4ad0ae8c64ac ("mm/vmalloc: remove
unmap_kernel_range"), more precisely NOMMU part. This one is
interesting. Before conversion to vmap_pages_range_noflush() we had

static inline int
map_kernel_range_noflush(unsigned long start, unsigned long size,
                        pgprot_t prot, struct page **pages)
{
        return size >> PAGE_SHIFT;
}

static int __pcpu_map_pages(unsigned long addr, struct page **pages,
                            int nr_pages)
{
        return map_kernel_range_noflush(addr, nr_pages << PAGE_SHIFT,
                                        PAGE_KERNEL, pages);
}

static int pcpu_map_pages(struct pcpu_chunk *chunk,
                          struct page **pages, int page_start, int page_end)
{
        unsigned int cpu, tcpu;
        int i, err;

        for_each_possible_cpu(cpu) {
                err = __pcpu_map_pages(pcpu_chunk_addr(chunk, cpu, page_start),
                                       &pages[pcpu_page_idx(cpu, page_start)],
                                       page_end - page_start);
                if (err < 0)
                        goto err;

                for (i = page_start; i < page_end; i++)
                        pcpu_set_page_chunk(pages[pcpu_page_idx(cpu, i)],
                                            chunk);
        }
        return 0;
err:
        for_each_possible_cpu(tcpu) {
                if (tcpu == cpu)
                        break;
                __pcpu_unmap_pages(pcpu_chunk_addr(chunk, tcpu, page_start),
                                   page_end - page_start);
        }
        pcpu_post_unmap_tlb_flush(chunk, page_start, page_end);
        return err;
}

Here __pcpu_map_pages() would never return negative value, so
compiler optimizes error path and pcpu_post_unmap_tlb_flush() never
referenced.

After conversion to vmap_pages_range_noflush() we got

static inline
int vmap_pages_range_noflush(unsigned long addr, unsigned long end,
                pgprot_t prot, struct page **pages, unsigned int page_shift)
{
       return -EINVAL;
}

static int __pcpu_map_pages(unsigned long addr, struct page **pages,
                            int nr_pages)
{
        return vmap_pages_range_noflush(addr, addr + (nr_pages << PAGE_SHIFT),
                                        PAGE_KERNEL, pages, PAGE_SHIFT);
}

and pcpu_map_pages() unchanged.

Here __pcpu_map_pages() always return negative value, so compiler
cannot optimise error path and pcpu_post_unmap_tlb_flush() stay
referenced.

Now it is kind of clear why it worked before and why it refuses to build.

Next is to understand how to fix that. I noticed [1] following comment
from Nicholas:

> Previous code had a strange NOMMU implementation of
> map_kernel_range_noflush that came in with commit b554cb426a955
> ("NOMMU: support SMP dynamic percpu_alloc") which would return
> success if the size was <= PAGE_SIZE, but that has no way of working
> on NOMMU because the page can not be mapped to the new start address
> even if there is only one of them. So change this code to always
> return failure.
>
> NOMMU probably needs to take a closer look at what it does with
> percpu-vm.c and carve out a special case there if necessary rather
> than pretend vmap works.

So I started looking into it. First thing I noticed is than UP+NOMMU
doesn't run into the same issue. The reason is that they pull
mm/percpu-km.c with

config NEED_PER_CPU_KM
        depends on !SMP
        bool
        default y

Looking more into history of kernel memory based allocator it looks
like it was designed with SMP in mind at least it is mentioned in
b0c9778b1d07 ("percpu: implement kernel memory based chunk
allocation"):

> Implement an alternate percpu chunk management based on kernel memeory
> for nommu SMP architectures. 

... and even latter when NEED_PER_CPU_KM was introduced by
bbddff054587 ("percpu: use percpu allocator on UP too"):

> Currently, users of percpu allocators need to handle UP differently,
> which is somewhat fragile and ugly.  Other than small amount of
> memory, there isn't much to lose by enabling percpu allocator on UP.
> It can simply use kernel memory based chunk allocation which was added
> for SMP archs w/o MMUs.

I could not find justification on prohibiting SMP+NOMMU in patch
discussion [2] either. It looks like that dependency was overlooked and
did not consider SMP+NOMMU.

This probably also was a reason for b554cb426a95 ("NOMMU: support SMP
dynamic percpu_alloc"), yet patch author and reviewers were also included
into original b0c9778b1d07 ("percpu: implement kernel memory based chunk
allocation")...

Unless I'm missing something I'm proposing bringing kernel memory based
allocator back for use with SMP+NOMMU

Vladimir Murzin (1):
  percpu: km: ensure it is used with NOMMU (either UP or SMP)

 mm/Kconfig | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

-- 
2.7.4

next             reply	other threads:[~2021-11-30 17:30 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-30 17:29 Vladimir Murzin [this message]
2021-11-30 17:29 ` [PATCH] percpu: km: ensure it is used with NOMMU (either UP or SMP) Vladimir Murzin
2021-11-30 17:41   ` Dennis Zhou
2021-12-01 11:51     ` Vladimir Murzin
2021-12-03 21:02       ` Dennis Zhou
2021-12-06  8:27         ` Vladimir Murzin
2021-12-06 12:01         ` Rob Landley
2021-12-06 16:21           ` Rich Felker
2021-12-06 17:54             ` Dennis Zhou
2021-12-14 16:29       ` Geert Uytterhoeven
2021-12-14 17:26         ` Dennis Zhou
2021-12-14 19:02           ` Geert Uytterhoeven
2021-12-14 19:18             ` Dennis Zhou
2021-12-14 20:12               ` Geert Uytterhoeven
2021-12-14 20:50                 ` Dennis Zhou
2021-12-15  7:56                   ` Geert Uytterhoeven

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211130172954.129587-1-vladimir.murzin@arm.com \
    --to=vladimir.murzin@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=cl@linux.com \
    --cc=dennis@kernel.org \
    --cc=hch@lst.de \
    --cc=linux-arch-owner@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=npiggin@gmail.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox