linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jerome Glisse <jglisse@google.com>
To: David Hildenbrand <david@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org,  linux-kernel@vger.kernel.org,
	stable@vger.kernel.org
Subject: Re: [PATCH] mm: fix maxnode for mbind(), set_mempolicy() and migrate_pages()
Date: Tue, 23 Jul 2024 09:19:07 -0700	[thread overview]
Message-ID: <CAPTQFZSuhMOUhNH06ePUXCiQtqOxUbbn5cT+cD=GqXnkVSGD=w@mail.gmail.com> (raw)
In-Reply-To: <0c390494-e6ba-4cde-aace-cd726f2409a1@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 2228 bytes --]

On Mon, 22 Jul 2024 at 06:09, David Hildenbrand <david@redhat.com> wrote:

> On 20.07.24 19:35, Jerome Glisse wrote:
> > Because maxnode bug there is no way to bind or migrate_pages to the
> > last node in multi-node NUMA system unless you lie about maxnodes
> > when making the mbind, set_mempolicy or migrate_pages syscall.
> >
> > Manpage for those syscall describe maxnodes as the number of bits in
> > the node bitmap ("bit mask of nodes containing up to maxnode bits").
> > Thus if maxnode is n then we expect to have a n bit(s) bitmap which
> > means that the mask of valid bits is ((1 << n) - 1). The get_nodes()
> > decrement lead to the mask being ((1 << (n - 1)) - 1).
> >
> > The three syscalls use a common helper get_nodes() and first things
> > this helper do is decrement maxnode by 1 which leads to using n-1 bits
> > in the provided mask of nodes (see get_bitmap() an helper function to
> > get_nodes()).
> >
> > The lead to two bugs, either the last node in the bitmap provided will
> > not be use in either of the three syscalls, or the syscalls will error
> > out and return EINVAL if the only bit set in the bitmap was the last
> > bit in the mask of nodes (which is ignored because of the bug and an
> > empty mask of nodes is an invalid argument).
> >
> > I am surprised this bug was never caught ... it has been in the kernel
> > since forever.
>
> Let's look at QEMU: backends/hostmem.c
>
>      /*
>       * We can have up to MAX_NODES nodes, but we need to pass maxnode+1
>       * as argument to mbind() due to an old Linux bug (feature?) which
>       * cuts off the last specified node. This means backend->host_nodes
>       * must have MAX_NODES+1 bits available.
>       */
>
> Which means that it's been known for a long time, and the workaround
> seems to be pretty easy.
>
> So I wonder if we rather want to update the documentation to match reality.
>

I think it is kind of weird if we ask to supply maxnodes+1 to work around
the bug. If we apply this patch qemu would continue to work as is while
fixing users that were not aware of that bug. So I would say applying this
patch does more good. Long term qemu can drop its workaround or keep it for
backward compatibility with old kernel.

[-- Attachment #2: Type: text/html, Size: 2761 bytes --]

  parent reply	other threads:[~2024-07-23 16:19 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-20 17:35 Jerome Glisse
2024-07-20 17:55 ` Matthew Wilcox
2024-07-22 21:21   ` Gregory Price
     [not found] ` <0c390494-e6ba-4cde-aace-cd726f2409a1@redhat.com>
2024-07-23 16:19   ` Jerome Glisse [this message]
2024-07-23 16:33   ` Jerome Glisse
2024-07-23 17:37     ` David Hildenbrand
2024-07-23 18:24       ` David Hildenbrand
2024-07-24  4:15       ` Jerome Glisse
2024-07-24  6:27         ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAPTQFZSuhMOUhNH06ePUXCiQtqOxUbbn5cT+cD=GqXnkVSGD=w@mail.gmail.com' \
    --to=jglisse@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox