From: Florian Weimer <fweimer@redhat.com>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Alexander Duyck <alexander.duyck@gmail.com>,
Ralph Campbell <rcampbell@nvidia.com>,
Linux MM <linux-mm@kvack.org>,
longman@redhat.com, Linux API <linux-api@vger.kernel.org>,
Andi Kleen <ak@linux.intel.com>
Subject: Re: No system call to determine MAX_NUMNODES?
Date: Wed, 13 Feb 2019 15:25:14 +0100 [thread overview]
Message-ID: <87d0nvepf9.fsf@oldenburg2.str.redhat.com> (raw)
In-Reply-To: <4dab8a83-803a-56e0-6bbf-bdf581f2d1b4@suse.cz> (Vlastimil Babka's message of "Wed, 13 Feb 2019 10:26:48 +0100")
* Vlastimil Babka:
> On 2/7/19 1:27 AM, Alexander Duyck wrote:
>> On Wed, Feb 6, 2019 at 3:13 PM Ralph Campbell <rcampbell@nvidia.com> wrote:
>>>
>>> I was using the latest git://git.cmpxchg.org/linux-mmotm.git and noticed
>>> a new issue compared to 5.0.0-rc5.
>>>
>>> It looks like there is no convenient way to query the kernel's value for
>>> MAX_NUMNODES yet this is used in kernel_get_mempolicy() to validate the
>>> 'maxnode' parameter to the GET_MEMPOLICY(2) system call.
>>> Otherwise, EINVAL is returned.
>>>
>>> Searching the internet for get_mempolicy yields some references that
>>> recommend reading /proc/<pid>/status and parsing the line "Mems_allowed:".
>>>
>>> Running "cat /proc/self/status | grep Mems_allowed:" I get:
>>> With 5.0.0-rc5:
>>> Mems_allowed: 00000000,00000001
>>> With 5.0.0-rc5-mm1:
>>> Mems_allowed: 1
>>> (both kernels were config'ed with CONFIG_NODES_SHIFT=6)
>>>
>>> Clearly, there should be a better way to query MAX_NUMNODES like
>>> sysconf(), sysctl(), or libnuma.
>>
>> Really we shouldn't need to know that. That just tells us about how
>> the kernel was built, it doesn't really provide any information about
>> the layout of the system.
>>
>>> I searched for the patch that changed /proc/self/status but didn't find it.
>>
>> The patch you are looking for is located at:
>> http://lkml.kernel.org/r/1545405631-6808-1-git-send-email-longman@redhat.com
>
> Hmm looks like libnuma [1] uses that /proc/self/status parsing approach for
> numa_num_possible_nodes() and it's also mentioned in man numa(3), and comment in
> code mentions that libcpuset does that as well. I'm afraid we can't just break this.
Oh-oh. This looks utterly broken to me in the face of process
migration.
Is this used for anything important? Perhaps sizing data structures in
user space?
Thanks,
Florian
next prev parent reply other threads:[~2019-02-13 14:25 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-06 23:13 Ralph Campbell
2019-02-07 0:27 ` Alexander Duyck
2019-02-13 9:26 ` Vlastimil Babka
2019-02-13 14:25 ` Florian Weimer [this message]
2019-02-13 14:48 ` Vlastimil Babka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87d0nvepf9.fsf@oldenburg2.str.redhat.com \
--to=fweimer@redhat.com \
--cc=ak@linux.intel.com \
--cc=alexander.duyck@gmail.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=longman@redhat.com \
--cc=rcampbell@nvidia.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox