From: "Christoph Lameter (Ampere)" <cl@gentwo.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: tglx@linutronix.de, axboe@kernel.dk,
linux-kernel@vger.kernel.org, mingo@redhat.com,
dvhart@infradead.org, dave@stgolabs.net, andrealmeid@igalia.com,
Andrew Morton <akpm@linux-foundation.org>,
urezki@gmail.com, hch@infradead.org, lstoakes@gmail.com,
Arnd Bergmann <arnd@arndb.de>,
linux-api@vger.kernel.org, linux-mm@kvack.org,
linux-arch@vger.kernel.org, malteskarupke@web.de
Subject: Re: [PATCH v1 11/14] futex: Implement FUTEX2_NUMA
Date: Fri, 25 Oct 2024 12:36:28 -0700 (PDT) [thread overview]
Message-ID: <887eadb6-6142-3edf-0a25-d33b2219b90d@gentwo.org> (raw)
In-Reply-To: <20241025085815.GG14555@noisy.programming.kicks-ass.net>
Sorry saw this after the other email.
On Fri, 25 Oct 2024, Peter Zijlstra wrote:
> > Could we follow NUMA policies like with other metadata allocations during
> > systen call processing?
>
> I had a quick look at this, and since the mempolicy stuff is per vma,
> and we don't have the vma, this is going to be terribly expensive --
> mmap_lock and all that.
There is a memory policy for the task as a whole that is used for slab
allocations and allocations that are not vma bound in current->mempolicy.
Use that.
> Using memory policies is probably okay -- but still risky, since you get
> the extra failure case where if you change the mempolicy between WAIT
> and WAKE things will not match and sadness happens, but that *SHOULD*
> hopefully not happen a lot. Mempolicies are typically fairly static.
Right.
> > That way the placement of the futex can be controlled by the tasks memory
> > policy. We could skip the FUTEX2_NUMA option.
>
> That doesn't work. If we don't have storage for the node across
> WAIT/WAKE, then the node must be deterministic per futex_hash().
> Otherwise wake has no chance of finding the entry.
You can get a node number following the current task mempolicy by calling
mempolicy_slab_node() and keep using that node for the future.
It is also possible to check if the policy is interleave and then follow
the distributed hash scheme.
> The current scheme where we determine node based on hash bits is fully
> deterministic and WAIT/WAKE will agree on which node-hash to use. The
> interleave is no worse than the global hash today -- OTOH it also isn't
> better.
This is unexpected strange behavior for those familiar with NUMA. We have
tools to set memory policies for tasks and those policies should be used
throughout.
next prev parent reply other threads:[~2024-10-25 19:46 UTC|newest]
Thread overview: 59+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-21 10:22 [PATCH v1 00/14] futex: More futex2 bits Peter Zijlstra
2023-07-21 10:22 ` [PATCH v1 01/14] futex: Clarify FUTEX2 flags Peter Zijlstra
2023-07-31 16:08 ` Thomas Gleixner
2023-07-21 10:22 ` [PATCH v1 02/14] futex: Extend the " Peter Zijlstra
2023-07-21 15:47 ` Arnd Bergmann
2023-07-21 18:52 ` Peter Zijlstra
2023-07-31 16:11 ` Thomas Gleixner
2023-07-31 16:25 ` Peter Zijlstra
2023-07-31 17:16 ` Thomas Gleixner
2023-07-31 17:35 ` Peter Zijlstra
2023-07-31 20:52 ` Thomas Gleixner
2023-07-31 17:42 ` Thomas Gleixner
2023-07-31 19:20 ` Peter Zijlstra
2023-07-31 21:14 ` Thomas Gleixner
2023-07-31 21:33 ` Peter Zijlstra
2023-07-31 22:43 ` Thomas Gleixner
2023-07-31 22:59 ` Peter Zijlstra
2023-08-01 8:49 ` Thomas Gleixner
2023-08-01 6:02 ` Arnd Bergmann
2023-07-21 10:22 ` [PATCH v1 03/14] futex: Flag conversion Peter Zijlstra
2023-07-31 16:21 ` Thomas Gleixner
2023-07-31 16:26 ` Peter Zijlstra
2023-07-21 10:22 ` [PATCH v1 04/14] futex: Validate futex value against futex size Peter Zijlstra
2023-07-31 17:12 ` Thomas Gleixner
2023-07-21 10:22 ` [PATCH v1 05/14] futex: Add sys_futex_wake() Peter Zijlstra
2023-07-21 15:41 ` Arnd Bergmann
2023-07-21 18:54 ` Peter Zijlstra
2023-07-21 21:23 ` Arnd Bergmann
2023-07-25 7:22 ` Geert Uytterhoeven
2023-07-21 10:22 ` [PATCH v1 06/14] futex: Add sys_futex_wait() Peter Zijlstra
2023-07-25 7:22 ` Geert Uytterhoeven
2023-07-31 16:35 ` Thomas Gleixner
2023-07-21 10:22 ` [PATCH v1 07/14] futex: Propagate flags into get_futex_key() Peter Zijlstra
2023-07-31 16:36 ` Thomas Gleixner
2023-07-21 10:22 ` [PATCH v1 08/14] futex: Add flags2 argument to futex_requeue() Peter Zijlstra
2023-07-31 16:43 ` Thomas Gleixner
2023-07-21 10:22 ` [PATCH v1 09/14] futex: Add sys_futex_requeue() Peter Zijlstra
2023-07-25 7:23 ` Geert Uytterhoeven
2023-07-31 17:19 ` Thomas Gleixner
2023-07-31 17:38 ` Peter Zijlstra
2023-07-21 10:22 ` [PATCH v1 10/14] mm: Add vmalloc_huge_node() Peter Zijlstra
2023-07-24 13:46 ` Christoph Hellwig
2023-07-21 10:22 ` [PATCH v1 11/14] futex: Implement FUTEX2_NUMA Peter Zijlstra
2023-07-21 12:16 ` Peter Zijlstra
2023-07-31 17:36 ` Thomas Gleixner
2023-07-31 18:03 ` Peter Zijlstra
2023-07-31 21:26 ` Thomas Gleixner
2024-06-12 17:07 ` Christoph Lameter (Ampere)
2024-06-12 17:23 ` Christoph Lameter (Ampere)
2024-06-12 17:44 ` Peter Zijlstra
2024-10-25 8:58 ` Peter Zijlstra
2024-10-25 19:36 ` Christoph Lameter (Ampere) [this message]
2024-10-26 7:21 ` Peter Zijlstra
2024-10-28 22:32 ` Christoph Lameter (Ampere)
2023-07-21 10:22 ` [PATCH v1 12/14] futex: Propagate flags into futex_get_value_locked() Peter Zijlstra
2023-07-21 10:22 ` [PATCH v1 13/14] futex: Enable FUTEX2_{8,16} Peter Zijlstra
2023-07-21 10:22 ` [PATCH v1 14/14] futex,selftests: Extend the futex selftests Peter Zijlstra
2023-07-21 14:42 ` [PATCH v1 00/14] futex: More futex2 bits Jens Axboe
2023-07-21 15:49 ` Arnd Bergmann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=887eadb6-6142-3edf-0a25-d33b2219b90d@gentwo.org \
--to=cl@gentwo.org \
--cc=akpm@linux-foundation.org \
--cc=andrealmeid@igalia.com \
--cc=arnd@arndb.de \
--cc=axboe@kernel.dk \
--cc=dave@stgolabs.net \
--cc=dvhart@infradead.org \
--cc=hch@infradead.org \
--cc=linux-api@vger.kernel.org \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lstoakes@gmail.com \
--cc=malteskarupke@web.de \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=urezki@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox