From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4A510D0C5EB for ; Fri, 25 Oct 2024 08:58:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D1FDC6B0092; Fri, 25 Oct 2024 04:58:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CCFEA6B0093; Fri, 25 Oct 2024 04:58:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BBE996B0095; Fri, 25 Oct 2024 04:58:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 9DDCE6B0092 for ; Fri, 25 Oct 2024 04:58:28 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id C8B1B1207CE for ; Fri, 25 Oct 2024 08:58:10 +0000 (UTC) X-FDA: 82711522908.24.49669FE Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) by imf14.hostedemail.com (Postfix) with ESMTP id 58ED910000F for ; Fri, 25 Oct 2024 08:58:04 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=infradead.org header.s=desiato.20200630 header.b=VfbUIXQ1; spf=none (imf14.hostedemail.com: domain of peterz@infradead.org has no SPF policy when checking 90.155.92.199) smtp.mailfrom=peterz@infradead.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729846628; a=rsa-sha256; cv=none; b=L5yyIyzlLTqpJrcKIhNgOxnnec8J16FiuMXvj6vnumWxBNvQW0aFqrfoQSsNQmdCi9LnNh 5FgljMkb1dEf93WRAkDT7Y68Rvau7+KXXqxlPmOCMtxo04Fzm6K1d9LVw279HrI1p5HLhH Jc/LosZPJ3gUXJAoLofRvuyBu+gMaD0= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=infradead.org header.s=desiato.20200630 header.b=VfbUIXQ1; spf=none (imf14.hostedemail.com: domain of peterz@infradead.org has no SPF policy when checking 90.155.92.199) smtp.mailfrom=peterz@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729846628; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=fM1f2YxrgQ2sEgdNsgOBt8r5ECNfwt7LRPGzmtJRH9U=; b=X5vmKkv0YBtBv6F8EnmjMQjK3sc91dTLnEuNKxvBpJ2ZBcQsEfPWTQaacDjJS/92esqZm7 nDW5qq8mf95WWUjWA2+wDSLcEenajj82lUOau+3goSuoLrLPSa91puztJ6fqY7ILlWPJGJ aeGSlh2JfnrXTwhzTvJEG5fnvH2DTRo= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=fM1f2YxrgQ2sEgdNsgOBt8r5ECNfwt7LRPGzmtJRH9U=; b=VfbUIXQ13vdAbkhGHISFVV7/6/ 6xcxH5uE1V4/NNyqD1rm1AA0zKhmTceBIrcHgEkmBVPH+Uq7JqKGMCu4q+avhIP2fDg+5nZPI3RrQ bqIPoKqOW5QhP5G4AH+/6JzhQ4LLlwjr93m0Fwi9upbNunph5ZAA4YWZn53KXsAEVLTldlRUFw94n 7EoH37g9NB4bod4Sa48l67V0LRohtd2g7U5om3/Bbw9FQIw69pbFwOnnyvrwBJ0v0qGmr7lZ+VG1F mkcDhSW/s7VwgOeNIdfoKtRz7ieJnk/DWKB91nu86H6biJTsNZrFTwBigDHk8WQiwWeORnyoheue5 ueCM+3XA==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.98 #2 (Red Hat Linux)) id 1t4G9A-00000008sII-3S1k; Fri, 25 Oct 2024 08:58:17 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 1000) id 0CE20300ABE; Fri, 25 Oct 2024 10:58:16 +0200 (CEST) Date: Fri, 25 Oct 2024 10:58:15 +0200 From: Peter Zijlstra To: "Christoph Lameter (Ampere)" Cc: tglx@linutronix.de, axboe@kernel.dk, linux-kernel@vger.kernel.org, mingo@redhat.com, dvhart@infradead.org, dave@stgolabs.net, andrealmeid@igalia.com, Andrew Morton , urezki@gmail.com, hch@infradead.org, lstoakes@gmail.com, Arnd Bergmann , linux-api@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, malteskarupke@web.de Subject: Re: [PATCH v1 11/14] futex: Implement FUTEX2_NUMA Message-ID: <20241025085815.GG14555@noisy.programming.kicks-ass.net> References: <20230721102237.268073801@infradead.org> <20230721105744.434742902@infradead.org> <9dc04e4c-2adc-5084-4ea1-b200d82be29f@linux.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <9dc04e4c-2adc-5084-4ea1-b200d82be29f@linux.com> X-Stat-Signature: jgbdh3dq4jnq156p1g94j49nr11i9jqx X-Rspamd-Queue-Id: 58ED910000F X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1729846684-603425 X-HE-Meta: U2FsdGVkX18FKDGqMKnB03k7Wvcv36L6Vk0Q1ffnmtymNFGBb5GwAFhq4IOxU+Tw72iKMQDGx2ob09v+VfWCAoEA/9K0h/XYEmzdLZQwq8KjgXvktkddILcC0J8n6+PCW0v8HQ0Bg5B9pG+bX9pMTXzqlH7CYYFkYUSRn3QlSiN+16rcrF8AIDUW4O9YTM4ElHZi0xuw+K5VFCVV2xfP8ap+OkMu+l4G6pp1Q1tIHxxbCU0SeGUC5/PFr7+qEVpzy1CcV81C5H6QUW0XY6xCKY/PFjJzBLuPQOJh/XCHsWX/uv6VLwBvAV1YsBKXYIuLHeDHEaYlfKq7/495/di1bYzmSQeP0VSOqMwekZ7Wd3/Xd9rqRVCxGpEgVcWgLKxTkg5Qvz2wy6Qwb7P4IBihYVJ6QTQDp6JolS1e7GE/t1lLdhdTs/IrEgttXb+MVmmHx5uNcFO94knMBmBEgxHYBbLR4AgqOajAbKz51pGL5WHO9TbuYW+EaPqcvO82OBp0AfW6Ll502NXblTS2nS1ZyQxasjTRdNbNxXvKTo8qpg5NBR5SPotR/NW/VE+YZHQuVAgyD3CnKWgNmNHMh4Ape4tsVDNiOF+GiKYg9X7waezdLIrb09o0ufC28e8ekMX15tLGRQRZD9ttyl94C9120EEDAai3ix3TXXCjHqihMeWM4Jukl0ExqbQ2qDBZ6pIO8dCj11vPguzROstt4NIwzNwH3jWR53kgpBn0TcVsU80QSIw/0PduwsWG7EnjdbSBX3dZYFJVZNNzPgvmPsJexWTeXCqNzdiCygmb0jqKoZScdh/SNAP1rnaU3PjvkJjzsfzoQb26PusUb4bniNF9ONzLB6XPUw+x2AmSRp7rggBkeJLBFqZl9qlqpwEpNJlSDjkmb6MVcahizXEkqduydGo5Mi2HZ1VvQNqimzL7265RnASf5ogua+b7JzoHaZizOGEm/R4DDOQxdjEWSWD oo+fWl4C DP/eFqcWq01xawVCSEjsna3O1uqj5qF2bcr8Ogd9ciLZQyQR/e+soL3OCjkYkzVwZM829OqcwdCvO4l4zv0GJBBz5DkH/ERDE3KuOMbP1hnxAa/3CBE18jbg+g2/W+pEvCLl1zHX7brDeXmPMnyaG7auaM9h8GIJlpFSUzZemdeFoiRnHAsXzNNCC6bI8YK0P3tlVQ9icgO0/uOO7ozTthm8sxk+exi4BrgGX1DrRbHhViLf70q5giLUmuw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Jun 12, 2024 at 10:23:00AM -0700, Christoph Lameter (Ampere) wrote: > > When FUTEX2_NUMA is not set, the node is simply an extention of the > > hash, such that traditional futexes are still interleaved over the > > nodes. > > Could we follow NUMA policies like with other metadata allocations during > systen call processing? I had a quick look at this, and since the mempolicy stuff is per vma, and we don't have the vma, this is going to be terribly expensive -- mmap_lock and all that. Once lockless vma lookups land (soonish, perhaps), this could be reconsidered. But for now there just isn't a sane way to do this. Using memory policies is probably okay -- but still risky, since you get the extra failure case where if you change the mempolicy between WAIT and WAKE things will not match and sadness happens, but that *SHOULD* hopefully not happen a lot. Mempolicies are typically fairly static. > If there is no NUMA task policy then the futex > should be placed on the local NUMA node. > That way the placement of the futex can be controlled by the tasks memory > policy. We could skip the FUTEX2_NUMA option. That doesn't work. If we don't have storage for the node across WAIT/WAKE, then the node must be deterministic per futex_hash(). Otherwise wake has no chance of finding the entry. Consider our random unbound task with no policies etc. (default state) doing FUTEX_WAIT and going to sleep while on node-0, it's sibling thread, that happens to run on node-1 issues FUTEX_WAKE. If they disagree on determining 'node', then they will not find match and the wakeup doesn't happen and userspace gets really sad. The current scheme where we determine node based on hash bits is fully deterministic and WAIT/WAKE will agree on which node-hash to use. The interleave is no worse than the global hash today -- OTOH it also isn't better.