From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3F6F0C3DA70 for ; Tue, 30 Jul 2024 15:48:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3C9936B0085; Tue, 30 Jul 2024 11:48:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 377C86B0092; Tue, 30 Jul 2024 11:48:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 23F456B0093; Tue, 30 Jul 2024 11:48:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 089D76B0085 for ; Tue, 30 Jul 2024 11:48:18 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id A436640236 for ; Tue, 30 Jul 2024 15:48:17 +0000 (UTC) X-FDA: 82396850634.26.2182EFC Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) by imf07.hostedemail.com (Postfix) with ESMTP id 44FEB40030 for ; Tue, 30 Jul 2024 15:48:14 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=ek+2tm02; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=lPdsf4c9; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=ek+2tm02; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=lPdsf4c9; dmarc=none; spf=pass (imf07.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.131 as permitted sender) smtp.mailfrom=vbabka@suse.cz ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722354451; a=rsa-sha256; cv=none; b=QUQ2vrYdbinYnP4GL6NvQuNpOvw0l0ooPzCG8JdUjQ7CicuXNKGcmSJRJn4wW6iGI+jH3f yHFjOaPvq7ydYjXOIoRajOmDROm9KzndoedC/8zfyEd8Sxila+JBKIKzBLSNyTkNkOIJpy 4Lue1+0XihSmGcYeVDu6nVVusr3BzjM= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=ek+2tm02; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=lPdsf4c9; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=ek+2tm02; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=lPdsf4c9; dmarc=none; spf=pass (imf07.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.131 as permitted sender) smtp.mailfrom=vbabka@suse.cz ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722354451; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=O/S5tZTRn63vnVTPwzlEfrWM8nM+yXiZ15+Qp4xtJ34=; b=dBSmj+qfwqg5iwDNuFIs71KjQ+iK9aOZzB/PsCIrGjN7hLxQcEJiZELnNpqpaKkT/vjYMV 32qXy1mgmHy1GsHu5eVl7fSk6NpUw7NrxFws8CCKVdZqFC2YJEsBTN3AqhQBGPOUVc5/Yd 3GSzDJOk0FpPy1eWQm38DXBTCbITk4c= Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 85A061F7FA; Tue, 30 Jul 2024 15:48:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1722354493; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=O/S5tZTRn63vnVTPwzlEfrWM8nM+yXiZ15+Qp4xtJ34=; b=ek+2tm02tfDRhj7qSjMS/hNdyTwZD3hd8uguRB4P5ebxE1D1c9ZfOXNiz1wiPLDcuoeCtr YT4rzh3LUwltP9qQWU9aq63+UYSKraziAEo0oNMIjjwB2bKZ4YG97XOhXqe+fiSBeLlrsx fKTrwEQSC8IjFivBGtxusQuaE5dRALY= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1722354493; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=O/S5tZTRn63vnVTPwzlEfrWM8nM+yXiZ15+Qp4xtJ34=; b=lPdsf4c97WqbwCeYlrR4EV0NKnMpn3MpmL+MqE7ZnJYnaxbYGLnY/1eyUqw0UASqcIJ1BO Sw2+28i/twA9aaAw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1722354493; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=O/S5tZTRn63vnVTPwzlEfrWM8nM+yXiZ15+Qp4xtJ34=; b=ek+2tm02tfDRhj7qSjMS/hNdyTwZD3hd8uguRB4P5ebxE1D1c9ZfOXNiz1wiPLDcuoeCtr YT4rzh3LUwltP9qQWU9aq63+UYSKraziAEo0oNMIjjwB2bKZ4YG97XOhXqe+fiSBeLlrsx fKTrwEQSC8IjFivBGtxusQuaE5dRALY= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1722354493; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=O/S5tZTRn63vnVTPwzlEfrWM8nM+yXiZ15+Qp4xtJ34=; b=lPdsf4c97WqbwCeYlrR4EV0NKnMpn3MpmL+MqE7ZnJYnaxbYGLnY/1eyUqw0UASqcIJ1BO Sw2+28i/twA9aaAw== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 6FCC013983; Tue, 30 Jul 2024 15:48:13 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id 02gBGz0LqWZXBgAAD6G6ig (envelope-from ); Tue, 30 Jul 2024 15:48:13 +0000 Message-ID: <4e9d1f6d-9cd8-493c-9440-b46a99f1c8af@suse.cz> Date: Tue, 30 Jul 2024 17:49:51 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH 12/20] kthread: Implement preferred affinity To: Frederic Weisbecker , LKML Cc: Andrew Morton , Kees Cook , Peter Zijlstra , Thomas Gleixner , Michal Hocko , linux-mm@kvack.org, "Paul E. McKenney" , Neeraj Upadhyay , Joel Fernandes , Boqun Feng , Zqiang , rcu@vger.kernel.org References: <20240726215701.19459-1-frederic@kernel.org> <20240726215701.19459-13-frederic@kernel.org> From: Vlastimil Babka Content-Language: en-US In-Reply-To: <20240726215701.19459-13-frederic@kernel.org> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 44FEB40030 X-Stat-Signature: 7y4h3o5rfr5k1ort8tp1wyeiy1mqjuiq X-Rspam-User: X-HE-Tag: 1722354494-244362 X-HE-Meta: U2FsdGVkX1+N/IUKRyEERVpnQi0tLyiOkSp8RmAK6krG/BWEzMGd7NEYEtDcRPphL4GeMApLIfAf6QLhSN5wy+8dfKA1AADazGPaD0XH5HYqojsns/vG4ANydc9i1pTNuJ3BwFt2++ceVMNTSkkAT0GZpFA6scdYpy0xc4kgFZJiLxEA45zAC75FMb2Pw2QiwlKOqbKPSUA4/J1j/sazS9q2KvqRI80kzUCUu2iAHGU2NrYF2A+D/d2wJacFWCaRHnniPxWQPAddMuMAVrkZZ+KEdQb5KhxOCK6/P4lvsNwVuACfKPH6P3Wr1Cxc67N3w1roRe10J9NclssQFW8KRyz3K6cZZcYIjEQGlC+gvPAY/tkt/sZTiOUX7tgbq4TSVmDz3wNJ9AAn2p9f+XuFJVKD097cx5COLBV/EgxqIjGlvcXQh/l+HxvbcS3Kawm4ofWE39/PpQEEWgOH9gUN40dj18aAf/un2XySAGYAVzFBWGzV7B1Cx59XwlyUO2j2A91OShUkyWdITEVExxpWNpvBDZyxZbcVFRwCyz8UncluL5e6dQ0OgAHZJhW0pV8uv1BkO9oSbJM4hawf/PBlPimrmJBt6lRYVeptr6aINkqMlORfV6H6q83aly+oZmT6AOS0Uj3jBEpYYf2mDbi9eYbHGmyUvsdpd0F/EY9JKyQNR1+iX9MZbcAsv+p0p7PKweXST+s8geZwkBD9sY6La2OXyLjq6rGrA2bqvzK99mJCipVj7GKI6vWVHCPW/ko8m8AMrSMCaHGkHpmiz/Nk1A4531LtFFKn7Z+Wv8bxw6rcBNymZNvShH0eKKNWGmv8n6UJY+X3MvsnUqTcLY0b68zRsQPpzh9IzPLGgdUmJ7AQ0P3wv2fN0buL3Gie5Suk93Peg1RR8amLNH4qkYWg3KnpfuPcvADXMnUF4nipYS9/y2Us/ILvrDa1V79Nau+kTOArqoDA9FFjItvS6y8 kFc1EW3Z LIEN9RmbI1D5cg8XZPxaNmtqgschIpUVjv3rGrW+qDs8ZhwENbfK6fJgG9ps4OaZ8BGim/ZRsc4uumFB7RoMSLXybamZOJEZ0WhAOCQuDhJvDVR6UX2ngdKBBT73/++8AZ0xVO1OItD0BtpDf6oKb6sBuC/hzw1jnq/vjrNJM5AZ7ucEUU5WLYQTuJhAULheW7BI7tSLDnzgRgXGQCVNBC8ZFaVV7lUpvSJY8IP5jTRQTz6CoAOKVXyqIZ88/YBhalcwRvtIAoSJ/r8GvGaRtuilCAyMVZBlcFf8LIuCH5s4TpYkQNYpL1VvMv5jNoYw8BVm4S4v0VJ8oq6H2nfGTj+h0Q8baV8vHIXzTP1EDfPoZKceVgocETt3Cs73tZdu/gFgfySXUoatF8k9CBU8G2lDjun+oXrelYfdIb7VOG/VP13uD2Q5LMOAb8A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 7/26/24 11:56 PM, Frederic Weisbecker wrote: > Affining kthreads follow either of three existing different patterns: > > 1) Per-CPU kthreads must stay affine to a single CPU and never execute > relevant code on any other CPU. This is currently handled by smpboot > code which takes care of CPU-hotplug operations. > > 2) Kthreads that _have_ to be affine to a specific set of CPUs and can't > run anywhere else. The affinity is set through kthread_bind_mask() > and the subsystem takes care by itself to handle CPU-hotplug operations. > > 3) Kthreads that have a _preferred_ affinity but that can run anywhere > without breaking correctness. Userspace can overwrite the affinity. > It is set manually like any other task and CPU-hotplug is supposed > to be handled by the relevant subsystem so that the task is properly > reaffined whenever a given CPU from the preferred affinity comes up > or down. Also care must be taken so that the preferred affinity > doesn't cross housekeeping cpumask boundaries. > > Currently the preferred affinity pattern has at least 4 identified > users, with more or less success when it comes to handle CPU-hotplug > operations and housekeeping cpumask. > > Provide an infrastructure to handle this usecase patter. A new > kthread_affine_preferred() API is introduced, to be used just like > kthread_bind_mask(), right after kthread creation and before the first > wake up. The kthread is then affine right away to the cpumask passed > through the API if it has online housekeeping CPUs. Otherwise it will > be affine to all online housekeeping CPUs as a last resort. > > It is aware of CPU hotplug events such that: > > * When a housekeeping CPU goes up and is part of the preferred affinity > of a given kthread, it is added to its applied affinity set (and > possibly the default last resort online housekeeping set is removed > from the set). > > * When a housekeeping CPU goes down while it was part of the preferred > affinity of a kthread, it is removed from the kthread's applied > affinity. The last resort is to affine the kthread to all online > housekeeping CPUs. > > Signed-off-by: Frederic Weisbecker Acked-by: Vlastimil Babka Nit: > +int kthread_affine_preferred(struct task_struct *p, const struct cpumask *mask) > +{ > + struct kthread *kthread = to_kthread(p); > + cpumask_var_t affinity; > + unsigned long flags; > + int ret; > + > + if (!wait_task_inactive(p, TASK_UNINTERRUPTIBLE) || kthread->started) { > + WARN_ON(1); > + return -EINVAL; > + } > + Should we also fail if kthread->preferred_affinity already exist? In case somebody calls this twice. Also for some of the use cases (kswapd, kcompactd) it would make sense to be able to add cpus of a node as they are onlined. Which seems we didn't do, except some corner case handling in kcompactd, but maybe we should? I wonder if the current implementation of onlining a completely new node with cpus does the right thing as a result of the individual onlining operations, or we end up with being affined to a single cpu (or none). But that would need some kind of kthread_affine_preferred_update() implementation? > + if (!zalloc_cpumask_var(&affinity, GFP_KERNEL)) > + return -ENOMEM; > + > + kthread->preferred_affinity = kzalloc(sizeof(struct cpumask), GFP_KERNEL); > + if (!kthread->preferred_affinity) { > + ret = -ENOMEM; > + goto out; > + } > + > + mutex_lock(&kthreads_hotplug_lock); > + cpumask_copy(kthread->preferred_affinity, mask); > + list_add_tail(&kthread->hotplug_node, &kthreads_hotplug); > + kthread_fetch_affinity(kthread, affinity); > + > + /* It's safe because the task is inactive. */ > + raw_spin_lock_irqsave(&p->pi_lock, flags); > + do_set_cpus_allowed(p, mask); > + raw_spin_unlock_irqrestore(&p->pi_lock, flags); > + > + mutex_unlock(&kthreads_hotplug_lock); > +out: > + free_cpumask_var(affinity); > + > + return 0; > +} > + > +static int kthreads_hotplug_update(void) > +{ > + cpumask_var_t affinity; > + struct kthread *k; > + int err = 0; > + > + if (list_empty(&kthreads_hotplug)) > + return 0; > + > + if (!zalloc_cpumask_var(&affinity, GFP_KERNEL)) > + return -ENOMEM; > + > + list_for_each_entry(k, &kthreads_hotplug, hotplug_node) { > + if (WARN_ON_ONCE(!k->preferred_affinity)) { > + err = -EINVAL; > + break; > + } > + kthread_fetch_affinity(k, affinity); > + set_cpus_allowed_ptr(k->task, affinity); > + } > + > + free_cpumask_var(affinity); > + > + return err; > +} > + > +static int kthreads_offline_cpu(unsigned int cpu) > +{ > + int ret = 0; > + > + mutex_lock(&kthreads_hotplug_lock); > + cpumask_clear_cpu(cpu, &kthread_online_mask); > + ret = kthreads_hotplug_update(); > + mutex_unlock(&kthreads_hotplug_lock); > + > + return ret; > +} > + > +static int kthreads_online_cpu(unsigned int cpu) > +{ > + int ret = 0; > + > + mutex_lock(&kthreads_hotplug_lock); > + cpumask_set_cpu(cpu, &kthread_online_mask); > + ret = kthreads_hotplug_update(); > + mutex_unlock(&kthreads_hotplug_lock); > + > + return ret; > +} > + > +static int kthreads_init(void) > +{ > + return cpuhp_setup_state(CPUHP_AP_KTHREADS_ONLINE, "kthreads:online", > + kthreads_online_cpu, kthreads_offline_cpu); > +} > +early_initcall(kthreads_init); > + > void __kthread_init_worker(struct kthread_worker *worker, > const char *name, > struct lock_class_key *key)