linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Frederic Weisbecker <frederic@kernel.org>
To: LKML <linux-kernel@vger.kernel.org>
Cc: "Frederic Weisbecker" <frederic@kernel.org>,
	"Michal Koutný" <mkoutny@suse.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Bjorn Helgaas" <bhelgaas@google.com>,
	"Catalin Marinas" <catalin.marinas@arm.com>,
	"Danilo Krummrich" <dakr@kernel.org>,
	"David S . Miller" <davem@davemloft.net>,
	"Eric Dumazet" <edumazet@google.com>,
	"Gabriele Monaco" <gmonaco@redhat.com>,
	"Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
	"Ingo Molnar" <mingo@redhat.com>,
	"Jakub Kicinski" <kuba@kernel.org>,
	"Jens Axboe" <axboe@kernel.dk>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Lai Jiangshan" <jiangshanlai@gmail.com>,
	"Marco Crivellari" <marco.crivellari@suse.com>,
	"Michal Hocko" <mhocko@suse.com>,
	"Muchun Song" <muchun.song@linux.dev>,
	"Paolo Abeni" <pabeni@redhat.com>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Phil Auld" <pauld@redhat.com>,
	"Rafael J . Wysocki" <rafael@kernel.org>,
	"Roman Gushchin" <roman.gushchin@linux.dev>,
	"Shakeel Butt" <shakeel.butt@linux.dev>,
	"Simon Horman" <horms@kernel.org>, "Tejun Heo" <tj@kernel.org>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Vlastimil Babka" <vbabka@suse.cz>,
	"Waiman Long" <longman@redhat.com>,
	"Will Deacon" <will@kernel.org>,
	cgroups@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-block@vger.kernel.org, linux-mm@kvack.org,
	linux-pci@vger.kernel.org, netdev@vger.kernel.org
Subject: [PATCH 32/33] genirq: Correctly handle preferred kthreads affinity
Date: Mon, 13 Oct 2025 22:31:45 +0200	[thread overview]
Message-ID: <20251013203146.10162-33-frederic@kernel.org> (raw)
In-Reply-To: <20251013203146.10162-1-frederic@kernel.org>

[CHECKME: Do some IRQ threads have strong affinity requirements? In
which case they should use kthread_bind()...]

The affinity of IRQ threads is applied through a direct call to the
scheduler. As a result this affinity may not be carried correctly across
hotplug events, cpuset isolated partitions updates, or against
housekeeping constraints.

For example a simple creation of cpuset isolated partition will
overwrite all IRQ threads affinity to the non isolated cpusets.

To prevent from that, use the appropriate kthread affinity APIs that
takes care of the preferred affinity during these kinds of events.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
---
 kernel/irq/manage.c | 47 +++++++++++++++++++++++++++------------------
 1 file changed, 28 insertions(+), 19 deletions(-)

diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index c94837382037..d96f6675c888 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -176,15 +176,15 @@ bool irq_can_set_affinity_usr(unsigned int irq)
 }
 
 /**
- * irq_set_thread_affinity - Notify irq threads to adjust affinity
+ * irq_thread_notify_affinity - Notify irq threads to adjust affinity
  * @desc:	irq descriptor which has affinity changed
  *
  * Just set IRQTF_AFFINITY and delegate the affinity setting to the
- * interrupt thread itself. We can not call set_cpus_allowed_ptr() here as
- * we hold desc->lock and this code can be called from hard interrupt
+ * interrupt thread itself. We can not call kthread_affine_preferred_update()
+ * here as we hold desc->lock and this code can be called from hard interrupt
  * context.
  */
-static void irq_set_thread_affinity(struct irq_desc *desc)
+static void irq_thread_notify_affinity(struct irq_desc *desc)
 {
 	struct irqaction *action;
 
@@ -283,7 +283,7 @@ int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask,
 		fallthrough;
 	case IRQ_SET_MASK_OK_NOCOPY:
 		irq_validate_effective_affinity(data);
-		irq_set_thread_affinity(desc);
+		irq_thread_notify_affinity(desc);
 		ret = 0;
 	}
 
@@ -1032,11 +1032,26 @@ static void irq_thread_check_affinity(struct irq_desc *desc, struct irqaction *a
 	}
 
 	if (valid)
-		set_cpus_allowed_ptr(current, mask);
+		kthread_affine_preferred_update(current, mask);
 	free_cpumask_var(mask);
 }
+
+static inline void irq_thread_set_affinity(struct task_struct *t,
+					   struct irq_desc *desc)
+{
+	const struct cpumask *mask;
+
+	if (cpumask_available(desc->irq_common_data.affinity))
+		mask = irq_data_get_effective_affinity_mask(&desc->irq_data);
+	else
+		mask = cpu_possible_mask;
+
+	kthread_affine_preferred(t, mask);
+}
 #else
 static inline void irq_thread_check_affinity(struct irq_desc *desc, struct irqaction *action) { }
+static inline void irq_thread_set_affinity(struct task_struct *t,
+					   struct irq_desc *desc) { }
 #endif
 
 static int irq_wait_for_interrupt(struct irq_desc *desc,
@@ -1384,7 +1399,8 @@ static void irq_nmi_teardown(struct irq_desc *desc)
 }
 
 static int
-setup_irq_thread(struct irqaction *new, unsigned int irq, bool secondary)
+setup_irq_thread(struct irqaction *new, struct irq_desc *desc,
+		 unsigned int irq, bool secondary)
 {
 	struct task_struct *t;
 
@@ -1405,16 +1421,9 @@ setup_irq_thread(struct irqaction *new, unsigned int irq, bool secondary)
 	 * references an already freed task_struct.
 	 */
 	new->thread = get_task_struct(t);
-	/*
-	 * Tell the thread to set its affinity. This is
-	 * important for shared interrupt handlers as we do
-	 * not invoke setup_affinity() for the secondary
-	 * handlers as everything is already set up. Even for
-	 * interrupts marked with IRQF_NO_BALANCE this is
-	 * correct as we want the thread to move to the cpu(s)
-	 * on which the requesting code placed the interrupt.
-	 */
-	set_bit(IRQTF_AFFINITY, &new->thread_flags);
+
+	irq_thread_set_affinity(t, desc);
+
 	return 0;
 }
 
@@ -1486,11 +1495,11 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new)
 	 * thread.
 	 */
 	if (new->thread_fn && !nested) {
-		ret = setup_irq_thread(new, irq, false);
+		ret = setup_irq_thread(new, desc, irq, false);
 		if (ret)
 			goto out_mput;
 		if (new->secondary) {
-			ret = setup_irq_thread(new->secondary, irq, true);
+			ret = setup_irq_thread(new->secondary, desc, irq, true);
 			if (ret)
 				goto out_thread;
 		}
-- 
2.51.0



  parent reply	other threads:[~2025-10-13 20:36 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-13 20:31 [PATCH 00/33 v3] cpuset/isolation: Honour kthreads preferred affinity Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 01/33] PCI: Prepare to protect against concurrent isolated cpuset change Frederic Weisbecker
2025-10-14 20:53   ` Bjorn Helgaas
2025-10-31 15:30     ` Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 02/33] cpu: Revert "cpu/hotplug: Prevent self deadlock on CPU hot-unplug" Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 03/33] memcg: Prepare to protect against concurrent isolated cpuset change Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 04/33] mm: vmstat: " Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 05/33] sched/isolation: Save boot defined domain flags Frederic Weisbecker
2025-10-23 15:45   ` Valentin Schneider
2025-10-31 15:36     ` Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 06/33] cpuset: Convert boot_hk_cpus to use HK_TYPE_DOMAIN_BOOT Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 07/33] driver core: cpu: Convert /sys/devices/system/cpu/isolated " Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 08/33] net: Keep ignoring isolated cpuset change Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 09/33] block: Protect against concurrent " Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 10/33] cpu: Provide lockdep check for CPU hotplug lock write-held Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 11/33] cpuset: Provide lockdep check for cpuset lock held Frederic Weisbecker
2025-10-14 13:29   ` Chen Ridong
2025-10-31 16:08     ` Frederic Weisbecker
2025-11-03  2:32       ` Chen Ridong
2025-10-13 20:31 ` [PATCH 12/33] sched/isolation: Convert housekeeping cpumasks to rcu pointers Frederic Weisbecker
2025-10-21  1:46   ` Chen Ridong
2025-10-21  1:57     ` Chen Ridong
2025-10-21  4:03     ` Waiman Long
2025-10-31 16:17       ` Frederic Weisbecker
2025-10-31 19:29         ` Waiman Long
2025-11-03  2:22         ` Chen Ridong
2025-11-05 15:18           ` Frederic Weisbecker
2025-10-21  3:49   ` Waiman Long
2025-11-05 15:23     ` Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 13/33] cpuset: Update HK_TYPE_DOMAIN cpumask from cpuset Frederic Weisbecker
2025-10-21  4:10   ` Waiman Long
2025-10-22  1:36     ` Chen Ridong
2025-11-05 15:42     ` Frederic Weisbecker
2025-11-05 19:33       ` Waiman Long
2025-10-21 13:39   ` Waiman Long
2025-11-05 15:45     ` Frederic Weisbecker
2025-11-05 19:39       ` Waiman Long
2025-10-31 12:59   ` Phil Auld
2025-11-05 15:57     ` Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 14/33] sched/isolation: Flush memcg workqueues on cpuset isolated partition change Frederic Weisbecker
2025-10-21 19:16   ` Waiman Long
2025-10-21 19:28     ` Waiman Long
2025-11-05 16:20       ` Frederic Weisbecker
2025-11-05 16:17     ` Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 15/33] sched/isolation: Flush vmstat " Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 16/33] PCI: Flush PCI probe workqueue " Frederic Weisbecker
2025-10-14 20:50   ` Bjorn Helgaas
2025-11-05 16:28     ` Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 17/33] cpuset: Propagate cpuset isolation update to workqueue through housekeeping Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 18/33] cpuset: Remove cpuset_cpu_is_isolated() Frederic Weisbecker
2025-10-29 18:05   ` Waiman Long
2025-11-05 16:36     ` Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 19/33] sched/isolation: Remove HK_TYPE_TICK test from cpu_is_isolated() Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 20/33] PCI: Remove superfluous HK_TYPE_WQ check Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 21/33] kthread: Refine naming of affinity related fields Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 22/33] kthread: Include unbound kthreads in the managed affinity list Frederic Weisbecker
2025-10-21 22:42   ` Waiman Long
2025-11-05 16:57     ` Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 23/33] kthread: Include kthreadd to " Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 24/33] kthread: Rely on HK_TYPE_DOMAIN for preferred affinity management Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 25/33] sched: Switch the fallback task allowed cpumask to HK_TYPE_DOMAIN Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 26/33] cgroup/cpuset: Fail if isolated and nohz_full don't leave any housekeeping Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 27/33] sched/arm64: Move fallback task cpumask to HK_TYPE_DOMAIN Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 28/33] kthread: Honour kthreads preferred affinity after cpuset changes Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 29/33] kthread: Comment on the purpose and placement of kthread_affine_node() call Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 30/33] kthread: Add API to update preferred affinity on kthread runtime Frederic Weisbecker
2025-10-14 12:35   ` Simon Horman
2025-11-05 17:26     ` Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 31/33] kthread: Document kthread_affine_preferred() Frederic Weisbecker
2025-10-13 20:31 ` Frederic Weisbecker [this message]
2025-10-13 20:31 ` [PATCH 33/33] doc: Add housekeeping documentation Frederic Weisbecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251013203146.10162-33-frederic@kernel.org \
    --to=frederic@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=bhelgaas@google.com \
    --cc=catalin.marinas@arm.com \
    --cc=cgroups@vger.kernel.org \
    --cc=dakr@kernel.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=gmonaco@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=horms@kernel.org \
    --cc=jiangshanlai@gmail.com \
    --cc=kuba@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=marco.crivellari@suse.com \
    --cc=mhocko@suse.com \
    --cc=mingo@redhat.com \
    --cc=mkoutny@suse.com \
    --cc=muchun.song@linux.dev \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=pauld@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rafael@kernel.org \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeel.butt@linux.dev \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=vbabka@suse.cz \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox