From: Thomas Gleixner <tglx@kernel.org>
To: "Qiliang Yuan" <realwujing@gmail.com>,
"Ingo Molnar" <mingo@redhat.com>,
"Peter Zijlstra" <peterz@infradead.org>,
"Juri Lelli" <juri.lelli@redhat.com>,
"Vincent Guittot" <vincent.guittot@linaro.org>,
"Dietmar Eggemann" <dietmar.eggemann@arm.com>,
"Steven Rostedt" <rostedt@goodmis.org>,
"Ben Segall" <bsegall@google.com>, "Mel Gorman" <mgorman@suse.de>,
"Valentin Schneider" <vschneid@redhat.com>,
"Paul E. McKenney" <paulmck@kernel.org>,
"Frederic Weisbecker" <frederic@kernel.org>,
"Neeraj Upadhyay" <neeraj.upadhyay@kernel.org>,
"Joel Fernandes" <joelagnelf@nvidia.com>,
"Josh Triplett" <josh@joshtriplett.org>,
"Boqun Feng" <boqun@kernel.org>,
"Uladzislau Rezki" <urezki@gmail.com>,
"Mathieu Desnoyers" <mathieu.desnoyers@efficios.com>,
"Lai Jiangshan" <jiangshanlai@gmail.com>,
Zqiang <qiang.zhang@linux.dev>,
"Anna-Maria Behnsen" <anna-maria@linutronix.de>,
"Ingo Molnar" <mingo@kernel.org>, "Tejun Heo" <tj@kernel.org>,
"Andrew Morton" <akpm@linux-foundation.org>,
"Vlastimil Babka" <vbabka@kernel.org>,
"Suren Baghdasaryan" <surenb@google.com>,
"Michal Hocko" <mhocko@suse.com>,
"Brendan Jackman" <jackmanb@google.com>,
"Johannes Weiner" <hannes@cmpxchg.org>, "Zi Yan" <ziy@nvidia.com>,
"Waiman Long" <longman@redhat.com>,
"Chen Ridong" <chenridong@huaweicloud.com>,
"Michal Koutný" <mkoutny@suse.com>,
"Jonathan Corbet" <corbet@lwn.net>,
"Shuah Khan" <skhan@linuxfoundation.org>,
"Shuah Khan" <shuah@kernel.org>
Cc: linux-kernel@vger.kernel.org, rcu@vger.kernel.org,
linux-mm@kvack.org, cgroups@vger.kernel.org,
linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org,
Qiliang Yuan <realwujing@gmail.com>
Subject: Re: [PATCH v2 04/12] tick/nohz: Transition to dynamic full dynticks state management
Date: Tue, 14 Apr 2026 23:57:52 +0200 [thread overview]
Message-ID: <87tstddx27.ffs@tglx> (raw)
In-Reply-To: <20260413-wujing-dhm-v2-4-06df21caba5d@gmail.com>
On Mon, Apr 13 2026 at 15:43, Qiliang Yuan wrote:
> Context:
> Full dynticks (NOHZ_FULL) is typically a static configuration determined
> at boot time. DHEI extends this to support runtime activation.
I have no idea what DHEI is. Provide proper information and not magic
acronyms.
> Problem:
> Switching to NOHZ_FULL at runtime requires careful synchronization
> of context tracking and housekeeping states. Re-invoking setup logic
> multiple times could lead to inconsistencies or warnings, and RCU
> dependency checks often prevented tick suppression in Zero-Conf setups.
And that careful synchronization is best achieved with an opaque
notifier callchain which relies on build time ordering. Impressive.
> Solution:
> - Replace the static tick_nohz_full_enabled() checks with a dynamic
> tick_nohz_full_running state variable.
That variable existed before and you are telling the what and not why
this is required and how that is correct vs. the other checks in
tick_nohz_full_enabled(). Also what's static about that function aside
of being marked static inline?
> - Refactor tick_nohz_full_setup to be safe for runtime invocation,
> adding guards against re-initialization and ensuring IRQ work
> interrupt support.
Refactoring has to be done in a preparatory patch and not
> - Implement boot-time pre-activation of context tracking (shadow
> init) for all possible CPUs to avoid instruction flow issues during
> dynamic transitions.
Again lot's of hand waving without a proper explanation.
> - Hook into housekeeping_notifier_list to update NO_HZ states dynamically.
See above.
> This provides the core state machine for reliable, on-demand tick
> suppression and high-performance isolation.
I can find a lot of hacks, but definitely not the slightest notion of a
state machine. Don't throw random buzzwords into a changelog if there is
no evidence for their existance.
> +static int tick_nohz_housekeeping_reconfigure(struct notifier_block *nb,
> + unsigned long action, void *data)
> +{
> + struct housekeeping_update *upd = data;
> + int cpu;
> +
> + if (action == HK_UPDATE_MASK && upd->type == HK_TYPE_TICK) {
> + cpumask_var_t non_housekeeping_mask;
> +
> + if (!alloc_cpumask_var(&non_housekeeping_mask, GFP_KERNEL))
> + return NOTIFY_BAD;
> +
> + cpumask_andnot(non_housekeeping_mask, cpu_possible_mask, upd->new_mask);
> +
> + if (!tick_nohz_full_mask) {
> + if (!zalloc_cpumask_var(&tick_nohz_full_mask, GFP_KERNEL)) {
> + free_cpumask_var(non_housekeeping_mask);
> + return NOTIFY_BAD;
> + }
> + }
> +
> + /* Kick all CPUs to re-evaluate tick dependency before change */
> + for_each_online_cpu(cpu)
> + tick_nohz_full_kick_cpu(cpu);
That solves what?
> + cpumask_copy(tick_nohz_full_mask, non_housekeeping_mask);
What's the exact point of this non_housekeeping_mask?
Why can't you simply do:
cpumask_andnot(tick_nohz_full_mask, cpu_possible_mask, upd->new_mask);
That'd be too simple and comprehensible, right?
> + tick_nohz_full_running = !cpumask_empty(tick_nohz_full_mask);
> +
> + /*
> + * If nohz_full is running, the timer duty must be on a housekeeper.
> + * If the current timer CPU is not a housekeeper, or no duty is assigned,
> + * pick the first housekeeper and assign it.
> + */
> + if (tick_nohz_full_running) {
> + int timer_cpu = READ_ONCE(tick_do_timer_cpu);
New line between declaration and code.
> + if (timer_cpu == TICK_DO_TIMER_NONE ||
> + !cpumask_test_cpu(timer_cpu, upd->new_mask)) {
No line break required. You have 100 characters
> + int next_timer = cpumask_first(upd->new_mask);
next_timer? Please pick variable names which are comprehensible and self
explaining. Also why can't you re-use timer_cpu, which would be actually useful?
> + if (next_timer < nr_cpu_ids)
How can upd->new_mask be empty? That'd be a bug, no?
> + WRITE_ONCE(tick_do_timer_cpu, next_timer);
> + }
> + }
> +
> + /* Kick all CPUs again to apply new nohz full state */
> + for_each_online_cpu(cpu)
> + tick_nohz_full_kick_cpu(cpu);
This whole thing lacks an explanation why it is even remotely correct.
> void __init tick_nohz_init(void)
...
> + if (!tick_nohz_full_mask) {
> + if (!slab_is_available())
> + alloc_bootmem_cpumask_var(&tick_nohz_full_mask);
> + else
> + zalloc_cpumask_var(&tick_nohz_full_mask, GFP_KERNEL);
> }
I've seen the same code sequence before. Copy & paste is simpler than
providing helper functions.....
> - if (IS_ENABLED(CONFIG_PM_SLEEP_SMP) &&
> - !IS_ENABLED(CONFIG_PM_SLEEP_SMP_NONZERO_CPU)) {
> - cpu = smp_processor_id();
> + housekeeping_register_notifier(&tick_nohz_housekeeping_nb);
>
> - if (cpumask_test_cpu(cpu, tick_nohz_full_mask)) {
> - pr_warn("NO_HZ: Clearing %d from nohz_full range "
> - "for timekeeping\n", cpu);
> - cpumask_clear_cpu(cpu, tick_nohz_full_mask);
> + if (tick_nohz_full_running) {
This indentation and the resulting goto mess can be completely avoided
if you actually refactor the code and not just claim to do so.
Again, this does too many things at once and then explains them badly,
which makes it unreviewable.
Thanks,
tglx
next prev parent reply other threads:[~2026-04-14 21:58 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-13 7:43 [PATCH v2 00/12] Dynamic Housekeeping Management (DHM) via CPUSets Qiliang Yuan
2026-04-13 7:43 ` [PATCH v2 02/12] sched/isolation: Introduce housekeeping notifier infrastructure Qiliang Yuan
2026-04-14 21:25 ` Thomas Gleixner
2026-04-13 7:43 ` [PATCH v2 03/12] rcu: Support runtime NOCB initialization and dynamic offloading Qiliang Yuan
2026-04-13 7:43 ` [PATCH v2 04/12] tick/nohz: Transition to dynamic full dynticks state management Qiliang Yuan
2026-04-14 21:57 ` Thomas Gleixner [this message]
2026-04-13 7:43 ` [PATCH v2 05/12] genirq: Support dynamic migration for managed interrupts Qiliang Yuan
2026-04-14 21:21 ` Thomas Gleixner
2026-04-13 7:43 ` [PATCH v2 06/12] watchdog: Allow runtime toggle of lockup detector affinity Qiliang Yuan
2026-04-13 7:43 ` [PATCH v2 07/12] sched/core: Dynamically update scheduler domain housekeeping mask Qiliang Yuan
2026-04-13 7:43 ` [PATCH v2 08/12] workqueue, mm: Support dynamic housekeeping mask updates Qiliang Yuan
2026-04-13 7:43 ` [PATCH v2 09/12] cgroup/cpuset: Introduce CPUSet-driven dynamic housekeeping (DHM) Qiliang Yuan
2026-04-13 7:43 ` [PATCH v2 10/12] cgroup/cpuset: Implement SMT-aware grouping and safety guards Qiliang Yuan
2026-04-13 7:43 ` [PATCH v2 11/12] Documentation: cgroup-v2: Document dynamic housekeeping (DHM) Qiliang Yuan
2026-04-13 7:43 ` [PATCH v2 12/12] selftests: cgroup: Add functional tests for dynamic housekeeping Qiliang Yuan
[not found] ` <20260413-wujing-dhm-v2-1-06df21caba5d@gmail.com>
2026-04-13 19:25 ` [PATCH v2 01/12] sched/isolation: Separate housekeeping types in enum hk_type Waiman Long
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87tstddx27.ffs@tglx \
--to=tglx@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=anna-maria@linutronix.de \
--cc=boqun@kernel.org \
--cc=bsegall@google.com \
--cc=cgroups@vger.kernel.org \
--cc=chenridong@huaweicloud.com \
--cc=corbet@lwn.net \
--cc=dietmar.eggemann@arm.com \
--cc=frederic@kernel.org \
--cc=hannes@cmpxchg.org \
--cc=jackmanb@google.com \
--cc=jiangshanlai@gmail.com \
--cc=joelagnelf@nvidia.com \
--cc=josh@joshtriplett.org \
--cc=juri.lelli@redhat.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=longman@redhat.com \
--cc=mathieu.desnoyers@efficios.com \
--cc=mgorman@suse.de \
--cc=mhocko@suse.com \
--cc=mingo@kernel.org \
--cc=mingo@redhat.com \
--cc=mkoutny@suse.com \
--cc=neeraj.upadhyay@kernel.org \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=qiang.zhang@linux.dev \
--cc=rcu@vger.kernel.org \
--cc=realwujing@gmail.com \
--cc=rostedt@goodmis.org \
--cc=shuah@kernel.org \
--cc=skhan@linuxfoundation.org \
--cc=surenb@google.com \
--cc=tj@kernel.org \
--cc=urezki@gmail.com \
--cc=vbabka@kernel.org \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox