From: Marc Zyngier <maz@kernel.org>
To: Yu Zhao <yuzhao@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Catalin Marinas <catalin.marinas@arm.com>,
Muchun Song <muchun.song@linux.dev>,
Thomas Gleixner <tglx@linutronix.de>,
Will Deacon <will@kernel.org>,
Douglas Anderson <dianders@chromium.org>,
Mark Rutland <mark.rutland@arm.com>,
Nanyong Sun <sunnanyong@huawei.com>,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH v1 3/6] irqchip/gic-v3: support SGI broadcast
Date: Tue, 22 Oct 2024 16:03:30 +0100 [thread overview]
Message-ID: <86a5ew41tp.wl-maz@kernel.org> (raw)
In-Reply-To: <20241021042218.746659-4-yuzhao@google.com>
On Mon, 21 Oct 2024 05:22:15 +0100,
Yu Zhao <yuzhao@google.com> wrote:
>
> GIC v3 and later support SGI broadcast, i.e., the mode that routes
> interrupts to all PEs in the system excluding the local CPU.
>
> Supporting this mode can avoid looping through all the remote CPUs
> when broadcasting SGIs, especially for systems with 200+ CPUs. The
> performance improvement can be measured with the rest of this series
> booted with "hugetlb_free_vmemmap=on irqchip.gicv3_pseudo_nmi=1":
>
> cd /sys/kernel/mm/hugepages/
> echo 600 >hugepages-1048576kB/nr_hugepages
> echo 2048kB >hugepages-1048576kB/demote_size
> perf record -g -- bash -c "echo 600 >hugepages-1048576kB/demote"
>
> gic_ipi_send_mask() bash sys time
> Before: 38.14% 0m10.513s
> After: 0.20% 0m5.132s
>
> Signed-off-by: Yu Zhao <yuzhao@google.com>
> ---
> drivers/irqchip/irq-gic-v3.c | 20 +++++++++++++++++++-
> 1 file changed, 19 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> index ce87205e3e82..42c39385e1b9 100644
> --- a/drivers/irqchip/irq-gic-v3.c
> +++ b/drivers/irqchip/irq-gic-v3.c
> @@ -1394,9 +1394,20 @@ static void gic_send_sgi(u64 cluster_id, u16 tlist, unsigned int irq)
> gic_write_sgi1r(val);
> }
>
> +static void gic_broadcast_sgi(unsigned int irq)
> +{
> + u64 val;
> +
> + val = BIT(ICC_SGI1R_IRQ_ROUTING_MODE_BIT) | (irq << ICC_SGI1R_SGI_ID_SHIFT);
As picked up by the test bot, please fix the 32bit build.
> +
> + pr_devel("CPU %d: broadcasting SGI %u\n", smp_processor_id(), irq);
> + gic_write_sgi1r(val);
> +}
> +
> static void gic_ipi_send_mask(struct irq_data *d, const struct cpumask *mask)
> {
> int cpu;
> + cpumask_t broadcast;
>
> if (WARN_ON(d->hwirq >= 16))
> return;
> @@ -1407,6 +1418,13 @@ static void gic_ipi_send_mask(struct irq_data *d, const struct cpumask *mask)
> */
> dsb(ishst);
>
> + cpumask_copy(&broadcast, cpu_present_mask);
Why cpu_present_mask? I'd expect that cpu_online_mask should be the
correct mask to use -- we don't IPI offline CPUs, in general.
> + cpumask_clear_cpu(smp_processor_id(), &broadcast);
> + if (cpumask_equal(&broadcast, mask)) {
> + gic_broadcast_sgi(d->hwirq);
> + goto done;
> + }
So the (valid) case where you would IPI *everyone* is not handled as a
fast path? That seems a missed opportunity.
This also seem an like expensive way to do it. How about something
like:
int mcnt = cpumask_weight(mask);
int ocnt = cpumask_weight(cpu_online_mask);
if (mcnt == ocnt) {
/* Broadcast to all CPUs including self */
} else if (mcnt == (ocnt - 1) &&
!cpumask_test_cpu(smp_processor_id(), mask)) {
/* Broadcast to all but self */
}
which avoids the copy+update_full compare.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
next prev parent reply other threads:[~2024-10-22 15:03 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-21 4:22 [PATCH v1 0/6] mm/arm64: re-enable HVO Yu Zhao
2024-10-21 4:22 ` [PATCH v1 1/6] mm/hugetlb_vmemmap: batch update PTEs Yu Zhao
2024-10-21 4:22 ` [PATCH v1 2/6] mm/hugetlb_vmemmap: add arch-independent helpers Yu Zhao
2024-10-21 4:22 ` [PATCH v1 3/6] irqchip/gic-v3: support SGI broadcast Yu Zhao
2024-10-22 0:24 ` kernel test robot
2024-10-22 15:03 ` Marc Zyngier [this message]
2024-10-25 5:07 ` Yu Zhao
2024-10-25 16:14 ` Marc Zyngier
2024-10-25 17:31 ` Yu Zhao
2024-10-29 19:02 ` Marc Zyngier
2024-10-29 19:53 ` Yu Zhao
2024-10-21 4:22 ` [PATCH v1 4/6] arm64: broadcast IPIs to pause remote CPUs Yu Zhao
2024-10-22 16:15 ` Marc Zyngier
2024-10-28 22:11 ` Yu Zhao
2024-10-29 19:36 ` Marc Zyngier
2024-10-31 18:10 ` Yu Zhao
2024-10-21 4:22 ` [PATCH v1 5/6] arm64: pause remote CPUs to update vmemmap Yu Zhao
2024-10-21 4:22 ` [PATCH v1 6/6] arm64: select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP Yu Zhao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=86a5ew41tp.wl-maz@kernel.org \
--to=maz@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=catalin.marinas@arm.com \
--cc=dianders@chromium.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mark.rutland@arm.com \
--cc=muchun.song@linux.dev \
--cc=sunnanyong@huawei.com \
--cc=tglx@linutronix.de \
--cc=will@kernel.org \
--cc=yuzhao@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox