linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Marc Zyngier <maz@kernel.org>
To: Yu Zhao <yuzhao@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Muchun Song <muchun.song@linux.dev>,
	Thomas Gleixner <tglx@linutronix.de>,
	Will Deacon <will@kernel.org>,
	Douglas Anderson <dianders@chromium.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Nanyong Sun <sunnanyong@huawei.com>,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH v1 3/6] irqchip/gic-v3: support SGI broadcast
Date: Tue, 22 Oct 2024 16:03:30 +0100	[thread overview]
Message-ID: <86a5ew41tp.wl-maz@kernel.org> (raw)
In-Reply-To: <20241021042218.746659-4-yuzhao@google.com>

On Mon, 21 Oct 2024 05:22:15 +0100,
Yu Zhao <yuzhao@google.com> wrote:
> 
> GIC v3 and later support SGI broadcast, i.e., the mode that routes
> interrupts to all PEs in the system excluding the local CPU.
> 
> Supporting this mode can avoid looping through all the remote CPUs
> when broadcasting SGIs, especially for systems with 200+ CPUs. The
> performance improvement can be measured with the rest of this series
> booted with "hugetlb_free_vmemmap=on irqchip.gicv3_pseudo_nmi=1":
> 
>   cd /sys/kernel/mm/hugepages/
>   echo 600 >hugepages-1048576kB/nr_hugepages
>   echo 2048kB >hugepages-1048576kB/demote_size
>   perf record -g -- bash -c "echo 600 >hugepages-1048576kB/demote"
> 
>          gic_ipi_send_mask()  bash sys time
> Before:  38.14%               0m10.513s
> After:    0.20%               0m5.132s
> 
> Signed-off-by: Yu Zhao <yuzhao@google.com>
> ---
>  drivers/irqchip/irq-gic-v3.c | 20 +++++++++++++++++++-
>  1 file changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> index ce87205e3e82..42c39385e1b9 100644
> --- a/drivers/irqchip/irq-gic-v3.c
> +++ b/drivers/irqchip/irq-gic-v3.c
> @@ -1394,9 +1394,20 @@ static void gic_send_sgi(u64 cluster_id, u16 tlist, unsigned int irq)
>  	gic_write_sgi1r(val);
>  }
>  
> +static void gic_broadcast_sgi(unsigned int irq)
> +{
> +	u64 val;
> +
> +	val = BIT(ICC_SGI1R_IRQ_ROUTING_MODE_BIT) | (irq << ICC_SGI1R_SGI_ID_SHIFT);

As picked up by the test bot, please fix the 32bit build.

> +
> +	pr_devel("CPU %d: broadcasting SGI %u\n", smp_processor_id(), irq);
> +	gic_write_sgi1r(val);
> +}
> +
>  static void gic_ipi_send_mask(struct irq_data *d, const struct cpumask *mask)
>  {
>  	int cpu;
> +	cpumask_t broadcast;
>  
>  	if (WARN_ON(d->hwirq >= 16))
>  		return;
> @@ -1407,6 +1418,13 @@ static void gic_ipi_send_mask(struct irq_data *d, const struct cpumask *mask)
>  	 */
>  	dsb(ishst);
>  
> +	cpumask_copy(&broadcast, cpu_present_mask);

Why cpu_present_mask? I'd expect that cpu_online_mask should be the
correct mask to use -- we don't IPI offline CPUs, in general.

> +	cpumask_clear_cpu(smp_processor_id(), &broadcast);
> +	if (cpumask_equal(&broadcast, mask)) {
> +		gic_broadcast_sgi(d->hwirq);
> +		goto done;
> +	}

So the (valid) case where you would IPI *everyone* is not handled as a
fast path? That seems a missed opportunity.

This also seem an like expensive way to do it. How about something
like:

	int mcnt = cpumask_weight(mask);
	int ocnt = cpumask_weight(cpu_online_mask);
	if (mcnt == ocnt)  {
		/* Broadcast to all CPUs including self */
	} else if (mcnt == (ocnt - 1) &&
		   !cpumask_test_cpu(smp_processor_id(), mask)) {
		/* Broadcast to all but self */
	}

which avoids the copy+update_full compare.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.


  parent reply	other threads:[~2024-10-22 15:03 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-21  4:22 [PATCH v1 0/6] mm/arm64: re-enable HVO Yu Zhao
2024-10-21  4:22 ` [PATCH v1 1/6] mm/hugetlb_vmemmap: batch update PTEs Yu Zhao
2024-10-21  4:22 ` [PATCH v1 2/6] mm/hugetlb_vmemmap: add arch-independent helpers Yu Zhao
2024-10-21  4:22 ` [PATCH v1 3/6] irqchip/gic-v3: support SGI broadcast Yu Zhao
2024-10-22  0:24   ` kernel test robot
2024-10-22 15:03   ` Marc Zyngier [this message]
2024-10-25  5:07     ` Yu Zhao
2024-10-25 16:14       ` Marc Zyngier
2024-10-25 17:31         ` Yu Zhao
2024-10-29 19:02           ` Marc Zyngier
2024-10-29 19:53             ` Yu Zhao
2024-10-21  4:22 ` [PATCH v1 4/6] arm64: broadcast IPIs to pause remote CPUs Yu Zhao
2024-10-22 16:15   ` Marc Zyngier
2024-10-28 22:11     ` Yu Zhao
2024-10-29 19:36       ` Marc Zyngier
2024-10-31 18:10         ` Yu Zhao
2024-10-21  4:22 ` [PATCH v1 5/6] arm64: pause remote CPUs to update vmemmap Yu Zhao
2024-10-21  4:22 ` [PATCH v1 6/6] arm64: select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP Yu Zhao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86a5ew41tp.wl-maz@kernel.org \
    --to=maz@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=catalin.marinas@arm.com \
    --cc=dianders@chromium.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mark.rutland@arm.com \
    --cc=muchun.song@linux.dev \
    --cc=sunnanyong@huawei.com \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox