From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7E35CD116ED for ; Fri, 25 Oct 2024 05:08:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A7DF56B007B; Fri, 25 Oct 2024 01:08:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A2D9B6B0082; Fri, 25 Oct 2024 01:08:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8F5806B0083; Fri, 25 Oct 2024 01:08:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 713686B007B for ; Fri, 25 Oct 2024 01:08:26 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 7D412404F3 for ; Fri, 25 Oct 2024 05:08:15 +0000 (UTC) X-FDA: 82710943224.22.7C2283C Received: from mail-vs1-f42.google.com (mail-vs1-f42.google.com [209.85.217.42]) by imf12.hostedemail.com (Postfix) with ESMTP id 63CB440009 for ; Fri, 25 Oct 2024 05:08:15 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=jgUzhDlo; spf=pass (imf12.hostedemail.com: domain of yuzhao@google.com designates 209.85.217.42 as permitted sender) smtp.mailfrom=yuzhao@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729832699; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=zfGHBHIoxV5kA8YjaOaHcL5pnxUwdfqiiPu15uVoLw0=; b=w42f2pH73Z6E50CO8FBOQU3jekUxxemHQ3Y02LaE/XX/4r9LBVFOXOL1/PD05O9bYp7uEp 0wFeBb2qqps+atT975x58VuEaWWbN+OwoqTkYJVCjDM1CyY90hB4F5GMdIKT8lIvcFkxTz CEIr2UWEOHog4zmOVtRm1ViVL+uJuxw= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=jgUzhDlo; spf=pass (imf12.hostedemail.com: domain of yuzhao@google.com designates 209.85.217.42 as permitted sender) smtp.mailfrom=yuzhao@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729832699; a=rsa-sha256; cv=none; b=IoYRJPMywJb/xiAGhMOqaQzNoeWaZhzdI58yHsouShtGEolDMk9+5LI9jVrcjnP5zJskmm MKocR4n9VHQpZXX+HqJVnmD+cGdAyZm4QlR4fnBx6mBPWw5We8lVQSPVFP/WR29KM9zX1s unR5Hju2QfpyxW6AuaPwzyuILE7YkG4= Received: by mail-vs1-f42.google.com with SMTP id ada2fe7eead31-4a74cfa7671so1444195137.0 for ; Thu, 24 Oct 2024 22:08:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1729832903; x=1730437703; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=zfGHBHIoxV5kA8YjaOaHcL5pnxUwdfqiiPu15uVoLw0=; b=jgUzhDloHsbsLk8v87CJqqQEMj8lAqDcAPSDTxDBeTDp7a75RS25/sMYHf4JBULOZ4 4C68qnBmFWp3BzMnumXLB5+Lm+H6WV5q/t6Wq23C0X1FMB0ZT8gd8kTWWFnQZRMasY2C diDJYFSXExp0yQpvPqEpF1qYt1McOhaxIh4BdqxF5e8BCEA9O0O7oFqd3c3cneB5WWw5 pqxnQvSGqu48Pzczeq+kekJnv8dGaR96H3g+gfX1OnyOGZDN+dP8QN4Ge7z14UJ+kcwR jd5HvSN2jnpQ+Fts6UiweHZXyER4NuCngFYPhiZiVQru2vOTsioo8oa19X9cT+wiNjSR YXJg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729832903; x=1730437703; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=zfGHBHIoxV5kA8YjaOaHcL5pnxUwdfqiiPu15uVoLw0=; b=K5L1EFDIFtlKZ9yLLLb18OubkuHy4oVNeQiXkxr7KqtNcIkVYJAd9HY1pDWU0mtmLG dOvpFtEv62QV3eczMjJVkPTH1IkzcDYob5N9o5WjgkM2vSJn+/lFtR5jUGbECNzOAylu e+rnuqHM1BvcT0vLnD7fRXF9s1kJppCIdGqevDwVB4s6qDd9MXTOLc/WgpRiI19Lbn4T qPXVOEtn2mBhRbszDr0vorI8Celte1jGc0VMEWL0obIaMmOfr4VmgZeFO/RAk9A65tM+ TeRDo5HuHRKBHiv441tiG3jLuRn6C+9AQ21LaXDr4mRfneYYBspUYu+/3qoTJEDTw1XF kpAw== X-Forwarded-Encrypted: i=1; AJvYcCUb/H5RO3ioD5Zk2DbnFNTD6xh03CsB3otOAggD0YlxMv00s9ziLhi3umqeS1yUuabmx+mScSuPGQ==@kvack.org X-Gm-Message-State: AOJu0Yx0MlhAw/bmAxq+OayQwzxHEeei/Evf3BwkbtCcCkTfaIYVrHPs 3T2HtyMQzF4svftGbbMQgVZisSYWm1wDL0pUmhz55GhEhX28pERv/ZQe3hqqw/4yeYxQ2SjRSri gtZf7AywOKaM81uY7xlPxCs4RAxitlNM/YzhI X-Google-Smtp-Source: AGHT+IHUDYUoU8LFUbZ4k2PKGpA+ThkXB1QD3CSnJUZbltuRAF2DUtPp005MRVWQoU3kazWygDbzQCvItjMIBV7xf3M= X-Received: by 2002:a05:6102:418a:b0:4a4:72f0:7937 with SMTP id ada2fe7eead31-4a870d306d7mr3512838137.8.1729832902928; Thu, 24 Oct 2024 22:08:22 -0700 (PDT) MIME-Version: 1.0 References: <20241021042218.746659-1-yuzhao@google.com> <20241021042218.746659-4-yuzhao@google.com> <86a5ew41tp.wl-maz@kernel.org> In-Reply-To: <86a5ew41tp.wl-maz@kernel.org> From: Yu Zhao Date: Thu, 24 Oct 2024 23:07:45 -0600 Message-ID: Subject: Re: [PATCH v1 3/6] irqchip/gic-v3: support SGI broadcast To: Marc Zyngier Cc: Andrew Morton , Catalin Marinas , Muchun Song , Thomas Gleixner , Will Deacon , Douglas Anderson , Mark Rutland , Nanyong Sun , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 63CB440009 X-Stat-Signature: 6k45593ftxrsubdmg9beh195m53h14cq X-Rspam-User: X-HE-Tag: 1729832895-803285 X-HE-Meta: U2FsdGVkX18FfNSbm7Ipra/vbhEbceKQdERUI8RUKsos9iAnZXC68FABVxSbPGpGOs+19g6Q87RMsO1erjqvf2y1Lit5O04rbXpYvStwD/eGK4ginqVsP5bZu1JDA/g0jz4xDUWeekaqcVlUMOfetwms7QReDJ3+RVJj/sb8JL2m4md1/oi69/AtcAEF2DrH410Tlp/5ykP8XIGNp2SmKekpXexXx6eVI0y6El1+/7zK+iaRMPd/+1exqr7PCU2ZrJ5JSKT1XYkJMh/rcQ6VG8NRE5qS43JUdwcNmSnzE6Ij3wiKvvsJjJ1J8smJoKXam1E0H1DOtJzkeDag4XpG9FvfeDfudi7ZIzRl8zdlmS53pd9tULWGi+rkF+fs3xXkXbbxkqFJsU8SFXjIT8hP/wIzXET/nfnOX04O/79EgulOH1N6EHSjb/C5ueUiEkCMkKiJuWo78BidPvixgO0wPR6EuU/XklUv3veo7sm2azqX25cFI3mqB0tUOloBXUZFKtdj7yRBFI2gtxq5sQZO4LIQf3fmvrgOMywjEsP1c2WlRAuP4OPeplFMBvulFoie9c0gnPeMuIL87LuM2TxKyss44bP5uMyMeQYa4+FmwPlyc7OOTdCB3c8xn2tQWV8i9hdOrKKPJPoAxaflH1XIebz0vZfsIkUU5EKrUikOmdFx89+QgvHcCoBdI6lwPDPV6WbXzUE6nQmOZcjhlAN5QSA5GdhqaX7SwfAoiPN5fvd6gtEaIpP5Ryk2nuZjJvFKxD8CXtFFiuubaBwqgCL0+3euTpVFdRgQIBnB1/b2FVWEzmlk4DSMDg6J3c2Rq8g+PRT45Gm469WyIb5yEUDOTKZHPyinnSCB/I9RZjnXxWuht6wuNFf2q0Wis5GL5/06uIPSSUhJIrsd9iYR1oYI2hrbf+5v7pFuWzSQPYsGD4rLB9YtUhOZXcP0Iv0vLruJLBket1zwm5nC0VrJfny aQ89l04B fA1Z9xb6ooTGxVrjH6MC2SLg9xGLMacHnCBcCGlAdtgE1Tbh1aFX+ZS5lAX0mD83aV+cmL2jHlc/Yg1wIOk1VRvztoftsaRQY06sg5k7HvIaHK/3OWU5kfl7IuzMTL/as4n1xZbWa9Yqfq08Mf9B+76g4xlUmjrPhlU4p4bR5DgKNDnpZQNaQykECFEKEiJR6/rGvDFwUpOZWhPglR/YC2REdESh87P5bUZ/Rjg+WASpveHaS9ExV4B8S4g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Marc, On Tue, Oct 22, 2024 at 9:03=E2=80=AFAM Marc Zyngier wrote= : > > On Mon, 21 Oct 2024 05:22:15 +0100, > Yu Zhao wrote: > > > > GIC v3 and later support SGI broadcast, i.e., the mode that routes > > interrupts to all PEs in the system excluding the local CPU. > > > > Supporting this mode can avoid looping through all the remote CPUs > > when broadcasting SGIs, especially for systems with 200+ CPUs. The > > performance improvement can be measured with the rest of this series > > booted with "hugetlb_free_vmemmap=3Don irqchip.gicv3_pseudo_nmi=3D1": > > > > cd /sys/kernel/mm/hugepages/ > > echo 600 >hugepages-1048576kB/nr_hugepages > > echo 2048kB >hugepages-1048576kB/demote_size > > perf record -g -- bash -c "echo 600 >hugepages-1048576kB/demote" > > > > gic_ipi_send_mask() bash sys time > > Before: 38.14% 0m10.513s > > After: 0.20% 0m5.132s > > > > Signed-off-by: Yu Zhao > > --- > > drivers/irqchip/irq-gic-v3.c | 20 +++++++++++++++++++- > > 1 file changed, 19 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.= c > > index ce87205e3e82..42c39385e1b9 100644 > > --- a/drivers/irqchip/irq-gic-v3.c > > +++ b/drivers/irqchip/irq-gic-v3.c > > @@ -1394,9 +1394,20 @@ static void gic_send_sgi(u64 cluster_id, u16 tli= st, unsigned int irq) > > gic_write_sgi1r(val); > > } > > > > +static void gic_broadcast_sgi(unsigned int irq) > > +{ > > + u64 val; > > + > > + val =3D BIT(ICC_SGI1R_IRQ_ROUTING_MODE_BIT) | (irq << ICC_SGI1R_S= GI_ID_SHIFT); > > As picked up by the test bot, please fix the 32bit build. Will do. > > + > > + pr_devel("CPU %d: broadcasting SGI %u\n", smp_processor_id(), irq= ); > > + gic_write_sgi1r(val); > > +} > > + > > static void gic_ipi_send_mask(struct irq_data *d, const struct cpumask= *mask) > > { > > int cpu; > > + cpumask_t broadcast; > > > > if (WARN_ON(d->hwirq >=3D 16)) > > return; > > @@ -1407,6 +1418,13 @@ static void gic_ipi_send_mask(struct irq_data *d= , const struct cpumask *mask) > > */ > > dsb(ishst); > > > > + cpumask_copy(&broadcast, cpu_present_mask); > > Why cpu_present_mask? I'd expect that cpu_online_mask should be the > correct mask to use -- we don't IPI offline CPUs, in general. This is exactly because "we don't IPI offline CPUs, in general", assuming "we" means the kernel, not GIC. My interpretation of what the GIC spec says ("0b1: Interrupts routed to all PEs in the system, excluding self") is that it broadcasts IPIs to "cpu_present_mask" (minus the local one). So if the kernel uses "cpu_online_mask" here, GIC would send IPIs to offline CPUs (cpu_present_mask ^ cpu_online_mask), which I don't know whether it's a defined behavior. But if you actually meant GIC doesn't IPI offline CPUs, then yes, here the kernel should use "cpu_online_mask". > > + cpumask_clear_cpu(smp_processor_id(), &broadcast); > > + if (cpumask_equal(&broadcast, mask)) { > > + gic_broadcast_sgi(d->hwirq); > > + goto done; > > + } > > So the (valid) case where you would IPI *everyone* is not handled as a > fast path? That seems a missed opportunity. You are right: it should handle that case. > This also seem an like expensive way to do it. How about something > like: > > int mcnt =3D cpumask_weight(mask); > int ocnt =3D cpumask_weight(cpu_online_mask); > if (mcnt =3D=3D ocnt) { > /* Broadcast to all CPUs including self */ Does the comment mean the following two steps? 1. Broadcasting to everyone else. 2. Sending to self. My understanding of the "Interrupt Routing Mode" is that it can't broadcast to all CPUs including self, and therefore we need the above two steps, which still can be a lot faster. Is my understanding correct? > } else if (mcnt =3D=3D (ocnt - 1) && > !cpumask_test_cpu(smp_processor_id(), mask)) { > /* Broadcast to all but self */ > } > > which avoids the copy+update_full compare. Thank you.