From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id ACE60C433F5 for ; Sun, 9 Jan 2022 00:28:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AAA4C6B0071; Sat, 8 Jan 2022 19:28:09 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A59326B0073; Sat, 8 Jan 2022 19:28:09 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8F9776B0074; Sat, 8 Jan 2022 19:28:09 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 80BBC6B0071 for ; Sat, 8 Jan 2022 19:28:09 -0500 (EST) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 200F818215475 for ; Sun, 9 Jan 2022 00:28:09 +0000 (UTC) X-FDA: 79008861498.17.EEEC12B Received: from mail-ed1-f52.google.com (mail-ed1-f52.google.com [209.85.208.52]) by imf21.hostedemail.com (Postfix) with ESMTP id A24721C0008 for ; Sun, 9 Jan 2022 00:28:08 +0000 (UTC) Received: by mail-ed1-f52.google.com with SMTP id a18so36865526edj.7 for ; Sat, 08 Jan 2022 16:28:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=oGsOLIK808uXKoYEXm+9M1/mxGIpc+NCRyWBzRfQguU=; b=GncIXPQZdIf1s30CtvXDpImy0aM6Y8UDX4Prf9G0CuasfJL5i/sqgCYEx2ov5zFK7w iPw7dihpr73TH43khvz5Nm4adkZoRk3EKrdA77y5ZkWlYf0F3T3S5jUt6sOxlz2J4rXc ed5o7t90vs8GITG/p32k1EPLErcxH74XHsri8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=oGsOLIK808uXKoYEXm+9M1/mxGIpc+NCRyWBzRfQguU=; b=MR4pWeGf+HR6m95Z/8Y8JOp4UT0bPuPOgB9u6XU+GtkrH+bXHvFvgoeyeUQdWdvXP2 DbCxX5Z/UQD/v8JjShf8RjzYdiT036bGG7ySbyp9aIVR4vqFFCmMLiIOWvrmVswugs4n oU9ZNZp/F665COM3SKZKkugR3LdePDsU5nAc8k2rmJuNtMHcaym0UIgzW/OmSgjviiK1 VKyCtNpnJbuoMrgBnb6jerMjHI6/S6ypGE0+Q+PL5E1B2CEHIUzb07G3hve7RCz2Yv6x D30L1hBnuF/FwsulSh0+picTL0ALF0DGgtPYGiMKrzQLwEAalqD+7G0+kkwYikDUEEE7 cJ1g== X-Gm-Message-State: AOAM533FWUgzMNOqdlPnwvQNOIWUMpe3cwzcYM0kqVITpopJKpdOpMeL v6OaVXgo4vSjaDC1H6Q5eTK6GSQzWQVyp/x2 X-Google-Smtp-Source: ABdhPJyrs1ZaoleF6JBn3gaTFWkA33W4O/cBm+p5VHZlxpjQTbTm+hDyKe8IKHFAAI4Pf2l1K2Mdzg== X-Received: by 2002:a05:6402:c92:: with SMTP id cm18mr5413379edb.295.1641688086717; Sat, 08 Jan 2022 16:28:06 -0800 (PST) Received: from mail-wr1-f53.google.com (mail-wr1-f53.google.com. [209.85.221.53]) by smtp.gmail.com with ESMTPSA id hc19sm162865ejc.1.2022.01.08.16.28.05 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 08 Jan 2022 16:28:05 -0800 (PST) Received: by mail-wr1-f53.google.com with SMTP id r9so17308902wrg.0 for ; Sat, 08 Jan 2022 16:28:05 -0800 (PST) X-Received: by 2002:a5d:6c68:: with SMTP id r8mr58307877wrz.281.1641688084877; Sat, 08 Jan 2022 16:28:04 -0800 (PST) MIME-Version: 1.0 References: <7c9c388c388df8e88bb5d14828053ac0cb11cf69.1641659630.git.luto@kernel.org> <3586aa63-2dd2-4569-b9b9-f51080962ff2@www.fastmail.com> In-Reply-To: <3586aa63-2dd2-4569-b9b9-f51080962ff2@www.fastmail.com> From: Linus Torvalds Date: Sat, 8 Jan 2022 16:27:48 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH 16/23] sched: Use lightweight hazard pointers to grab lazy mms To: Andy Lutomirski , Will Deacon , Catalin Marinas Cc: Andrew Morton , Linux-MM , Nicholas Piggin , Anton Blanchard , Benjamin Herrenschmidt , Paul Mackerras , Randy Dunlap , linux-arch , "the arch/x86 maintainers" , Rik van Riel , Dave Hansen , "Peter Zijlstra (Intel)" , Nadav Amit , Mathieu Desnoyers Content-Type: multipart/alternative; boundary="00000000000021a7ec05d51b4bec" X-Rspamd-Queue-Id: A24721C0008 X-Stat-Signature: 9bze7da34nz4xtzjseuhtioxyp9xbkab Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=GncIXPQZ; dmarc=none; spf=pass (imf21.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.208.52 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org X-Rspamd-Server: rspam02 X-HE-Tag: 1641688088-984009 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: --00000000000021a7ec05d51b4bec Content-Type: text/plain; charset="UTF-8" On Sat, Jan 8, 2022 at 2:04 PM Andy Lutomirski wrote: > > So this requires that all architectures actually walk all relevant > CPUs to see if an IPI is needed and send that IPI. On architectures > that actually need an IPI anyway (x86 bare metal, powerpc (I think) > and others, fine. But on architectures with a broadcast-to-all-CPUs > flush (ARM64 IIUC), then the extra IPI will be much much slower than a > simple load-acquire in a loop. ... hmm. How about a hybrid scheme? (a) architectures that already require that IPI anyway for TLB invalidation (ie x86, but others too), just make the rule be that the TLB flush by exit_mmap() get rid of any lazy TLB mm references. Which they already do. (b) architectures like arm64 that do hw-assisted TLB shootdown will have an ASID allocator model, and what you do is to use that to either (b') increment/decrement the mm_count at mm ASID allocation/freeing time (b'') use the existing ASID tracking data to find the CPU's that have that ASID (c) can you really imagine hardware TLB shootdown without ASID allocation? That doesn't seem to make sense. But if it exists, maybe that kind of crazy case would do the percpu array walking. (Honesty in advertising: I don't know the arm64 ASID code - I used to know the old alpha version I wrote in a previous lifetime - but afaik any ASID allocator has to be able to track CPU's that have a particular ASID in use and be able to invalidate it). Hmm. The x86 maintainers are on this thread, but they aren't even the problem. Adding Catalin and Will to this, I think they should know if/how this would fit with the arm64 ASID allocator. Will/Catalin, background here: https://lore.kernel.org/all/CAHk-=wj4LZaFB5HjZmzf7xLFSCcQri-WWqOEJHwQg0QmPRSdQA@mail.gmail.com/ for my objection to that special "keep non-refcounted magic per-cpu pointer to lazy tlb mm". Linus --00000000000021a7ec05d51b4bec Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


On Sat, Jan 8, 2022 at 2:04 PM Andy Lutomirski <= ;luto@kernel.org> wrote:
><= br>> So this requires that all architectures actually walk all relevant =
> CPUs to see if an IPI is needed and send that IPI. On architecture= s
> that actually need an IPI anyway (x86 bare metal, powerpc (I thin= k)
> and others, fine. But on architectures with a broadcast-to-all-C= PUs
> flush (ARM64 IIUC), then the extra IPI will be much much slower= than a
> simple load-acquire in a loop.

... hmm. How ab= out a hybrid scheme?

=C2=A0(a) architectures that = already require that IPI anyway for TLB invalidation (ie x86, but others to= o), just make the rule be that the TLB flush by exit_mmap() get rid of any = lazy TLB mm references. Which they already do.

=C2=A0(b) architectures like arm64 that do hw-assisted TLB shootdown wil= l have an ASID allocator model, and what you do is to use that to either
=C2=A0 =C2=A0 (b') increment/decrement the mm_count at mm ASID = allocation/freeing time
=C2=A0 =C2=A0 (b'') use the exist= ing ASID tracking data to find the CPU's that have that ASID
=
=C2=A0(c) can you really imagine hardware TLB shootdown with= out ASID allocation? That doesn't seem to make sense. But if it exists,= maybe that kind of crazy case would do the percpu array walking.

(Honesty in advertising: I don't know the arm64 ASID co= de - I used to know the old alpha version I wrote in a previous lifetime - = but afaik any ASID allocator has to be able to track CPU's that have a = particular ASID in use and be able to invalidate it).

<= div>Hmm. The x86 maintainers are on this thread, but they aren't even t= he problem. Adding Catalin and Will to this, I think they should know if/ho= w this would fit with the arm64 ASID allocator.

Wi= ll/Catalin, background here:


for my objection to that special "keep non-refcounted=C2=A0ma= gic per-cpu pointer to lazy tlb mm".

=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Linus
--00000000000021a7ec05d51b4bec--