From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2A5D8C433F5 for ; Sun, 9 Jan 2022 20:49:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AF22E6B0073; Sun, 9 Jan 2022 15:49:02 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id AA2076B0074; Sun, 9 Jan 2022 15:49:02 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 96A466B0075; Sun, 9 Jan 2022 15:49:02 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0050.hostedemail.com [216.40.44.50]) by kanga.kvack.org (Postfix) with ESMTP id 89C6A6B0073 for ; Sun, 9 Jan 2022 15:49:02 -0500 (EST) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 4D71418175FA2 for ; Sun, 9 Jan 2022 20:49:02 +0000 (UTC) X-FDA: 79011938124.25.6B9C515 Received: from mail-ed1-f42.google.com (mail-ed1-f42.google.com [209.85.208.42]) by imf18.hostedemail.com (Postfix) with ESMTP id D70FD1C000B for ; Sun, 9 Jan 2022 20:49:01 +0000 (UTC) Received: by mail-ed1-f42.google.com with SMTP id w16so45631333edc.11 for ; Sun, 09 Jan 2022 12:49:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=MbEyDeqgYbJ2xhJxNtcbQmNzMLNwoT31D9rzWrEDzgc=; b=gSjXyOTYJ/ZtTbXp5zrEUee7+gcmi5I7MFLZHm9UtKa11ybb29dF/Qkw7HsGEfZ2KO oJhMoNsi13vzZedr5xsSTA5R3Z1JLa1JeOsD2WP2VbPwoCRhiPJnqG0HtFpMrOo+hVHL d9nxrfL67kEqAeL+qFp5ppDgnhmlHUsZCxp5w= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=MbEyDeqgYbJ2xhJxNtcbQmNzMLNwoT31D9rzWrEDzgc=; b=BMGn8/2SUqZILmyrYSAN2Y+CgbAG2WCUkvaEXlCdS81cTpodoxQsNbnq0IXj+mhaqZ i74cSYQXNkHks6SA+ul7x9PJS9jyCpy7fkOGpHBG+cooyLaw/iDW2aV1/XXuJRxRcCbM QDef+KQSTqwG2sLeiBIXHCdgcnOoH9xsoI9sIaGDchY4wMa3BbOlY7t2uIPr8HMGvF6u KPur7gVA55XXBSXN+nI82NRZBDL7mWO2fEgQFAIWDjvyikRbAmomA0XqAULhBACJRTVE zkX2V3bes+OPrPXdov51vLQQ4tZrPMOVtcermHdAr7zdYxqc/sdZL6EWDMu+R14KdkYG qWMw== X-Gm-Message-State: AOAM530Goh2ks+vbG2blQkRg5LCws4Xk9Xo7nfGfUEOwb2EirCRJBnLb bDAX8Y93C275RqVx3kuapwVDAjX08xO4Yh5qF1A= X-Google-Smtp-Source: ABdhPJwQp8bCMuuXREbw6AEM5lhnbYDmnGmsvBzYY8xKj2o00LR0xnKfRQXZb0vqLip4h7GOR0d6eA== X-Received: by 2002:a17:906:6c1:: with SMTP id v1mr54926972ejb.638.1641761339846; Sun, 09 Jan 2022 12:48:59 -0800 (PST) Received: from mail-wr1-f51.google.com (mail-wr1-f51.google.com. [209.85.221.51]) by smtp.gmail.com with ESMTPSA id 4sm1685310ejc.160.2022.01.09.12.48.59 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 09 Jan 2022 12:48:59 -0800 (PST) Received: by mail-wr1-f51.google.com with SMTP id v6so22979881wra.8 for ; Sun, 09 Jan 2022 12:48:59 -0800 (PST) X-Received: by 2002:a5d:6c68:: with SMTP id r8mr61088416wrz.281.1641761338895; Sun, 09 Jan 2022 12:48:58 -0800 (PST) MIME-Version: 1.0 References: <7c9c388c388df8e88bb5d14828053ac0cb11cf69.1641659630.git.luto@kernel.org> <3586aa63-2dd2-4569-b9b9-f51080962ff2@www.fastmail.com> <430e3db1-693f-4d46-bebf-0a953fe6c2fc@www.fastmail.com> <484a7f37-ceed-44f6-8629-0e67a0860dc8@www.fastmail.com> In-Reply-To: <484a7f37-ceed-44f6-8629-0e67a0860dc8@www.fastmail.com> From: Linus Torvalds Date: Sun, 9 Jan 2022 12:48:42 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH 16/23] sched: Use lightweight hazard pointers to grab lazy mms To: Andy Lutomirski Cc: Will Deacon , Catalin Marinas , Andrew Morton , Linux-MM , Nicholas Piggin , Anton Blanchard , Benjamin Herrenschmidt , Paul Mackerras , Randy Dunlap , linux-arch , "the arch/x86 maintainers" , Rik van Riel , Dave Hansen , "Peter Zijlstra (Intel)" , Nadav Amit , Mathieu Desnoyers Content-Type: text/plain; charset="UTF-8" Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=gSjXyOTY; spf=pass (imf18.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.208.42 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org; dmarc=none X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: D70FD1C000B X-Stat-Signature: n8zmapy8qjpm8eruboe3ocp9nuqzbydw X-HE-Tag: 1641761341-296568 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sun, Jan 9, 2022 at 12:20 PM Andy Lutomirski wrote: > > Are you *sure*? The ASID management code on x86 is (as mentioned > before) completely unaware of whether an ASID is actually in use > anywhere. Right. But the ASID situation on x86 is very very different, exactly because x86 doesn't have cross-CPU TLB invalidates. Put another way: x86 TLB hardware is fundamentally per-cpu. As such, any ASID management is also per-cpu. That's fundamentally not true on arm64. And that's not some "arm64 implementation detail". That's fundamental to doing cross-CPU TLB invalidates in hardware. If your TLB invalidates act across CPU's, then the state they act on are also obviously across CPU's. So the ASID situation is fundamentally different depending on the hardware usage. On x86, TLB's are per-core, and on arm64 they are not, and that's reflected in our code too. As a result, on x86, each mm has a per-cpu ASID, and there's a small array of per-cpu "mm->asid" mappings. On arm, each mm has an asid, and it's allocated from a global asid space - so there is no need for that "mm->asid" mapping, because the asid is there in the mm, and it's shared across cpus. That said, I still don't actually know the arm64 ASID management code. The thing about TLB flushes is that it's ok to do them spuriously (as long as you don't do _too_ many of them and tank performance), so two different mm's can have the same hw ASID on two different cores and that just makes cross-CPU TLB invalidates too aggressive. You can't share an ASID on the _same_ core without flushing in between context switches, because then the TLB on that core might be re-used for a different mm. So the flushing rules aren't necessarily 100% 1:1 with the "in use" rules, and who knows if the arm64 ASID management actually ends up just matching what that whole "this lazy TLB is still in use on another CPU". So I don't really know the arm64 situation. And i's possible that lazy TLB isn't even worth it on arm64 in the first place. > > So I think that even for that hardware TLB shootdown case, your patch > > only adds overhead. > > The overhead is literally: > > exit_mmap(); > for each cpu still in mm_cpumask: > smp_load_acquire > > That's it, unless the mm is actually in use Ok, now do this for a machine with 1024 CPU's. And tell me it is "scalable". > On a very large arm64 system, I would believe there could be real overhead. But these very large systems are exactly the systems that currently ping-pong mm_count. Right. But I think your arguments against mm_count are questionable. I'd much rather have a *much* smaller patch that says "on x86 and powerpc, we don't need this overhead at all". And then the arm64 people can look at it and say "Yeah, we'll still do the mm_count thing", or maybe say "Yeah, we can solve it another way". And maybe the arm64 people actually say "Yeah, this hazard pointer thing is perfect for us". That still doesn't necessarily argue for it on an architecture that ends up serializing with an IPI anyway. Linus