From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 18606C7115C for ; Wed, 25 Jun 2025 17:03:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A008E6B00B1; Wed, 25 Jun 2025 13:03:53 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9B10E6B00C9; Wed, 25 Jun 2025 13:03:53 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8A09C6B00CA; Wed, 25 Jun 2025 13:03:53 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 735C46B00B1 for ; Wed, 25 Jun 2025 13:03:53 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 3AE6D14061F for ; Wed, 25 Jun 2025 17:03:53 +0000 (UTC) X-FDA: 83594545146.20.52F4B2C Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) by imf12.hostedemail.com (Postfix) with ESMTP id B1B3E4001F for ; Wed, 25 Jun 2025 17:03:50 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=none; spf=pass (imf12.hostedemail.com: domain of jonathan.cameron@huawei.com designates 185.176.79.56 as permitted sender) smtp.mailfrom=jonathan.cameron@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1750871031; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=03czVGrjzN1/CisGS8a1WUUMDWMM4durwiM3BHZwY4M=; b=GIhk9eN4XfansWbSj2Pepp7/zBljQ4vwR4DdgK9LJHrtBI8VrJCW1uOvQCOGSdiC0A1Pr4 LPlXOSpcaKl8Kx4VnzUqREeqSPqdaqf2GPAkJKMC74fz8ldISvdyjywN0Tv5aRtmCe7Kiq iOaPOCzqcD76Ca8mOanShXCCrSEXknM= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=none; spf=pass (imf12.hostedemail.com: domain of jonathan.cameron@huawei.com designates 185.176.79.56 as permitted sender) smtp.mailfrom=jonathan.cameron@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1750871031; a=rsa-sha256; cv=none; b=Xt3iboOjXEHFskYHJmFsmgQ2J4wfHDO5g/XkDab7ZpU+byVvTVc3wopaNdkOzvsUkC5G6J NDS3VqCoIJ3/d850X3Pq4s9olkdWmbCBG7kx1uqauvadkguxjpUp/JNan5+njiUq7Be1hq /GtiXaDf370/843e5JJnNpzNxS4ON3M= Received: from mail.maildlp.com (unknown [172.18.186.216]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4bS7Pk1vB4z6K9B9; Thu, 26 Jun 2025 01:01:14 +0800 (CST) Received: from frapeml500008.china.huawei.com (unknown [7.182.85.71]) by mail.maildlp.com (Postfix) with ESMTPS id 7DA671402EC; Thu, 26 Jun 2025 01:03:47 +0800 (CST) Received: from localhost (10.203.177.66) by frapeml500008.china.huawei.com (7.182.85.71) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.39; Wed, 25 Jun 2025 19:03:46 +0200 Date: Wed, 25 Jun 2025 18:03:43 +0100 From: Jonathan Cameron To: Peter Zijlstra CC: "H. Peter Anvin" , Catalin Marinas , , , , , , , , Will Deacon , Dan Williams , Davidlohr Bueso , Yicong Yang , , Yushan Wang , "Lorenzo Pieralisi" , Mark Rutland , Dave Hansen , Thomas Gleixner , Ingo Molnar , Borislav Petkov , , Andy Lutomirski Subject: Re: [PATCH v2 0/8] Cache coherency management subsystem Message-ID: <20250625180343.000020de@huawei.com> In-Reply-To: <20250625093152.GZ1613376@noisy.programming.kicks-ass.net> References: <20250624154805.66985-1-Jonathan.Cameron@huawei.com> <20250625085204.GC1613200@noisy.programming.kicks-ass.net> <20250625093152.GZ1613376@noisy.programming.kicks-ass.net> X-Mailer: Claws Mail 4.3.0 (GTK 3.24.42; x86_64-w64-mingw32) MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.203.177.66] X-ClientProxiedBy: lhrpeml500005.china.huawei.com (7.191.163.240) To frapeml500008.china.huawei.com (7.182.85.71) X-Rspamd-Queue-Id: B1B3E4001F X-Stat-Signature: qjypo8b7cztwfmweg1khkdwe55riraqz X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1750871030-878120 X-HE-Meta: U2FsdGVkX189aDfup6pz7+GOTD6wq6BBBZutL2vA2mGABSbdTZhIEYZtcaAvTZjSPikaDH5RhBRzXewNlrFBtMt8BhtEEyBbtGLjX/wdLZ+taXRs39ZmRCSOd21HQH8m8CP2AymNejOi3bgP82zP+67I9PBoaLqwZhcANvJssp+kqDtN8LfaRXsrrvYWTkz9TGHtxns9viSVHeu33FqRAmh7F9HouAtXP+tLzW+ENdwsIRh3Vw2ry2TX2Adh68fAtmBXKtCawIigvIzGLtPMkvIZKUdfsdm9HMRW/XFozgGDnystQ0rescWy/EdZGV7D061R2SgBW/DRRC7Ml7WFHmYnkdnqq+JdL+USYSZGLv7X4/c0XHYIfYpX6oePjyhb8qLq6tHjmadnA7xHjGjurzqRr9YSj4hukMSdVoLR/2kKnSxrndCuSLhQb88nzpmx+UDj5jxnoWuNn0IFqceyWK6yYnDt22PEjzXkjDwU9rSS9Vmyea+Aga/ZvN+u8ayuESYtvEj0AXGjpuPxoS2Ktt3O5oHboRjPmSTObyNghwtRuoBk6Va1bM58Voc5YIWdg0BkbTSuqGI3zwkVCeNsKTdYVN/WYq4EryTyhFx3nuZmYA5rYCnX+Mjiyr9SQAepJ9EU2rfopnHRnRWXyLPFf6CfwPz/Mq+lm+9NulXpjrnj+Noq8vL3gBMUucPZuXZVQF9SPlT5MKM4+IscL/xN5v5yh1RRlVMdgs6mtZr28FyYij8503tD9pfzbqOQ9CAhXk7UrRTsYQ7+qY9APmPdPj7CcE2DvbTTYfOGBjz8ciegzu1RF0gOUxdbdcfFnZeEuFb2svuMnI7k3jih6nA4I9Qk9TzcPlfmVZL5FL1sLZaHRQSKxLdgZsUZlmNd8qYCyyFCM3RnOSIoSAbT1w0OH774IZFXv1mh+u20ZqhxKkme/fwtlqRc4IQ5m2NGyrKFc2sTFzIIdvKUo2OOOk7 ZNeQylwx ogrgWImv7ZdP3W0baDWvk9ZZ3mBeK8xkPDQ+0WmQT4nDkMB0wBLrpZ3go4PSan3RHWfpEmVnPFGb5n7qTZ4LL6OuVTsNNDMNA2n63aJVeh3r/NTGxgQlA3FQhf0iVljI6AAj2eTgPZLUKoqRzX1lD4PpkP5JT2rfivfMgU1rFJAJNebitF/VMoCWB43qRA0EgqVepIkkjNA+i20t6ftpIdLEeL1DCqapZy3FYcZIafYbZHU9uTlzsMOjrSjglYzJBJyMMQp/ySfjiXWHjlQ7noxp564Hjz4Ynly4JIEtIPTnUarGBQ7wr6tYMPd9MKpHDYLONp4t///cuyZTrtsOpKqeeMw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, 25 Jun 2025 11:31:52 +0200 Peter Zijlstra wrote: > On Wed, Jun 25, 2025 at 02:12:39AM -0700, H. Peter Anvin wrote: > > On June 25, 2025 1:52:04 AM PDT, Peter Zijlstra wrote: > > >On Tue, Jun 24, 2025 at 04:47:56PM +0100, Jonathan Cameron wrote: > > > > > >> On x86 there is the much loved WBINVD instruction that causes a write back > > >> and invalidate of all caches in the system. It is expensive but it is > > > > > >Expensive is not the only problem. It actively interferes with things > > >like Cache-Allocation-Technology (RDT-CAT for the intel folks). Doing > > >WBINVD utterly destroys the cache subsystem for everybody on the > > >machine. > > > > > >> necessary in a few corner cases. > > > > > >Don't we have things like CLFLUSH/CLFLUSHOPT/CLWB exactly so that we can > > >avoid doing dumb things like WBINVD ?!? > > > > > >> These are cases where the contents of > > >> Physical Memory may change without any writes from the host. Whilst there > > >> are a few reasons this might happen, the one I care about here is when > > >> we are adding or removing mappings on CXL. So typically going from > > >> there being actual memory at a host Physical Address to nothing there > > >> (reads as zero, writes dropped) or visa-versa. > > > > > >> The > > >> thing that makes it very hard to handle with CPU flushes is that the > > >> instructions are normally VA based and not guaranteed to reach beyond > > >> the Point of Coherence or similar. You might be able to (ab)use > > >> various flush operations intended to ensure persistence memory but > > >> in general they don't work either. > > > > > >Urgh so this. Dan, Dave, are we getting new instructions to deal with > > >this? I'm really not keen on having WBINVD in active use. > > > > > > > WBINVD is the nuclear weapon to use when you have lost all notion of > > where the problematic data can be, and amounts to a full reset of the > > cache system. > > > > WBINVD can block interrupts for many *milliseconds*, system wide, and > > so is really only useful for once-per-boot type events, like MTRR > > initialization. > > Right this... But that CXL thing sounds like that's semi 'regular' to > the point that providing some infrastructure around it makes sense. This > should not be. I'm fully on board with the WBINVD issues (and hope for something new for the X86 world). However, this particular infrastructure (for those systems that can do so) is about pushing the problem and information to where it can be handled in a lot less disruptive fashion. It can take 'a while' but we are flushing only cache entries in the requested PA range. Other than some potential excess snoop traffic if the coherency tracking isn't precise, there should be limited affect on the rest of the system. So, for the systems I particularly care about, the CXL case isn't that bad. Just for giggles, if you want some horror stories the (dropped) ARM PSCI spec provides for approaches that require synchronization of calls across all CPUs. "CPU Rendezvous" in the attributes of CLEAN_INV_MEMREGION requires all CPUs to make a call within an impdef (discoverable) timeout. https://developer.arm.com/documentation/den0022/falp1/?lang=en I gather no one actually needs that on 'real' systems - that is systems where we actually need to do these flushes! The ACPI 'RFC' doesn't support that delight. Jonathan