From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 04C39E77198 for ; Wed, 8 Jan 2025 01:36:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7AE926B0082; Tue, 7 Jan 2025 20:36:55 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 75E1D6B0083; Tue, 7 Jan 2025 20:36:55 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 625CC6B0088; Tue, 7 Jan 2025 20:36:55 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 40E5F6B0082 for ; Tue, 7 Jan 2025 20:36:55 -0500 (EST) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id BA26B4345C for ; Wed, 8 Jan 2025 01:36:54 +0000 (UTC) X-FDA: 82982570748.16.9CB174A Received: from mail-qt1-f178.google.com (mail-qt1-f178.google.com [209.85.160.178]) by imf23.hostedemail.com (Postfix) with ESMTP id DEF0C140013 for ; Wed, 8 Jan 2025 01:36:52 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=jIvDIXbE; spf=pass (imf23.hostedemail.com: domain of yosryahmed@google.com designates 209.85.160.178 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736300212; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=eFzNbFvKJFsX0j1FvggyTJo867KOKtvnxAwTmr9XGLE=; b=zHoRs7oSm2jSz5+afBs0kBtAwJWXKSuXEnyI8fKNK+iZVMCPUJd+CEaT7NHgDR/NPr5NgA pPQwW8EkcrVr0Qm0fCBNiMKaElIAdx59wgkS+yIAP2nmvz1MpYmKxWppXrVs/LGFE7odzA NZELMpm976hJ0aLrIxEdX5yzzmCMIGo= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=jIvDIXbE; spf=pass (imf23.hostedemail.com: domain of yosryahmed@google.com designates 209.85.160.178 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736300212; a=rsa-sha256; cv=none; b=LdLKQnlqUdJ/b/e5yfMqzlKUfUbQUTtpvmD0cFgRoIs/jLstvUUJdIinLjtQQHzxwM7exN sl83dOtXH7xF+p+Jf0Pcs9iLPCDEHI7EhwiOSKthqdnQqq0a6uDdHw711fxQGtooz0xO0N lPZgP78vPq+3Kevsd4ZBrimF6Wk4rko= Received: by mail-qt1-f178.google.com with SMTP id d75a77b69052e-4679eacf2c5so152895181cf.0 for ; Tue, 07 Jan 2025 17:36:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736300212; x=1736905012; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=eFzNbFvKJFsX0j1FvggyTJo867KOKtvnxAwTmr9XGLE=; b=jIvDIXbEsZyu305NDfVFFQxUgQIGthPXOtCoBWig37lCbX3B1+Vmg/8Eu6fOv4PiWe IrqHbBExfFng1P3H//aHd69GDvzxcBPSy30cjlumQ2wyyIhNpBBhg9IsEmt8vWYSiUCn OmWGVCdjBiFr4D9t/ebUDyVhcV8SzVBw66VIwrsDSSJQ9gr6kqGuNeCevvOlxYTwSqy3 Hn2wCAwUqzeVuiZr3zw4lJUp73CclsLvAaVIqRiJhUOgi5aMbsdd9R0NRi9xiDFF+z/N +Djlvehc1vfJmArHscTZ3LEwDCiSHD3sAleUAH7ZKAI7PRBLjns+PKLwBguCcP0BjADN UXlw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736300212; x=1736905012; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=eFzNbFvKJFsX0j1FvggyTJo867KOKtvnxAwTmr9XGLE=; b=G3N2T7kSkqA9xhdnbb0D43coCLCSNtUd7V0QkxA6G/ppmnWfdG415HuU3jgYehTbPO cewE3MEZmZkhJeAXxdqSIn93XeGPbj5HkZRj4frsxgg0MMhvWvlXd6CDVmb6LkpLefxq E6Ea44kok+KiXrBbc1NdegTSUpBLK/wd+k17HggOng6rmhDZUuK4Dck2VPuRo8mJ95D3 Opb2/IqH7gLLuTKq+VEz+41YUj1latPaDxU7tVuA5mOEPKneqYlk+KHhlWBge3W+4wyT eHxnxbwb+0riibxR14dHRt7dOAD4EgaOLgAlvfLNxNYoRBEklswzl4bXHJXE2pE81BE1 AVdw== X-Forwarded-Encrypted: i=1; AJvYcCW60OvA9hMetlSAX1+ONYeOHSHJVdhmEIpJqOzQO0YUsge6pRwZJrxi03rpInGvIoK5X2f6D1HqcQ==@kvack.org X-Gm-Message-State: AOJu0Yxffx2pGp1avmQRjhHuw5MbgBj7t2H8ELlj48JYhTif7MIuP0Zc VLMWz8r29rPaNRlt1rE4I6ePCrc8PibYH7s1P2IQ9yn3m2Ex45d/tphUxW6mtxHRT5KfYOug3U2 1KnL7dakSIoSc8DVAtl3nVwwZcr3oQz6CCBMD X-Gm-Gg: ASbGncs8h79xC6VaadKO5dJehYL9U2q1yWm7DAYyIl0N6WHLFkXypthKxv5603uWGz/ UME85ezCuc50b9TqhPmvg+iRsYyPzxhc3ulC5BwSeIPKTscKvmCDEcuXp9Ud21hT0e/xX X-Google-Smtp-Source: AGHT+IECl2dlR12QhcLsSC0BpqSBBT6HPssJiHOBG6nkSsgOJYzphR1lH/4vk2Q39kHt/Fmx3Cdsj5ZpJoam4p7IVXA= X-Received: by 2002:a05:6214:d6a:b0:6d8:ab7e:e554 with SMTP id 6a1803df08f44-6df9b2b1e3cmr21130666d6.34.1736300211822; Tue, 07 Jan 2025 17:36:51 -0800 (PST) MIME-Version: 1.0 References: <20241230175550.4046587-1-riel@surriel.com> <95a7349f887e538b5e63f77da6b2a1d7efc9a43f.camel@surriel.com> In-Reply-To: <95a7349f887e538b5e63f77da6b2a1d7efc9a43f.camel@surriel.com> From: Yosry Ahmed Date: Tue, 7 Jan 2025 17:36:15 -0800 X-Gm-Features: AbW1kvaQwpPk52UF_7bVZY8YebbHBcVj6wEoj5fkfDQ44RSSnzXfNroXVvoukZQ Message-ID: Subject: Re: [PATCH v3 00/12] AMD broadcast TLB invalidation To: Rik van Riel Cc: x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@meta.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, akpm@linux-foundation.org, nadav.amit@gmail.com, zhengqi.arch@bytedance.com, linux-mm@kvack.org, Reiji Watanabe , Brendan Jackman Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: DEF0C140013 X-Rspamd-Server: rspam12 X-Stat-Signature: 89yoyt66z14u95pfwux36t1nzyizquno X-Rspam-User: X-HE-Tag: 1736300212-51605 X-HE-Meta: U2FsdGVkX1/O9r4gcyx9l9dPBuBTgQepFdMWLpDz8VG+9MldOwtYpO+6wQ+IFXxPhKUBz/3huVqsjbEqufcQU1QP1D+xGyWTPHDX51znKSurJXuXEjrwiwr1hOjkc3bK/sQcNcvACoSHiXXb/3GyUvaRXfHFf7t7RshOymZWEIViTQAVGCGElGmaM4GyGFEndJBKdn/ETKX4I1a4Ry/vfXzq6F6zWU97sWkSEv1WucMpXQezBclfzjOhZ1QiG9PULz2iLLMV/cQ205WlvczfwY04z/K3tNgxQbvxag/Li8TPR7ay+LJ1/8cJh52cOWno004woyYTz6JqMmATqZ065xOQS0Pw9wOpTf04yCMfC+vBRZM9T8d4yYIPVf8kzajVXtlRdieJZBTwI1feruhV3FtzucZMO/LnJHXot/4l2vRkk7WN+EAVSmIe94fD61eop6YJQDwJCpKN0xGNtlLe7QVKbIhQP3ZnoyNutQ7r6BdSXXAg89mGS3B0KnZ1GSh9vKrz/sF2GxMo1gI6xF5CUf+VGGMfnzOo73HkAErVkeTimxkniJyp/Q/RiI4U39noPNWXeBzbiMdyWYiO0eK7J3E22QO9qgKWF+kexoJvofy1tNhGfCfkqxaggQs67BnAy44qGlsDEdl1xUDa6oZveerCyypzTdVm8YxhF59Ze0Ze/kENb1WFO972ku2QY9ciGQKti7ehAgmtNU+fxHX2weDUlwCYDtKrgDyNgtJEgjlhm3xDYYTsAbhS1Cug5rY3frhakheII1aXqAB3wsyCP+k2/OChmr6ueREpBo0QXjlPKysDrV1/UP7LbyKYuLRTZmqBPVX29UgvOCWIW0JLuhrgmlaoCrneA3/RBiKM4I2/n59YXU2iXq2ZrcQVqKSPIS7FU0IkDH56rscdG0/uGzzNLuvv9OVyShf8He/c0f1tH8ZnqY2KhwpLfyrDToR33XsA2g8Bv14Gn18NMmf GOuZz1Fx MgXHdOGicu0MRakhKtI8Aj1JMT5S/SAi6jF39/22mG+6NPTiZkrKOBCTx1eIlCbRhIpWPwRBJa8c+eNPfING2MvdtbluZ0//B3UdDMLp4OSph0/jnfXRIbls+QO/Pw3QGQUXCqkCzYcJxLxb1DQXKwVbTzApNhHQQKTVpptISEQQyQso2hSzNd7V8eWMgI2kXoHyRAt+PJ2DWWvl24ftn46/mfKi9KJJ/B3wlFOOzPTQniOVXTfOA3VKCURuWzHUY8TteSyxRGUTADiv6f7GG5WZMGBNcAMEETfzOMcp6N+9zajCHEsdVOCfZpKBlSfBx5N/tCTE64q0fe8w= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000001, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Jan 6, 2025 at 7:25=E2=80=AFPM Rik van Riel wrot= e: > > On Mon, 2025-01-06 at 14:49 -0800, Yosry Ahmed wrote: > > > > We briefly looked at using INVLPGB/TLBSYNC as part of the ASI work to > > optimize away the async freeing logic which sends TLB flush IPIs. > > > > I have a high-level question about INVLPGB/TLBSYNC that I could not > > immediately find the answer to in the AMD manual. Sorry if I missed > > the answer or if I missed something obvious. > > > > Do we know what the underlying mechanism for delivering the TLB > > flushes is? If a CPU has interrupts disabled, does it still receive > > the broadcast TLB flush request and handle it? > > I assume TLB invalidation is probably handled similarly > to how cache coherency is handled between CPUs. > > However, it probably does not need to be quite as fast, > since cache coherency traffic is probably 2-6 orders of > magnitude more common than TLB invalidation traffic. > > > > > My main concern is that TLBSYNC is a single instruction that seems > > like it will wait for an arbitrary amount of time, and IIUC > > interrupts > > (and NMIs) will not be delivered to the running CPU until after the > > instruction completes execution (only at an instruction boundary). > > > > Are there any guarantees about other CPUs handling the broadcast TLB > > flush in a timely manner, or an explanation of how CPUs handle the > > incoming requests in general? > > The performance numbers I got with the tlb_flush2_threads > microbenchmark strongly suggest that INVLPGB flushes are > handled by the receiving CPUs even while interrupts are > disabled. > > CPU time spent in flush_tlb_mm_range goes down with > INVLPGB, compared with IPI based TLB flushing, even when > the IPIs only go to a subset of CPUs. > > I have no idea whether the invalidation is handled by > something like microcode in the CPU, by the (more > external?) logic that handles cache coherency, or > something else entirely. > > I suspect AMD wouldn't tell us exactly ;) Well, ideally they would just tell us the conditions under which CPUs respond to the broadcast TLB flush or the expectations around latency. I am also wondering if a CPU can respond to an INVLPGB while running TLBSYNC, specifically if it's possible for two CPUs to send broadcasts to one another and then execute TLBSYNC to wait for each other. Could this lead to a deadlock? I think the answer is no but we have little understanding about what's going on under the hood to know for sure (or at least I do). > > -- > All Rights Reversed.