From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9D0BCE77198 for ; Mon, 6 Jan 2025 22:49:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 98BD06B00A7; Mon, 6 Jan 2025 17:49:42 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 93C4E6B00B8; Mon, 6 Jan 2025 17:49:42 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7DBD16B00B9; Mon, 6 Jan 2025 17:49:42 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 5CDD36B00A7 for ; Mon, 6 Jan 2025 17:49:42 -0500 (EST) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id E26D6120454 for ; Mon, 6 Jan 2025 22:49:41 +0000 (UTC) X-FDA: 82978520562.29.3EECF3F Received: from mail-qv1-f47.google.com (mail-qv1-f47.google.com [209.85.219.47]) by imf08.hostedemail.com (Postfix) with ESMTP id 1020A160019 for ; Mon, 6 Jan 2025 22:49:39 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=AlVQaSzu; spf=pass (imf08.hostedemail.com: domain of yosryahmed@google.com designates 209.85.219.47 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736203780; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=vzaG1amCuJQ1VomNoH97N+dhFy3OKoG16Jjdxqo7dWM=; b=6mQvWLPguQBIBBS7cZBkLkG1x2ei5WOip8jq1unE3hi0o0hQ6alcymMzLVUxs/67MNfAJ6 Zr+c4kpMqe0jdQRDbruMzl+1DRZRaZqr/J0dANc22az75VXAeJT7H0GpeC5ABO3OPtoLaZ K+WYvBj28S/Go68CeJtYJuA0FHVTfkM= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=AlVQaSzu; spf=pass (imf08.hostedemail.com: domain of yosryahmed@google.com designates 209.85.219.47 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736203780; a=rsa-sha256; cv=none; b=LmDblv2YJTiKdL35313pqoLL6IPTGMdqkABBBlYXQsNntJScabXbOzHsEEmOSlKiM/f6aE 6auwk/sMVm2U3LFG/0NosuHnRqMxxldbXErdKo+zihFhoRKPXe1rFi9o4ftCyXL1LV7+Yu OStSWqQ2749R/Vwz/yy7HxCwAoHpmuU= Received: by mail-qv1-f47.google.com with SMTP id 6a1803df08f44-6dd5c544813so106064716d6.1 for ; Mon, 06 Jan 2025 14:49:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736203779; x=1736808579; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=vzaG1amCuJQ1VomNoH97N+dhFy3OKoG16Jjdxqo7dWM=; b=AlVQaSzuzXHqJOM8xP7nkQ6aKHEjtmR0oi/Zp7REQi8T6lUJRulWP7toI8YHWpg0WN hq1e6q89n0fVMTSr30CH8M2+gwAbcXv7QIQZ4fOeK6vD6QifFGMNtUcEB39Q+MoQYg90 l6Hqt17HQQ3ZkARBov4RsnhdoQUvB8q03EVIRVuHs9P6udWNBR//iqHpK19X1rQp2B+o 9uh0B3Jf5R22EhZ3eWJYd5terKW0FQhw/+pIJfwDCWCAuI8wL4Ew5umAk1uDCLe4NP53 4/fuiloMFqi8xlufm00DtIFe8so4qqQi9Dr1fjIm30OcWtvlFU4py6XntgNw23cQXeRF pCbg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736203779; x=1736808579; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=vzaG1amCuJQ1VomNoH97N+dhFy3OKoG16Jjdxqo7dWM=; b=xRtt3W3aSPwJyl1cNiDVfhQuoMrx+t/FkIZ37lUvliajtEeN4RmE7K16YDklXCY/1q L5gVGT+wfk2cAl8q+gpHH7qdRrMnNsW8IoD8TkQL79G2/e4apr4uDn/YsI1d3HLDvOPC QvKI6BcAfhC+j/jL78xVie5mHd2CCPXp2R7lFcomt04cNWuFDkUC69aylnVdvHDRhk9W hf/QtarDeD6q5+Va64FUuYsi1LkAHKqbBcu1VyiptS+Dg7TCJM4VQzsSb2fBNn0NkZkv tvJ0L52cJYR9F8LcarFEOqTjjxvtxG8N8A8LgDg+THIq8crXxbdxVmImEJk4/WIrDk2g +UAA== X-Forwarded-Encrypted: i=1; AJvYcCXPlhEQv3iO0YZIzaVCZwvE4sjdyTuu9fFKkFeBLLt1+8uw8oJy4l/8vH3AgV7eenWJxGGleaCSkA==@kvack.org X-Gm-Message-State: AOJu0YxM9O+ZJA7MRYkE/927a4Qsd/0IGxVXqZtoe/wxCdpb6Ntxk10N uCJo52cl7iJMZoSH/U99C1AG2TO/q7Yv56CfKWPej3+iXmUMpfjNuL2F1772lSAhPM+9xHcFQba Rlbw3a9K2lUs6J//WUAESrG8hlRiTbPZPqyxC X-Gm-Gg: ASbGncvPJlILC7vee18FkrgSGObh/Yy3/Y9wjzJYJThlaWR70aLFXTYyOLIvfTrnyVN sLTNrdvyeKsTFv/1u1Q9y5aeIW7cSU6a42PM= X-Google-Smtp-Source: AGHT+IHmCwzfeeMaIRfFJdGi0oV+Oqjk342IcYxK02pJ+iAE0EM7PAdYNVkBR0bptG6M7gfgNv28+oI6HDlPQpOvXR8= X-Received: by 2002:a05:6214:e83:b0:6d8:7d63:f424 with SMTP id 6a1803df08f44-6dd23331dffmr996368176d6.12.1736203778986; Mon, 06 Jan 2025 14:49:38 -0800 (PST) MIME-Version: 1.0 References: <20241230175550.4046587-1-riel@surriel.com> In-Reply-To: <20241230175550.4046587-1-riel@surriel.com> From: Yosry Ahmed Date: Mon, 6 Jan 2025 14:49:02 -0800 X-Gm-Features: AbW1kvZLMEt-1oJ8lFY6jNKhmoe8O0Vqr9EdNlknMyB5fNJzqkUPb927yqcjXfs Message-ID: Subject: Re: [PATCH v3 00/12] AMD broadcast TLB invalidation To: Rik van Riel Cc: x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@meta.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, akpm@linux-foundation.org, nadav.amit@gmail.com, zhengqi.arch@bytedance.com, linux-mm@kvack.org, Reiji Watanabe , Brendan Jackman Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 1020A160019 X-Rspam-User: X-Stat-Signature: g8mz7ry1w6qih6dq6pym4rw59jxn5cs9 X-HE-Tag: 1736203779-375357 X-HE-Meta: U2FsdGVkX1/zb6Jr7yDL3iKiBABw5y7RUClSETfryptz7bS/mHx+dAP1nFZYkTD97XYd6g95zbKxOmyKCIk5JmXtQFxf4TqiHXzPs/GQL3MU2FqyRaxBjlyoiCxgenXYi4l9lBB2LO9c8tj59nMlJk9G6eZnVt2kx3UFf29/TEvQjg122qAt/vEHdaJV4hjG0pOW9eM1+az+Yx3mGKjEnefXba97wtL1TcNcn16Oys4zttE4laB37mYhY7VmD6s9FtvxrmEZ6EX/o5iS1d/LI+bzxtnuRnHl+bh/3Eb5rrPzhJt/kx0yvRgQ1Axw3E1R1REorIv2ARIWnQXR4c2G6AFmQJejdmfxIvdSX9o708Ay8Qp0Ty7KFCRjbwJ6P6HM5yqIP30peRANto+zfqN+AZew9iJKotc6+y5q+6nfZozVow7O2F4/Wbi5SQoPvEjLudYO9ow2r47lGrI+MNbF33VjVA94NXS93aaaBo/w+d3hsl/nhEfKpX7N/jGFWW/KcPvRcqQgmwGkigH9r8CzWAiA4QdkrfnjGT3y2XoLWcQ2+NZ5R9g0OBwxLjVNxIWOuUOKXSkDxSeBp3BzXGlFQ6UrnQu2SO9yMyPmZ5QQByKBxGNsMm1RlRghOceH9J3qhD14e06Fo+Nd+xtZ1wwRxQ74Lq0q/EbyijAqcVVI/iMAn8zzdDC+oGipjleKpGZk+BMV6xZ9DsgBGQxTu8wEPl+t4NSmS+vTVkv2cVYz8t9mnR84etuB1vpoA1LcM4ho6mUMsG0Q5fH0t6UHXpL0lnQ2vjIuEexEgCQnyXOCAVMzI4Z7IP9/KG8K9tHEBMg2dBRIHXsnhJj75wEiWVkzUM0TS78LErOlnoAPDl7/O+ImR38wSTznPJtWr3I1mHkrXf2FBCOCaQL2sO6cixw0kFIgjEU8Q3xwrdbl7xAnyt5Siv9ANRHl/0qQ4XaB71/UbuJb4YS0z85R/YkgHck 6/2JeYO2 CqPq49kusVbKaBuC/jcFkAmd45QRBxumIe5Q0c9GvxtJ58Nq5/BGVgKtnDjKHdtXW7pB6p37HoqJ3LVmMHMZTKkUo6PkfljPMA7NRv0a9XPRWT3p0tyYlg5D7g/jyhC4Tv7Ph6BDFpqt/r228SJoMV0m5w8UEYai3CHbyIrw+jKbwGYt+GIb+wn9S7DapKOREKinBes/2qli+3eGN88B7PCV05U3BMBu277BLuQMe15sgiQzr5Shtp6hllzMTy5hFx5CTYNrfvN+h3Z007iEbTVd+cO0oTHQO6v2QOn+QgkQ/NgJNY3rOcvuF8CzO7Hyxj5NTgIrMZRRVfcqi4WOLYhvuN8Vda0v/wAX/XMDjnyGG1TFo9nrJCdYdkhbKYxm2bcRD X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Dec 30, 2024 at 9:57=E2=80=AFAM Rik van Riel wro= te: > > Subject: [RFC PATCH 00/10] AMD broadcast TLB invalidation > > Add support for broadcast TLB invalidation using AMD's INVLPGB instructio= n. > > This allows the kernel to invalidate TLB entries on remote CPUs without > needing to send IPIs, without having to wait for remote CPUs to handle > those interrupts, and with less interruption to what was running on > those CPUs. > > Because x86 PCID space is limited, and there are some very large > systems out there, broadcast TLB invalidation is only used for > processes that are active on 3 or more CPUs, with the threshold > being gradually increased the more the PCID space gets exhausted. > > Combined with the removal of unnecessary lru_add_drain calls > (see https://lkml.org/lkml/2024/12/19/1388) this results in a > nice performance boost for the will-it-scale tlb_flush2_threads > test on an AMD Milan system with 36 cores: > > - vanilla kernel: 527k loops/second > - lru_add_drain removal: 731k loops/second > - only INVLPGB: 527k loops/second > - lru_add_drain + INVLPGB: 1157k loops/second > > Profiling with only the INVLPGB changes showed while > TLB invalidation went down from 40% of the total CPU > time to only around 4% of CPU time, the contention > simply moved to the LRU lock. We briefly looked at using INVLPGB/TLBSYNC as part of the ASI work to optimize away the async freeing logic which sends TLB flush IPIs. I have a high-level question about INVLPGB/TLBSYNC that I could not immediately find the answer to in the AMD manual. Sorry if I missed the answer or if I missed something obvious. Do we know what the underlying mechanism for delivering the TLB flushes is? If a CPU has interrupts disabled, does it still receive the broadcast TLB flush request and handle it? My main concern is that TLBSYNC is a single instruction that seems like it will wait for an arbitrary amount of time, and IIUC interrupts (and NMIs) will not be delivered to the running CPU until after the instruction completes execution (only at an instruction boundary). Are there any guarantees about other CPUs handling the broadcast TLB flush in a timely manner, or an explanation of how CPUs handle the incoming requests in general? > > Fixing both at the same time about doubles the > number of iterations per second from this case. > > v3: > - Remove paravirt tlb_remove_table call (thank you Qi Zheng) > - More suggested cleanups and changelog fixes by Peter and Nadav > v2: > - Apply suggestions by Peter and Borislav (thank you!) > - Fix bug in arch_tlbbatch_flush, where we need to do both > the TLBSYNC, and flush the CPUs that are in the cpumask. > - Some updates to comments and changelogs based on questions. > >