From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 23945C77B7A for ; Tue, 16 May 2023 17:56:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A13B4900003; Tue, 16 May 2023 13:56:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9C2D5900002; Tue, 16 May 2023 13:56:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 863D6900003; Tue, 16 May 2023 13:56:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 72766900002 for ; Tue, 16 May 2023 13:56:26 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 2EF79C02E1 for ; Tue, 16 May 2023 17:56:26 +0000 (UTC) X-FDA: 80796872772.16.D7AD1CF Received: from mail-pg1-f175.google.com (mail-pg1-f175.google.com [209.85.215.175]) by imf19.hostedemail.com (Postfix) with ESMTP id 9D3471A0015 for ; Tue, 16 May 2023 17:56:23 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b=dGUyYABO; spf=pass (imf19.hostedemail.com: domain of nadav.amit@gmail.com designates 209.85.215.175 as permitted sender) smtp.mailfrom=nadav.amit@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1684259783; a=rsa-sha256; cv=none; b=m0qbQ16gEU/1MwtZFaMZv5byFR5fqEBeNCStux6n6Z7fuj4uQDQJfmMq3hgDi/DwDC0p5n VGBv8FNpSCVfRVlSNcN7dYdbXPNryJwbaqr+Zpa/4ELkI/YzgTduppG9NgyEK+QryHp2uw M9vaSn1bEtTTMMoluRXSpDzR5gK0rTg= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b=dGUyYABO; spf=pass (imf19.hostedemail.com: domain of nadav.amit@gmail.com designates 209.85.215.175 as permitted sender) smtp.mailfrom=nadav.amit@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1684259783; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=j/iFIFR/+4isEeFtiLhNrK6Bv+z1ClXHKbQ3ARD5xuk=; b=F0Ez7t2N5pSnrKghG7KV/hU1zbJNALAVXfp6/AfbzUv4uhbqboLjzXVRHKfWvsR2nvjY8c eNPABVdMAmjwat3Sz5L01HYVU598L1EfvyIShwfZ7Q35wQN//azR4dt6E2BmDM/siwWHbi y/e7+glYTeB91FKYLf7CjlG8jNayFH4= Received: by mail-pg1-f175.google.com with SMTP id 41be03b00d2f7-52cbd7e73d2so11157680a12.3 for ; Tue, 16 May 2023 10:56:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684259782; x=1686851782; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=j/iFIFR/+4isEeFtiLhNrK6Bv+z1ClXHKbQ3ARD5xuk=; b=dGUyYABOLXdUYYVdLg6OFvUawGxNb7Mo6cTILUP40mjazexDA8y1Ym4PywRxYTCwUr 3ceafX9F13xg5M/zY7dQgwHGcaKCwtxikaPO5kO2z8VM1Sb9BfQ6xDg4+3dNtAVtN7V7 N0APiaxFopQagPmr7eYHduuN5G2gGShKbNka0vDjYN0i+2D9X/64Ora41BpfKanhd+/W aa8WDIxy/zY0oKrrzpt7VRpS6nBdkRZj1lJQqRlZw40h7+A/ISS6v3kKQfFG4j80uVkP iICEKqiGFFxQw9F98skN/KjX60V8CQgRgjV22LUlZYC3pfQlpRbZtGUamDLnBfu60JQf wiFw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684259782; x=1686851782; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=j/iFIFR/+4isEeFtiLhNrK6Bv+z1ClXHKbQ3ARD5xuk=; b=j92QA1FQV2xnMM4e3l6qjZyyEM79wWhQZhXyBHqjpgXVHAyKS60Xp2bHeMUL5HVW+4 bDBdi3DTssSFrSpUlCgRwLD9NKYzJ0Ns6b6WR0CaA7yH0rR/9/we+JfZShhv+tj0tnCI yMdHmJtr9CMtmhnBkiVmZ4afx4X2imJRIeGbJ8GHbi3BwUXgQ7YZxwIbbBKpXEJ/iOUj NYVzsNKVKsLvxHlThEcYhcWJ2/A4RRl0RUa0uomBCESD0RfV1CZZWsXQEYRR3d9fgi2R DKpXvRSf1uQhDdXkTcGDg1b9n+rSOy3imuurBCLDzCS9IZa1DQglPqeS8hIZyuW/kiCX ojNQ== X-Gm-Message-State: AC+VfDzcuEkW7QnIMCmoYyiUwqfcar3VX5dGP8szC3ouRQDcczyOBOC0 ZJKvBxcLvQrgkdX8xmKDOK8= X-Google-Smtp-Source: ACHHUZ7iNSPXkRz8NXxhyTXGh/qhWMr6tVaXBfw8FD0k542CNtX3yI3G/dysbJXUHk/S9n+YBqD+2A== X-Received: by 2002:a17:902:8691:b0:19a:9890:eac6 with SMTP id g17-20020a170902869100b0019a9890eac6mr36061539plo.24.1684259781861; Tue, 16 May 2023 10:56:21 -0700 (PDT) Received: from smtpclient.apple (c-24-6-216-183.hsd1.ca.comcast.net. [24.6.216.183]) by smtp.gmail.com with ESMTPSA id u4-20020a170902e5c400b001ae197fdbb2sm4713277plf.274.2023.05.16.10.56.19 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 16 May 2023 10:56:21 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3731.500.231\)) Subject: Re: Excessive TLB flush ranges From: Nadav Amit In-Reply-To: <87o7mk733x.ffs@tglx> Date: Tue, 16 May 2023 10:56:08 -0700 Cc: Uladzislau Rezki , "Russell King (Oracle)" , Andrew Morton , linux-mm , Christoph Hellwig , Lorenzo Stoakes , Peter Zijlstra , Baoquan He , John Ogness , linux-arm-kernel@lists.infradead.org, Mark Rutland , Marc Zyngier , x86@kernel.org Content-Transfer-Encoding: quoted-printable Message-Id: <7ED917BC-420F-47D4-8956-8984205A75F0@gmail.com> References: <87a5y5a6kj.ffs@tglx> <87353x9y3l.ffs@tglx> <87zg658fla.ffs@tglx> <87r0rg93z5.ffs@tglx> <87cz308y3s.ffs@tglx> <87y1lo7a0z.ffs@tglx> <87o7mk733x.ffs@tglx> To: Thomas Gleixner X-Mailer: Apple Mail (2.3731.500.231) X-Rspam-User: X-Stat-Signature: 1usa98uxk3d5xz4shncshr6auy4wcdoj X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 9D3471A0015 X-HE-Tag: 1684259783-525869 X-HE-Meta: U2FsdGVkX19C417uyt0S0RoyTWtBNkWApdRQcgJdVC3e4BeMsUcOUBoSR922yiFDT9m+6iZUMRDy8bDDLQ5+KJOPlFqieYIwBkiy42k2Ma1dt5UQ/lPCy6QDzY1gVzFFuqml1yiqU9q1K9Xtt872wIcul0ZYDStsa174Z7aDZT9JlFyQ4zpk+9Rkq7j4H5Cgt/4zwu6C0cf+Gww1R0QB3c/vALBKOASi0s9i3W67tvR9BsbRpD4jfYr6C/hh2oCFPi9Lv7eviy5b0P0+euxbvvVvk/rws7jwlMOB6XBIfoKpLfQERR67wey6Flbe8ZnhFIK6nR9jArX7GuKMV3pXRWaF5Tao/bDGxQ/LlWGAf0DggqIMW6uf8kBBKhfMUhlSt5gXdWmqJ+tpbwxIvHQ1YvsroywWHiuh13/EjiymJVcP1RA+ajTAQlhnxti/neUIzdjcyK8LRRxc9ipWk0rZVHYuERizU0e1YZDz0sRfzQs/Jp86j2sSq770OGN2UCfFRDvPcBpgTjKWFKwYOWfRJoVREUkflXXtIOZRlSz5UlBn+2zyBuVzZhqxyFwBC9UCVAoldTHlxOQ6rHpX8JT68IrqovIvZcoJ5At3blkgFud3KKzko+8YSr3IOGVdCQgYFsN8ufUlcKixCrmgPg3qgHcnVLWvl02R+9hMk84w7AFtxpH6qCE07nEa9F2Vjk242xKS9KNbmNIWPpZW6jjLPCeDm2deb9DCqrreK8CXiIclcbMavnrmsK2lTdFkC6em/bRHn7u9BrVElWYQzuITEe/jW3z3gQEFExFTBIMdmsh+9r8Ge0TsOvZGWDGcjYqCotoGttteSh5ZHboJyqxeo2UOK2uGLMrmc7cISU56SpYe3PhXDz0IdrfJwtYi1JciJuH+YyLAeqyBJ9mVcM09iJvCOyuhu7ONcrF+IjDgcYkQ1eaYD+BLxEURRerf6UEAqNrSFlXgVngSzjEO8I9 MfOX876I EEmhhBLKZtAQHHjezz7dv6SyIZMOgEpbPf83OoMMD4hTUJH1Tl6Bn3KAsCDCuc3XErE2UN+bsHUgSrRcqJkl2lv0QaWZAM1BsSSZuZOwmle0GQXW9yhfmvtE/SpAWEvn3SmvVYgWw/C/hNxmjVh+OCl+WGa6rv8RdiS5xykQD0a3EjpnJE2nxg/DkJ1sn0VG94Z7Nk7nZPnyYLq7qluK+45BFA9fr1tKbmg9VdV0KSRqPZ3rGnerGvl3O/NoPuM2k/dXUnWsb2EhdCbHmATX3xfDHwKYh8ptepfNxyGEfOBaNcWpVHWQ6OZdlx5YJ+xSUcdEXjV2sYdUfENIFMY8oegt4vvoMa6h4jWCLr1bB26oAUWlg8L+BXuUqF1jbEevLSCaLkZwIEec+HP3UfexJJkCJPS8NpXqObooU1n8QyQACvvyX+ULKVzpWqJpH9ELincaUv2jJHpf6cwT2IGjqIMO7AwXAJAx/cqeY3OZ9KWMmaRbXgKFhySPpx9+BiJHLy7aGKDy3KjTxWSVXc56xzxt+ivnG5GJpPAZKFSM441uyf1ryK/MU+KVzqw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: > On May 16, 2023, at 7:38 AM, Thomas Gleixner = wrote: >=20 > There is a world outside of x86, but even on x86 it's borderline silly > to take the whole TLB out when you can flush 3 TLB entries one by one > with exactly the same number of IPIs, i.e. _one_. No? I just want to re-raise points that were made in the past, including in the discussion that I sent before and match my experience. Feel free to reject them, but I think you should not ignore them. In a nutshell, there is a tradeoff which is non-trivial. Controlling the exact ranges that need to be flushed might require, especially in IPI-based TLB invalidation systems, additional logic and more cache lines that need to traverse between the caches. The latter - the cache-lines that hold the ranges that need to be flushed - are the main issue. They might induce overhead that negates the benefits if in most cases it turns out that many pages are flushed. Data structures such as linked-lists might therefore not be suitable to hold the ranges that need to be flushed, as they are not cache-friendly. The data that is transferred between the cores to indicate which ranges should be flushed would ideally be cache line aligned and fit into a single cache-line. It is possible that for kernel ranges, where the stride is always a base-page size (4KB on x86) you might come with more condense way of communicating TLB flushing ranges of kernel pages than userspace pages. Perhaps the workload characteristics are different. But it should be noticed that major parts of the rationale behind the changes that you suggest could also apply to TLB invalidations of userspace mapping, as done in tlb_gather and UBC mechanisms. But in those cases the rationale, at least for x86, was that since the CPU knows to do TLB refills very efficiently, the extra complexity and overheads are likely not to worth the trouble. I hope my feedback is useful. Here is again a link to a discussion from 2015 about this subject: = https://lore.kernel.org/all/CA+55aFwVUkdaf0_rBk7uJHQjWXu+OcLTHc6FKuCn0Cb2K= vg9NA@mail.gmail.com/ There are several patches that showed the benefit of reducing cache contention during TLB shootdown. Here is one for example: https://lore.kernel.org/all/20190423065706.15430-1-namit@vmware.com/