From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 18692C77B75 for ; Mon, 15 May 2023 21:31:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9B8FB900003; Mon, 15 May 2023 17:31:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 96917900002; Mon, 15 May 2023 17:31:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 830EF900003; Mon, 15 May 2023 17:31:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 73555900002 for ; Mon, 15 May 2023 17:31:19 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 42555A1226 for ; Mon, 15 May 2023 21:31:19 +0000 (UTC) X-FDA: 80793785478.09.D044EC7 Received: from pandora.armlinux.org.uk (pandora.armlinux.org.uk [78.32.30.218]) by imf30.hostedemail.com (Postfix) with ESMTP id 7667A80013 for ; Mon, 15 May 2023 21:31:16 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=armlinux.org.uk header.s=pandora-2019 header.b=sNknB2Ly; spf=none (imf30.hostedemail.com: domain of "linux+linux-mm=kvack.org@armlinux.org.uk" has no SPF policy when checking 78.32.30.218) smtp.mailfrom="linux+linux-mm=kvack.org@armlinux.org.uk"; dmarc=pass (policy=none) header.from=armlinux.org.uk ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1684186276; a=rsa-sha256; cv=none; b=YvUf632+bll/acBlMbIZyrZmEdc4j+5CnIt0AHegIU+r1HgZERzISwlDly39yYSkrqC9Mk /wHTGkwEv1JrLeDahvPAZaRt0e/USbi4YLPL6XnutqKAXKFjZPgcBGXmjo8akddYR/rN3J DTTxJYGtDyL6lYidBq3feUtVmD/EJZA= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=armlinux.org.uk header.s=pandora-2019 header.b=sNknB2Ly; spf=none (imf30.hostedemail.com: domain of "linux+linux-mm=kvack.org@armlinux.org.uk" has no SPF policy when checking 78.32.30.218) smtp.mailfrom="linux+linux-mm=kvack.org@armlinux.org.uk"; dmarc=pass (policy=none) header.from=armlinux.org.uk ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1684186276; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=DIvmSIRVOIKMZeHdo7YQVs1U/DxOWcq2bRz535cFiIE=; b=d+sBFK4RwMPijPLZKu2kz8DQt6OLnwK5njK6ZOeARP7kyJTIMLvfQoxnOD3+Kg/x89i2fx H2BJ8W4DQAbEoMJXvLUOKZz2/gcpgWhPWD7xOT9tJhUyct0DmNX4QfB3ECKInHkfF5iC/A aydRX4QeVPa7v0tjI0g3GP8xSut/uXo= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=armlinux.org.uk; s=pandora-2019; h=Sender:In-Reply-To:Content-Type: MIME-Version:References:Message-ID:Subject:Cc:To:From:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=DIvmSIRVOIKMZeHdo7YQVs1U/DxOWcq2bRz535cFiIE=; b=sNknB2LyuDLlgkAznc+sZRXj8l AEQdyDs1a4fSzuDBJzh0ja4J5wEiT6UoEV2mpgmxb9TdwrxMrioIjuzalXM0AL/UPLZefSuKgRbU0 wZ4Ss61DdRyrtBV0YupcjAbzHYnHbvldoEov2Z17uobuSKPt7weAkKibcCvVpNVnKZWWZUxVYHZrQ 3VSflSVQLldIX7agLpbaiAKSBTspDsUyO2Di1rDAKncFF/SomYD8ZvhmW0XBYBehMyd2n+3M1vVJV EnmO1gphX3WR54jc7+Ft/sSIwHRTG2ohf02pDs/fAR9MCrQW5sV8jfmCHEjmPejFOnT1INMjQtUEf Y1iSanLA==; Received: from shell.armlinux.org.uk ([fd8f:7570:feb6:1:5054:ff:fe00:4ec]:38002) by pandora.armlinux.org.uk with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1pyfme-0004eG-Tc; Mon, 15 May 2023 22:31:08 +0100 Received: from linux by shell.armlinux.org.uk with local (Exim 4.94.2) (envelope-from ) id 1pyfmb-00005k-OJ; Mon, 15 May 2023 22:31:05 +0100 Date: Mon, 15 May 2023 22:31:05 +0100 From: "Russell King (Oracle)" To: Thomas Gleixner Cc: Andrew Morton , linux-mm@kvack.org, Christoph Hellwig , Uladzislau Rezki , Lorenzo Stoakes , Peter Zijlstra , Baoquan He , John Ogness , linux-arm-kernel@lists.infradead.org, Mark Rutland , Marc Zyngier , x86@kernel.org Subject: Re: Excessive TLB flush ranges Message-ID: References: <87a5y5a6kj.ffs@tglx> <87353x9y3l.ffs@tglx> <87zg658fla.ffs@tglx> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87zg658fla.ffs@tglx> X-Rspam-User: X-Stat-Signature: nk9oohzg1k331456nihg5sggn7c6oo9q X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 7667A80013 X-HE-Tag: 1684186276-626819 X-HE-Meta: U2FsdGVkX19HHO2v2XKhO3+hyEJh1e8bfFx8qz/yIQr295901aZ00DlAMFcEihkZN6SwaBNt80uO59ue126RbmKU7yEHyN9R2E7jzDEz4VuFsvjaNSzMTjP4lT4F5jDplK3RJa4+ZdUsF20URyi9uWIW0rf05A4UYDg5B9P8Su7v1dJ/d3MTWcG1Ff+LEcbwREnGBjJZ8Kq9wnDbLLdXlG4XsGZG7xZQ5pz+yiiH4C0csCacpJua89+i+etaXyVmsmYcV1DXx6/H0kPRmnS6LEYzDQeOvAUmi4y+EYz0aG+dgyf+R+mOjvIjckXFmxrmguayOwyDpVHAnv/nVFLFuAphdU/RY2X9tX/lAs+DVp84CcWoR9yl/7aaVPBAWaNCdyOpg5T5+CbJNlfYVip0YXtoFCxznkzmgVJM0V4MDCuTgQcdNtwPeihr+O0fcACdymbQP0dLi4jcU/yW/5IHnmDU51ECUZLBpkSbkx3S8x9g5Xb/hZLd7Ndwxu/4wFiGz8CJD46bgz914VdgT3JEZmHW4H+zlMuN4m53yITVrrrFP3a/9sdIV+EqLedF/aPVjxkFG8Xaq6a+B8zVX3pHmR05Kw9DRFVQ4PsFkEYwco61dv08jN27CqdNqCvb0nDvZO/c+3/v2KCzEgLaU62+xA+neVPrAHB18uuzWKaSvwdavpfS8Ubo94F6cgLTJ8ITdEp1y3ELQ1VlygnOKobs6OHStwoae3cYx9uM2qnTRIVqHYvA4WpV9/OKotyb2Kx+1H2sfg5Uhg3xwEooM4OG3fixK+V13eY1hT908x1OU4KnGRs17zeuU2ZCmwO934MeKx1UpbN7lBLPx1D3sOkO/XJpKES4Zvm2hE/2Gfg5fY6EYjdGaqwT7O568C9BzPkfhIxzE/z1dAvmfS5E7u+ePnWcgzJ31ERHMq8szbDKlfqXAK4HLWriej7jEleytas4BuigddY+rHTAEaLm6Ts 1+aqmk0x n6pfbrcO3VkrICZWgUqPqpOgecOM46eTZw11ccidaCRDQ7eNLJS+3++z1qZSatslqQJvEbvkfQaTQ0YebFCKg5azQWKOdAoDt0Cj3OxV/2VP0G5E4cag67dhVUpAxf2yBwgHuvxC7/pitaHNNtfxBS0c0K9zayBLTN5VpbXye6aWlbPuun/mURha7RWFbZ6ODcJ5JznoHDw1YNBLZ+xviDZ9bDfEzE0BytoL3aYsDMyHj7LaOy5HCHf1nQ3tYOtjp1f61Ec13N57IrUddwdVRQmhYyuE+OULIOJERIv86o2WHOu2UcW3AVckwfbgXUMydrdxt6Vj8ljOiEhxZca9cUh78WkbOnhH8Ylj/MaTgOz3grYCCWNIfw4LhA+gIOuRKPmffrgUhq1/tdj9XkbzBX2MPo3DObtvfoFORIfPIaldKdUNOxBhhjmoNiZyQIfgIATBUpxv3roD7CjJqdPdRQ1E2YvdB57s7hpwgEXbvMd/5kuYaYQW+5qFp0gU4PL5CJWeQJs0cjNsQAZhRw8Vy2zX1gM0MuCzYAqBsJyeo6iwElyCum5t5M+mzCA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, May 15, 2023 at 11:11:45PM +0200, Thomas Gleixner wrote: > On Mon, May 15 2023 at 21:46, Thomas Gleixner wrote: > > On Mon, May 15 2023 at 17:59, Russell King wrote: > >> On Mon, May 15, 2023 at 06:43:40PM +0200, Thomas Gleixner wrote: > > That reproduces in a VM easily and has exactly the same behaviour: > > > > Extra page[s] via The actual allocation > > _vm_unmap_aliases() Pages Pages Flush start Pages > > alloc: ffffc9000058e000 2 > > free : ffff888144751000 1 ffffc9000058e000 2 ffff888144751000 17312759359 > > > > alloc: ffffc90000595000 2 > > free : ffff8881424f0000 1 ffffc90000595000 2 ffff8881424f0000 17312768167 > > > > ..... > > > > seccomp seems to install 29 BPF programs for that process. So on exit() > > this results in 29 full TLB flushes on x86, where each of them is used > > to flush exactly three TLB entries. > > > > The actual two page allocation (ffffc9...) is in the vmalloc space, the > > extra page (ffff88...) is in the direct mapping. > > I tried to flush them one by one, which is actually slightly slower. > That's not surprising as there are 3 * 29 instead of 29 IPIs and the > IPIs dominate the picture. > > But that's not necessarily true for ARM32 as there are no IPIs involved > on the machine we are using, which is a dual-core Cortex-A9. > > So I came up with the hack below, which is equally fast as the full > flush variant while the performance impact on the other CPUs is minimally > lower according to perf. > > That probably should have another argument which tells how many TLBs > this flush affects, i.e. 3 in this example, so an architecture can > sensibly decide whether it wants to use flush all or not. > > Thanks, > > tglx > --- > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -1728,6 +1728,7 @@ static bool __purge_vmap_area_lazy(unsig > unsigned int num_purged_areas = 0; > struct list_head local_purge_list; > struct vmap_area *va, *n_va; > + struct vmap_area tmp = { .va_start = start, .va_end = end }; > > lockdep_assert_held(&vmap_purge_lock); > > @@ -1747,7 +1748,12 @@ static bool __purge_vmap_area_lazy(unsig > list_last_entry(&local_purge_list, > struct vmap_area, list)->va_end); > > - flush_tlb_kernel_range(start, end); > + if (tmp.va_end > tmp.va_start) > + list_add(&tmp.list, &local_purge_list); > + flush_tlb_kernel_vas(&local_purge_list); > + if (tmp.va_end > tmp.va_start) > + list_del(&tmp.list); So basically we end up iterating over each VA range, which seems sensible if the range is large and we have to iterate over it page by page. In the case you have, are "start" and "end" set on function entry to a range, or are they set to ULONG_MAX,0 ? What I'm wondering is whether we could get away with just having flush_tlb_kernel_vas(). Whether that's acceptable to others is a different question :) > + > resched_threshold = lazy_max_pages() << 1; > > spin_lock(&free_vmap_area_lock); > --- a/arch/x86/mm/tlb.c > +++ b/arch/x86/mm/tlb.c > @@ -10,6 +10,7 @@ > #include > #include > #include > +#include > > #include > #include > @@ -1081,6 +1082,24 @@ void flush_tlb_kernel_range(unsigned lon > } > } > > +static void do_flush_vas(void *arg) > +{ > + struct list_head *list = arg; > + struct vmap_area *va; > + unsigned long addr; > + > + list_for_each_entry(va, list, list) { > + /* flush range by one by one 'invlpg' */ > + for (addr = va->va_start; addr < va->va_end; addr += PAGE_SIZE) > + flush_tlb_one_kernel(addr); Isn't this just the same as: flush_tlb_kernel_range(va->va_start, va->va_end); at least on ARM32, it should be - the range will be iterated over in assembly instead of C, although it'll be out of line but should be slightly faster. Thanks. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!