From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7186EC7EE2C for ; Tue, 16 May 2023 15:07:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BB00A900003; Tue, 16 May 2023 11:07:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B60FD900002; Tue, 16 May 2023 11:07:25 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A01AE900003; Tue, 16 May 2023 11:07:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 8FFDF900002 for ; Tue, 16 May 2023 11:07:25 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 5CA00802F5 for ; Tue, 16 May 2023 15:07:25 +0000 (UTC) X-FDA: 80796446850.28.8D5620D Received: from mail-lj1-f174.google.com (mail-lj1-f174.google.com [209.85.208.174]) by imf24.hostedemail.com (Postfix) with ESMTP id 8D0FF180511 for ; Tue, 16 May 2023 15:02:35 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b=hXp0Usv2; spf=pass (imf24.hostedemail.com: domain of urezki@gmail.com designates 209.85.208.174 as permitted sender) smtp.mailfrom=urezki@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1684249357; a=rsa-sha256; cv=none; b=qQ3oc1mvghiTIppGC63uouVTSj3YDpyVmHj38YoBtQ+u2pfzJRXI7C6b+NG/H6b0lVAYPH 3RXepfuyy47UskIywoocJzWt3O3MktchGzodSq6aHgr8UkkggAr9IFmnmHqk/ciqJaRAe+ ygwKWQxnJk7rkwUo5pDL2Q8GdXOr4fU= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b=hXp0Usv2; spf=pass (imf24.hostedemail.com: domain of urezki@gmail.com designates 209.85.208.174 as permitted sender) smtp.mailfrom=urezki@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1684249357; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Ob8Y+0KV9XkYWxgRnvgVekYB7Mmh3+rr9E9J7Dt2yR0=; b=lqbXLa+dNEhYVMtiN6pILJf1PWUQmD+70rYI7kU8ZntLfwaJoXWlDmPyM3Jtn9PlQgxZ1K zIKYmg7W7lIKhul6vkF570FnvcCx0Eay/+3ZxJ7JbyHPlgnQddfhv3J6/DQVZnVXXCxGhp fciPmlqv3ApUyi+e0Kc3oImrCZ28FBw= Received: by mail-lj1-f174.google.com with SMTP id 38308e7fff4ca-2ac89e6a5a1so142003751fa.0 for ; Tue, 16 May 2023 08:02:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684249354; x=1686841354; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:date:from:from:to:cc:subject:date:message-id:reply-to; bh=Ob8Y+0KV9XkYWxgRnvgVekYB7Mmh3+rr9E9J7Dt2yR0=; b=hXp0Usv2O4CmM52L1+L8EfhdpELI7Qo2pQwzaQsiPBqxRR6+QNnUlnUjXsVGBPEp0c aTB9nnE3dHvnPH7J7aPnVWpZ/BZEJEaps12KcVJuDjU+SZK+3Na+ffYP8zOY3dBHL+KW 0XboQYx3WWF/VTOGiKe1ytAJf4pqwdUfzyV5aD8FF+7V7gJBx1kw1R7CqpXpAmb0RP9V LIom0Ib4hzKuGuc/u/H7YpL/yI6uc+ULpgofW+5UGe1e+E1+KtEuEU5tyfFtEiVybN2c dBbbonXrmYp/bRoTPLq4o/s5w+JKbJ/Au0njsjqdmB1SfzkJxJHD37KiPx8czYPzY01/ MY9A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684249354; x=1686841354; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:date:from:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=Ob8Y+0KV9XkYWxgRnvgVekYB7Mmh3+rr9E9J7Dt2yR0=; b=LYzwHsWht7HccWcGERPCFVOy58aUg45tt7caPkrYXWCxPgTduS/eO2a7oWDooiMTzv 0MX/kEGMPubsZlLUqqbQeKY8FQMRQLOVMPutccdRtjppBCxRmYVfAEQwSrKgLrzkts3C 3gfUMbeKyGl5Hd2tNNDPLZOwOdqVpMtrUdYUISjeyFnE/F9noyfX6yKVkEcJ4bs5YMJ7 o7z/u9czAr+TwkH493Z/+la6kCekhmVwuiY0oaSxTFMdgwMfg+Jb4A6ofLunFhPTrXHD mW/2i35YQal+krZOJV3fGvqvBivLZrHjBzGdIfw+yF5OfJBza2oqrOzivXnraTnLN2JU uzmA== X-Gm-Message-State: AC+VfDwctzNF9zg3g+ie7dpTDX0GcpdX+bs1S7bAHMIme+PsYmo2b5HJ nYrfWkDiNB3XfJtUa4zcRGw= X-Google-Smtp-Source: ACHHUZ6tlb1Ri3yIFk3olQAshLMgctaz8XK5rC2akWNevkqnBEdlTks3bcSva920lqxdAmnF7Ilorw== X-Received: by 2002:a2e:80ca:0:b0:2ac:53f7:41ea with SMTP id r10-20020a2e80ca000000b002ac53f741eamr8496758ljg.46.1684249353974; Tue, 16 May 2023 08:02:33 -0700 (PDT) Received: from pc636 (host-90-235-18-147.mobileonline.telia.com. [90.235.18.147]) by smtp.gmail.com with ESMTPSA id m24-20020a2e8718000000b002a8aa82654asm4058450lji.60.2023.05.16.08.01.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 16 May 2023 08:02:02 -0700 (PDT) From: Uladzislau Rezki X-Google-Original-From: Uladzislau Rezki Date: Tue, 16 May 2023 17:01:55 +0200 To: Thomas Gleixner Cc: Uladzislau Rezki , "Russell King (Oracle)" , Andrew Morton , linux-mm@kvack.org, Christoph Hellwig , Lorenzo Stoakes , Peter Zijlstra , Baoquan He , John Ogness , linux-arm-kernel@lists.infradead.org, Mark Rutland , Marc Zyngier , x86@kernel.org Subject: Re: Excessive TLB flush ranges Message-ID: References: <87353x9y3l.ffs@tglx> <87zg658fla.ffs@tglx> <87r0rg93z5.ffs@tglx> <87cz308y3s.ffs@tglx> <87y1lo7a0z.ffs@tglx> <87o7mk733x.ffs@tglx> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87o7mk733x.ffs@tglx> X-Rspam-User: X-Stat-Signature: ihqu7raep4pdx3mjb9z34qez1ad8z36e X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 8D0FF180511 X-HE-Tag: 1684249355-229091 X-HE-Meta: U2FsdGVkX1/I5JFSGe4WlrJgR/ZpA2TqlVuWyO54ZoataCxCAnrgoswYEl5jQkPXohnWa+d0iSwSGBqk2xJnr92wS5fLAqf+SxXL3ouCZXK9DctQhx0MM77mcjVBC+up1p5/qgU0RPWw3CupisIKkRaZVy4oYnoa0bkcdkPiEsEh3Gs6HxgxQKFRLvYhuPefgfnupLOVHBffrQosjKPhi9zrGZqReiLll+4EWGFnUzkY2hvt/3BEXfgoMti8W8U2n00pAo0xshwHQ3hhYUqfvXZeZkkw31hFwmBbfDJxae7tVmyVFZ7ZDO4oroittNcOMheYxoS9v9hsyIFTyGlZrG9uJoccilb3BD0RNuJkJbVoiT1qFTjpopj3p5aEUtb4GNiWHW1F3QuTiKqBqewRpB251aIo4t9xmmYMDZIbQYh+hZLdpzEgay3x8y5KahyhEbwLEXKxLs8PaFYeAROZsoWU4HTjFoTtE6eFsCYKH3P/dr/MXJ6R1VMI8LQAytzHUdqrQVePVfA+FKZsAwOI1uIAnDfR8CkHEYcYts4kE105pOoMfmpbVxx76usSFc9pmehcqZTNLai56i0ATQl220WBK6g1jJ0L5jtEujHu2nNEnuSMzr4tV5CBAjN5IBK+hVfG/ves4Gj3H+HbHfhVfJdr2UHx8PSw6VvfQrYh1KtwEGI9qXJSXGUheCZnOmgjIORdN8o+eLtmrMtI7xdiArdEjgWWvift0wrvGtS2e1GVerqcVV8hSkjrvUfjO9Az0Dh+PHnrTgFCQAacsuAtVtVPvaAOweFEj/VvanuEX673bJK4dfJrQKRc7xBsNuu8ibCxaava2rjPCuiDuqFasAsY/zPnWE4Tlf2EtIpOi8F/MfWvwtp10dqSP/Doh3INOODs49jY8S/6Nt33/9cycl1ngbZREiIKnnkKNmG2AtXSrXAXZeZU15qXzouRlc/kRSHpJ1qaJH34d3DlZpA sYW1am7d b1O4nnOPHzRdQMTAG0eJB1yus3eLbN4d/bBQT9GW8cFkAY5j9UJZPteb8nfqaWK9vON2VU/wXZVid11TSCF8z3ks+CGs1/yK04fdUkeRQhe/Vrwo3szW/d6toeg+rwLm1fVxnq770pCanKJGSFq92QAAoZt84Hg48CuwxmJ6okMM3ghPs6Uot4DKl6PCfDUs20RR57rP1xDi0wir6BZZCaLQHBeQnGVjC5dyh1d2kSaHTMmAwOoA3c9FjGgqDHsprmrJaR30zTGWmdXob1iIbbzWgxAv+TW2qAwwg8I5MvQFpQYCEaquSyuqjrGdVlAQHKaJGmmPldkanEMqF7TFpXUJgvY2uyYGUS5vaw+4x0TpYQGQMcQh+uBGuADw60aXbkpkVHLpe+jHddgwuyDtAljxMz+tt5yhFylBINj1bm9DoDfw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, May 16, 2023 at 04:38:58PM +0200, Thomas Gleixner wrote: > On Tue, May 16 2023 at 15:42, Uladzislau Rezki wrote: > >> _vm_unmap_aliases() collects dirty ranges from per cpu vmap_block_queue > >> (what ever that is) and hands a start..end range to > >> __purge_vmap_area_lazy(). > >> > >> As I pointed out already, this can also end up being an excessive range > >> because there is no guarantee that those individual collected ranges are > >> consecutive. Though I have no idea how to cure that right now. > >> > >> AFAICT this was done to spare flush IPIs, but the mm folks should be > >> able to explain that properly. > >> > > This is done to prevent generating IPIs. That is why the whole range is > > calculated once and a flush occurs only once for all lazily registered VAs. > > Sure, but you pretty much enforced flush_tlb_all() by doing that, which > is not even close to correct. > > This range calculation is only correct when the resulting coalesced > range is consecutive, but if the resulting coalesced range is huge with > large holes and only a few pages to flush, then it's actively wrong. > > The architecture has zero chance to decide whether it wants to flush > single entries or all in one go. > Id depends what is a corner case what is not. Usually all allocations are done sequentially. From the other hand it is not always true. A good example is a module loading/unloading(it has a special place in vmap space). In this scenario we are quite far in vmap space from for example VMALLOC_START point. So it will require a flush_tlb_all, yes. > > There is a world outside of x86, but even on x86 it's borderline silly > to take the whole TLB out when you can flush 3 TLB entries one by one > with exactly the same number of IPIs, i.e. _one_. No? > I meant if we invoke flush_tlb_kernel_range() on each VA's individual range: void flush_tlb_kernel_range(unsigned long start, unsigned long end) { if (tlb_ops_need_broadcast()) { struct tlb_args ta; ta.ta_start = start; ta.ta_end = end; on_each_cpu(ipi_flush_tlb_kernel_range, &ta, 1); } else local_flush_tlb_kernel_range(start, end); broadcast_tlb_a15_erratum(); } we should IPI and wait, no? -- Uladzislau Rezki