linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Uladzislau Rezki <urezki@gmail.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Uladzislau Rezki <urezki@gmail.com>,
	"Russell King (Oracle)" <linux@armlinux.org.uk>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, Christoph Hellwig <hch@lst.de>,
	Lorenzo Stoakes <lstoakes@gmail.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Baoquan He <bhe@redhat.com>, John Ogness <jogness@linutronix.de>,
	linux-arm-kernel@lists.infradead.org,
	Mark Rutland <mark.rutland@arm.com>,
	Marc Zyngier <maz@kernel.org>,
	x86@kernel.org
Subject: Re: Excessive TLB flush ranges
Date: Wed, 17 May 2023 13:26:21 +0200	[thread overview]
Message-ID: <ZGS53cXQEQ8wZ1ab@pc636> (raw)
In-Reply-To: <87leho6wd9.ffs@tglx>

On Tue, May 16, 2023 at 07:04:34PM +0200, Thomas Gleixner wrote:
> On Tue, May 16 2023 at 17:01, Uladzislau Rezki wrote:
> > On Tue, May 16, 2023 at 04:38:58PM +0200, Thomas Gleixner wrote:
> >> There is a world outside of x86, but even on x86 it's borderline silly
> >> to take the whole TLB out when you can flush 3 TLB entries one by one
> >> with exactly the same number of IPIs, i.e. _one_. No?
> >> 
> > I meant if we invoke flush_tlb_kernel_range() on each VA's individual
> > range:
> >
> > <ARM>
> > void flush_tlb_kernel_range(unsigned long start, unsigned long end)
> > {
> > 	if (tlb_ops_need_broadcast()) {
> > 		struct tlb_args ta;
> > 		ta.ta_start = start;
> > 		ta.ta_end = end;
> > 		on_each_cpu(ipi_flush_tlb_kernel_range, &ta, 1);
> > 	} else
> > 		local_flush_tlb_kernel_range(start, end);
> > 	broadcast_tlb_a15_erratum();
> > }
> > <ARM>
> >
> > we should IPI and wait, no?
> 
> The else clause does not do an IPI, but that's irrelevant.
> 
> The proposed flush_tlb_kernel_vas(list, num_pages) mechanism
> achieves:
> 
>   1) It batches multiple ranges to _one_ invocation
> 
>   2) It lets the architecture decide based on the number of pages
>      whether it does a tlb_flush_all() or a flush of individual ranges.
> 
> Whether the architecture uses IPIs or flushes only locally and the
> hardware propagates that is completely irrelevant.
> 
> Right now any coalesced range, which is huge due to massive holes, takes
> decision #2 away.
> 
> If you want to flush individual VAs from the core vmalloc code then you
> lose #1, as the aggregated number of pages might justify a tlb_flush_all().
> 
> That's a pure architecture decision and all the core code needs to do is
> to provide appropriate information and not some completely bogus request
> to flush 17312759359 pages, i.e. a ~64.5 TB range, while in reality
> there are exactly _three_ distinct pages to flush.
> 
1.

I think, all two cases(logic) should be moved into ARCH code, so a decision 
is made _not_ by vmalloc code how to flush, either fully, if it supported or
page by page that require list chasing.

As for vmalloc interace, we can provide the list(we keep it short, because of
merging property) + number of pages to flush.

2.

It looks like your problem is because of 

void vfree(const void *addr)
{
...
	if (unlikely(vm->flags & VM_FLUSH_RESET_PERMS))
		vm_reset_perms(vm); <----
...
}

so, all purged areas are drained in a caller context, so it is blocked
until the drain is done including flushing. I am not sure why it is done
from a caller context.

IMHO, it should be deferred same way as we do in:

static void free_vmap_area_noflush(struct vmap_area *va)

if do not miss the point why vfree() has to do it directly.

--
Uladzislau Rezki


  reply	other threads:[~2023-05-17 11:26 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-15 16:43 Thomas Gleixner
2023-05-15 16:59 ` Russell King (Oracle)
2023-05-15 19:46   ` Thomas Gleixner
2023-05-15 21:11     ` Thomas Gleixner
2023-05-15 21:31       ` Russell King (Oracle)
2023-05-16  6:37         ` Thomas Gleixner
2023-05-16  6:46           ` Thomas Gleixner
2023-05-16  8:18           ` Thomas Gleixner
2023-05-16  8:20             ` Thomas Gleixner
2023-05-16  8:27               ` Russell King (Oracle)
2023-05-16  9:03                 ` Thomas Gleixner
2023-05-16 10:05                   ` Baoquan He
2023-05-16 14:21                     ` Thomas Gleixner
2023-05-16 19:03                       ` Thomas Gleixner
2023-05-17  9:38                         ` Thomas Gleixner
2023-05-17 10:52                           ` Baoquan He
2023-05-19 11:22                             ` Thomas Gleixner
2023-05-19 11:49                               ` Baoquan He
2023-05-19 14:13                                 ` Thomas Gleixner
2023-05-19 12:01                         ` [RFC PATCH 1/3] mm/vmalloc.c: try to flush vmap_area one by one Baoquan He
2023-05-19 14:16                           ` Thomas Gleixner
2023-05-19 12:02                         ` [RFC PATCH 2/3] mm/vmalloc.c: Only flush VM_FLUSH_RESET_PERMS area immediately Baoquan He
2023-05-19 12:03                         ` [RFC PATCH 3/3] mm/vmalloc.c: change _vm_unmap_aliases() to do purge firstly Baoquan He
2023-05-19 14:17                           ` Thomas Gleixner
2023-05-19 18:38                           ` Thomas Gleixner
2023-05-19 23:46                             ` Baoquan He
2023-05-21 23:10                               ` Thomas Gleixner
2023-05-22 11:21                                 ` Baoquan He
2023-05-22 12:02                                   ` Thomas Gleixner
2023-05-22 14:34                                     ` Baoquan He
2023-05-22 20:21                                       ` Thomas Gleixner
2023-05-22 20:44                                         ` Thomas Gleixner
2023-05-23  9:35                                         ` Baoquan He
2023-05-19 13:49                   ` Excessive TLB flush ranges Thomas Gleixner
2023-05-16  8:21             ` Russell King (Oracle)
2023-05-16  8:19           ` Russell King (Oracle)
2023-05-16  8:44             ` Thomas Gleixner
2023-05-16  8:48               ` Russell King (Oracle)
2023-05-16 12:09                 ` Thomas Gleixner
2023-05-16 13:42                   ` Uladzislau Rezki
2023-05-16 14:38                     ` Thomas Gleixner
2023-05-16 15:01                       ` Uladzislau Rezki
2023-05-16 17:04                         ` Thomas Gleixner
2023-05-17 11:26                           ` Uladzislau Rezki [this message]
2023-05-17 11:58                             ` Thomas Gleixner
2023-05-17 12:15                               ` Uladzislau Rezki
2023-05-17 16:32                                 ` Thomas Gleixner
2023-05-19 10:01                                   ` Uladzislau Rezki
2023-05-19 14:56                                     ` Thomas Gleixner
2023-05-19 15:14                                       ` Uladzislau Rezki
2023-05-19 16:32                                         ` Thomas Gleixner
2023-05-19 17:02                                           ` Uladzislau Rezki
2023-05-16 17:56                       ` Nadav Amit
2023-05-16 19:32                         ` Thomas Gleixner
2023-05-17  0:23                           ` Thomas Gleixner
2023-05-17  1:23                             ` Nadav Amit
2023-05-17 10:31                               ` Thomas Gleixner
2023-05-17 11:47                                 ` Thomas Gleixner
2023-05-17 22:41                                   ` Nadav Amit
2023-05-17 14:43                                 ` Mark Rutland
2023-05-17 16:41                                   ` Thomas Gleixner
2023-05-17 22:57                                 ` Nadav Amit
2023-05-19 11:49                                   ` Thomas Gleixner
2023-05-17 12:12                               ` Russell King (Oracle)
2023-05-17 23:14                                 ` Nadav Amit
2023-05-15 18:17 ` Uladzislau Rezki
2023-05-16  2:26   ` Baoquan He
2023-05-16  6:40     ` Thomas Gleixner
2023-05-16  8:07       ` Baoquan He
2023-05-16  8:10         ` Baoquan He
2023-05-16  8:45         ` Russell King (Oracle)
2023-05-16  9:13           ` Thomas Gleixner
2023-05-16  8:54         ` Thomas Gleixner
2023-05-16  9:48           ` Baoquan He
2023-05-15 20:02 ` Nadav Amit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZGS53cXQEQ8wZ1ab@pc636 \
    --to=urezki@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=bhe@redhat.com \
    --cc=hch@lst.de \
    --cc=jogness@linutronix.de \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-mm@kvack.org \
    --cc=linux@armlinux.org.uk \
    --cc=lstoakes@gmail.com \
    --cc=mark.rutland@arm.com \
    --cc=maz@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox