Re: scalable kmap (was Re: vm lock contention reduction)

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: "Martin J. Bligh" <fletch@aracnet.com>
To: Linus Torvalds <torvalds@transmeta.com>
Cc: Andrew Morton <akpm@zip.com.au>,
	Andrea Arcangeli <andrea@suse.de>,
	Rik van Riel <riel@conectiva.com.br>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: scalable kmap (was Re: vm lock contention reduction)
Date: Sun, 07 Jul 2002 09:00:27 -0700	[thread overview]
Message-ID: <1083506661.1026032427@[10.10.2.3]> (raw)
In-Reply-To: <Pine.LNX.4.44.0207070041260.2262-100000@home.transmeta.com>

>> I think that might have been when Andrea was using persistent kmap
>> for highpte (since fixed), so we were really kicking the **** out
>> of it. Nonetheless, your point is perfectly correct, it's the
>> global invalidate that's really the expensive thing.
> 
> I suspect that there really aren't that many places that care about the
> persistent mappings, and the atomic per-cpu stuff is inherently scalable
> (but due to being harder to cache, slower). So I wonder how much of a
> problem the kmap stuff really is.
> 
> So if the main problem ends up being that some paths (a) really want the
> persistent version _and_ (b) you can make the paths hold them for long
> times (by writing to a blocking pipe/socket or similar) we may just have
> much simpler approaches - like a per-user kmap count.

I don't think they really want something that's persistant over a
long time, I think they want something they can hold over a potential
reschedule - that's why they're not using kmap_atomic.

> Which just guarantees that any user at any time can only hold 100
> concurrent persistent kmap's open. Problem solved.

We're not running out of kmaps in the pool, we're just churning them
(and dirtying them) at an unpleasant rate. Every time we exhaust the
pool, we do a global TLB flush on all CPUs, which sucks for performance.

> The _performance_ scalability concerns should be fairly easily solvable
> (as far as I can tell - feel free to correct me) by making the persistent
> array bigger 

Making the array bigger does help, but it consumes some more virtual
address space, which the most critical resource on these machines ... 
at the moment we use up 1024 entries, which is 4Mb, I normally set
things to 4096, which uses 16Mb - certainly that would be a better
default for larger machines. But if I make it much bigger than that,
I start to run out of vmalloc space ;-) Of course we could just add
the size of the kmap pool to _VMALLOC_RESERVE, which would be somewhat
better ...

> and finding things where persistency isn't needed (and
> possibly doesn't even help due to lack of locality), and just 
> making those places use the per-cpu atomic ones.

I'm kind of handwaving at this point because I don't have the stats
to hand. I had Keith gather some stats on this, and see what was 
actually calling kmap - I'll dig those out on Monday when I'm back
in the office, and send them along.

I was kind of hoping to find some elegant killer solution to all this,
but it's been kicked around for a while now, and every solution seems
to have its problems. If it can't be elegantly solved, we can probably
kill the performance issue by just tuning, as you say ...

M.

PS. One interesting thing Keith found was this: on NUMA-Q, I currently
do the IPI send for smp_call_function (amongst other things) as a 
sequenced unicast (send a seperate message to each CPU in turn), 
rather than the normal broadcast because it's harder to do in 
clustered apic mode. Whilst trying to switch this back, he found it ran
faster as the sequenced unicast, not only for NUMA-Q, but also for
standard SMP boxes!!! I'm guessing the timing offset generated helps
cacheline or lock contention ... interesting anyway.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

next prev parent reply	other threads:[~2002-07-07 16:00 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-07-04 23:05 vm lock contention reduction Andrew Morton
2002-07-04 23:26 ` Rik van Riel
2002-07-04 23:27 ` Rik van Riel
2002-07-05  1:37   ` Andrew Morton
2002-07-05  1:49     ` Rik van Riel
2002-07-05  2:18       ` Andrew Morton
2002-07-05  2:16         ` Rik van Riel
2002-07-05  2:53           ` Andrew Morton
2002-07-05  3:52             ` Benjamin LaHaise
2002-07-05  4:47           ` Linus Torvalds
2002-07-05  5:38             ` Andrew Morton
2002-07-05  5:51               ` Linus Torvalds
2002-07-05  6:08                 ` Linus Torvalds
2002-07-05  6:27                   ` Alexander Viro
2002-07-05  6:33                   ` Andrew Morton
2002-07-05  7:33                     ` Andrea Arcangeli
2002-07-07  2:50                       ` Andrew Morton
2002-07-07  3:05                         ` Linus Torvalds
2002-07-07  3:47                           ` Andrew Morton
2002-07-08 11:39                             ` Enhanced profiling support (was Re: vm lock contention reduction) John Levon
2002-07-08 17:52                               ` Linus Torvalds
2002-07-08 18:41                                 ` Karim Yaghmour
2002-07-10  2:22                                   ` John Levon
2002-07-10  4:16                                     ` Karim Yaghmour
2002-07-10  4:38                                       ` John Levon
2002-07-10  5:46                                         ` Karim Yaghmour
2002-07-10 13:10                                         ` bob
2002-07-07  5:16                           ` vm lock contention reduction Martin J. Bligh
2002-07-07  6:13                         ` scalable kmap (was Re: vm lock contention reduction) Martin J. Bligh
2002-07-07  6:37                           ` Andrew Morton
2002-07-07  7:53                           ` Linus Torvalds
2002-07-07  9:04                             ` Andrew Morton
2002-07-07 16:13                               ` Martin J. Bligh
2002-07-07 18:31                               ` Linus Torvalds
2002-07-07 18:55                                 ` Linus Torvalds
2002-07-07 19:02                                   ` Linus Torvalds
2002-07-08  7:24                                 ` Andrew Morton
2002-07-08  8:09                                   ` Andrea Arcangeli
2002-07-08 14:50                                     ` William Lee Irwin III
2002-07-08 20:39                                     ` Andrew Morton
2002-07-08 21:08                                       ` Benjamin LaHaise
2002-07-08 21:45                                         ` Andrew Morton
2002-07-08 22:24                                           ` Benjamin LaHaise
2002-07-07 16:00                             ` Martin J. Bligh [this message]
2002-07-07 18:28                               ` Linus Torvalds
2002-07-08  7:11                                 ` Andrea Arcangeli
2002-07-08 10:15                                 ` Eric W. Biederman
2002-07-08  7:00                               ` Andrea Arcangeli
2002-07-08 17:29                           ` Martin J. Bligh
2002-07-08 22:14                             ` Linus Torvalds
2002-07-09  0:16                               ` Andrew Morton
2002-07-09  3:17                             ` Andrew Morton
2002-07-09  4:28                               ` Martin J. Bligh
2002-07-09  5:28                                 ` Andrew Morton
2002-07-09  6:15                                   ` Martin J. Bligh
2002-07-09  6:30                                     ` William Lee Irwin III
2002-07-09  6:32                                     ` William Lee Irwin III
2002-07-09 16:08                                   ` Martin J. Bligh
2002-07-09 17:32                                   ` Andrea Arcangeli
2002-07-10  5:32                                     ` Andrew Morton
2002-07-10 22:43                                       ` Martin J. Bligh
2002-07-10 23:08                                         ` Andrew Morton
2002-07-10 23:26                                           ` Martin J. Bligh
2002-07-11  0:19                                             ` Andrew Morton
2002-07-12 17:48                                           ` Martin J. Bligh
2002-07-13 11:18                                             ` Andrea Arcangeli
2002-07-09 13:59                               ` Benjamin LaHaise
2002-07-08  0:38                         ` vm lock contention reduction William Lee Irwin III
2002-07-05  6:46                 ` Andrew Morton
2002-07-05 14:25                   ` Rik van Riel
2002-07-05 23:11         ` William Lee Irwin III
2002-07-05 23:48           ` Andrew Morton
2002-07-06  0:11             ` Rik van Riel
2002-07-06  0:31               ` Linus Torvalds
2002-07-06  0:45                 ` Rik van Riel
2002-07-06  0:48               ` Andrew Morton
2002-07-08  0:59                 ` William Lee Irwin III

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='1083506661.1026032427@[10.10.2.3]' \
    --to=fletch@aracnet.com \
    --cc=akpm@zip.com.au \
    --cc=andrea@suse.de \
    --cc=linux-mm@kvack.org \
    --cc=riel@conectiva.com.br \
    --cc=torvalds@transmeta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox