linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Christoph Lameter <cl@linux-foundation.org>
To: Nick Piggin <npiggin@suse.de>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>,
	"Zhang, Yanmin" <yanmin_zhang@linux.intel.com>,
	Lin Ming <ming.m.lin@intel.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [patch] SLQB slab allocator
Date: Fri, 16 Jan 2009 15:25:05 -0600 (CST)	[thread overview]
Message-ID: <Pine.LNX.4.64.0901161509160.27283@quilx.com> (raw)
In-Reply-To: <20090116034356.GM17810@wotan.suse.de>

On Fri, 16 Jan 2009, Nick Piggin wrote:

> > The application that is interrupted has no control over when SLQB runs its
> > expiration. The longer the queues the longer the holdoff. Look at the
> > changelogs for various queue expiration things in the kernel. I fixed up a
> > couple of those over the years for latency reasons.
>
> Interrupts and timers etc. as well as preemption by kernel threads happen
> everywhere in the kernel. I have not seen any reason why slab queue reaping
> in particular is a problem.

The slab queues are a particular problem since they are combined with
timers. So the latency insensitive phase of an HPC app completes and then
the latency critical part starts to run. SLAB will happily every 2 seconds
expire a certain amount of objects from all its various queues.

> Any slab allocator is going to have a whole lot of theoretical problems and
> you simply won't be able to fix them all because some require an oracle or
> others fundamentally conflict with another theoretical problem.

I agree there is no point in working on theoretical problems. We are
talking about practical problems.

> I concentrate on the main practical problems and the end result. If I see
> evidence of some problem caused, then I will do my best to fix it.

You concentrate on the problems that are given to you I guess...

> > Well yes with enterprise app you are likely not going to see it. Run HPC
> > and other low latency tests (Infiniband based and such).
>
> So do you have any results or not?

Of course. I need to repost them? I am no longer employed by the company I
did the work for. So the test data is no longer accessible to me. You have
to rely on the material that was posted in the past.

> > It still will have to move objects between queues? Or does it adapt the
> > slub method of "queue" per page?
>
> It has several queues that objects can move between. You keep asserting
> that this is a problem.

> > SLUB obeys memory policies. It just uses the page allocator for this by
> > doing an allocation *without* specifying the node that memory has to come
> > from. SLAB manages memory strictly per node. So it always has to ask for
> > memory from a particular node. Hence the need to implement memory policies
> > in the allocator.
>
> You only go to the allocator when the percpu queue goes empty though, so
> if memory policy changes (eg context switch or something), then subsequent
> allocations will be of the wrong policy.

The per cpu queue size in SLUB is limited by the queues only containing
objects from the same page. If you have large queues like SLAB/SLQB(?)
then this could be an issue.

> That is what I call a hack, which is made in order to solve a percieved
> performance problem. The SLAB/SLQB method of checking policy is simple,
> obviously correct, and until there is a *demonstrated* performance problem
> with that, then I'm not going to change it.

Well so far it seems that your tests never even exercise that part of the
allocators.

> I don't think this is a problem. Anyway, rt systems that care about such
> tiny latencies can easily prioritise this. And ones that don't care so
> much have many other sources of interrupts and background processing by
> the kernel or hardware interrupts.

How do they prioritize this?

> If this actually *is* a problem, I will allow an option to turn of periodic
> trimming of queues, and allow objects to remain in queues (like the page
> allocator does with its queues). And just provide hooks to reap them at
> low memory time.

That means large amounts of memory are going to be caught in these queues.
If its per cpu and one cpu does allocation and the other frees then the
first cpu will consume more and more memory from the page allocator
whereas the second will build up huge per cpu lists.

> It's strange. You percieve these theoretical problems with things that I
> actually consider is a distinct *advantage* of SLAB/SLQB. order-0 allocations,
> queueing, strictly obeying NUMA policies...

These are issues that we encountered in practice with large systems.
Pointer chasing performance on many apps is bounded by TLB faults etc.
Strictly obeying NUMA policies causes performance problems in SLAB. Try
MPOL_INTERLEAVE vs a cpu local allocations.

> > I still dont see the problem that SLQB is addressing (aside from code
> > cleanup of SLAB). Seems that you feel that the queueing behavior of SLAB
> > is okay.
>
> It addresses O(NR_CPUS^2) memory consumption of kmem caches, and large
> constant consumption of array caches of SLAB. It addresses scalability
> eg in situations with lots of cores per node. It allows resizeable
> queues. It addresses the code complexity and bootstap hoops of SLAB.
>
> It addresses performance and higher order allocation problems of SLUB.

It seems that on SMP systems SLQB will actually increase the number of
queues since it needs 2 queues per cpu instead of the 1 of SLAB. SLAB also
has resizable queues. Code simplification and bootstrap: Great work on
that. Again good cleanup of SLAB.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2009-01-16 21:25 UTC|newest]

Thread overview: 99+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-14  9:04 Nick Piggin
2009-01-14 10:53 ` Pekka Enberg
2009-01-14 11:47   ` Nick Piggin
2009-01-14 13:44     ` Pekka Enberg
2009-01-14 14:22       ` Nick Piggin
2009-01-14 14:45         ` Pekka Enberg
2009-01-14 15:09           ` Nick Piggin
2009-01-14 15:22             ` Nick Piggin
2009-01-14 15:30               ` Pekka Enberg
2009-01-14 15:59                 ` Nick Piggin
2009-01-14 18:40                   ` Christoph Lameter
2009-01-15  6:19                     ` Nick Piggin
2009-01-15 20:47                       ` Christoph Lameter
2009-01-16  3:43                         ` Nick Piggin
2009-01-16 21:25                           ` Christoph Lameter [this message]
2009-01-19  6:18                             ` Nick Piggin
2009-01-22  0:13                               ` Christoph Lameter
2009-01-22  9:27                                 ` Pekka Enberg
2009-01-22  9:30                                   ` Zhang, Yanmin
2009-01-22  9:33                                     ` Pekka Enberg
2009-01-23 15:32                                       ` Christoph Lameter
2009-01-23 15:37                                         ` Pekka Enberg
2009-01-23 15:42                                           ` Christoph Lameter
2009-01-23 15:32                                   ` Christoph Lameter
2009-01-23  4:09                                 ` Nick Piggin
2009-01-23 15:41                                   ` Christoph Lameter
2009-01-23 15:53                                     ` Nick Piggin
2009-01-26 17:28                                       ` Christoph Lameter
2009-02-03  1:53                                         ` Nick Piggin
2009-02-03 17:33                                           ` Christoph Lameter
2009-02-03 18:42                                             ` Pekka Enberg
2009-02-03 18:47                                               ` Pekka Enberg
2009-02-04  4:22                                                 ` Nick Piggin
2009-02-04 20:09                                                   ` Christoph Lameter
2009-02-05  3:18                                                     ` Nick Piggin
2009-02-04 20:10                                               ` Christoph Lameter
2009-02-05  3:14                                                 ` Nick Piggin
2009-02-04  4:07                                             ` Nick Piggin
2009-01-14 18:01             ` Christoph Lameter
2009-01-15  6:03               ` Nick Piggin
2009-01-15 20:05                 ` Christoph Lameter
2009-01-16  3:19                   ` Nick Piggin
2009-01-16 21:07                     ` Christoph Lameter
2009-01-19  5:47                       ` Nick Piggin
2009-01-22  0:19                         ` Christoph Lameter
2009-01-23  4:17                           ` Nick Piggin
2009-01-23 15:52                             ` Christoph Lameter
2009-01-23 16:10                               ` Nick Piggin
2009-01-23 17:09                                 ` Nick Piggin
2009-01-26 17:46                                   ` Christoph Lameter
2009-02-03  1:42                                     ` Nick Piggin
2009-01-26 17:34                                 ` Christoph Lameter
2009-02-03  1:48                                   ` Nick Piggin
2009-01-21 14:30 Nick Piggin
2009-01-21 14:59 ` Ingo Molnar
2009-01-21 15:17   ` Nick Piggin
2009-01-21 16:56   ` Nick Piggin
2009-01-21 17:40     ` Ingo Molnar
2009-01-23  3:31       ` Nick Piggin
2009-01-23  6:14       ` Nick Piggin
2009-01-23 12:56         ` Ingo Molnar
2009-01-21 17:59 ` Joe Perches
2009-01-23  3:35   ` Nick Piggin
2009-01-23  4:00     ` Joe Perches
2009-01-21 18:10 ` Hugh Dickins
2009-01-22 10:01   ` Pekka Enberg
2009-01-22 12:47     ` Hugh Dickins
2009-01-23 14:23       ` Hugh Dickins
2009-01-23 14:30         ` Pekka Enberg
2009-02-02  3:38         ` Zhang, Yanmin
2009-02-02  9:00           ` Pekka Enberg
2009-02-02 15:00             ` Christoph Lameter
2009-02-03  1:34               ` Zhang, Yanmin
2009-02-03  7:29             ` Zhang, Yanmin
2009-02-03 12:18               ` Hugh Dickins
2009-02-04  2:21                 ` Zhang, Yanmin
2009-02-05 19:04                   ` Hugh Dickins
2009-02-06  0:47                     ` Zhang, Yanmin
2009-02-06  8:57                     ` Pekka Enberg
2009-02-06 12:33                       ` Hugh Dickins
2009-02-10  8:56                         ` Zhang, Yanmin
2009-02-02 11:50           ` Hugh Dickins
2009-01-23  3:55   ` Nick Piggin
2009-01-23 13:57     ` Hugh Dickins
2009-01-22  8:45 ` Zhang, Yanmin
2009-01-23  3:57   ` Nick Piggin
2009-01-23  9:00   ` Nick Piggin
2009-01-23 13:34     ` Hugh Dickins
2009-01-23 13:44       ` Nick Piggin
2009-01-23  9:55 ` Andi Kleen
2009-01-23 10:13   ` Pekka Enberg
2009-01-23 11:25   ` Nick Piggin
2009-01-23 11:57     ` Andi Kleen
2009-01-23 13:18       ` Nick Piggin
2009-01-23 14:04         ` Andi Kleen
2009-01-23 14:27           ` Nick Piggin
2009-01-23 15:06             ` Andi Kleen
2009-01-23 15:15               ` Nick Piggin
2009-01-23 12:55   ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0901161509160.27283@quilx.com \
    --to=cl@linux-foundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ming.m.lin@intel.com \
    --cc=npiggin@suse.de \
    --cc=penberg@cs.helsinki.fi \
    --cc=torvalds@linux-foundation.org \
    --cc=yanmin_zhang@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox