Re: [patch] SLQB slab allocator

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Andi Kleen <andi@firstfloor.org>
To: Nick Piggin <npiggin@suse.de>
Cc: Andi Kleen <andi@firstfloor.org>,
	Pekka Enberg <penberg@cs.helsinki.fi>,
	Linux Memory Management List <linux-mm@kvack.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Lin Ming <ming.m.lin@intel.com>,
	"Zhang, Yanmin" <yanmin_zhang@linux.intel.com>,
	Christoph Lameter <clameter@engr.sgi.com>
Subject: Re: [patch] SLQB slab allocator
Date: Fri, 23 Jan 2009 12:57:31 +0100	[thread overview]
Message-ID: <20090123115731.GO15750@one.firstfloor.org> (raw)
In-Reply-To: <20090123112555.GF19986@wotan.suse.de>

On Fri, Jan 23, 2009 at 12:25:55PM +0100, Nick Piggin wrote:
> > > +#ifdef CONFIG_SLQB_SYSFS
> > > +	struct kobject kobj;	/* For sysfs */
> > > +#endif
> > > +#ifdef CONFIG_NUMA
> > > +	struct kmem_cache_node *node[MAX_NUMNODES];
> > > +#endif
> > > +#ifdef CONFIG_SMP
> > > +	struct kmem_cache_cpu *cpu_slab[NR_CPUS];
> > 
> > Those both really need to be dynamically allocated, otherwise
> > it wastes a lot of memory in the common case
> > (e.g. NR_CPUS==128 kernel on dual core system). And of course
> > on the proposed NR_CPUS==4096 kernels it becomes prohibitive.
> > 
> > You could use alloc_percpu? There's no alloc_pernode 
> > unfortunately, perhaps there should be one. 
> 
> cpu_slab is dynamically allocated, by just changing the size of
> the kmem_cache cache at boot time. 

You'll always have at least the MAX_NUMNODES waste because
you cannot tell the compiler that the cpu_slab field has 
moved.

> Probably the best way would
> be to have dynamic cpu and node allocs for them, I agree.

It's really needed.

> Any plans for an alloc_pernode?

It shouldn't be very hard to implement. Or do you ask if I'm volunteering? @)

> > > + * - investiage performance with memoryless nodes. Perhaps CPUs can be given
> > > + *   a default closest home node via which it can use fastpath functions.
> > 
> > FWIW that is what x86-64 always did. Perhaps you can just fix ia64 to do 
> > that too and be happy.
> 
> What if the node is possible but not currently online?

Nobody should allocate on it then.

> > > +/* Not all arches define cache_line_size */
> > > +#ifndef cache_line_size
> > > +#define cache_line_size()	L1_CACHE_BYTES
> > > +#endif
> > > +
> > 
> > They should. better fix them?
> 
> git grep -l -e cache_line_size arch/ | egrep '\.h$'
> 
> Only ia64, mips, powerpc, sparc, x86...

It's straight forward to that define everywhere.

> 
> > > +	if (unlikely(slab_poison(s)))
> > > +		memset(start, POISON_INUSE, PAGE_SIZE << s->order);
> > > +
> > > +	start += colour;
> > 
> > One thing i was wondering. Did you try to disable the colouring and see
> > if it makes much difference on modern systems? They tend to have either
> > larger caches or higher associativity caches.
> 
> I have tried, but I don't think I found a test where it made a
> statistically significant difference. It is not very costly to
> implement, though.

how about the memory usage?

also this is all so complicated already that every simplification helps.

> > > +#endif
> > > +
> > > +#ifdef CONFIG_NUMA
> > > +static struct kmem_cache kmem_node_cache;
> > > +static struct kmem_cache_cpu kmem_node_cpus[NR_CPUS];
> > > +static struct kmem_cache_node kmem_node_nodes[MAX_NUMNODES];
> > > +#endif
> > 
> > That all needs fixing too of course.
> 
> Hmm. I was hoping it could stay simple as it is just a static constant
> (for a given NR_CPUS) overhead. 

The issue is that distro kernels typically run with NR_CPUS >>> num_possible_cpus()
And we'll see likely higher NR_CPUS (and MAX_NUMNODES) in the future,
but also still want to run the same kernels on really small systems (e.g.
Atom based) without wasting their memory.  

So for anything NR_CPUS you should use per_cpu data -- that is correctly
sized automatically.

For MAX_NUMNODES we don't have anything equivalent currently, so 
you would also need alloc_pernode() I guess.

Ok you can just use per cpu for them too and only use the first
entry in each node. That's cheating, but not too bad.


> I wonder if bootmem is still up here?

bootmem is finished when slab comes up.
> 
> Could bite the bullet and do a multi-stage bootstap like SLUB, but I
> want to try avoiding that (but init code is also of course much less
> important than core code and total overheads). 

For DEFINE_PER_CPU you don't need special allocation.

Probably want a DEFINE_PER_NODE() for this or see above.

> 
> > > +static ssize_t align_show(struct kmem_cache *s, char *buf)
> > > +{
> > > +	return sprintf(buf, "%d\n", s->align);
> > > +}
> > > +SLAB_ATTR_RO(align);
> > > +
> > 
> > When you map back to the attribute you can use a index into a table
> > for the field, saving that many functions?
> > 
> > > +STAT_ATTR(CLAIM_REMOTE_LIST, claim_remote_list);
> > > +STAT_ATTR(CLAIM_REMOTE_LIST_OBJECTS, claim_remote_list_objects);
> > 
> > This really should be table driven, shouldn't it? That would give much
> > smaller code.
> 
> Tables probably would help. I will keep it close to SLUB for now,
> though.

Hmm, then fix slub? 

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2009-01-23 11:42 UTC|newest]

Thread overview: 99+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-21 14:30 Nick Piggin
2009-01-21 14:59 ` Ingo Molnar
2009-01-21 15:17   ` Nick Piggin
2009-01-21 16:56   ` Nick Piggin
2009-01-21 17:40     ` Ingo Molnar
2009-01-23  3:31       ` Nick Piggin
2009-01-23  6:14       ` Nick Piggin
2009-01-23 12:56         ` Ingo Molnar
2009-01-21 17:59 ` Joe Perches
2009-01-23  3:35   ` Nick Piggin
2009-01-23  4:00     ` Joe Perches
2009-01-21 18:10 ` Hugh Dickins
2009-01-22 10:01   ` Pekka Enberg
2009-01-22 12:47     ` Hugh Dickins
2009-01-23 14:23       ` Hugh Dickins
2009-01-23 14:30         ` Pekka Enberg
2009-02-02  3:38         ` Zhang, Yanmin
2009-02-02  9:00           ` Pekka Enberg
2009-02-02 15:00             ` Christoph Lameter
2009-02-03  1:34               ` Zhang, Yanmin
2009-02-03  7:29             ` Zhang, Yanmin
2009-02-03 12:18               ` Hugh Dickins
2009-02-04  2:21                 ` Zhang, Yanmin
2009-02-05 19:04                   ` Hugh Dickins
2009-02-06  0:47                     ` Zhang, Yanmin
2009-02-06  8:57                     ` Pekka Enberg
2009-02-06 12:33                       ` Hugh Dickins
2009-02-10  8:56                         ` Zhang, Yanmin
2009-02-02 11:50           ` Hugh Dickins
2009-01-23  3:55   ` Nick Piggin
2009-01-23 13:57     ` Hugh Dickins
2009-01-22  8:45 ` Zhang, Yanmin
2009-01-23  3:57   ` Nick Piggin
2009-01-23  9:00   ` Nick Piggin
2009-01-23 13:34     ` Hugh Dickins
2009-01-23 13:44       ` Nick Piggin
2009-01-23  9:55 ` Andi Kleen
2009-01-23 10:13   ` Pekka Enberg
2009-01-23 11:25   ` Nick Piggin
2009-01-23 11:57     ` Andi Kleen [this message]
2009-01-23 13:18       ` Nick Piggin
2009-01-23 14:04         ` Andi Kleen
2009-01-23 14:27           ` Nick Piggin
2009-01-23 15:06             ` Andi Kleen
2009-01-23 15:15               ` Nick Piggin
2009-01-23 12:55   ` Nick Piggin
  -- strict thread matches above, loose matches on Subject: below --
2009-01-14  9:04 Nick Piggin
2009-01-14 10:53 ` Pekka Enberg
2009-01-14 11:47   ` Nick Piggin
2009-01-14 13:44     ` Pekka Enberg
2009-01-14 14:22       ` Nick Piggin
2009-01-14 14:45         ` Pekka Enberg
2009-01-14 15:09           ` Nick Piggin
2009-01-14 15:22             ` Nick Piggin
2009-01-14 15:30               ` Pekka Enberg
2009-01-14 15:59                 ` Nick Piggin
2009-01-14 18:40                   ` Christoph Lameter
2009-01-15  6:19                     ` Nick Piggin
2009-01-15 20:47                       ` Christoph Lameter
2009-01-16  3:43                         ` Nick Piggin
2009-01-16 21:25                           ` Christoph Lameter
2009-01-19  6:18                             ` Nick Piggin
2009-01-22  0:13                               ` Christoph Lameter
2009-01-22  9:27                                 ` Pekka Enberg
2009-01-22  9:30                                   ` Zhang, Yanmin
2009-01-22  9:33                                     ` Pekka Enberg
2009-01-23 15:32                                       ` Christoph Lameter
2009-01-23 15:37                                         ` Pekka Enberg
2009-01-23 15:42                                           ` Christoph Lameter
2009-01-23 15:32                                   ` Christoph Lameter
2009-01-23  4:09                                 ` Nick Piggin
2009-01-23 15:41                                   ` Christoph Lameter
2009-01-23 15:53                                     ` Nick Piggin
2009-01-26 17:28                                       ` Christoph Lameter
2009-02-03  1:53                                         ` Nick Piggin
2009-02-03 17:33                                           ` Christoph Lameter
2009-02-03 18:42                                             ` Pekka Enberg
2009-02-03 18:47                                               ` Pekka Enberg
2009-02-04  4:22                                                 ` Nick Piggin
2009-02-04 20:09                                                   ` Christoph Lameter
2009-02-05  3:18                                                     ` Nick Piggin
2009-02-04 20:10                                               ` Christoph Lameter
2009-02-05  3:14                                                 ` Nick Piggin
2009-02-04  4:07                                             ` Nick Piggin
2009-01-14 18:01             ` Christoph Lameter
2009-01-15  6:03               ` Nick Piggin
2009-01-15 20:05                 ` Christoph Lameter
2009-01-16  3:19                   ` Nick Piggin
2009-01-16 21:07                     ` Christoph Lameter
2009-01-19  5:47                       ` Nick Piggin
2009-01-22  0:19                         ` Christoph Lameter
2009-01-23  4:17                           ` Nick Piggin
2009-01-23 15:52                             ` Christoph Lameter
2009-01-23 16:10                               ` Nick Piggin
2009-01-23 17:09                                 ` Nick Piggin
2009-01-26 17:46                                   ` Christoph Lameter
2009-02-03  1:42                                     ` Nick Piggin
2009-01-26 17:34                                 ` Christoph Lameter
2009-02-03  1:48                                   ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090123115731.GO15750@one.firstfloor.org \
    --to=andi@firstfloor.org \
    --cc=akpm@linux-foundation.org \
    --cc=clameter@engr.sgi.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ming.m.lin@intel.com \
    --cc=npiggin@suse.de \
    --cc=penberg@cs.helsinki.fi \
    --cc=yanmin_zhang@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox