From: Manfred Spraul <manfred@colorfullife.com>
To: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
Cc: Christoph Lameter <clameter@engr.sgi.com>,
linux-mm@kvack.org, akpm@osdl.org, dgc@sgi.com,
dipankar@in.ibm.com, mbligh@mbligh.org
Subject: Re: [PATCH] per-page SLAB freeing (only dcache for now)
Date: Tue, 04 Oct 2005 19:04:35 +0200 [thread overview]
Message-ID: <4342B623.3060007@colorfullife.com> (raw)
In-Reply-To: <20051003221743.GB29091@logos.cnet>
Marcelo Tosatti wrote:
>Hi Manfred,
>
>On Mon, Oct 03, 2005 at 10:37:26PM +0200, Manfred Spraul wrote:
>
>
>>Christoph Lameter wrote:
>>
>>
>>
>>>On Sat, 1 Oct 2005, Marcelo wrote:
>>>
>>>
>>>
>>>
>>>
>>>>I thought about having a mini-API for this such as "struct
>>>>slab_reclaim_ops" implemented by each reclaimable cache, invoked by a
>>>>generic SLAB function.
>>>>
>>>>
>>>>
>>>>
>>>>
>>Which functions would be needed?
>>- lock_cache(): No more alive/dead changes
>>- objp_is_alive()
>>- objp_is_killable()
>>- objp_kill()
>>
>>
>
>Yep something along that line. I'll come up with something more precise
>tomorrow.
>
>
>
>>I think it would be simpler if the caller must mark the objects as
>>alive/dead before/after calling kmem_cache_alloc/free: I don't think
>>it's a good idea to add special case code and branches to the normal
>>kmem_cache_alloc codepath. And especially: It would mean that
>>kmem_cache_alloc must perform a slab lookup in each alloc call, this
>>could be slow.
>>The slab users could store the alive status somewhere in the object. And
>>they could set the flag early, e.g. disable alive as soon as an object
>>is put on the rcu aging list.
>>
>>
>
>The "i_am_alive" flag purpose at the moment is to avoid interpreting
>uninitialized data (in the dentry cache, the reference counter is bogus
>in such case). It was just a quick hack to watch it work, it seemed to
>me it could be done within SLAB code.
>
>This information ("liveness" of objects) is managed inside the SLAB
>generic code, and it seems to be available already through the
>kmembufctl array which is part of the management data, right?
>
>
>
Not really. The array is only updated when the free status reaches the
slab structure, which is quite late.
kmem_cache_free
- puts the object into a per-cpu array. No locking at all, each cpu can
only read it's own array.
- when that array is full, then it's put into a global array (->shared).
- when the global array is full, then the object is marked as free in
the slab structure.
- when add objects from a slab are free, then the slab is placed on the
free slab list
- when there is memory pressure, then the pages from the free slab list
are reclaimed.
>Suppose there's no need for the cache specific functions to be aware of
>liveness, ie. its SLAB specific information.
>
>
>
What about RCU? We have dying objects: Still alive, because someone
might have a pointer to it, but already on the rcu list and will be
released after the next quiescent state. slab can't know that.
>Another issue is synchronization between multiple threads in this
>level of the reclaim path. Can be dealt with PageLock: if the bit is set,
>don't bother checking the page, someone else is already doing
>so.
>
>You mention
>
>
>
>>- lock_cache(): No more alive/dead changes
>>
>>
>
>With the PageLock bit, you can instruct kmem_cache_alloc() to skip partial
>but Locked pages (thus avoiding any object allocations within that page).
>Hum, what about higher order SLABs?
>
>
>
You have misunderstood my question: I was thinking about object
dead/alive changes.
There are two questions: First figure out how many objects from a
certain slab are alive. Then, if it's below a threshold, try to free
them. With this approach, you need lock(), is_objp_alive(), release_objp().
>Well, kmem_cache_alloc() can be a little bit smarter at this point, since
>its already a slow path, no? Its refill time, per-CPU cache is exhausted...
>
>
>
Definitively. Fast path is only kmem_cache_alloc and kmem_cache_free. No
global cache line writes in these functions. They were down to 1
conditional branch and 2-3 cachelines, One of them read-only, the
other(s) are read/write, but per-cpu. I'm not sure how much changed with
the NUMA patches, but the non-numa case should try to remain simple. And
e.g. looking up the bufctl means an integer division. Just that
instruction could nearly double the runtime of kmem_cache_free().
The shared_array part from cache_flusharray and cache_alloc_refill are
partially fast path: If we slow that down, then it will affect packet
routing. The rest is slow path.
--
Manfred
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2005-10-04 17:04 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-09-30 19:37 Marcelo
2005-10-01 2:46 ` Christoph Lameter
2005-10-01 21:52 ` Marcelo
2005-10-03 15:24 ` Christoph Lameter
2005-10-03 20:37 ` Manfred Spraul
2005-10-03 22:17 ` Marcelo Tosatti
2005-10-04 17:04 ` Manfred Spraul [this message]
2005-10-06 16:01 ` Marcelo Tosatti
2005-10-22 1:30 ` Marcelo Tosatti
2005-10-22 6:31 ` Andrew Morton
2005-10-22 9:21 ` Arjan van de Ven
2005-10-22 17:08 ` Christoph Lameter
2005-10-22 17:13 ` ia64 page size (was Re: [PATCH] per-page SLAB freeing (only dcache for now)) Arjan van de Ven
2005-10-22 18:16 ` [PATCH] per-page SLAB freeing (only dcache for now) Manfred Spraul
2005-10-23 18:41 ` Marcelo Tosatti
2005-10-23 16:30 ` Marcelo Tosatti
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4342B623.3060007@colorfullife.com \
--to=manfred@colorfullife.com \
--cc=akpm@osdl.org \
--cc=clameter@engr.sgi.com \
--cc=dgc@sgi.com \
--cc=dipankar@in.ibm.com \
--cc=linux-mm@kvack.org \
--cc=marcelo.tosatti@cyclades.com \
--cc=mbligh@mbligh.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox