From: Eric Dumazet <dada1@cosmosbay.com>
To: Nick Piggin <npiggin@suse.de>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Linux Memory Management List <linux-mm@kvack.org>
Subject: Re: [RFC] non-refcounted pages, application to slab?
Date: Wed, 25 Jan 2006 12:10:48 +0100 [thread overview]
Message-ID: <43D75CB8.9090101@cosmosbay.com> (raw)
In-Reply-To: <20060125105737.GB30421@wotan.suse.de>
Nick Piggin a ecrit :
> On Wed, Jan 25, 2006 at 11:26:01AM +0100, Eric Dumazet wrote:
>> Nick Piggin a ecrit :
>>> If an allocator knows exactly the lifetime of its page, then there is no
>>> need to do refcounting or the final put_page_zestzero (atomic op + mem
>>> barriers).
>>>
>>> This is probably not worthwhile for most cases, but slab did strike me
>>> as a potential candidate (however the complication here is that some
>>> code I think uses the refcount of underlying pages of slab allocations
>>> eg nommu code). So it is not a complete patch, but I wonder if anyone
>>> thinks the savings might be worth the complexity?
>>>
>>> Is there any particular code that is really heavy on slab allocations?
>>> That isn't mostly handled by the slab's internal freelists?
>> Hi Nick
>>
>> After reading your patch, I have some crazy idea.
>>
>> The atomic op + mem barrier you want to avoid could be avoided more
>> generally just by changing atomic_dec_and_test(atomic_t *v).
>>
>> If the current thread is the last referer (refcnt = 1), then it can safely
>> set the value to 0 because no other CPU can be touching the value (or else
>> there must be a bug somewhere, as the 'other cpu' could touch the value
>> just after us and we could free an object still in use by 'other cpu'
>>
>
> I think that would work for this case, but you change the semantics
> of the function for all users which is bad.
Yes :) I did a test with a patched kernel and I got :
BUG: atomic counter underflow at:
<c0103a3a> show_trace+0x20/0x22 <c0103b5b> dump_stack+0x1e/0x20
<c01d6934> _atomic_dec_and_lock+0x78/0x88 <c0177599> dput+0xbf/0x187
<c016dc96> path_release+0x14/0x30 <c016e540> __link_path_walk+0x36d/0xd5f
<c016ef84> link_path_walk+0x52/0xd6 <c016f2ec> do_path_lookup+0xfc/0x220
<c016f467> __path_lookup_intent_open+0x3e/0x73 <c016f4d1>
path_lookup_open+0x35/0x37
<c016fc79> open_namei+0x83/0x631 <c015f811> do_filp_open+0x38/0x56
<c015fb83> do_sys_open+0x5c/0x99 <c015fbe7> sys_open+0x27/0x29
<c0102bb3> sysenter_past_esp+0x54/0x75
So we cannot change atomic_dec_and_test(atomic_t *v) but introduce a new
function like :
int atomic_dec_refcount(atomic_t *v)
{
#ifdef CONFIG_SMP
/* avoid an atomic op if we are the last user of this refcount */
if (atomic_read(v) == 1) {
atomic_set(v, 0); /* not a real atomic op on most machines */
return 1;
}
#endif
return atomic_dec_and_test(v);
}
The cost of the extra conditional branch is worth, if it can avoid an atomic op.
>
> Such a test could be open coded in __free_page, although that does
> add a branch + some icache, but that might also be an option. (and
> my patch does also add to total icache footprint and is much uglier ;))
>
> Thanks,
> Nick
>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2006-01-25 11:10 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-01-25 9:39 Nick Piggin
2006-01-25 9:54 ` Eric Dumazet
2006-01-25 9:56 ` Nick Piggin
2006-01-25 10:26 ` Eric Dumazet
2006-01-25 10:57 ` Nick Piggin
2006-01-25 11:10 ` Eric Dumazet [this message]
2006-01-25 11:18 ` Nick Piggin
2006-01-25 10:30 ` Pekka Enberg
2006-01-25 11:00 ` Nick Piggin
2006-01-25 11:19 ` Pekka Enberg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=43D75CB8.9090101@cosmosbay.com \
--to=dada1@cosmosbay.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=npiggin@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox