From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <43D75CB8.9090101@cosmosbay.com> Date: Wed, 25 Jan 2006 12:10:48 +0100 From: Eric Dumazet MIME-Version: 1.0 Subject: Re: [RFC] non-refcounted pages, application to slab? References: <20060125093909.GE32653@wotan.suse.de> <43D75239.90907@cosmosbay.com> <20060125105737.GB30421@wotan.suse.de> In-Reply-To: <20060125105737.GB30421@wotan.suse.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit Sender: owner-linux-mm@kvack.org Return-Path: To: Nick Piggin Cc: Linux Kernel Mailing List , Linux Memory Management List List-ID: Nick Piggin a ecrit : > On Wed, Jan 25, 2006 at 11:26:01AM +0100, Eric Dumazet wrote: >> Nick Piggin a ecrit : >>> If an allocator knows exactly the lifetime of its page, then there is no >>> need to do refcounting or the final put_page_zestzero (atomic op + mem >>> barriers). >>> >>> This is probably not worthwhile for most cases, but slab did strike me >>> as a potential candidate (however the complication here is that some >>> code I think uses the refcount of underlying pages of slab allocations >>> eg nommu code). So it is not a complete patch, but I wonder if anyone >>> thinks the savings might be worth the complexity? >>> >>> Is there any particular code that is really heavy on slab allocations? >>> That isn't mostly handled by the slab's internal freelists? >> Hi Nick >> >> After reading your patch, I have some crazy idea. >> >> The atomic op + mem barrier you want to avoid could be avoided more >> generally just by changing atomic_dec_and_test(atomic_t *v). >> >> If the current thread is the last referer (refcnt = 1), then it can safely >> set the value to 0 because no other CPU can be touching the value (or else >> there must be a bug somewhere, as the 'other cpu' could touch the value >> just after us and we could free an object still in use by 'other cpu' >> > > I think that would work for this case, but you change the semantics > of the function for all users which is bad. Yes :) I did a test with a patched kernel and I got : BUG: atomic counter underflow at: show_trace+0x20/0x22 dump_stack+0x1e/0x20 _atomic_dec_and_lock+0x78/0x88 dput+0xbf/0x187 path_release+0x14/0x30 __link_path_walk+0x36d/0xd5f link_path_walk+0x52/0xd6 do_path_lookup+0xfc/0x220 __path_lookup_intent_open+0x3e/0x73 path_lookup_open+0x35/0x37 open_namei+0x83/0x631 do_filp_open+0x38/0x56 do_sys_open+0x5c/0x99 sys_open+0x27/0x29 sysenter_past_esp+0x54/0x75 So we cannot change atomic_dec_and_test(atomic_t *v) but introduce a new function like : int atomic_dec_refcount(atomic_t *v) { #ifdef CONFIG_SMP /* avoid an atomic op if we are the last user of this refcount */ if (atomic_read(v) == 1) { atomic_set(v, 0); /* not a real atomic op on most machines */ return 1; } #endif return atomic_dec_and_test(v); } The cost of the extra conditional branch is worth, if it can avoid an atomic op. > > Such a test could be open coded in __free_page, although that does > add a branch + some icache, but that might also be an option. (and > my patch does also add to total icache footprint and is much uglier ;)) > > Thanks, > Nick > > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org