From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Fri, 10 Aug 2007 00:40:59 -0700 From: Andrew Morton Subject: Re: SLUB: Fix dynamic dma kmalloc cache creation Message-Id: <20070810004059.8aa2aadb.akpm@linux-foundation.org> In-Reply-To: <200708100559.l7A5x3r2019930@hera.kernel.org> References: <200708100559.l7A5x3r2019930@hera.kernel.org> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: Christoph Lameter Cc: linux-mm@kvack.org List-ID: On Fri, 10 Aug 2007 05:59:03 GMT Linux Kernel Mailing List wrote: > Gitweb: http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=1ceef40249f21eceabf8633934d94962e7d8e1d7 > Commit: 1ceef40249f21eceabf8633934d94962e7d8e1d7 > Parent: fcda3d89bf1366f6801447eab2d8a75ac5b9c4ce > Author: Christoph Lameter > AuthorDate: Tue Aug 7 15:11:48 2007 -0700 > Committer: Christoph Lameter > CommitDate: Thu Aug 9 21:57:16 2007 -0700 > > SLUB: Fix dynamic dma kmalloc cache creation > > The dynamic dma kmalloc creation can run into trouble if a > GFP_ATOMIC allocation is the first one performed for a certain size > of dma kmalloc slab. > > - Move the adding of the slab to sysfs into a workqueue > (sysfs does GFP_KERNEL allocations) > - Do not call kmem_cache_destroy() (uses slub_lock) > - Only acquire the slub_lock once and--if we cannot wait--do a trylock. > > This introduces a slight risk of the first kmalloc(x, GFP_DMA|GFP_ATOMIC) > for a range of sizes failing due to another process holding the slub_lock. > However, we only need to acquire the spinlock once in order to establish > each power of two DMA kmalloc cache. The possible conflict is with the > slub_lock taken during slab management actions (create / remove slab cache). > > It is rather typical that a driver will first fill its buffers using > GFP_KERNEL allocations which will wait until the slub_lock can be acquired. > Drivers will also create its slab caches first outside of an atomic > context before starting to use atomic kmalloc from an interrupt context. > > If there are any failures then they will occur early after boot or when > loading of multiple drivers concurrently. Drivers can already accomodate > failures of GFP_ATOMIC for other reasons. Retries will then create the slab. > Well that was fairly foul. What was wrong wih turning slub_lock into a spinlock? > static noinline struct kmem_cache *dma_kmalloc_cache(int index, gfp_t flags) > { > struct kmem_cache *s; > - struct kmem_cache *x; > char *text; > size_t realsize; > > @@ -2289,22 +2306,36 @@ static noinline struct kmem_cache *dma_kmalloc_cache(int index, gfp_t flags) > return s; > > /* Dynamically create dma cache */ > - x = kmalloc(kmem_size, flags & ~SLUB_DMA); > - if (!x) > - panic("Unable to allocate memory for dma cache\n"); > + if (flags & __GFP_WAIT) > + down_write(&slub_lock); > + else { > + if (!down_write_trylock(&slub_lock)) > + goto out; > + } > + > + if (kmalloc_caches_dma[index]) > + goto unlock_out; > > realsize = kmalloc_caches[index].objsize; > - text = kasprintf(flags & ~SLUB_DMA, "kmalloc_dma-%d", > - (unsigned int)realsize); > - s = create_kmalloc_cache(x, text, realsize, flags); > - down_write(&slub_lock); > - if (!kmalloc_caches_dma[index]) { > - kmalloc_caches_dma[index] = s; > - up_write(&slub_lock); > - return s; > + text = kasprintf(flags & ~SLUB_DMA, "kmalloc_dma-%d", (unsigned int)realsize), > + s = kmalloc(kmem_size, flags & ~SLUB_DMA); > + > + if (!s || !text || !kmem_cache_open(s, flags, text, > + realsize, ARCH_KMALLOC_MINALIGN, > + SLAB_CACHE_DMA|__SYSFS_ADD_DEFERRED, NULL)) { > + kfree(s); > + kfree(text); > + goto unlock_out; > } > + > + list_add(&s->list, &slab_caches); > + kmalloc_caches_dma[index] = s; > + > + schedule_work(&sysfs_add_work); sysfs_add_work could be already pending, or running. boom. > +unlock_out: > up_write(&slub_lock); > - kmem_cache_destroy(s); > +out: > return kmalloc_caches_dma[index]; > } -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org