From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qg0-f44.google.com (mail-qg0-f44.google.com [209.85.192.44]) by kanga.kvack.org (Postfix) with ESMTP id 430C36B0038 for ; Thu, 10 Sep 2015 06:43:04 -0400 (EDT) Received: by qgev79 with SMTP id v79so31144939qge.0 for ; Thu, 10 Sep 2015 03:43:04 -0700 (PDT) Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id 87si12730165qkx.83.2015.09.10.03.43.03 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 10 Sep 2015 03:43:03 -0700 (PDT) Date: Thu, 10 Sep 2015 12:42:53 +0200 From: Jesper Dangaard Brouer Subject: Re: Is it OK to pass non-acquired objects to kfree? Message-ID: <20150910124253.6000cc77@redhat.com> In-Reply-To: References: <20150909184415.GJ4029@linux.vnet.ibm.com> <20150909203642.GO4029@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Dmitry Vyukov Cc: brouer@redhat.com, Christoph Lameter , "Paul E. McKenney" , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , "linux-mm@kvack.org" , Andrey Konovalov , Alexander Potapenko , Eric Dumazet On Thu, 10 Sep 2015 11:55:35 +0200 Dmitry Vyukov wrote: > On Thu, Sep 10, 2015 at 1:31 AM, Christoph Lameter wrote: > > On Wed, 9 Sep 2015, Paul E. McKenney wrote: > > > >> Either way, Dmitry's tool got a hit on real code using the slab > >> allocators. If that hit is a false positive, then clearly Dmitry > >> needs to fix his tool, however, I am not (yet) convinced that it is a > >> false positive. If it is not a false positive, we might well need to > >> articulate the rules for use of the slab allocators. > > > > Could I get a clear definiton as to what exactly is positive? Was this > > using SLAB, SLUB or SLOB? > > > >> > This would all use per cpu data. As soon as a handoff is required within > >> > the allocators locks are being used. So I would say no. > >> > >> As in "no, it is not necessary for the caller of kfree() to invoke barrier() > >> in this example", right? > > > > Actually SLUB contains a barrier already in kfree(). Has to be there > > because of the way the per cpu pointer is being handled. > > The positive was reporting of data races in the following code: > > // kernel/pid.c > if ((atomic_read(&pid->count) == 1) || > atomic_dec_and_test(&pid->count)) { > kmem_cache_free(ns->pid_cachep, pid); > put_pid_ns(ns); > } > > //drivers/tty/tty_buffer.c > while ((next = buf->head->next) != NULL) { > tty_buffer_free(port, buf->head); > buf->head = next; > } > > Namely, the tool reported data races between usage of the object in > other threads before they released the object and kfree. > > I am not sure why we are so concentrated on details like SLAB vs SLUB > vs SLOB or cache coherency protocols. This looks like waste of time to > me. General kernel code should not be safe only when working with SLxB > due to current implementation details of SLxB, it should be safe > according to memory allocator contract. And this contract seem to be: > memory allocator can do arbitrary reads and writes to the object > inside of kmalloc and kfree. > Similarly for memory model. There is officially documented kernel > memory model, which all general kernel code must adhere to. Reasoning > about whether a particular piece of code works on architecture X, or > how exactly it can break on architecture Y in unnecessary in such > context. In the end, there can be memory allocator implementation and > new architectures. > > My question is about contracts, not about current implementation > details or specific architectures. > > There are memory allocator implementations that do reads and writes of > the object, and there are memory allocator implementations that do not > do any barriers on fast paths. From this follows that objects must be > passed in quiescent state to kfree. > Now, kernel memory model says "A load-load control dependency requires > a full read memory barrier". > From this follows that the following code is broken: > > // kernel/pid.c > if ((atomic_read(&pid->count) == 1) || > atomic_dec_and_test(&pid->count)) { > kmem_cache_free(ns->pid_cachep, pid); > put_pid_ns(ns); > } > > and it should be: > > // kernel/pid.c > if ((smp_load_acquire(&pid->count) == 1) || > atomic_dec_and_test(&pid->count)) { > kmem_cache_free(ns->pid_cachep, pid); > put_pid_ns(ns); > } > This reminds me of some code in the network stack[1] in kfree_skb() where we have a smp_rmb(). Should we have used smp_load_acquire() ? void kfree_skb(struct sk_buff *skb) { if (unlikely(!skb)) return; if (likely(atomic_read(&skb->users) == 1)) smp_rmb(); else if (likely(!atomic_dec_and_test(&skb->users))) return; trace_kfree_skb(skb, __builtin_return_address(0)); __kfree_skb(skb); } EXPORT_SYMBOL(kfree_skb); [1] https://github.com/torvalds/linux/blob/v4.2-rc8/net/core/skbuff.c#L690 -- Best regards, Jesper Dangaard Brouer MSc.CS, Sr. Network Kernel Developer at Red Hat Author of http://www.iptv-analyzer.org LinkedIn: http://www.linkedin.com/in/brouer -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org