linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Hao Li <hao.li@linux.dev>
To: Harry Yoo <harry.yoo@oracle.com>
Cc: Alan Stern <stern@rowland.harvard.edu>,
	linux-mm@kvack.org,  Dmitry Vyukov <dvyukov@google.com>,
	lkmm@lists.linux.dev, linux-arch@vger.kernel.org,
	 linux-kernel@vger.kernel.org,
	Joel Fernandes <joelagnelf@nvidia.com>,
	 Daniel Lustig <dlustig@nvidia.com>,
	Akira Yokosawa <akiyks@gmail.com>,
	 "Paul E. McKenney" <paulmck@kernel.org>,
	Luc Maranget <luc.maranget@inria.fr>,
	 Jade Alglave <j.alglave@ucl.ac.uk>,
	David Howells <dhowells@redhat.com>,
	 Nicholas Piggin <npiggin@gmail.com>,
	Boqun Feng <boqun@kernel.org>,
	 Peter Zijlstra <peterz@infradead.org>,
	Will Deacon <will@kernel.org>,
	 Andrea Parri <parri.andrea@gmail.com>,
	Pedro Falcato <pfalcato@suse.de>,
	 Vlastimil Babka <vbabka@suse.cz>,
	Christoph Lameter <cl@gentwo.org>,
	 David Rientjes <rientjes@google.com>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	 Shakeel Butt <shakeel.butt@linux.dev>,
	Venkat Rao Bagalkote <venkat88@linux.ibm.com>,
	 Mateusz Guzik <mjguzik@gmail.com>,
	Suren Baghdasaryan <surenb@google.com>,
	 Marco Elver <elver@google.com>
Subject: Re: [BUG] Memory ordering between kmalloc() and kfree()? it's confusing!
Date: Fri, 27 Feb 2026 16:06:37 +0800	[thread overview]
Message-ID: <yf7gon3s3efwiwpsdytujnhohnxjc3fgu7oabmia4tbhwqgcs7@rzb7sfy2wrbu> (raw)
In-Reply-To: <aaByMLSAIJM8HdbO@hyeyoo>

On Fri, Feb 27, 2026 at 01:17:52AM +0900, Harry Yoo wrote:
> On Thu, Feb 26, 2026 at 10:45:55AM -0500, Alan Stern wrote:
> > On Thu, Feb 26, 2026 at 03:35:08PM +0900, Harry Yoo wrote:
> > > Hello, SLAB, LKMM, and KCSAN folks!
> > > 
> > > I'd like to discuss slab's assumption on users regarding memory ordering.
> > > 
> > > Recently, I've been investigating an interesting slab memory ordering
> > > issue [3] [4] in v7.0-rc1, which made me think about memory ordering
> > > for slab objects.
> > > 
> > > But without answering "What does slab expect users to do for correct
> > > operation?", I kept getting puzzled, and my brain hurt too much :/
> > > I'm writing things down to stop getting confused :)
> > > 
> > > Since I have never thought about this before, my reasoning could be
> > > partially or entirely incorrect. If so, please kindly let me know.
> > > 
> > > # Slab's assumption: Stores to object, its metadata, or struct slab
> > > # must be visible to the CPU that frees the object, when it is
> > > # passed to kfree(). It's users' responsibility to guarantee that.
> > > 
> > > When the slab allocator allocates an object, it updates its metadata and
> > > struct slab fields. After allocation, the user of slab updates object's
> > > content. As long as the object is freed on the same CPU that it was
> > > allocated, kfree() can see those stores (A CPU must be able to see
> > > what's in its store buffer), so no problem!
> > > 
> > > However, when e.g.) the pointer to object is stored in a shared variable
> > > and then freed on a different CPU, things become trickier.
> > > 
> > > In this case, I think it's fair for the slab allocator to assume that:
> > > 
> > >   1) Such stores must involve _at least_ a release barrier
> > >      (for example, via {cmp,}xchg{,_release}, or smp_store_release())
> > >      to ensure preceding stores are visible to other CPUs before
> > >      the pointer store becomes visible, and
> > > 
> > >   2) The CPU that frees an object must invoke at least an acquire
> > >      barrier to ensure that stores to object content / metadata, etc.,
> > >      are visible to the freeing CPU when it calls kfree().
> > > 
> > > Because the slab allocator itself doesn't guarantee that such
> > > barriers are invoked within the allocator, it relies on users to
> > > do this when needed.
> > 
> > It doesn't?  Then how does the slab allocator guarantee that two 
> > different CPUs won't try to perform allocations or deallocations from 
> > the same slab at the same time, messing everything up?
> 
> Ah, alloc/free slowpaths do use cmpxchg128 or spinlock and
> don't mess things up.
> 
> But fastpath allocs/frees are served from percpu array that is protected
> by a local_lock. local_lock has a compiler barrier in it, but that's
> not enough.

Hmm, this memory-ordering issue is indeed pretty mind-bending. I'd like to
share a few thoughts as well. Happy to be corrected!

For our current problem, I think the key lies in the relative ordering between
the two variables, stride and obj_exts. To address it, we need to ensure that
on the writer side, stride is assigned before obj_exts. And on the reader
side, we need to guarantee that if it can observe the latest value of
obj_exts, then it must also be able to observe the latest value of stride. If
this understanding is correct, then even if the slab API caller inserts a
memory barrier between alloc and free, or uses a spinlock (or any statement
that provides an equivalent memory-barrier effect), it would only ensure that
the writes to the pair {stride, obj_exts} as a whole happen-before the reads
of {stride, obj_exts} as a whole. However, it still wouldn't be able to
guarantee the ordering between the two variables: stride and obj_exts.

-- 
Thanks,
Hao


  parent reply	other threads:[~2026-02-27  8:07 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-26  6:35 Harry Yoo
2026-02-26 15:45 ` Alan Stern
2026-02-26 16:17   ` Harry Yoo
2026-02-26 16:42     ` Alan Stern
2026-02-26 17:11       ` Harry Yoo
2026-02-26 18:06         ` Alan Stern
2026-02-26 17:59     ` Christoph Lameter (Ampere)
2026-02-27  8:06     ` Hao Li [this message]
2026-02-27  9:03       ` Harry Yoo
2026-02-27  9:14 ` Akira Yokosawa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=yf7gon3s3efwiwpsdytujnhohnxjc3fgu7oabmia4tbhwqgcs7@rzb7sfy2wrbu \
    --to=hao.li@linux.dev \
    --cc=akiyks@gmail.com \
    --cc=boqun@kernel.org \
    --cc=cl@gentwo.org \
    --cc=dhowells@redhat.com \
    --cc=dlustig@nvidia.com \
    --cc=dvyukov@google.com \
    --cc=elver@google.com \
    --cc=harry.yoo@oracle.com \
    --cc=j.alglave@ucl.ac.uk \
    --cc=joelagnelf@nvidia.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lkmm@lists.linux.dev \
    --cc=luc.maranget@inria.fr \
    --cc=mjguzik@gmail.com \
    --cc=npiggin@gmail.com \
    --cc=parri.andrea@gmail.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=pfalcato@suse.de \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeel.butt@linux.dev \
    --cc=stern@rowland.harvard.edu \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=venkat88@linux.ibm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox