Re: [PATCH] mm/slub: add missing TID updates on slab deactivation

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Jann Horn <jannh@google.com>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Lameter <cl@linux.com>,
	Pekka Enberg <penberg@kernel.org>,
	 David Rientjes <rientjes@google.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	 Andrew Morton <akpm@linux-foundation.org>,
	Hyeonggon Yoo <42.hyeyoo@gmail.com>,
	linux-mm@kvack.org,  linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm/slub: add missing TID updates on slab deactivation
Date: Tue, 14 Jun 2022 17:54:24 +0200	[thread overview]
Message-ID: <CAG48ez2grcFO9FUApstmmr6QcVxbx=68MDXRtP_hQqAmYWn17w@mail.gmail.com> (raw)
In-Reply-To: <95a9f679-93d9-548a-fc26-985ec605e7f8@suse.cz>

On Tue, Jun 14, 2022 at 10:23 AM Vlastimil Babka <vbabka@suse.cz> wrote:
> On 6/8/22 20:22, Jann Horn wrote:
> > The fastpath in slab_alloc_node() assumes that c->slab is stable as long as
> > the TID stays the same. However, two places in __slab_alloc() currently
> > don't update the TID when deactivating the CPU slab.
> >
> > If multiple operations race the right way, this could lead to an object
> > getting lost; or, in an even more unlikely situation, it could even lead to
> > an object being freed onto the wrong slab's freelist, messing up the
> > `inuse` counter and eventually causing a page to be freed to the page
> > allocator while it still contains slab objects.
[...]
> > Fixes: c17dda40a6a4e ("slub: Separate out kmem_cache_cpu processing from deactivate_slab")
> > Fixes: 03e404af26dc2 ("slub: fast release on full slab")
> > Cc: stable@vger.kernel.org
>
> Hmm these are old commits, and currently oldest LTS is 4.9, so this will be
> fun. Worth doublechecking if it's not recent changes that actually
> introduced the bug... but seems not, AFAICS.
[...]
> > @@ -2936,6 +2936,7 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
> >
> >       if (!freelist) {
> >               c->slab = NULL;
> > +             c->tid = next_tid(c->tid);
> >               local_unlock_irqrestore(&s->cpu_slab->lock, flags);
>
> So this immediate unlock after setting NULL is new from the 5.15 preempt-rt
> changes. However even in older versions we could goto new_slab,
> new_slab_objects(), new_slab(), allocate_slab(), where if
> (gfpflags_allow_blocking()) local_irq_enable(); (there's no extra disabled
> preemption besides the irq disable) so I'd say the bug was possible before
> too, but less often?

Yeah, I think so too.

> >               stat(s, DEACTIVATE_BYPASS);
> >               goto new_slab;
> > @@ -2968,6 +2969,7 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
> >       freelist = c->freelist;
> >       c->slab = NULL;
> >       c->freelist = NULL;
>
> Previously these were part of deactivate_slab(), which does that at the very
> end, but also without bumping tid.
> I just wonder if it's necessary too, because IIUC the scenario you described
> relies on the missing bump above. This alone doesn't cause the c->slab vs
> c->freelist mismatch?

It's a different scenario, but at least in the current version, the
ALLOC_NODE_MISMATCH case jumps straight to the deactivate_slab label,
which takes the local_lock, grabs the old c->freelist, NULLs out
->slab and ->freelist, then drops the local_lock again. If the
c->freelist was non-NULL, then this will prevent concurrent cmpxchg
success; but there is no reason why c->freelist has to be non-NULL
here. So if c->freelist is already NULL, we basically just take the
local_lock, set c->slab to NULL, and drop the local_lock. And IIUC the
local_lock is the only protection we have here against concurrency,
since the slub_get_cpu_ptr() in __slab_alloc() only disables
migration? So again a concurrent fastpath free should be able to set
c->freelist to non-NULL after c->slab has been set to NULL.

So I think this TID bump is also necessary for correctness in the
current version.

And looking back at older kernels, back to at least 4.9, the
ALLOC_NODE_MISMATCH case looks similarly broken - except that again,
as you pointed out, we don't have the fine-grained locking, so it only
becomes racy if we hit new_slab_objects() -> new_slab() ->
allocate_slab() and then either we do local_irq_enable() or the
allocation fails.

> Thanks. Applying to slab/for-5.19-rc3/fixes branch.

Thanks!

next prev parent reply	other threads:[~2022-06-14 15:55 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-08 18:22 Jann Horn
2022-06-09 11:58 ` Christoph Lameter
2022-06-12 22:45 ` David Rientjes
2022-06-13  3:19 ` Muchun Song
2022-06-13 12:49 ` Hyeonggon Yoo
2022-06-14  8:23 ` Vlastimil Babka
2022-06-14 15:54   ` Jann Horn [this message]
2022-06-15  7:18     ` Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAG48ez2grcFO9FUApstmmr6QcVxbx=68MDXRtP_hQqAmYWn17w@mail.gmail.com' \
    --to=jannh@google.com \
    --cc=42.hyeyoo@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox