From: Kees Cook <keescook@chromium.org>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Vegard Nossum <vegard.nossum@oracle.com>,
"Rafael J. Wysocki" <rafael@kernel.org>,
Robert Moore <robert.moore@intel.com>,
Erik Kaneda <erik.kaneda@intel.com>,
"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
Christoph Lameter <cl@linux.com>,
Andrew Morton <akpm@linux-foundation.org>,
Marco Elver <elver@google.com>, Waiman Long <longman@redhat.com>,
LKML <linux-kernel@vger.kernel.org>,
Linux MM <linux-mm@kvack.org>,
ACPI Devel Maling List <linux-acpi@vger.kernel.org>,
Len Brown <lenb@kernel.org>, Steven Rostedt <rostedt@goodmis.org>,
Roman Gushchin <guro@fb.com>
Subject: Re: slub freelist issue / BUG: unable to handle page fault for address: 000000003ffe0018
Date: Fri, 5 Jun 2020 11:46:36 -0700 [thread overview]
Message-ID: <202006051053.A61A42374C@keescook> (raw)
In-Reply-To: <92d994be-e4f5-b186-4ad7-21828de44967@suse.cz>
On Fri, Jun 05, 2020 at 06:55:27PM +0200, Vlastimil Babka wrote:
>
> On 6/5/20 5:44 PM, Kees Cook wrote:
> > On Fri, Jun 05, 2020 at 04:44:51PM +0200, Vegard Nossum wrote:
> >> On 2020-06-05 16:08, Vlastimil Babka wrote:
> >> > On 6/5/20 3:12 PM, Rafael J. Wysocki wrote:
> >> > > On Fri, Jun 5, 2020 at 2:48 PM Vegard Nossum <vegard.nossum@oracle.com> wrote:
> >> > > >
> >> > > > On 2020-06-05 11:36, Vegard Nossum wrote:
> >> > > > >
> >> > > > > On 2020-06-05 11:11, Vlastimil Babka wrote:
> >> > > > > > So, with Kees' patch reverted, booting with slub_debug=F (or even more
> >> > > > > > specific slub_debug=F,ftrace_event_field) also hits this bug below. I
> >> > > > > > wanted to bisect it, but v5.7 was also bad, and also v5.6. Didn't try
> >> > > > > > further in history. So it's not new at all, and likely very specific to
> >> > > > > > your config+QEMU? (and related to the ACPI error messages that precede
> >> > > > > > it?).
> >> > [...]
> >> > [ 0.140408] ------------[ cut here ]------------
> >> > [ 0.140837] cache_from_obj: Wrong slab cache. Acpi-Namespace but object is from kmalloc-64
> >> > [ 0.141406] WARNING: CPU: 0 PID: 1 at mm/slab.h:524 kmem_cache_free+0x1d3/0x250
> >
> > Ah yes! Good. I had improved this check recently too, and I was worried
> > the freelist pointer patch was somehow blocking it, but I see now that
> > the failing config didn't have CONFIG_SLAB_FREELIST_HARDENED=y. Once
> > SLAB_CONSISTENCY_CHECKS was enabled ("slub_debug=F"), it started
> > tripping. Whew.
> >
> > I wonder if that entire test block should just be removed from
> > cache_from_obj():
> >
> > if (!memcg_kmem_enabled() &&
> > !IS_ENABLED(CONFIG_SLAB_FREELIST_HARDENED) &&
> > !unlikely(s->flags & SLAB_CONSISTENCY_CHECKS))
> > return s;
> >
> > and make this test unconditional? It's mostly only called during free(),
> > and shouldn't be too expensive to be made unconditional. Hmm.
>
> Hmm I have a different idea. The whole cache_from_obj() was added because of
> kmemcg (commit b9ce5ef49f00d) where per-memcg cache can be different from the
> root one. And I just realized this usecase can go away with Roman's series [1].
> But cache_from_obj() also kept the original SLUB consistency check case, and you
> added the freelist hardening case. If kmemcg use case went away it would be nice
> to avoid the virt_to_cache() and check completely again, unless in debugging or
> hardened kernel.
Is it that expensive? (I'm fine with it staying behind debug/hardening,
but if we can make it on by default, that'd be safer.)
> Furthermore, the original SLUB debugging case was an unconditional pr_err() plus
> WARN_ON_ONCE(1), which was kept by commit b9ce5ef49f00d. With freelist
> hardening this all changed to WARN_ONCE. So the second and later cases are not
> reported at all for hardening and also not for explicitly enabled debugging like
> in this case, which is IMHO not ideal.
Oh, I have no problem with WARN vs WARN_ONCE -- there's no reason to
split this. And I'd love the hardening side to gain the tracking call
too, if it's available.
I had just used WARN_ONCE() since sometimes it can be very noisy to keep
warning for some condition that might not be correctable.
> So I propose the following - the freelist hardening case keeps the WARN_ONCE,
> but also a one-line pr_err() for each case so they are not silent. The SLUB
> debugging case is always a full warning, and printing the tracking info if
> enabled and available. Pure kmemcg case does virt_to_cache() for now (until
> hopefully removed by Roman's series) but no checking at all. Would that work for
> everyone?
> [...]
> @@ -520,9 +528,18 @@ static inline struct kmem_cache *cache_from_obj(struct kmem_cache *s, void *x)
> return s;
>
> cachep = virt_to_cache(x);
> - WARN_ONCE(cachep && !slab_equal_or_root(cachep, s),
> - "%s: Wrong slab cache. %s but object is from %s\n",
> - __func__, s->name, cachep->name);
> + if (unlikely(s->flags & SLAB_CONSISTENCY_CHECKS)) {
> + if (WARN(cachep && !slab_equal_or_root(cachep, s),
> + "%s: Wrong slab cache. %s but object is from %s\n",
> + __func__, s->name, cachep->name))
> + slab_print_tracking(cachep, x);
> + } else if (IS_ENABLED(CONFIG_SLAB_FREELIST_HARDENED)) {
> + if (unlikely(cachep && !slab_equal_or_root(cachep, s))) {
> + pr_err("%s: Wrong slab cache. %s but object is from %s\n",
> + __func__, s->name, cachep->name);
> + WARN_ON_ONCE(1);
> + }
> + }
How about just this (in addition to your slab_print_tracking() refactor):
diff --git a/mm/slab.h b/mm/slab.h
index 207c83ef6e06..107b7f6db3c3 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -520,9 +520,10 @@ static inline struct kmem_cache *cache_from_obj(struct kmem_cache *s, void *x)
return s;
cachep = virt_to_cache(x);
- WARN_ONCE(cachep && !slab_equal_or_root(cachep, s),
+ if (WARN(cachep && !slab_equal_or_root(cachep, s),
"%s: Wrong slab cache. %s but object is from %s\n",
- __func__, s->name, cachep->name);
+ __func__, s->name, cachep->name))
+ slab_print_tracking(cachep, x);
return cachep;
}
--
Kees Cook
next prev parent reply other threads:[~2020-06-05 18:46 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-04 17:14 Vegard Nossum
2020-06-04 17:18 ` Vlastimil Babka
2020-06-04 17:20 ` Vegard Nossum
2020-06-04 17:51 ` Kees Cook
2020-06-04 17:57 ` Kees Cook
2020-06-04 18:46 ` Vlastimil Babka
2020-06-05 9:11 ` Vlastimil Babka
2020-06-05 9:36 ` Vegard Nossum
2020-06-05 12:47 ` Vegard Nossum
2020-06-05 13:12 ` Rafael J. Wysocki
2020-06-05 14:08 ` Vlastimil Babka
2020-06-05 14:24 ` Rafael J. Wysocki
2020-06-05 14:44 ` Vegard Nossum
2020-06-05 15:44 ` Kees Cook
2020-06-05 16:37 ` Vegard Nossum
2020-06-05 17:51 ` Kees Cook
2020-06-05 16:55 ` Vlastimil Babka
2020-06-05 18:46 ` Kees Cook [this message]
2020-06-08 10:51 ` Vlastimil Babka
2020-06-06 6:46 ` Rafael J. Wysocki
2020-06-05 21:45 ` Kaneda, Erik
2020-06-11 1:40 ` Kaneda, Erik
2020-06-11 10:54 ` Vlastimil Babka
2020-06-12 12:26 ` Rafael J. Wysocki
2021-03-23 18:32 ` Kirill A. Shutemov
2021-03-23 18:58 ` Vegard Nossum
2021-03-23 19:03 ` Rafael J. Wysocki
2021-03-23 21:54 ` Kaneda, Erik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=202006051053.A61A42374C@keescook \
--to=keescook@chromium.org \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=elver@google.com \
--cc=erik.kaneda@intel.com \
--cc=guro@fb.com \
--cc=lenb@kernel.org \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=longman@redhat.com \
--cc=rafael.j.wysocki@intel.com \
--cc=rafael@kernel.org \
--cc=robert.moore@intel.com \
--cc=rostedt@goodmis.org \
--cc=vbabka@suse.cz \
--cc=vegard.nossum@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox