From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2D428C369A2 for ; Fri, 11 Apr 2025 17:45:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BCD47680017; Fri, 11 Apr 2025 13:45:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B533328017D; Fri, 11 Apr 2025 13:45:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9F58D680017; Fri, 11 Apr 2025 13:45:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 7EBF428017D for ; Fri, 11 Apr 2025 13:45:19 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 132581A065A for ; Fri, 11 Apr 2025 17:45:21 +0000 (UTC) X-FDA: 83322489642.08.6707833 Received: from mail-qt1-f174.google.com (mail-qt1-f174.google.com [209.85.160.174]) by imf28.hostedemail.com (Postfix) with ESMTP id 07BA5C0003 for ; Fri, 11 Apr 2025 17:45:18 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Ytry1afX; spf=pass (imf28.hostedemail.com: domain of surenb@google.com designates 209.85.160.174 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1744393519; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=CmYXhvJwA000VWMeCK8milYpnMkrsLEvAm6JDEvQW/0=; b=WTs0JaS0PT9OBjzW1OmyLHjvikGXDDPeAxMmdsDrYjnu9AD8XVrzg6xGvj0FSnLlHtKR1E PEVjp4ajVAH1q433NWhgjlAvhqniMiUsHTU8oVTkp4pN8tEX2w2suAVgjQSwz6CpJzJDpO djnNk4hj10o/tbhNIimV1XTX7PsdC7M= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Ytry1afX; spf=pass (imf28.hostedemail.com: domain of surenb@google.com designates 209.85.160.174 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1744393519; a=rsa-sha256; cv=none; b=xN/mscA2oslCsbwl58RqpfPBpu0TeWItuRqo54WevEE97yN/jJAg9TMNZFAkspl+eeVaA0 +r4JtRgdUIs1rSWiRMorO4o1mIrdPzaXgTRdwJFOaOgnloxTWXYRUE6m/+m779YOzGbDGr zdDlttqrGqWoLlawuU8KAx+jR/uyVKQ= Received: by mail-qt1-f174.google.com with SMTP id d75a77b69052e-47681dba807so14211cf.1 for ; Fri, 11 Apr 2025 10:45:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1744393518; x=1744998318; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=CmYXhvJwA000VWMeCK8milYpnMkrsLEvAm6JDEvQW/0=; b=Ytry1afXvMVbKX3+vPr3dQo+tVf0D5gk4F9gFNmqexpX8qgP1EDKWuBlHcnOkv2Q4s WNqPAoDo/+mBfoE9k1gxoEX1ABgeDCJf0sW7HQdZPvVR9pqWFBVoGASgX9eVJjp8rZ4S +CU9ykP5vi0X8GsKql0aev8mf0YVPuzWwdjIhXMk576uPshsra/DFlak7vrlSsBRU3d4 rMrLyp2JbaxTHVh3E5NYfD2O5iglde0324dAIOLILBVfJ0R+/TXapqXV4wnK/gPwvdgB mefRFNyLmWhktnf12NBbXwyuAdzEusd7mf1ny1mLHEwB2R3TmIRI7oG2BPVVgc4uTtlU 8hTA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1744393518; x=1744998318; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=CmYXhvJwA000VWMeCK8milYpnMkrsLEvAm6JDEvQW/0=; b=HY7gjSB+uvelok69vt2EZedKK2SESYl5RjB0GIXpPKPWHYmycExc/n5X4ZWk+csC1G EZolE9DumNxp/f0Cf8HgxbPPo6aY/KpnQTfDGvLo0X/o2Ex2wOc6kbhP4aVKKsDroCfs zgurhTwk4dzAuz1jyAjuD/d/Y78eVhoiwRNLCrudS9wKob9/Nxxl5iqM4juYAMM7NJQn 4JR/aSlIfw6nslkjG13s7fr7s3itxcwsK7iDBGSsjovQ+ENaVIRa3evXbifbrPnCZgA0 uPJAkBvDVNAxFpzapSEKJ9oZvjo33GoYKqxkuuICPxMZ2Ozg6y8WfPpBMBDf8yhAQF0E S+Kw== X-Forwarded-Encrypted: i=1; AJvYcCXJ/jHmCkE6esLZJFezKLHt74BZydv0ib2AQmwi/EKX9/Sj37RsvGCn8qmYgprpF374ZgEP7LtbDA==@kvack.org X-Gm-Message-State: AOJu0YyuJWJTUQ5D36W+Bh2pXhN08vMHet59bfqmFIzzLYVIIjbB2Xxy dvX0F5OfaSAT+TTLsTrIq0WrQaTIdIOdBmVEI6fQAHcaVi3Go+eB+OqQwXEkmxo2t2YATc2n1q2 E1JLnQVBmg2abIuRp29ew+uLWv/S21CmMVD0X X-Gm-Gg: ASbGncux97mYlK7ve006Enz+oEJExZZvimaERmVR0uXps6TslRwXw0mX45OMPEpuk4V cwSwDImkkZT6H5lzE1Ju1ZM+BIhfFHORShE7zxLCLoB3/nPAPbQ/aHt5b1wQwHeCShmsZ5/dHP4 ZZOjDWpHYIDWUbXlz4iVQ5 X-Google-Smtp-Source: AGHT+IHpwrjXSN6F4QCOmiuy6BbuSaN1+LsfqI5XkD54bWWbea8IxYWQLbQiHd58FbUQro6BdgwbXf2cDgtUZk2rlbQ= X-Received: by 2002:a05:622a:19a3:b0:477:635f:5947 with SMTP id d75a77b69052e-4798152c235mr109641cf.12.1744393517762; Fri, 11 Apr 2025 10:45:17 -0700 (PDT) MIME-Version: 1.0 References: <20250411155737.1360746-1-surenb@google.com> In-Reply-To: From: Suren Baghdasaryan Date: Fri, 11 Apr 2025 10:45:06 -0700 X-Gm-Features: ATxdqUHCT-k-GR4pvwi0nv1jejzjDGqDtAGhqf1BigdhISanxeHgLCIg8Cp8WgY Message-ID: Subject: Re: [PATCH 1/1] slab: ensure slab->obj_exts is clear in a newly allocated slab page To: Vlastimil Babka Cc: akpm@linux-foundation.org, kent.overstreet@linux.dev, roman.gushchin@linux.dev, cl@linux.com, rientjes@google.com, harry.yoo@oracle.com, souravpanda@google.com, pasha.tatashin@soleen.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 07BA5C0003 X-Rspamd-Server: rspam05 X-Rspam-User: X-Stat-Signature: ezcgpu4np8pcd9xf816zyacwwoqbww6a X-HE-Tag: 1744393518-831724 X-HE-Meta: U2FsdGVkX19Z+2eXESGmUkyLM8sMpbLUaEc3j3sf/zpg3lkwAEscd/c+nQIXUAlTX6dBEdPnvvAaGIVvsZt/fF3PVMk/eSqiflWL+3nC/TPrVrs801/wC2ctzxuCtHQXUhRmZBYZZ+nn4r/hKhdSEkq+DGfZDme9QYsJfQ2rCET8E29uUBTcYWqm+DNt7E7wyV6u5DFVU7SONjLJWe7xb9uQjyK9Vt8EL34a3AaAfbMOJxrYssQmKIvrq3inuk3JuahtM2Y/j9rWytEAQH+RwkdrGLUeX6dvSA6umQL/UNDtS/H2sTfuvwIBMG9awUTbOC1XdbbqJlsodjKAOQ4OPTd6lJbleZO+CiJPFUIEUBmR2Fe9foAhGjANOKoUgGsDQMSUxRPGlnuazzFqV5UE3TGSO2j+FCQsV3kme8MOkBSLH0inhYY6EzJlTymqP1DAbeHzLS5+bM/6pz6rNzbawDKM+5u3oPdBHGFh1NQdJoz26FsfkCN5kUfJEZFl5hSNCfAeM/sp1GM/h9R+3JrnUnJtZVz7mshmT3bLSkhTk+kHIrZUQBtWNTh/Yfx2Jxgz3s4WmUTP+QisklylCuRfO4oWD8+hwGzzCEQn6aNEEC0k5v70XlDk7V1jDbJNqCFzIMI1Y25zvg3UL1sSasr8rGLNYqWeiMs9USdXCalqVUrLBkTfFYWz6XOFkqk/IJ8q1qQAqQJuiSuutXPeifA/rerLu8a+jsTcGr8WzE1y/bC15cjAbsI/4Akk8Sv8ejaAfJDa5Svz80TkeiQwqrZhYOVMOdozkxKjtNEfmq8zj1i8A38DfidIVTGkbjrjauk5d2PafKfwKdaO61srxR9TZPKKFfvsVnql8fryT719xTqc0D7a4pcY/yjpDenl4QzvvNYaG/BdyeeUo/Wt3kBKYSbaLv3IMAq8/2Naeny8OLlzJFpNkJ8ltY8aP/3nGe2/z2ndIl+V7dsWGYvxOdG gcCnx2nC KEt9V7SxhwpBTPGVXCQrVhlRiD0cd0S+TlkodMODc2uFG0zCnr1Cg14tJtSvr9r3znz1S4TT+pZXGjscGle9a4cqi+mu6BpxjDYxf+bCnnSCsx6dLmXHWJr8RlnBNhvW02x8+ExBWfkdoG83loJ/eNABGZo6bQmuKuhywS78UMoo4G7PXEIX8auf6iciZ/Lw8mZ5UZMJ9SE+sSU7qnG63Ti0H7xuinviwJ5D+YPkQoqu8cI6YWxxYMklPqyuLi/+/Aaoux3JFWONVb4sJTprToHErKGjCNKXQkYfn X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Apr 11, 2025 at 9:59=E2=80=AFAM Suren Baghdasaryan wrote: > > On Fri, Apr 11, 2025 at 9:27=E2=80=AFAM Vlastimil Babka = wrote: > > > > On 4/11/25 17:57, Suren Baghdasaryan wrote: > > > ktest recently reported crashes while running several buffered io tes= ts > > > with __alloc_tagging_slab_alloc_hook() at the top of the crash call s= tack. > > > The signature indicates an invalid address dereference with low bits = of > > > slab->obj_exts being set. The bits were outside of the range used by > > > page_memcg_data_flags and objext_flags and hence were not masked out > > > by slab_obj_exts() when obtaining the pointer stored in slab->obj_ext= s. > > > The typical crash log looks like this: > > > > > > 00510 Unable to handle kernel NULL pointer dereference at virtual add= ress 0000000000000010 > > > 00510 Mem abort info: > > > 00510 ESR =3D 0x0000000096000045 > > > 00510 EC =3D 0x25: DABT (current EL), IL =3D 32 bits > > > 00510 SET =3D 0, FnV =3D 0 > > > 00510 EA =3D 0, S1PTW =3D 0 > > > 00510 FSC =3D 0x05: level 1 translation fault > > > 00510 Data abort info: > > > 00510 ISV =3D 0, ISS =3D 0x00000045, ISS2 =3D 0x00000000 > > > 00510 CM =3D 0, WnR =3D 1, TnD =3D 0, TagAccess =3D 0 > > > 00510 GCS =3D 0, Overlay =3D 0, DirtyBit =3D 0, Xs =3D 0 > > > 00510 user pgtable: 4k pages, 39-bit VAs, pgdp=3D0000000104175000 > > > 00510 [0000000000000010] pgd=3D0000000000000000, p4d=3D00000000000000= 00, pud=3D0000000000000000 > > > 00510 Internal error: Oops: 0000000096000045 [#1] SMP > > > 00510 Modules linked in: > > > 00510 CPU: 10 UID: 0 PID: 7692 Comm: cat Not tainted 6.15.0-rc1-ktest= -g189e17946605 #19327 NONE > > > 00510 Hardware name: linux,dummy-virt (DT) > > > 00510 pstate: 20001005 (nzCv daif -PAN -UAO -TCO -DIT +SSBS BTYPE=3D-= -) > > > 00510 pc : __alloc_tagging_slab_alloc_hook+0xe0/0x190 > > > 00510 lr : __kmalloc_noprof+0x150/0x310 > > > 00510 sp : ffffff80c87df6c0 > > > 00510 x29: ffffff80c87df6c0 x28: 000000000013d1ff x27: 000000000013d2= 00 > > > 00510 x26: ffffff80c87df9e0 x25: 0000000000000000 x24: 00000000000000= 01 > > > 00510 x23: ffffffc08041953c x22: 000000000000004c x21: ffffff80c00021= 80 > > > 00510 x20: fffffffec3120840 x19: ffffff80c4821000 x18: 00000000000000= 00 > > > 00510 x17: fffffffec3d02f00 x16: fffffffec3d02e00 x15: fffffffec3d007= 00 > > > 00510 x14: fffffffec3d00600 x13: 0000000000000200 x12: 00000000000000= 06 > > > 00510 x11: ffffffc080bb86c0 x10: 0000000000000000 x9 : ffffffc080201e= 58 > > > 00510 x8 : ffffff80c4821060 x7 : 0000000000000000 x6 : 00000000555555= 56 > > > 00510 x5 : 0000000000000001 x4 : 0000000000000010 x3 : 00000000000000= 60 > > > 00510 x2 : 0000000000000000 x1 : ffffffc080f50cf8 x0 : ffffff80d801d0= 00 > > > 00510 Call trace: > > > 00510 __alloc_tagging_slab_alloc_hook+0xe0/0x190 (P) > > > 00510 __kmalloc_noprof+0x150/0x310 > > > 00510 __bch2_folio_create+0x5c/0xf8 > > > 00510 bch2_folio_create+0x2c/0x40 > > > 00510 bch2_readahead+0xc0/0x460 > > > 00510 read_pages+0x7c/0x230 > > > 00510 page_cache_ra_order+0x244/0x3a8 > > > 00510 page_cache_async_ra+0x124/0x170 > > > 00510 filemap_readahead.isra.0+0x58/0xa0 > > > 00510 filemap_get_pages+0x454/0x7b0 > > > 00510 filemap_read+0xdc/0x418 > > > 00510 bch2_read_iter+0x100/0x1b0 > > > 00510 vfs_read+0x214/0x300 > > > 00510 ksys_read+0x6c/0x108 > > > 00510 __arm64_sys_read+0x20/0x30 > > > 00510 invoke_syscall.constprop.0+0x54/0xe8 > > > 00510 do_el0_svc+0x44/0xc8 > > > 00510 el0_svc+0x18/0x58 > > > 00510 el0t_64_sync_handler+0x104/0x130 > > > 00510 el0t_64_sync+0x154/0x158 > > > 00510 Code: d5384100 f9401c01 b9401aa3 b40002e1 (f8227881) > > > 00510 ---[ end trace 0000000000000000 ]--- > > > 00510 Kernel panic - not syncing: Oops: Fatal exception > > > 00510 SMP: stopping secondary CPUs > > > 00510 Kernel Offset: disabled > > > 00510 CPU features: 0x0000,000000e0,00000410,8240500b > > > 00510 Memory Limit: none > > > > > > Investigation indicates that these bits are already set when we alloc= ate > > > slab page and are not zeroed out after allocation. We are not yet sur= e > > > why these crashes start happening only recently but regardless of the > > > reason, not initializing a field that gets used later is wrong. Fix i= t > > > by initializing slab->obj_exts during slab page allocation. > > > > slab->obj_exts overlays page->memcg_data and the checks on page alloc a= nd > > page free should catch any non-zero values, i.e. page_expected_state() > > page_bad_reason() so if anyone is e.g. UAF-writing there or leaving gar= bage > > there while freeing the page it's a bug. > > > > Perhaps CONFIG_MEMCG is disabled in the ktests and thus the checks are = not > > happening? We could extend them for CONFIG_SLAB_OBJ_EXT checking > > _unused_slab_obj_exts perhaps. But it would be a short lived change, se= e below. > > Correct, CONFIG_MEMCG was disabled during these tests. We added > BUG_ON() in the slab allocation path to trigger on these low bits and > it did trigger but the same assertion in the freeing path did not > catch anything. We suspected 4996fc547f5b ("mm: let _folio_nr_pages > overlay memcg_data in first tail page") to cause this but Kent's > bisection did not confirm that. > > > > > > Fixes: 21c690a349ba ("mm: introduce slabobj_ext to support slab objec= t extensions") > > > Reported-by: Kent Overstreet > > > Tested-by: Kent Overstreet > > > Signed-off-by: Suren Baghdasaryan > > > Acked-by: Kent Overstreet > > > Cc: > > > > We'll need this anyway for the not so far future when struct slab is > > separated from struct page so it's fine but it would still be great to = find > > the underlying buggy code which this is going to hide. > > Yeah, we will try to find the culprit. For now to prevent others from > stepping on this mine I would like to get this in. Thanks! Kent asked me to forward this (his email is misbehaving for some reason): Yes, ktest doesn't flip on CONFIG_MEMCG. Those checks you're talking about are also behind CONFIG_DEBUG_VM, which isn't normally on. I did do some runs with it on and it didn't fire - only additional asserts Suren and I added - so something's missing. In the meantime, this needs to go in quickly as a hotfix because it's a 6.15-rc1 regression, and I've been getting distros to enable memory allocation profiling and I'd be shocked if it doesn't cause memcg crashes as well. > > > > > > --- > > > mm/slub.c | 10 ++++++++++ > > > 1 file changed, 10 insertions(+) > > > > > > diff --git a/mm/slub.c b/mm/slub.c > > > index b46f87662e71..dc9e729e1d26 100644 > > > --- a/mm/slub.c > > > +++ b/mm/slub.c > > > @@ -1973,6 +1973,11 @@ static inline void handle_failed_objexts_alloc= (unsigned long obj_exts, > > > #define OBJCGS_CLEAR_MASK (__GFP_DMA | __GFP_RECLAIMABLE | \ > > > __GFP_ACCOUNT | __GFP_NOFAIL) > > > > > > +static inline void init_slab_obj_exts(struct slab *slab) > > > +{ > > > + slab->obj_exts =3D 0; > > > +} > > > + > > > int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s, > > > gfp_t gfp, bool new_slab) > > > { > > > @@ -2058,6 +2063,10 @@ static inline bool need_slab_obj_ext(void) > > > > > > #else /* CONFIG_SLAB_OBJ_EXT */ > > > > > > +static inline void init_slab_obj_exts(struct slab *slab) > > > +{ > > > +} > > > + > > > static int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache = *s, > > > gfp_t gfp, bool new_slab) > > > { > > > @@ -2637,6 +2646,7 @@ static struct slab *allocate_slab(struct kmem_c= ache *s, gfp_t flags, int node) > > > slab->objects =3D oo_objects(oo); > > > slab->inuse =3D 0; > > > slab->frozen =3D 0; > > > + init_slab_obj_exts(slab); > > > > > > account_slab(slab, oo_order(oo), s, flags); > > > > > > > > > base-commit: c496b37f9061db039b413c03ccd33506175fe6ec > >