From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 60C8DC87FC3
	for <linux-mm@archiver.kernel.org>; Thu, 29 Aug 2024 17:04:17 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id D111F6B00B3; Thu, 29 Aug 2024 13:04:16 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id C97936B00B4; Thu, 29 Aug 2024 13:04:16 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id B1C1F6B00B5; Thu, 29 Aug 2024 13:04:16 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17])
	by kanga.kvack.org (Postfix) with ESMTP id 883CC6B00B3
	for <linux-mm@kvack.org>; Thu, 29 Aug 2024 13:04:16 -0400 (EDT)
Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay06.hostedemail.com (Postfix) with ESMTP id 11DE9A8EE2
	for <linux-mm@kvack.org>; Thu, 29 Aug 2024 17:04:16 +0000 (UTC)
X-FDA: 82505906112.04.176A64D
Received: from mail-pl1-f176.google.com (mail-pl1-f176.google.com [209.85.214.176])
	by imf10.hostedemail.com (Postfix) with ESMTP id 05C6CC0018
	for <linux-mm@kvack.org>; Thu, 29 Aug 2024 17:04:12 +0000 (UTC)
Authentication-Results: imf10.hostedemail.com;
	dkim=pass header.d=google.com header.s=20230601 header.b=B+uE9TyZ;
	dmarc=pass (policy=reject) header.from=google.com;
	spf=pass (imf10.hostedemail.com: domain of surenb@google.com designates 209.85.214.176 as permitted sender) smtp.mailfrom=surenb@google.com
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com;
	s=arc-20220608; t=1724950953;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-type:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=1nrWj+lbgwnv4pKqxuewQJzBbIqoUYBkPvZycuD61mM=;
	b=7JiLM/twuTShtBTEUSo+2VCUWqSbI/JXUZ1prTnkMB9aFCta+T7kZ7yo7+AyyuohqaDrVv
	R/C1kPq4HHRpwoLNoPbfUPcUCwwgDc0KTTKzMNwsPdjt/YYJcm578ms/1uzTDTSwbhl8yH
	BjMabIBFTBNokLwInjb6Pe/ntYcWFHw=
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724950953; a=rsa-sha256;
	cv=none;
	b=YAYB61WQdAKRMf1QEeSgCMFkWfX9X5DBOLyLM9dzG4eorA625TJssAMxusFNVXNSg0gFF0
	RAQFnTavt8PFVNVulNBt8RawyxdXknGT7AVyyYDKft+zkWQVz3bAMeCmMLmiyJuU62RJ+g
	TpEsdlukRsGcWRjLfP6eqHYhfgEZfKk=
ARC-Authentication-Results: i=1;
	imf10.hostedemail.com;
	dkim=pass header.d=google.com header.s=20230601 header.b=B+uE9TyZ;
	dmarc=pass (policy=reject) header.from=google.com;
	spf=pass (imf10.hostedemail.com: domain of surenb@google.com designates 209.85.214.176 as permitted sender) smtp.mailfrom=surenb@google.com
Received: by mail-pl1-f176.google.com with SMTP id d9443c01a7336-20260346ca1so7475ad.0
        for <linux-mm@kvack.org>; Thu, 29 Aug 2024 10:04:12 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20230601; t=1724951051; x=1725555851; darn=kvack.org;
        h=content-transfer-encoding:cc:to:subject:message-id:date:from
         :in-reply-to:references:mime-version:from:to:cc:subject:date
         :message-id:reply-to;
        bh=1nrWj+lbgwnv4pKqxuewQJzBbIqoUYBkPvZycuD61mM=;
        b=B+uE9TyZkAi8wOiJp9RJhvCU2fC0bINO0uKUAmA1qFAsQwx7Ocw7GEOv7zz6CytGhB
         mxIxB2vtwpeAIrROO36y0E3DyK/WFlOWAyXtil01D3BLQrCh92y9G01M/fQcn6xv3kFk
         oPVIyPD2zKAMUx6tW+DY9vfKU15lV/hKsoju3SOHL95l3tWjBi0UUQ01tBZTW3cOTgPo
         Qj1JLit9DSp8/PFX5PnFOVU3AjYE9yEX/v2vPnEINU4Z0UgMJ0LHA/Hw92qZfboEJ105
         KfXY45GCakxzTk47GoZ9BKou2aqy5c8sAcsBztknExy0Gl1yY3RvMyoHbNolr7Z2BloF
         YVfw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1724951051; x=1725555851;
        h=content-transfer-encoding:cc:to:subject:message-id:date:from
         :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=1nrWj+lbgwnv4pKqxuewQJzBbIqoUYBkPvZycuD61mM=;
        b=JWrHwpGMQ/XRkrP5ycVt56TR16HTzSavAjUmSBtMXi1l148vJ56SWSFcNAT3FNubLV
         wmRhtePKmtntKUuURyYt8gluqiqXeSGUqEZ8E48w/34PKNHqIBHtSn4eiZylIn3R+thn
         ohYnxls6HbQ9sgHW9RJiUu3SL7Uf6+Y3eZkPD2pCybfWZm5vsySu7OgdqDQrcSieA1ea
         U0mhDOewXBWxkAbuqN802lD4dSZ9sFxF4ZG6F8kHjomw1xL4IYYUScz7jYs83wZOi/aP
         WqEPyP8nT4AjNu0YrUk9i3KcSAES8Ijkb/SXNZAc9MR8fdakHqnZdU2GIlllzUYIzPsm
         5/Ng==
X-Forwarded-Encrypted: i=1; AJvYcCW62nzWp+0weA0yifh/Lb6mZwnRE7GsJXK3CoJigyVf8bU3ZzPIzMdTEJzcwMsNSjJmZ9HQBo8Xng==@kvack.org
X-Gm-Message-State: AOJu0Yxl+BQrQeNe85MndgxxBOxXWf30yYTmI2yCkGBeSO8N0aEBeKZi
	intDZZC4uete+pU/oAG3D7HqXJ6XpnSQjkoiM5331Vfj2sryLOuzmOPvAnNcsHOjyp9alRzCL6H
	N63TG49rjr45AXNeOZkefyWPGVYqdZCPJG6ee
X-Google-Smtp-Source: AGHT+IEKjHEI5fi8CACxEU/wuH7rLZWTlw6hmW+nXaPA8fFpOSL/sPXDl3XdQmTpStVXXT1hAiKSRn50zzP7b1kofVA=
X-Received: by 2002:a17:903:32ca:b0:201:fba5:3ed with SMTP id
 d9443c01a7336-20510d58008mr3167875ad.27.1724951050824; Thu, 29 Aug 2024
 10:04:10 -0700 (PDT)
MIME-Version: 1.0
References: <20240809072532.work.266-kees@kernel.org> <20240809073309.2134488-5-kees@kernel.org>
In-Reply-To: <20240809073309.2134488-5-kees@kernel.org>
From: Suren Baghdasaryan <surenb@google.com>
Date: Thu, 29 Aug 2024 10:03:56 -0700
Message-ID: <CAJuCfpHOqKPzbbkULpWU5g1-8mTLXraQM4taHzajY_cJ-YFWgQ@mail.gmail.com>
Subject: Re: [PATCH 5/5] slab: Allocate and use per-call-site caches
To: Kees Cook <kees@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>, Kent Overstreet <kent.overstreet@linux.dev>, 
	Christoph Lameter <cl@linux.com>, Pekka Enberg <penberg@kernel.org>, David Rientjes <rientjes@google.com>, 
	Joonsoo Kim <iamjoonsoo.kim@lge.com>, Andrew Morton <akpm@linux-foundation.org>, 
	Roman Gushchin <roman.gushchin@linux.dev>, Hyeonggon Yoo <42.hyeyoo@gmail.com>, linux-mm@kvack.org, 
	"GONG, Ruiqi" <gongruiqi@huaweicloud.com>, Jann Horn <jannh@google.com>, 
	Matteo Rizzo <matteorizzo@google.com>, jvoisin <julien.voisin@dustri.org>, 
	Xiu Jianfeng <xiujianfeng@huawei.com>, linux-kernel@vger.kernel.org, 
	linux-hardening@vger.kernel.org
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Rspamd-Server: rspam07
X-Rspamd-Queue-Id: 05C6CC0018
X-Stat-Signature: 7fpjw3jmf4na3ttibs3aye4g9swq7oy4
X-Rspam-User: 
X-HE-Tag: 1724951052-82029
X-HE-Meta: U2FsdGVkX1/JYZ4yhwsWwK53k1yCaFAu0KzjiHvpAOzqBmGRxkTwH1QmCDS48rUZwZHrFZv9R/Xoo7xZ56xMNFHfRiudd5tXNyRm/fQGgJJpTlIcraMWT9MUj0IWxoa86zZLYMOK2urwXsPWIFsf8N7WOqbt7E5BW87oK3xJJCxjdaefesIRLR10H73rMF6rQ7vMu6FdxqIFKus3OAVaXxga5hRdu6wzvtH+NssTHKM9EZ8wYOvCMaYnUI9L7Tqi0Utuie3m+GupSST5S07+DZhgvIsEONXLAYCSLkPRlbsMW/SpfmjaLOagmugirwedoGXx+0vdUmdTFRA/NjboRANB1AK91eBuEplSNsu74ts7lYhzqq9gsfP0IeO42Z4+58tWvqUn2XPPDj41mT7r0V6DAzBjiTlJk3/uVHkxrthgfuNha/efvgVnyZ6ClqDUr/U7ap40E+AKkqLNykaoJkaAay1bB5DQYwz4yDU15iMev2e2rLUxoP5IQ9kAhzwi33VHFp6RnglQQSi9QBoxJuFBSHH1LEvU6Br59/QVY/LPABM8XCVUx9NaP3oMezvHbj9nKDl9dKbJ4ecANubK/MGHnbeGRnpjpm1MuJWZCMkP8NMEQNQpLptOXbQU29TdjGSocYqwJfMF1Js9Klf9/x3vtQFFGM+3utKkzlmHHrMNu0vu01P6R4j08FRwnr/1/6LezhUCWNRL8xbyYwmVlf1A4WzqMYGIZXDFLuXjkPj4bikIpe3xoNA0RFAAK/4I8nPBDL9SsN5Yq3r5EjG/7xSKRGOb0iaGW0rFQ3EZ7jw5TBavHeffHG3fOBgqlOj7ej4/DyQ4ZhJMgJqKPasYQEuMMcKP/7YVE6sRtIQWoQZYhJQdTj63IMS+KvXtbaWfNykKDyW1/zK1m/IkA29KxkSv8TXnuwvBa2FKBZ/xUK5xQpxieaaRW60yJWJpMrxPWF2VdtDcy6NW8Q8qJWA
 m4k+xu8W
 /wOEMNomhZOk0l/Vn5Ah7unQ4IJJc78T9cgF4d6mcn6IhR7Yft3XQO05j92uAHyxSBxaRp8a1LUM+sLyJZtMpQNYlerjw2gmBkHU0kaQAEPdVu195lE0hnRMiDHSAditzbomug4xSx8iL6w9T6+0t5eoWtHTiPKcpRgMJygw7EuqdK+LimzobGi9xb11TNqLG1zyss2xKqqLYpvr3fpqQKoSOtUslXSWwvaWKuLd50speqT+WcYGuSaqypHB8HhxQwPpaVB0pVhqSTiqm3WTylaA7YiWDVIcNpedgnjl6HDqdnP5DPjVtR0gzQVkxvoFQfb1T7et38NckFfTfiawSr7WqkjQZX0nUMG3g3vTdocw4jI0WpIsC/HvzRm1YujLRA3LYwri2G4ITt2breoEgoSQ8I5ICtqKt5OBgrZb5dDj+OhYz0GQf1S2xcJm3ucR4YCOjFWyvdhJXYKut4vtRFRt3NOG3CyODGuSaxm+Z5wBVrcBI/PxLDGRvBypp2XiAsyQDilB4oy2A1dD5Bsq9lro0zeEFPUP3+Vn/
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>
List-Subscribe: <mailto:majordomo@kvack.org>
List-Unsubscribe: <mailto:majordomo@kvack.org>

On Fri, Aug 9, 2024 at 12:33=E2=80=AFAM Kees Cook <kees@kernel.org> wrote:
>
> Use separate per-call-site kmem_cache or kmem_buckets. These are
> allocated on demand to avoid wasting memory for unused caches.
>
> A few caches need to be allocated very early to support allocating the
> caches themselves: kstrdup(), kvasprintf(), and pcpu_mem_zalloc(). Any
> GFP_ATOMIC allocations are currently left to be allocated from
> KMALLOC_NORMAL.
>
> With a distro config, /proc/slabinfo grows from ~400 entries to ~2200.
>
> Since this feature (CONFIG_SLAB_PER_SITE) is redundant to
> CONFIG_RANDOM_KMALLOC_CACHES, mark it a incompatible. Add Kconfig help
> text that compares the features.
>
> Improvements needed:
> - Retain call site gfp flags in alloc_tag meta field to:
>   - pre-allocate all GFP_ATOMIC caches (since their caches cannot
>     be allocated on demand unless we want them to be GFP_ATOMIC
>     themselves...)

I'm currently working on a feature to identify allocations with
__GFP_ACCOUNT known at compile time (similar to how you handle the
size in the previous patch). Might be something you can reuse/extend.

>   - Separate MEMCG allocations as well

Do you mean allocations with __GFP_ACCOUNT or something else?

> - Allocate individual caches within kmem_buckets on demand to
>   further reduce memory usage overhead.
>
> Signed-off-by: Kees Cook <kees@kernel.org>
> ---
> Cc: Suren Baghdasaryan <surenb@google.com>
> Cc: Kent Overstreet <kent.overstreet@linux.dev>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Christoph Lameter <cl@linux.com>
> Cc: Pekka Enberg <penberg@kernel.org>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Roman Gushchin <roman.gushchin@linux.dev>
> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
> Cc: linux-mm@kvack.org
> ---
>  include/linux/alloc_tag.h |   8 +++
>  lib/alloc_tag.c           | 121 +++++++++++++++++++++++++++++++++++---
>  mm/Kconfig                |  19 +++++-
>  mm/slab_common.c          |   1 +
>  mm/slub.c                 |  31 +++++++++-
>  5 files changed, 170 insertions(+), 10 deletions(-)
>
> diff --git a/include/linux/alloc_tag.h b/include/linux/alloc_tag.h
> index f5d8c5849b82..c95628f9b049 100644
> --- a/include/linux/alloc_tag.h
> +++ b/include/linux/alloc_tag.h
> @@ -24,6 +24,7 @@ struct alloc_tag_counters {
>  struct alloc_meta {
>         /* 0 means non-slab, SIZE_MAX means dynamic, and everything else =
is fixed-size. */
>         size_t sized;
> +       void *cache;

I see now where that meta.cache in the previous patch came from...
That part should be moved here.

>  };
>  #define ALLOC_META_INIT(_size) {               \
>                 .sized =3D (__builtin_constant_p(_size) ? (_size) : SIZE_=
MAX), \
> @@ -216,6 +217,13 @@ static inline void alloc_tag_sub(union codetag_ref *=
ref, size_t bytes) {}
>
>  #endif /* CONFIG_MEM_ALLOC_PROFILING */
>
> +#ifdef CONFIG_SLAB_PER_SITE
> +void alloc_tag_early_walk(void);
> +void alloc_tag_site_init(struct codetag *ct, bool ondemand);
> +#else
> +static inline void alloc_tag_early_walk(void) {}
> +#endif
> +
>  #define alloc_hooks_tag(_tag, _do_alloc)                               \
>  ({                                                                     \
>         struct alloc_tag * __maybe_unused _old =3D alloc_tag_save(_tag); =
 \
> diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
> index 6d2cb72bf269..e8a66a7c4a6b 100644
> --- a/lib/alloc_tag.c
> +++ b/lib/alloc_tag.c
> @@ -157,6 +157,89 @@ static void __init procfs_init(void)
>         proc_create_seq("allocinfo", 0400, NULL, &allocinfo_seq_op);
>  }
>
> +#ifdef CONFIG_SLAB_PER_SITE
> +static bool ondemand_ready;
> +
> +void alloc_tag_site_init(struct codetag *ct, bool ondemand)
> +{
> +       struct alloc_tag *tag =3D ct_to_alloc_tag(ct);
> +       char *name;
> +       void *p, *old;
> +
> +       /* Only handle kmalloc allocations. */
> +       if (!tag->meta.sized)
> +               return;
> +
> +       /* Must be ready for on-demand allocations. */
> +       if (ondemand && !ondemand_ready)
> +               return;
> +
> +       old =3D READ_ONCE(tag->meta.cache);
> +       /* Already allocated? */
> +       if (old)
> +               return;
> +
> +       if (tag->meta.sized < SIZE_MAX) {
> +               /* Fixed-size allocations. */
> +               name =3D kasprintf(GFP_KERNEL, "f:%zu:%s:%d", tag->meta.s=
ized, ct->function, ct->lineno);
> +               if (WARN_ON_ONCE(!name))
> +                       return;
> +               /*
> +                * As with KMALLOC_NORMAL, the entire allocation needs to=
 be
> +                * open to usercopy access. :(
> +                */
> +               p =3D kmem_cache_create_usercopy(name, tag->meta.sized, 0=
,
> +                                              SLAB_NO_MERGE, 0, tag->met=
a.sized,
> +                                              NULL);
> +       } else {
> +               /* Dynamically-size allocations. */
> +               name =3D kasprintf(GFP_KERNEL, "d:%s:%d", ct->function, c=
t->lineno);
> +               if (WARN_ON_ONCE(!name))
> +                       return;
> +               p =3D kmem_buckets_create(name, SLAB_NO_MERGE, 0, UINT_MA=
X, NULL);
> +       }
> +       if (p) {
> +               if (unlikely(!try_cmpxchg(&tag->meta.cache, &old, p))) {
> +                       /* We lost the allocation race; clean up. */
> +                       if (tag->meta.sized < SIZE_MAX)
> +                               kmem_cache_destroy(p);
> +                       else
> +                               kmem_buckets_destroy(p);
> +               }
> +       }
> +       kfree(name);
> +}
> +
> +static void alloc_tag_site_init_early(struct codetag *ct)
> +{
> +       /* Explicitly initialize the caches needed to initialize caches. =
*/
> +       if (strcmp(ct->function, "kstrdup") =3D=3D 0 ||
> +           strcmp(ct->function, "kvasprintf") =3D=3D 0 ||
> +           strcmp(ct->function, "pcpu_mem_zalloc") =3D=3D 0)

I hope we can find a better way to distinguish these allocations.
Maybe have a specialized hook for them, like alloc_hooks_early() which
sets a bit inside ct->flags to distinguish them?

> +               alloc_tag_site_init(ct, false);
> +
> +       /* TODO: pre-allocate GFP_ATOMIC caches here. */

You could pre-allocate GFP_ATOMIC caches during
alloc_tag_module_load() only if gfp_flags are known at compile time I
think. I guess for the dynamic case choose_slab() will fall back to
kmalloc_slab()?

> +}
> +#endif
> +
> +static void alloc_tag_module_load(struct codetag_type *cttype,
> +                                 struct codetag_module *cmod)
> +{
> +#ifdef CONFIG_SLAB_PER_SITE
> +       struct codetag_iterator iter;
> +       struct codetag *ct;
> +
> +       iter =3D codetag_get_ct_iter(cttype);
> +       for (ct =3D codetag_next_ct(&iter); ct; ct =3D codetag_next_ct(&i=
ter)) {
> +               if (iter.cmod !=3D cmod)
> +                       continue;
> +
> +               /* TODO: pre-allocate GFP_ATOMIC caches here. */
> +               //alloc_tag_site_init(ct, false);
> +       }
> +#endif
> +}
> +
>  static bool alloc_tag_module_unload(struct codetag_type *cttype,
>                                     struct codetag_module *cmod)
>  {
> @@ -175,8 +258,21 @@ static bool alloc_tag_module_unload(struct codetag_t=
ype *cttype,
>
>                 if (WARN(counter.bytes,
>                          "%s:%u module %s func:%s has %llu allocated at m=
odule unload",
> -                        ct->filename, ct->lineno, ct->modname, ct->funct=
ion, counter.bytes))
> +                        ct->filename, ct->lineno, ct->modname, ct->funct=
ion, counter.bytes)) {
>                         module_unused =3D false;
> +               }
> +#ifdef CONFIG_SLAB_PER_SITE
> +               else if (tag->meta.sized) {
> +                       /* Remove the allocated caches, if possible. */
> +                       void *p =3D READ_ONCE(tag->meta.cache);
> +
> +                       WRITE_ONCE(tag->meta.cache, NULL);

I'm guessing you are not using try_cmpxchg() the same way you did in
alloc_tag_site_init() because a race with any other user is impossible
at the module unload time? If so, a comment mentioning that would be
good.

> +                       if (tag->meta.sized < SIZE_MAX)
> +                               kmem_cache_destroy(p);
> +                       else
> +                               kmem_buckets_destroy(p);
> +               }
> +#endif
>         }
>
>         return module_unused;
> @@ -260,15 +356,16 @@ static void __init sysctl_init(void)
>  static inline void sysctl_init(void) {}
>  #endif /* CONFIG_SYSCTL */
>
> +static const struct codetag_type_desc alloc_tag_desc =3D {
> +       .section        =3D "alloc_tags",
> +       .tag_size       =3D sizeof(struct alloc_tag),
> +       .module_load    =3D alloc_tag_module_load,
> +       .module_unload  =3D alloc_tag_module_unload,
> +};
> +
>  static int __init alloc_tag_init(void)
>  {
> -       const struct codetag_type_desc desc =3D {
> -               .section        =3D "alloc_tags",
> -               .tag_size       =3D sizeof(struct alloc_tag),
> -               .module_unload  =3D alloc_tag_module_unload,
> -       };
> -
> -       alloc_tag_cttype =3D codetag_register_type(&desc);
> +       alloc_tag_cttype =3D codetag_register_type(&alloc_tag_desc);
>         if (IS_ERR(alloc_tag_cttype))
>                 return PTR_ERR(alloc_tag_cttype);
>
> @@ -278,3 +375,11 @@ static int __init alloc_tag_init(void)
>         return 0;
>  }
>  module_init(alloc_tag_init);
> +
> +#ifdef CONFIG_SLAB_PER_SITE
> +void alloc_tag_early_walk(void)
> +{
> +       codetag_early_walk(&alloc_tag_desc, alloc_tag_site_init_early);
> +       ondemand_ready =3D true;
> +}
> +#endif
> diff --git a/mm/Kconfig b/mm/Kconfig
> index 855c63c3270d..4f01cb6dd32e 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -302,7 +302,20 @@ config SLAB_PER_SITE
>         default SLAB_FREELIST_HARDENED
>         select SLAB_BUCKETS
>         help
> -         Track sizes of kmalloc() call sites.
> +         As a defense against shared-cache "type confusion" use-after-fr=
ee
> +         attacks, every kmalloc()-family call allocates from a separate
> +         kmem_cache (or when dynamically sized, kmem_buckets). Attackers
> +         will no longer be able to groom malicious objects via similarly
> +         sized allocations that share the same cache as the target objec=
t.
> +
> +         This increases the "at rest" kmalloc slab memory usage by
> +         roughly 5x (around 7MiB), and adds the potential for greater
> +         long-term memory fragmentation. However, some workloads
> +         actually see performance improvements when single allocation
> +         sites are hot.

I hope you provide the performance and overhead data in the cover
letter when you post v1.

> +
> +         For a similar defense, see CONFIG_RANDOM_KMALLOC_CACHES, which
> +         has less memory usage overhead, but is probabilistic.
>
>  config SLUB_STATS
>         default n
> @@ -331,6 +344,7 @@ config SLUB_CPU_PARTIAL
>  config RANDOM_KMALLOC_CACHES
>         default n
>         depends on !SLUB_TINY
> +       depends on !SLAB_PER_SITE
>         bool "Randomize slab caches for normal kmalloc"
>         help
>           A hardening feature that creates multiple copies of slab caches=
 for
> @@ -345,6 +359,9 @@ config RANDOM_KMALLOC_CACHES
>           limited degree of memory and CPU overhead that relates to hardw=
are and
>           system workload.
>
> +         For a similar defense, see CONFIG_SLAB_PER_SITE, which is
> +         deterministic, but has greater memory usage overhead.
> +
>  endmenu # Slab allocator options
>
>  config SHUFFLE_PAGE_ALLOCATOR
> diff --git a/mm/slab_common.c b/mm/slab_common.c
> index fc698cba0ebe..09506bfa972c 100644
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -1040,6 +1040,7 @@ void __init create_kmalloc_caches(void)
>                 kmem_buckets_cache =3D kmem_cache_create("kmalloc_buckets=
",
>                                                        sizeof(kmem_bucket=
s),
>                                                        0, SLAB_NO_MERGE, =
NULL);
> +       alloc_tag_early_walk();
>  }
>
>  /**
> diff --git a/mm/slub.c b/mm/slub.c
> index 3520acaf9afa..d14102c4b4d7 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -4135,6 +4135,35 @@ void *__kmalloc_large_node_noprof(size_t size, gfp=
_t flags, int node)
>  }
>  EXPORT_SYMBOL(__kmalloc_large_node_noprof);
>
> +static __always_inline
> +struct kmem_cache *choose_slab(size_t size, kmem_buckets *b, gfp_t flags=
,
> +                              unsigned long caller)
> +{
> +#ifdef CONFIG_SLAB_PER_SITE
> +       struct alloc_tag *tag =3D current->alloc_tag;
> +
> +       if (!b && tag && tag->meta.sized &&
> +           kmalloc_type(flags, caller) =3D=3D KMALLOC_NORMAL &&
> +           (flags & GFP_ATOMIC) !=3D GFP_ATOMIC) {

What if allocation is GFP_ATOMIC but a previous allocation from the
same location (same tag) happened without GFP_ATOMIC and
tag->meta.cache was allocated. Why not use that existing cache?
Same if the tag->meta.cache was pre-allocated.


> +               void *p =3D READ_ONCE(tag->meta.cache);
> +
> +               if (!p && slab_state >=3D UP) {
> +                       alloc_tag_site_init(&tag->ct, true);
> +                       p =3D READ_ONCE(tag->meta.cache);
> +               }
> +
> +               if (tag->meta.sized < SIZE_MAX) {
> +                       if (p)
> +                               return p;
> +                       /* Otherwise continue with default buckets. */
> +               } else {
> +                       b =3D p;
> +               }
> +       }
> +#endif
> +       return kmalloc_slab(size, b, flags, caller);
> +}
> +
>  static __always_inline
>  void *__do_kmalloc_node(size_t size, kmem_buckets *b, gfp_t flags, int n=
ode,
>                         unsigned long caller)
> @@ -4152,7 +4181,7 @@ void *__do_kmalloc_node(size_t size, kmem_buckets *=
b, gfp_t flags, int node,
>         if (unlikely(!size))
>                 return ZERO_SIZE_PTR;
>
> -       s =3D kmalloc_slab(size, b, flags, caller);
> +       s =3D choose_slab(size, b, flags, caller);
>
>         ret =3D slab_alloc_node(s, NULL, flags, node, caller, size);
>         ret =3D kasan_kmalloc(s, ret, size, flags);
> --
> 2.34.1
>