Re: [RFC PATCH 1/1] lib/stackdepot: replace CONFIG_STACK_HASH_ORDER with automatic sizing

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Dmitry Vyukov <dvyukov@google.com>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Marco Elver <elver@google.com>,
	Alexander Potapenko <glider@google.com>,
	 Linus Torvalds <torvalds@linux-foundation.org>,
	linux-mm@kvack.org,  linux-kernel@vger.kernel.org,
	Andrey Konovalov <andreyknvl@gmail.com>,
	 kasan-dev@googlegroups.com
Subject: Re: [RFC PATCH 1/1] lib/stackdepot: replace CONFIG_STACK_HASH_ORDER with automatic sizing
Date: Fri, 27 May 2022 14:02:38 +0200	[thread overview]
Message-ID: <CACT4Y+Y4GZfXOru2z5tFPzFdaSUd+GFc6KVL=bsa0+1m197cQQ@mail.gmail.com> (raw)
In-Reply-To: <20220527113706.24870-2-vbabka@suse.cz>

On Fri, 27 May 2022 at 13:37, Vlastimil Babka <vbabka@suse.cz> wrote:
>
> As Linus explained [1], setting the stackdepot hash table size as a
> config option is suboptimal, especially as stackdepot becomes a
> dependency of less specialized subsystems than initially (e.g. DRM,
> networking, SLUB_DEBUG):
>
> : (a) it introduces a new compile-time question that isn't sane to ask
> : a regular user, but is now exposed to regular users.
>
> : (b) this by default uses 1MB of memory for a feature that didn't in
> : the past, so now if you have small machines you need to make sure you
> : make a special kernel config for them.
>
> Ideally we would employ rhashtable for fully automatic resizing, which
> should be feasible for many of the new users, but problematic for the
> original users with restricted context that call __stack_depot_save()
> with can_alloc == false, i.e. KASAN.
>
> However we can easily remove the config option and scale the hash table
> automatically with system memory. The STACK_HASH_MASK constant becomes
> stack_hash_mask variable and is used only in one mask operation, so the
> overhead should be negligible to none. For early allocation we can
> employ the existing alloc_large_system_hash() function and perform
> similar scaling for the late allocation.
>
> The existing limits of the config option (between 4k and 1M buckets)
> are preserved, and scaling factor is set to one bucket per 16kB memory
> so on 64bit the max 1M buckets (8MB memory) is achieved with 16GB
> system, while a 1GB system will use 512kB.

Hi Vlastimil,

We use KASAN with VMs with 2GB of memory.
If I did the math correctly this will result in 128K entries, while
currently we have CONFIG_STACK_HASH_ORDER=20 even for arm32.
I am actually not sure how full the table gets, but we can fuzz a
large kernel for up to an hour, so we can get lots of stacks (we were
the only known users who routinely overflowed default LOCKDEP tables
:)).

I am not opposed to this in general. And I understand that KASAN Is
different from the other users.
What do you think re allowing CONFIG_STACK_HASH_ORDER=0/is not set
which will mean auto-size, but keeping ability to set exact size as
well?
Or alternatively auto-size if KASAN is not enabled and use a large
table otherwise? But I am not sure if anybody used
CONFIG_STACK_HASH_ORDER to reduce the default size with KASAN...



> If needed, the automatic scaling could be complemented with a boot-time
> kernel parameter, but it feels pointless to add it without a specific
> use case.
>
> [1] https://lore.kernel.org/all/CAHk-=wjC5nS+fnf6EzRD9yQRJApAhxx7gRB87ZV+pAWo9oVrTg@mail.gmail.com/
>
> Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---
>  lib/Kconfig      |  9 ---------
>  lib/stackdepot.c | 47 ++++++++++++++++++++++++++++++++++++-----------
>  2 files changed, 36 insertions(+), 20 deletions(-)
>
> diff --git a/lib/Kconfig b/lib/Kconfig
> index 6a843639814f..1e7cf7c76ae6 100644
> --- a/lib/Kconfig
> +++ b/lib/Kconfig
> @@ -682,15 +682,6 @@ config STACKDEPOT_ALWAYS_INIT
>         bool
>         select STACKDEPOT
>
> -config STACK_HASH_ORDER
> -       int "stack depot hash size (12 => 4KB, 20 => 1024KB)"
> -       range 12 20
> -       default 20
> -       depends on STACKDEPOT
> -       help
> -        Select the hash size as a power of 2 for the stackdepot hash table.
> -        Choose a lower value to reduce the memory impact.
> -
>  config REF_TRACKER
>         bool
>         depends on STACKTRACE_SUPPORT
> diff --git a/lib/stackdepot.c b/lib/stackdepot.c
> index 5ca0d086ef4a..f7b73ddfca77 100644
> --- a/lib/stackdepot.c
> +++ b/lib/stackdepot.c
> @@ -145,10 +145,15 @@ depot_alloc_stack(unsigned long *entries, int size, u32 hash, void **prealloc)
>         return stack;
>  }
>
> -#define STACK_HASH_SIZE (1L << CONFIG_STACK_HASH_ORDER)
> -#define STACK_HASH_MASK (STACK_HASH_SIZE - 1)
> +/* one hash table bucket entry per 16kB of memory */
> +#define STACK_HASH_SCALE       14
> +/* limited between 4k and 1M buckets */
> +#define STACK_HASH_ORDER_MIN   12
> +#define STACK_HASH_ORDER_MAX   20
>  #define STACK_HASH_SEED 0x9747b28c
>
> +static unsigned int stack_hash_mask;
> +
>  static bool stack_depot_disable;
>  static struct stack_record **stack_table;
>
> @@ -175,8 +180,6 @@ void __init stack_depot_want_early_init(void)
>
>  int __init stack_depot_early_init(void)
>  {
> -       size_t size;
> -
>         /* This is supposed to be called only once, from mm_init() */
>         if (WARN_ON(__stack_depot_early_init_passed))
>                 return 0;
> @@ -186,10 +189,15 @@ int __init stack_depot_early_init(void)
>         if (!__stack_depot_want_early_init || stack_depot_disable)
>                 return 0;
>
> -       size = (STACK_HASH_SIZE * sizeof(struct stack_record *));
> -       pr_info("Stack Depot early init allocating hash table with memblock_alloc, %zu bytes\n",
> -               size);
> -       stack_table = memblock_alloc(size, SMP_CACHE_BYTES);
> +       stack_table = alloc_large_system_hash("stackdepot",
> +                                               sizeof(struct stack_record *),
> +                                               0,
> +                                               STACK_HASH_SCALE,
> +                                               HASH_EARLY | HASH_ZERO,
> +                                               NULL,
> +                                               &stack_hash_mask,
> +                                               1UL << STACK_HASH_ORDER_MIN,
> +                                               1UL << STACK_HASH_ORDER_MAX);
>
>         if (!stack_table) {
>                 pr_err("Stack Depot hash table allocation failed, disabling\n");
> @@ -207,13 +215,30 @@ int stack_depot_init(void)
>
>         mutex_lock(&stack_depot_init_mutex);
>         if (!stack_depot_disable && !stack_table) {
> -               pr_info("Stack Depot allocating hash table with kvcalloc\n");
> -               stack_table = kvcalloc(STACK_HASH_SIZE, sizeof(struct stack_record *), GFP_KERNEL);
> +               unsigned long entries;
> +
> +               entries = nr_free_buffer_pages();
> +               entries = roundup_pow_of_two(entries);
> +
> +               if (STACK_HASH_SCALE > PAGE_SHIFT)
> +                       entries >>= (STACK_HASH_SCALE - PAGE_SHIFT);
> +               else
> +                       entries <<= (PAGE_SHIFT - STACK_HASH_SCALE);
> +
> +               if (entries < 1UL << STACK_HASH_ORDER_MIN)
> +                       entries = 1UL << STACK_HASH_ORDER_MIN;
> +               if (entries > 1UL << STACK_HASH_ORDER_MAX)
> +                       entries = 1UL << STACK_HASH_ORDER_MAX;
> +
> +               pr_info("Stack Depot allocating hash table of %lu entries with kvcalloc\n",
> +                               entries);
> +               stack_table = kvcalloc(entries, sizeof(struct stack_record *), GFP_KERNEL);
>                 if (!stack_table) {
>                         pr_err("Stack Depot hash table allocation failed, disabling\n");
>                         stack_depot_disable = true;
>                         ret = -ENOMEM;
>                 }
> +               stack_hash_mask = entries - 1;
>         }
>         mutex_unlock(&stack_depot_init_mutex);
>         return ret;
> @@ -386,7 +411,7 @@ depot_stack_handle_t __stack_depot_save(unsigned long *entries,
>                 goto fast_exit;
>
>         hash = hash_stack(entries, nr_entries);
> -       bucket = &stack_table[hash & STACK_HASH_MASK];
> +       bucket = &stack_table[hash & stack_hash_mask];
>
>         /*
>          * Fast path: look the stack trace up without locking.
> --
> 2.36.1
>

next prev parent reply	other threads:[~2022-05-27 12:02 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-27 11:37 [RFC PATCH 0/1] stackdepot hash table autosizing Vlastimil Babka
2022-05-27 11:37 ` [RFC PATCH 1/1] lib/stackdepot: replace CONFIG_STACK_HASH_ORDER with automatic sizing Vlastimil Babka
2022-05-27 12:02   ` Dmitry Vyukov [this message]
2022-06-20 12:00     ` Vlastimil Babka
2022-06-20 12:02       ` Dmitry Vyukov
2022-06-20 15:02 ` [PATCH] " Vlastimil Babka
2022-06-21  7:37   ` Dmitry Vyukov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CACT4Y+Y4GZfXOru2z5tFPzFdaSUd+GFc6KVL=bsa0+1m197cQQ@mail.gmail.com' \
    --to=dvyukov@google.com \
    --cc=andreyknvl@gmail.com \
    --cc=elver@google.com \
    --cc=glider@google.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox