From: Vlastimil Babka <vbabka@suse.cz>
To: David Rientjes <rientjes@google.com>,
Jianfeng Wang <jianfeng.w.wang@oracle.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, cl@linux.com,
akpm@linux-foundation.org, penberg@kernel.org
Subject: Re: [PATCH v3 1/2] slub: introduce count_partial_free_approx()
Date: Mon, 22 Apr 2024 09:49:04 +0200 [thread overview]
Message-ID: <e1a06ea3-57b2-4562-895b-a2fb5d5667cc@suse.cz> (raw)
In-Reply-To: <3e5d2937-76ab-546b-9ce8-7e7140424278@google.com>
On 4/20/24 2:18 AM, David Rientjes wrote:
> On Fri, 19 Apr 2024, Jianfeng Wang wrote:
>
>> diff --git a/mm/slub.c b/mm/slub.c
>> index 1bb2a93cf7b6..993cbbdd2b6c 100644
>> --- a/mm/slub.c
>> +++ b/mm/slub.c
>> @@ -3213,6 +3213,43 @@ static inline bool free_debug_processing(struct kmem_cache *s,
>> #endif /* CONFIG_SLUB_DEBUG */
>>
>> #if defined(CONFIG_SLUB_DEBUG) || defined(SLAB_SUPPORTS_SYSFS)
>> +#define MAX_PARTIAL_TO_SCAN 10000
>> +
>> +static unsigned long count_partial_free_approx(struct kmem_cache_node *n)
>> +{
>> + unsigned long flags;
>> + unsigned long x = 0;
>> + struct slab *slab;
>> +
>> + spin_lock_irqsave(&n->list_lock, flags);
>> + if (n->nr_partial <= MAX_PARTIAL_TO_SCAN) {
>> + list_for_each_entry(slab, &n->partial, slab_list)
>> + x += slab->objects - slab->inuse;
>> + } else {
>> + /*
>> + * For a long list, approximate the total count of objects in
>> + * it to meet the limit on the number of slabs to scan.
>> + * Scan from both the list's head and tail for better accuracy.
>> + */
>> + unsigned long scanned = 0;
>> +
>> + list_for_each_entry(slab, &n->partial, slab_list) {
>> + x += slab->objects - slab->inuse;
>> + if (++scanned == MAX_PARTIAL_TO_SCAN / 2)
>> + break;
>> + }
>> + list_for_each_entry_reverse(slab, &n->partial, slab_list) {
>> + x += slab->objects - slab->inuse;
>> + if (++scanned == MAX_PARTIAL_TO_SCAN)
>> + break;
>> + }
>> + x = mult_frac(x, n->nr_partial, scanned);
>> + x = min(x, node_nr_objs(n));
>> + }
>> + spin_unlock_irqrestore(&n->list_lock, flags);
>> + return x;
>> +}
>
> Creative :)
>
> The default value of MAX_PARTIAL_TO_SCAN seems to work well in practice
> while being large enough to bias for actual values?
>
> I can't think of a better way to avoid the disruption that very long
> partial lists cause. If the actual value is needed, it will need to be
> read from the sysfs file for that slab cache.
>
> It does beg the question of whether we want to extend slabinfo to indicate
> that some fields are approximations, however. Adding a suffix such as
> " : approx" to a slab cache line may be helpful if the disparity in the
> estimates would actually make a difference in practice.
I'm afraid that changing the layout of /proc/slabinfo has a much higher
chance of breaking some consumer, than the imprecision due to approximation
has. So I would rather not risk it.
> I have a hard time believing that this approximation will not be "close
> enough" for all practical purposes, given that the value could very well
> substantially change the instant after the iteration is done anyway.
>
> So for that reason, this sounds good to me!
>
> Acked-by: David Rientjes <rientjes@google.com>
>
>> +
>> static unsigned long count_partial(struct kmem_cache_node *n,
>> int (*get_count)(struct slab *))
>> {
>> @@ -7089,7 +7126,7 @@ void get_slabinfo(struct kmem_cache *s, struct slabinfo *sinfo)
>> for_each_kmem_cache_node(s, node, n) {
>> nr_slabs += node_nr_slabs(n);
>> nr_objs += node_nr_objs(n);
>> - nr_free += count_partial(n, count_free);
>> + nr_free += count_partial_free_approx(n);
>> }
>>
>> sinfo->active_objs = nr_objs - nr_free;
next prev parent reply other threads:[~2024-04-22 7:49 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-19 17:56 [PATCH v3 0/2] " Jianfeng Wang
2024-04-19 17:56 ` [PATCH v3 1/2] " Jianfeng Wang
2024-04-20 0:18 ` David Rientjes
2024-04-22 7:49 ` Vlastimil Babka [this message]
2024-04-19 17:56 ` [PATCH v3 2/2] slub: use count_partial_free_approx() in slab_out_of_memory() Jianfeng Wang
2024-04-20 0:18 ` David Rientjes
2024-04-22 7:56 ` [PATCH v3 0/2] slub: introduce count_partial_free_approx() Vlastimil Babka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e1a06ea3-57b2-4562-895b-a2fb5d5667cc@suse.cz \
--to=vbabka@suse.cz \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=jianfeng.w.wang@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=penberg@kernel.org \
--cc=rientjes@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox