linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: David Rientjes <rientjes@google.com>,
	Jianfeng Wang <jianfeng.w.wang@oracle.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, cl@linux.com,
	akpm@linux-foundation.org, penberg@kernel.org
Subject: Re: [PATCH v3 1/2] slub: introduce count_partial_free_approx()
Date: Mon, 22 Apr 2024 09:49:04 +0200	[thread overview]
Message-ID: <e1a06ea3-57b2-4562-895b-a2fb5d5667cc@suse.cz> (raw)
In-Reply-To: <3e5d2937-76ab-546b-9ce8-7e7140424278@google.com>

On 4/20/24 2:18 AM, David Rientjes wrote:
> On Fri, 19 Apr 2024, Jianfeng Wang wrote:
> 
>> diff --git a/mm/slub.c b/mm/slub.c
>> index 1bb2a93cf7b6..993cbbdd2b6c 100644
>> --- a/mm/slub.c
>> +++ b/mm/slub.c
>> @@ -3213,6 +3213,43 @@ static inline bool free_debug_processing(struct kmem_cache *s,
>>  #endif /* CONFIG_SLUB_DEBUG */
>>  
>>  #if defined(CONFIG_SLUB_DEBUG) || defined(SLAB_SUPPORTS_SYSFS)
>> +#define MAX_PARTIAL_TO_SCAN 10000
>> +
>> +static unsigned long count_partial_free_approx(struct kmem_cache_node *n)
>> +{
>> +	unsigned long flags;
>> +	unsigned long x = 0;
>> +	struct slab *slab;
>> +
>> +	spin_lock_irqsave(&n->list_lock, flags);
>> +	if (n->nr_partial <= MAX_PARTIAL_TO_SCAN) {
>> +		list_for_each_entry(slab, &n->partial, slab_list)
>> +			x += slab->objects - slab->inuse;
>> +	} else {
>> +		/*
>> +		 * For a long list, approximate the total count of objects in
>> +		 * it to meet the limit on the number of slabs to scan.
>> +		 * Scan from both the list's head and tail for better accuracy.
>> +		 */
>> +		unsigned long scanned = 0;
>> +
>> +		list_for_each_entry(slab, &n->partial, slab_list) {
>> +			x += slab->objects - slab->inuse;
>> +			if (++scanned == MAX_PARTIAL_TO_SCAN / 2)
>> +				break;
>> +		}
>> +		list_for_each_entry_reverse(slab, &n->partial, slab_list) {
>> +			x += slab->objects - slab->inuse;
>> +			if (++scanned == MAX_PARTIAL_TO_SCAN)
>> +				break;
>> +		}
>> +		x = mult_frac(x, n->nr_partial, scanned);
>> +		x = min(x, node_nr_objs(n));
>> +	}
>> +	spin_unlock_irqrestore(&n->list_lock, flags);
>> +	return x;
>> +}
> 
> Creative :)
> 
> The default value of MAX_PARTIAL_TO_SCAN seems to work well in practice 
> while being large enough to bias for actual values?
> 
> I can't think of a better way to avoid the disruption that very long 
> partial lists cause.  If the actual value is needed, it will need to be 
> read from the sysfs file for that slab cache.
> 
> It does beg the question of whether we want to extend slabinfo to indicate 
> that some fields are approximations, however.  Adding a suffix such as 
> " : approx" to a slab cache line may be helpful if the disparity in the 
> estimates would actually make a difference in practice.

I'm afraid that changing the layout of /proc/slabinfo has a much higher
chance of breaking some consumer, than the imprecision due to approximation
has. So I would rather not risk it.

> I have a hard time believing that this approximation will not be "close 
> enough" for all practical purposes, given that the value could very well 
> substantially change the instant after the iteration is done anyway.
> 
> So for that reason, this sounds good to me!
> 
> Acked-by: David Rientjes <rientjes@google.com>
> 
>> +
>>  static unsigned long count_partial(struct kmem_cache_node *n,
>>  					int (*get_count)(struct slab *))
>>  {
>> @@ -7089,7 +7126,7 @@ void get_slabinfo(struct kmem_cache *s, struct slabinfo *sinfo)
>>  	for_each_kmem_cache_node(s, node, n) {
>>  		nr_slabs += node_nr_slabs(n);
>>  		nr_objs += node_nr_objs(n);
>> -		nr_free += count_partial(n, count_free);
>> +		nr_free += count_partial_free_approx(n);
>>  	}
>>  
>>  	sinfo->active_objs = nr_objs - nr_free;



  reply	other threads:[~2024-04-22  7:49 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-19 17:56 [PATCH v3 0/2] " Jianfeng Wang
2024-04-19 17:56 ` [PATCH v3 1/2] " Jianfeng Wang
2024-04-20  0:18   ` David Rientjes
2024-04-22  7:49     ` Vlastimil Babka [this message]
2024-04-19 17:56 ` [PATCH v3 2/2] slub: use count_partial_free_approx() in slab_out_of_memory() Jianfeng Wang
2024-04-20  0:18   ` David Rientjes
2024-04-22  7:56 ` [PATCH v3 0/2] slub: introduce count_partial_free_approx() Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e1a06ea3-57b2-4562-895b-a2fb5d5667cc@suse.cz \
    --to=vbabka@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=jianfeng.w.wang@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox