linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Christoph Lameter <cl@linux.com>,
	Pekka Enberg <penberg@kernel.org>,
	David Rientjes <rientjes@google.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: Re: [PATCH] slub: limit count of partial slabs scanned to gather statistics
Date: Tue, 5 May 2020 08:46:41 +0300	[thread overview]
Message-ID: <fa0a6e28-0b68-c6d0-eb5d-8b180b86230f@yandex-team.ru> (raw)
In-Reply-To: <20200504125656.e3d04b350c807aba8a2a7271@linux-foundation.org>

On 04/05/2020 22.56, Andrew Morton wrote:
> On Mon, 04 May 2020 19:07:39 +0300 Konstantin Khlebnikov <khlebnikov@yandex-team.ru> wrote:
> 
>> To get exact count of free and used objects slub have to scan list of
>> partial slabs. This may take at long time. Scanning holds spinlock and
>> blocks allocations which move partial slabs to per-cpu lists and back.
>>
>> Example found in the wild:
>>
>> # cat /sys/kernel/slab/dentry/partial
>> 14478538 N0=7329569 N1=7148969
>> # time cat /sys/kernel/slab/dentry/objects
>> 286225471 N0=136967768 N1=149257703
>>
>> real	0m1.722s
>> user	0m0.001s
>> sys	0m1.721s
> 
> I assume this could trigger the softlockup detector or even NMI
> watchdog in some situations?

Yes, irqs are disabled here. But loop itself is pretty fast.
It requires terabytes of ram to reach common thresholds for watchdogs.

> 
>> The same problem in slab was addressed in commit f728b0a5d72a ("mm, slab:
>> faster active and free stats") by adding more kmem cache statistics.
>> For slub same approach requires atomic op on fast path when object frees.
>>
>> Let's simply limit count of scanned slabs and print warning.
>> Limit set in /sys/module/slub/parameters/max_partial_to_count.
>> Default is 10000 which should be enough for most sane cases.
>>
>> Return linear approximation if list of partials is longer than limit.
>> Nobody should notice difference.
> 
> That's a pretty sad "solution" :(
> 
> But I guess it's better than nothing at all, unless there are
> alternative ideas?

Running this loop till the end adds more problems than gives information.
Adding new  percpu or atomic counters to fast paths seems redundant even for debugging.

Actually there is no much sense in accurate statistics for count of objects,
when there are millions of them.

Memory consumption here is defined by count and size of slabs.

> 
>> --- a/mm/slub.c
>> +++ b/mm/slub.c
>> @@ -2407,16 +2407,29 @@ static inline unsigned long node_nr_objs(struct kmem_cache_node *n)
>>   #endif /* CONFIG_SLUB_DEBUG */
>>   
>>   #if defined(CONFIG_SLUB_DEBUG) || defined(CONFIG_SYSFS)
>> +
>> +static unsigned long max_partial_to_count __read_mostly = 10000;
>> +module_param(max_partial_to_count, ulong, 0644);
>> +
>>   static unsigned long count_partial(struct kmem_cache_node *n,
>>   					int (*get_count)(struct page *))
>>   {
>> +	unsigned long counted = 0;
>>   	unsigned long flags;
>>   	unsigned long x = 0;
>>   	struct page *page;
>>   
>>   	spin_lock_irqsave(&n->list_lock, flags);
>> -	list_for_each_entry(page, &n->partial, slab_list)
>> +	list_for_each_entry(page, &n->partial, slab_list) {
>>   		x += get_count(page);
>> +
>> +		if (++counted > max_partial_to_count) {
>> +			pr_warn_once("SLUB: too much partial slabs to count all objects, increase max_partial_to_count.\n");
>> +			/* Approximate total count of objects */
>> +			x = mult_frac(x, n->nr_partial, counted);
>> +			break;
>> +		}
>> +	}
>>   	spin_unlock_irqrestore(&n->list_lock, flags);
>>   	return x;
>>   }


  reply	other threads:[~2020-05-05  5:46 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-04 16:07 Konstantin Khlebnikov
2020-05-04 19:56 ` Andrew Morton
2020-05-05  5:46   ` Konstantin Khlebnikov [this message]
2020-05-08  3:18   ` Christopher Lameter
2020-05-04 21:19 ` David Rientjes
2020-05-05  6:20   ` Konstantin Khlebnikov
2020-05-06 11:56 ` Vlastimil Babka
2020-05-07  5:25   ` Konstantin Khlebnikov
2020-05-07 14:12     ` Vlastimil Babka
2020-05-06 19:06 ` Qian Cai
2020-05-07  3:01   ` Qian Cai
2020-05-07  3:20     ` Stephen Rothwell
2020-05-07  5:15     ` Konstantin Khlebnikov
2020-05-07 18:24       ` David Rientjes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fa0a6e28-0b68-c6d0-eb5d-8b180b86230f@yandex-team.ru \
    --to=khlebnikov@yandex-team.ru \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox