Re: [PATCH] slub: limit number of slabs to scan in count_partial()

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Jianfeng Wang <jianfeng.w.wang@oracle.com>
To: Vlastimil Babka <vbabka@suse.cz>,
	"Christoph Lameter (Ampere)" <cl@linux.com>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"penberg@kernel.org" <penberg@kernel.org>,
	"rientjes@google.com" <rientjes@google.com>,
	"iamjoonsoo.kim@lge.com" <iamjoonsoo.kim@lge.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	Junxiao Bi <junxiao.bi@oracle.com>
Subject: Re: [PATCH] slub: limit number of slabs to scan in count_partial()
Date: Sat, 13 Apr 2024 01:17:05 +0000	[thread overview]
Message-ID: <5552D041-8549-4E76-B3EC-03C76C117077@oracle.com> (raw)
In-Reply-To: <e348dfcd-6944-4500-bf84-c58b8c2e657f@oracle.com>


> On Apr 12, 2024, at 1:44 PM, Jianfeng Wang <jianfeng.w.wang@oracle.com> wrote:
> 
> On 4/12/24 1:20 PM, Vlastimil Babka wrote:
>> On 4/12/24 7:29 PM, Jianfeng Wang wrote:
>>> 
>>> 
>>> On 4/12/24 12:48 AM, Vlastimil Babka wrote:
>>>> On 4/11/24 7:02 PM, Christoph Lameter (Ampere) wrote:
>>>>> On Thu, 11 Apr 2024, Jianfeng Wang wrote:
>>>>> 
>>>>>> So, the fix is to limit the number of slabs to scan in
>>>>>> count_partial(), and output an approximated result if the list is too
>>>>>> long. Default to 10000 which should be enough for most sane cases.
>>>>> 
>>>>> 
>>>>> That is a creative approach. The problem though is that objects on the 
>>>>> partial lists are kind of sorted. The partial slabs with only a few 
>>>>> objects available are at the start of the list so that allocations cause 
>>>>> them to be removed from the partial list fast. Full slabs do not need to 
>>>>> be tracked on any list.
>>>>> 
>>>>> The partial slabs with few objects are put at the end of the partial list 
>>>>> in the hope that the few objects remaining will also be freed which would 
>>>>> allow the freeing of the slab folio.
>>>>> 
>>>>> So the object density may be higher at the beginning of the list.
>>>>> 
>>>>> kmem_cache_shrink() will explicitly sort the partial lists to put the 
>>>>> partial pages in that order.
>>>>> 

Realized that I’d do "echo 1 > /sys/kernel/slab/dentry/shrink” to sort the list explicitly.
After that, the numbers become:
N = 10000 -> diff = 7.1 %
N = 20000 -> diff = 5.7 %
N = 25000 -> diff = 5.4 %
So, expecting ~5-7% difference after shrinking.

>>>>> Can you run some tests showing the difference between the estimation and 
>>>>> the real count?
>>> 
>>> Yes.
>>> On a server with one NUMA node, I create a case that uses many dentry objects.
>> 
>> Could you describe in more detail how do you make dentry cache to grow such
>> a large partial slabs list? Thanks.
>> 
> 
> I utilized the fact that creating a folder will create a new dentry object;
> deleting a folder will delete all its sub-folder's dentry objects.
> 
> Then, I started to create N folders, while each folder has M empty sub-folders.
> Assuming that these operations would consume a large number of dentry
> objects in the sequential order. Their slabs were very likely to be full slabs.
> After all folders were created, I deleted a subset of the N folders (i.e.,
> one out of every two folders). This would create many holes, which turned a
> subset of full slabs into partial slabs.

next prev parent reply	other threads:[~2024-04-13  1:17 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-11 16:40 Jianfeng Wang
2024-04-11 17:02 ` Christoph Lameter (Ampere)
2024-04-12  7:48   ` Vlastimil Babka
2024-04-12 17:29     ` [External] : " Jianfeng Wang
2024-04-12 18:16       ` Christoph Lameter (Ampere)
2024-04-12 18:32         ` Jianfeng Wang
2024-04-12 20:20       ` [External] : " Vlastimil Babka
2024-04-12 20:44         ` Jianfeng Wang
2024-04-13  1:17           ` Jianfeng Wang [this message]
2024-04-15  7:35             ` Vlastimil Babka
2024-04-16 18:58               ` Jianfeng Wang
2024-04-16 20:14                 ` Vlastimil Babka
2024-04-15 16:20             ` Christoph Lameter (Ampere)
2024-04-13  4:43         ` [External] : " Matthew Wilcox
2024-04-12  7:41 ` Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5552D041-8549-4E76-B3EC-03C76C117077@oracle.com \
    --to=jianfeng.w.wang@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=junxiao.bi@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox