From: Vlastimil Babka <vbabka@suse.cz>
To: Yang Shi <shy828301@gmail.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Miaohe Lin <linmiaohe@huawei.com>,
Song Liu <songliubraving@fb.com>, Rik van Riel <riel@surriel.com>,
Matthew Wilcox <willy@infradead.org>, Zi Yan <ziy@nvidia.com>,
Theodore Ts'o <tytso@mit.edu>,
Andrew Morton <akpm@linux-foundation.org>,
Linux MM <linux-mm@kvack.org>,
Linux FS-devel Mailing List <linux-fsdevel@vger.kernel.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [v3 PATCH 0/8] Make khugepaged collapse readonly FS THP more consistent
Date: Tue, 10 May 2022 09:35:11 +0200 [thread overview]
Message-ID: <0da1c63b-5cc3-7fc9-1fb4-fdc385539bbc@suse.cz> (raw)
In-Reply-To: <CAHbLzkrZb6r1r6xFaEFvvJzwvVgDgeZWfjhq-SFu_mQZ0j5tTQ@mail.gmail.com>
On 5/9/22 22:34, Yang Shi wrote:
> On Mon, May 9, 2022 at 9:05 AM Vlastimil Babka <vbabka@suse.cz> wrote:
>>
>> On 4/4/22 22:02, Yang Shi wrote:
>> > include/linux/huge_mm.h | 14 ++++++++++++
>> > include/linux/khugepaged.h | 59 ++++++++++++---------------------------------------
>> > include/linux/sched/coredump.h | 3 ++-
>> > kernel/fork.c | 4 +---
>> > mm/huge_memory.c | 15 ++++---------
>> > mm/khugepaged.c | 76 +++++++++++++++++++++++++++++++++++++-----------------------------
>> > mm/mmap.c | 14 ++++++++----
>> > mm/shmem.c | 12 -----------
>> > 8 files changed, 88 insertions(+), 109 deletions(-)
>>
>> Resending my general feedback from mm-commits thread to include the
>> public ML's:
>>
>> There's modestly less lines in the end, some duplicate code removed,
>> special casing in shmem.c removed, that's all good as it is. Also patch 8/8
>> become quite boring in v3, no need to change individual filesystems and also
>> no hook in fault path, just the common mmap path. So I would just handle
>> patch 6 differently as I just replied to it, and acked the rest.
>>
>> That said it's still unfortunately rather a mess of functions that have
>> similar names. transhuge_vma_enabled(vma). hugepage_vma_check(vma),
>> transparent_hugepage_active(vma), transhuge_vma_suitable(vma, addr)?
>> So maybe still some space for further cleanups. But the series is fine as it
>> is so we don't have to wait for it now.
>
> Yeah, I agree that we do have a lot thp checks. Will find some time to
> look into it deeper later.
Thanks.
>>
>> We could also consider that the tracking of which mm is to be scanned is
>> modelled after ksm which has its own madvise flag, but also no "always"
>> mode. What if for THP we only tracked actual THP madvised mm's, and in
>> "always" mode just scanned all vm's, would that allow ripping out some code
>> perhaps, while not adding too many unnecessary scans? If some processes are
>
> Do you mean add all mm(s) to the scan list unconditionally? I don't
> think it will scale.
It might be interesting to find out how many mm's (percentage of all mm's)
are typically in the list with "always" enabled. I wouldn't be surprised if
it was nearly all of them. Having at least one large enough anonymous area
sounds like something all processes would have these days?
>> being scanned without any effect, maybe track success separately, and scan
>> them less frequently etc. That could be ultimately more efficinet than
>> painfully tracking just *eligibility* for scanning in "always" mode?
>
> Sounds like we need a couple of different lists, for example, inactive
> and active? And promote or demote mm(s) between the two lists? TBH I
> don't see too many benefits at the moment. Or I misunderstood you?
Yeah, something like that. It would of course require finding out whether
khugepaged is consuming too much cpu uselessly these days while not
processing fast enough mm's where it succeeds more.
>>
>> Even more radical thing to consider (maybe that's a LSF/MM level topic, too
>> bad :) is that we scan pagetables in ksm, khugepaged, numa balancing, soon
>> in MGLRU, and I probably forgot something else. Maybe time to think about
>> unifying those scanners?
>
> We do have pagewalk (walk_page_range()) which is used by a couple of
> mm stuff, for example, mlock, mempolicy, mprotect, etc. I'm not sure
> whether it is feasible for khugepaged, ksm, etc, or not since I didn't
> look that hard. But I agree it should be worth looking at.
pagewalk is a framework to simplify writing code that processes page tables
for a given one-off task, yeah. But this would be something a bit different,
e.g. a kernel thread that does the sum of what khugepaged/ksm/etc do. Numa
balancing uses task_work instead of kthread so that would require
consideration on which mechanism the unified daemon would use.
>>
>>
next prev parent reply other threads:[~2022-05-10 7:35 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-04 20:02 Yang Shi
2022-04-04 20:02 ` [v3 PATCH 1/8] sched: coredump.h: clarify the use of MMF_VM_HUGEPAGE Yang Shi
2022-05-09 12:25 ` Vlastimil Babka
2022-04-04 20:02 ` [v3 PATCH 2/8] mm: khugepaged: remove redundant check for VM_NO_KHUGEPAGED Yang Shi
2022-05-09 12:45 ` Vlastimil Babka
2022-04-04 20:02 ` [v3 PATCH 3/8] mm: khugepaged: skip DAX vma Yang Shi
2022-05-09 12:46 ` Vlastimil Babka
2022-04-04 20:02 ` [v3 PATCH 4/8] mm: thp: only regular file could be THP eligible Yang Shi
2022-05-09 13:41 ` Vlastimil Babka
2022-04-04 20:02 ` [v3 PATCH 5/8] mm: khugepaged: make khugepaged_enter() void function Yang Shi
2022-05-09 13:46 ` Vlastimil Babka
2022-04-04 20:02 ` [v3 PATCH 6/8] mm: khugepaged: move some khugepaged_* functions to khugepaged.c Yang Shi
2022-05-09 15:31 ` Vlastimil Babka
2022-05-09 23:00 ` Yang Shi
2022-04-04 20:02 ` [v3 PATCH 7/8] mm: khugepaged: introduce khugepaged_enter_vma() helper Yang Shi
2022-05-09 15:39 ` Vlastimil Babka
2022-04-04 20:02 ` [v3 PATCH 8/8] mm: mmap: register suitable readonly file vmas for khugepaged Yang Shi
2022-05-09 15:43 ` Vlastimil Babka
2022-04-05 0:16 ` [v3 PATCH 0/8] Make khugepaged collapse readonly FS THP more consistent Matthew Wilcox
2022-04-05 0:48 ` Yang Shi
2022-04-27 20:58 ` Matthew Wilcox
2022-04-27 22:38 ` Yang Shi
2022-04-27 23:16 ` Yang Shi
2022-05-09 16:05 ` Vlastimil Babka
2022-05-09 20:34 ` Yang Shi
2022-05-10 7:35 ` Vlastimil Babka [this message]
2022-05-10 19:25 ` Yang Shi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0da1c63b-5cc3-7fc9-1fb4-fdc385539bbc@suse.cz \
--to=vbabka@suse.cz \
--cc=akpm@linux-foundation.org \
--cc=kirill.shutemov@linux.intel.com \
--cc=linmiaohe@huawei.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=riel@surriel.com \
--cc=shy828301@gmail.com \
--cc=songliubraving@fb.com \
--cc=tytso@mit.edu \
--cc=willy@infradead.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox