From: Ning Zhang <ningzhang@linux.alibaba.com>
To: "Alex Zhu (Kernel)" <alexlzhu@meta.com>, Yu Zhao <yuzhao@google.com>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
Kernel Team <Kernel-team@fb.com>,
Matthew Wilcox <willy@infradead.org>,
Rik van Riel <riel@surriel.com>,
Johannes Weiner <hannes@cmpxchg.org>
Subject: Re: [PATCH v4 2/3] mm: changes to split_huge_page() to free zero filled tail pages
Date: Tue, 25 Oct 2022 14:21:18 +0800 [thread overview]
Message-ID: <351a87a2-3bb1-f66b-af95-34ec15a9af54@linux.alibaba.com> (raw)
In-Reply-To: <A59014C8-2318-42EA-A4CA-80B76D9197AA@fb.com>
[-- Attachment #1: Type: text/plain, Size: 3693 bytes --]
在 2022/10/20 02:48, Alex Zhu (Kernel) 写道:
>
>
>> On Oct 18, 2022, at 10:12 PM, Yu Zhao <yuzhao@google.com> wrote:
>>
>> On Tue, Oct 18, 2022 at 9:42 PM <alexlzhu@fb.com> wrote:
>>>
>>> From: Alexander Zhu <alexlzhu@fb.com>
>>>
>>> Currently, when /sys/kernel/mm/transparent_hugepage/enabled=always
>>> is set
>>> there are a large number of transparent hugepages that are almost
>>> entirely
>>> zero filled. This is mentioned in a number of previous patchsets
>>> including:
>>> https://lore.kernel.org/all/20210731063938.1391602-1-yuzhao@google.com/
>>> https://lore.kernel.org/all/
>>> 1635422215-99394-1-git-send-email-ningzhang@linux.alibaba.com/
>>>
>>> Currently, split_huge_page() does not have a way to identify zero filled
>>> pages within the THP. Thus these zero pages get remapped and continue to
>>> create memory waste. In this patch, we identify and free tail pages that
>>> are zero filled in split_huge_page(). In this way, we avoid mapping
>>> these
>>> pages back into page table entries and can free up unused memory within
>>> THPs. This is based off the previously mentioned patchset by Yu Zhao.
>>
>> Hi Alex,
>>
>> Generally the process [1] to follow is that you keep my patches
>> separate from yours, rather than squash them into one, e.g., [2].
>>
>> [1]https://www.kernel.org/doc/html/latest/process/submitting-patches.html
>> [2]https://lore.kernel.org/linux-mm/cover.1665568707.git.christophe.leroy@csgroup.eu/
>>
>> Also it's a courtesy to cc Ning, since his approach is (very) similar
>> to yours. Naturally he would wonder if you are reinventing the wheel,
>> so you'd have to address it in your cover letter.
>
> Sorry about that. Will cc Ning as well in future iterations. I will
> split out the second patch into a few patches as well.
>
> This patchset differs from Ning's RFC in that we make use of list_lru
> and a shrinker, as discussed previously:
> https://lore.kernel.org/linux-mm/CAOUHufYeuMN9As58BVwMKSN6viOZKReXNeCBgGeeL6ToWGsEKw@mail.gmail.com/
>
> The approach is different, but we are fundamentally still cleaning up
> underutilized THPs (contain a large number of zero pages).
>
I have used a shrinker in previous version (see
https://gitee.com/anolis/cloud-kernel/commit/62f8852885cc7f23063886d36fd36d94b48d3982)
.
But the shrinker has a problem that it can't control the split number
accurately. For example, I only want to split two THPs to avoid OOM, but
shrinker may split many THPs.
>>
>>> However, we chose to free anonymous zero tail pages whenever they are
>>> encountered instead of only on reclaim or migration.
>>
>> What are cases that are not on reclaim or migration?
>
> It would be any case where split_huge_page is called on anonymous
> memory. split_huge_page is also called from KSM and madvise. It can
> also be called from debugfs, which is what the self test relies on. We
> thought this implementation would be more generic. As far as I can
> tell there is no reason to keep zero pages around in anonymous THPs
> that have been split.
>
> We also handled remapping to a shared zero page on userfaultfd in a
> previous iteration. That is the only use case I am aware of where we
> do not want to zap the zero pages.
>>
>> As I've explained off the mailing list, it's likely a bug if you
>> really have one. And I don't think you do. I'm currently under the
>> impression that you have a slab shrinker, and slab shrinkers are on
>> the reclaim path.
>>
>> Thanks.
>
> This shrinker is not only for slabs. It’s for all anonymous THPs in
> physical memory. That’s why we needed to add list_lru_add_page and
> list_lru_delete_page as well, as list_lru_add/delete assumes slab
> objects.
>
>
[-- Attachment #2: Type: text/html, Size: 26871 bytes --]
next prev parent reply other threads:[~2022-10-25 6:21 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-19 3:42 [PATCH v4 0/3] THP Shrinker alexlzhu
2022-10-19 3:42 ` [PATCH v4 1/3] mm: add thp_utilization metrics to debugfs alexlzhu
2022-10-25 3:21 ` Huang, Ying
2022-10-19 3:42 ` [PATCH v4 2/3] mm: changes to split_huge_page() to free zero filled tail pages alexlzhu
2022-10-19 5:12 ` Yu Zhao
2022-10-19 18:48 ` Alex Zhu (Kernel)
2022-10-25 6:21 ` Ning Zhang [this message]
2022-10-26 19:43 ` Alex Zhu (Kernel)
2022-10-20 21:57 ` kernel test robot
2022-10-19 3:42 ` [PATCH v4 3/3] mm: THP low utilization shrinker alexlzhu
2022-10-19 15:16 ` kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=351a87a2-3bb1-f66b-af95-34ec15a9af54@linux.alibaba.com \
--to=ningzhang@linux.alibaba.com \
--cc=Kernel-team@fb.com \
--cc=alexlzhu@meta.com \
--cc=hannes@cmpxchg.org \
--cc=linux-mm@kvack.org \
--cc=riel@surriel.com \
--cc=willy@infradead.org \
--cc=yuzhao@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox