From: Barry Song <21cnbao@gmail.com>
To: "Huang, Ying" <ying.huang@intel.com>
Cc: akpm@linux-foundation.org, linux-mm@kvack.org, chrisl@kernel.org,
david@redhat.com, hannes@cmpxchg.org, kasong@tencent.com,
linux-kernel@vger.kernel.org, mhocko@suse.com,
nphamcs@gmail.com, ryan.roberts@arm.com, shy828301@gmail.com,
surenb@google.com, kaleshsingh@google.com, hughd@google.com,
v-songbaohua@oppo.com, willy@infradead.org, xiang@kernel.org,
yosryahmed@google.com, baolin.wang@linux.alibaba.com,
shakeel.butt@linux.dev, senozhatsky@chromium.org,
minchan@kernel.org
Subject: Re: [PATCH RFC v4 0/2] mm: support mTHP swap-in for zRAM-like swapfile
Date: Thu, 4 Jul 2024 22:23:20 +1200 [thread overview]
Message-ID: <CAGsJ_4yKNzZTzfj-deN=cLkgNxb6sbgj75NVH3UYjXVbHykvhg@mail.gmail.com> (raw)
In-Reply-To: <8734oqhr4c.fsf@yhuang6-desk2.ccr.corp.intel.com>
On Thu, Jul 4, 2024 at 1:42 PM Huang, Ying <ying.huang@intel.com> wrote:
>
> Barry Song <21cnbao@gmail.com> writes:
>
> > On Wed, Jul 3, 2024 at 6:33 PM Huang, Ying <ying.huang@intel.com> wrote:
> >>
> >
> > Ying, thanks!
> >
> >> Barry Song <21cnbao@gmail.com> writes:
>
> [snip]
>
> >> > This patch introduces mTHP swap-in support. For now, we limit mTHP
> >> > swap-ins to contiguous swaps that were likely swapped out from mTHP as
> >> > a whole.
> >> >
> >> > Additionally, the current implementation only covers the SWAP_SYNCHRONOUS
> >> > case. This is the simplest and most common use case, benefiting millions
> >>
> >> I admit that Android is an important target platform of Linux kernel.
> >> But I will not advocate that it's MOST common ...
> >
> > Okay, I understand that there are still many embedded systems similar
> > to Android, even if
> > they are not Android :-)
> >
> >>
> >> > of Android phones and similar devices with minimal implementation
> >> > cost. In this straightforward scenario, large folios are always exclusive,
> >> > eliminating the need to handle complex rmap and swapcache issues.
> >> >
> >> > It offers several benefits:
> >> > 1. Enables bidirectional mTHP swapping, allowing retrieval of mTHP after
> >> > swap-out and swap-in.
> >> > 2. Eliminates fragmentation in swap slots and supports successful THP_SWPOUT
> >> > without fragmentation. Based on the observed data [1] on Chris's and Ryan's
> >> > THP swap allocation optimization, aligned swap-in plays a crucial role
> >> > in the success of THP_SWPOUT.
> >> > 3. Enables zRAM/zsmalloc to compress and decompress mTHP, reducing CPU usage
> >> > and enhancing compression ratios significantly. We have another patchset
> >> > to enable mTHP compression and decompression in zsmalloc/zRAM[2].
> >> >
> >> > Using the readahead mechanism to decide whether to swap in mTHP doesn't seem
> >> > to be an optimal approach. There's a critical distinction between pagecache
> >> > and anonymous pages: pagecache can be evicted and later retrieved from disk,
> >> > potentially becoming a mTHP upon retrieval, whereas anonymous pages must
> >> > always reside in memory or swapfile. If we swap in small folios and identify
> >> > adjacent memory suitable for swapping in as mTHP, those pages that have been
> >> > converted to small folios may never transition to mTHP. The process of
> >> > converting mTHP into small folios remains irreversible. This introduces
> >> > the risk of losing all mTHP through several swap-out and swap-in cycles,
> >> > let alone losing the benefits of defragmentation, improved compression
> >> > ratios, and reduced CPU usage based on mTHP compression/decompression.
> >>
> >> I understand that the most optimal policy in your use cases may be
> >> always swapping-in mTHP in highest order. But, it may be not in some
> >> other use cases. For example, relative slow swap devices, non-fault
> >> sub-pages swapped out again before usage, etc.
> >>
> >> So, IMO, the default policy should be the one that can adapt to the
> >> requirements automatically. For example, if most non-fault sub-pages
> >> will be read/written before being swapped out again, we should swap-in
> >> in larger order, otherwise in smaller order. Swap readahead is one
> >> possible way to do that. But, I admit that this may not work perfectly
> >> in your use cases.
> >>
> >> Previously I hope that we can start with this automatic policy that
> >> helps everyone, then check whether it can satisfy your requirements
> >> before implementing the optimal policy for you. But it appears that you
> >> don't agree with this.
> >>
> >> Based on the above, IMO, we should not use your policy as default at
> >> least for now. A user space interface can be implemented to select
> >> different swap-in order policy similar as that of mTHP allocation order
> >> policy. We need a different policy because the performance characters
> >> of the memory allocation is quite different from that of swap-in. For
> >> example, the SSD reading could be much slower than the memory
> >> allocation. With the policy selection, I think that we can implement
> >> mTHP swap-in for non-SWAP_SYNCHRONOUS too. Users need to know what they
> >> are doing.
> >
> > Agreed. Ryan also suggested something similar before.
> > Could we add this user policy by:
> >
> > /sys/kernel/mm/transparent_hugepage/hugepages-<size>/swapin_enabled
> > which could be 0 or 1, I assume we don't need so many "always inherit
> > madvise never"?
> >
> > Do you have any suggestions regarding the user interface?
>
> /sys/kernel/mm/transparent_hugepage/hugepages-<size>/swapin_enabled
>
> looks good to me. To be consistent with "enabled" in the same
> directory, and more importantly, to be extensible, I think that it's
> better to start with at least "always never". I believe that we will
> add "auto" in the future to tune automatically. Which can be used as
> default finally.
Sounds good to me. Thanks!
>
> --
> Best Regards,
> Huang, Ying
Barry
prev parent reply other threads:[~2024-07-04 10:23 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-29 11:10 Barry Song
2024-06-29 11:10 ` [PATCH RFC v4 1/2] mm: swap: introduce swapcache_prepare_nr and swapcache_clear_nr for large folios swap-in Barry Song
2024-06-29 11:10 ` [PATCH RFC v4 2/2] mm: support large folios swapin as a whole for zRAM-like swapfile Barry Song
2024-07-01 13:52 ` Yosry Ahmed
2024-07-01 21:27 ` Barry Song
2024-07-03 6:31 ` [PATCH RFC v4 0/2] mm: support mTHP swap-in " Huang, Ying
2024-07-03 7:58 ` Barry Song
2024-07-03 8:32 ` Barry Song
2024-07-04 1:40 ` Huang, Ying
2024-07-04 10:23 ` Barry Song [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAGsJ_4yKNzZTzfj-deN=cLkgNxb6sbgj75NVH3UYjXVbHykvhg@mail.gmail.com' \
--to=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=chrisl@kernel.org \
--cc=david@redhat.com \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=kaleshsingh@google.com \
--cc=kasong@tencent.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=minchan@kernel.org \
--cc=nphamcs@gmail.com \
--cc=ryan.roberts@arm.com \
--cc=senozhatsky@chromium.org \
--cc=shakeel.butt@linux.dev \
--cc=shy828301@gmail.com \
--cc=surenb@google.com \
--cc=v-songbaohua@oppo.com \
--cc=willy@infradead.org \
--cc=xiang@kernel.org \
--cc=ying.huang@intel.com \
--cc=yosryahmed@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox