From: Barry Song <21cnbao@gmail.com>
To: Yang Shi <shy828301@gmail.com>
Cc: lsf-pc@lists.linux-foundation.org, Linux-MM <linux-mm@kvack.org>
Subject: Re: [LSF/MM/BPF TOPIC]mTHP reliable allocation and reclamation
Date: Tue, 14 May 2024 21:20:25 +1200 [thread overview]
Message-ID: <CAGsJ_4yGRkCwr8xSVF8RgSFoxMfQK_MSNdubsfRnVuUZCXU4Hw@mail.gmail.com> (raw)
In-Reply-To: <CAHbLzkqt+xXViE13c+P1xg0=6M8anR8T3uY8i2=MLvwfK2CoKw@mail.gmail.com>
On Sat, May 11, 2024 at 9:18 AM Yang Shi <shy828301@gmail.com> wrote:
>
> On Thu, May 9, 2024 at 7:22 PM Barry Song <21cnbao@gmail.com> wrote:
> >
> > Hi,
> >
> > I'd like to propose a session about the allocation and reclamation of
> > mTHP. This is related to Yu Zhao's
> > TAO[1] but not the same.
> >
> > OPPO has implemented mTHP-like large folios across thousands of
> > genuine Android devices, utilizing
> > ARM64 CONT-PTE. However, we've encountered challenges:
> >
> > - The allocation of mTHP isn't consistently reliable; even after
> > prolonged use, obtaining large folios
> > remains uncertain.
> > As an instance, following a few hours of operation, the likelihood
> > of successfully allocating large
> > folios on a phone may decrease to just 2%.
> >
> > - Mixing large and small folios in the same LRU list can lead to
> > mutual blocking and unpredictable
> > latency during reclamation/allocation.
>
> I'm also curious how much large folios can improve reclamation
> efficiency. Having large folios is supposed to reduce the scan time
> since there should be fewer folios on LRU. But IIRC I haven't seen too
> much data or benchmark (particularly real life workloads) regarding
> this.
Hi Yang,
We lack direct data on this matter, but information from Ryan's THP_SWPOUT
series [1] provides insights as follows:
| alloc size | baseline | + this series |
| | mm-unstable (~v6.9-rc1) | |
|:-----------|------------------------:|------------------------:|
| 4K Page | 0.0% | 1.3% |
| 64K THP | -13.6% | 46.3% |
| 2M THP | 91.4% | 89.6% |
I suspect the -13.6% performance decrease is due to the split
operation. Once the split
is eliminated, the patchset observed a 46.3% increase. It is presumed
that the overhead
required to reclaim 64K is reduced compared to reclaiming 16 * 4K.
However, at present, in actual android devices, we are observing
nearly 100% occurrence
of anon_thp_swpout_fallback after the device has been in operation for
several hours[2].
Hence, it is likely that we will experience regression instead of
improvement due to the
absence of measures to mitigate swap fragmentation.
[1] https://lore.kernel.org/all/20240408183946.2991168-1-ryan.roberts@arm.com/
[2] https://lore.kernel.org/lkml/CAGsJ_4zAcJkuW016Cfi6wicRr8N9X+GJJhgMQdSMp+Ah+NSgNQ@mail.gmail.com/
>
> >
> > For instance, if you require large folios, the LRU list's tail could
> > be filled with small folios.
> > LRU(LF- large folio, SF- small folio):
> >
> > LF - LF - LF - SF - SF - SF - SF - SF - SF -SF - SF - SF - SF - SF - SF - SF
> >
> > You might end up reclaiming many small folios yet still struggle to
> > allocate large folios. Conversely,
> > the inverse scenario can occur when the LRU list's tail is populated
> > with large folios.
> >
> > SF - SF - SF - LF - LF - LF - LF - LF - LF -LF - LF - LF - LF - LF - LF - LF
> >
> > In OPPO's products, we allocate dedicated pageblocks solely for large
> > folios allocation, and we've
> > fine-tuned the LRU mechanism to support dual LRU—one for small folios
> > and another for large ones.
> > Dedicated page blocks offer a fundamental guarantee of allocating
> > large folios. Additionally, segregating
> > small and large folios into two LRUs ensures that both can be
> > efficiently reclaimed for their respective
> > users' requests. However, while the implementation may lack aesthetic
> > appeal and is primarily tailored
> > for product purposes, it isn't fully upstreamable.
> >
> > You can obtain the architectural diagram of OPPO's approach from link[2].
> >
> > Therefore, my plan is to present:
> >
> > - Introduce the architecture of OPPO's mTHP-like approach, which
> > encompasses additional optimizations
> > we've made to address swap fragmentation issues and improve swap
> > performance, such as dual-zRAM
> > and compression/decompression of large folios [3].
> >
> > - Present OPPO's method of utilizing dedicated page blocks and a
> > dual-LRU system for mTHP.
> >
> > - Share our observations from employing Yu Zhao's TAO on Pixel 6 phones.
> >
> > - Discuss our future direction—are we leaning towards TAO or dedicated
> > page blocks? If we opt for page
> > blocks, how do we plan to resolve the LRU issue?
> >
> > [1] https://lore.kernel.org/linux-mm/20240229183436.4110845-1-yuzhao@google.com/
> > [2] https://github.com/21cnbao/mTHP/blob/main/largefoliosarch.png
> > [3] https://lore.kernel.org/linux-mm/20240327214816.31191-1-21cnbao@gmail.com/
> >
Thanks,
Barry
next prev parent reply other threads:[~2024-05-14 9:20 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-10 2:22 Barry Song
2024-05-10 2:31 ` Matthew Wilcox
2024-05-10 2:42 ` Barry Song
2024-05-10 14:25 ` [Lsf-pc] " Michal Hocko
2024-05-10 20:33 ` Yu Zhao
2024-05-15 2:42 ` Barry Song
2024-05-15 10:21 ` Karim Manaouil
2024-05-15 10:59 ` Yu Zhao
2024-05-15 13:50 ` Yang Shi
2024-05-15 18:14 ` Barry Song
2024-05-10 21:18 ` Yang Shi
2024-05-14 9:20 ` Barry Song [this message]
2024-05-15 13:49 ` Yang Shi
2024-05-15 19:25 ` Barry Song
2024-05-15 21:41 ` Yang Shi
2024-05-15 22:15 ` Barry Song
2024-05-15 23:41 ` Matthew Wilcox
2024-05-16 0:25 ` Barry Song
2024-05-16 3:19 ` Gao Xiang
2024-05-16 6:57 ` Barry Song
2024-05-16 7:07 ` Gao Xiang
2024-05-22 21:43 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAGsJ_4yGRkCwr8xSVF8RgSFoxMfQK_MSNdubsfRnVuUZCXU4Hw@mail.gmail.com \
--to=21cnbao@gmail.com \
--cc=linux-mm@kvack.org \
--cc=lsf-pc@lists.linux-foundation.org \
--cc=shy828301@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox