From: Ryan Roberts <ryan.roberts@arm.com>
To: Matthew Wilcox <willy@infradead.org>, Barry Song <21cnbao@gmail.com>
Cc: akpm@linux-foundation.org, linux-mm@kvack.org,
ying.huang@intel.com, baolin.wang@linux.alibaba.com,
chrisl@kernel.org, david@redhat.com, hannes@cmpxchg.org,
hughd@google.com, kaleshsingh@google.com, kasong@tencent.com,
linux-kernel@vger.kernel.org, mhocko@suse.com,
minchan@kernel.org, nphamcs@gmail.com, senozhatsky@chromium.org,
shakeel.butt@linux.dev, shy828301@gmail.com, surenb@google.com,
v-songbaohua@oppo.com, xiang@kernel.org, yosryahmed@google.com
Subject: Re: [PATCH v5 4/4] mm: Introduce per-thpsize swapin control policy
Date: Tue, 30 Jul 2024 09:36:39 +0100 [thread overview]
Message-ID: <f0c7f061-6284-4fe5-8cbf-93281070895b@arm.com> (raw)
In-Reply-To: <ZqcR_oZmVpi2TrHO@casper.infradead.org>
On 29/07/2024 04:52, Matthew Wilcox wrote:
> On Fri, Jul 26, 2024 at 09:46:18PM +1200, Barry Song wrote:
>> A user space interface can be implemented to select different swap-in
>> order policies, similar to the mTHP allocation order policy. We need
>> a distinct policy because the performance characteristics of memory
>> allocation differ significantly from those of swap-in. For example,
>> SSD read speeds can be much slower than memory allocation. With
>> policy selection, I believe we can implement mTHP swap-in for
>> non-SWAP_SYNCHRONOUS scenarios as well. However, users need to understand
>> the implications of their choices. I think that it's better to start
>> with at least always never. I believe that we will add auto in the
>> future to tune automatically, which can be used as default finally.
>
> I strongly disagree. Use the same sysctl as the other anonymous memory
> allocations.
I vaguely recall arguing in the past that just because the user has requested 2M
THP that doesn't mean its the right thing to do for performance to swap-in the
whole 2M in one go. That's potentially a pretty huge latency, depending on where
the backend is, and it could be a waste of IO if the application never touches
most of the 2M. Although the fact that the application hinted for a 2M THP in
the first place hopefully means that they are storing objects that need to be
accessed at similar times. Today it will be swapped in page-by-page then
eventually collapsed by khugepaged.
But I think those arguments become weaker as the THP size gets smaller. 16K/64K
swap-in will likely yield significant performance improvements, and I think
Barry has numbers for this?
So I guess we have a few options:
- Just use the same sysfs interface as for anon allocation, And see if anyone
reports performance regressions. Investigate one of the options below if an
issue is raised. That's the simplest and cleanest approach, I think.
- New sysfs interface as Barry has implemented; nobody really wants more
controls if it can be helped.
- Hardcode a size limit (e.g. 64K); I've tried this in a few different contexts
and never got any traction.
- Secret option 4: Can we allocate a full-size folio but only choose to swap-in
to it bit-by-bit? You would need a way to mark which pages of the folio are
valid (e.g. per-page flag) but guess that's a non-starter given the strategy to
remove per-page flags?
Thanks,
Ryan
next prev parent reply other threads:[~2024-07-30 8:36 UTC|newest]
Thread overview: 59+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-26 9:46 [PATCH v5 0/4] mm: support mTHP swap-in for zRAM-like swapfile Barry Song
2024-07-26 9:46 ` [PATCH v5 1/4] mm: swap: introduce swapcache_prepare_nr and swapcache_clear_nr for large folios swap-in Barry Song
2024-07-30 3:00 ` Baolin Wang
2024-07-30 3:11 ` Matthew Wilcox
2024-07-30 3:15 ` Barry Song
2024-07-26 9:46 ` [PATCH v5 2/4] mm: Introduce mem_cgroup_swapin_uncharge_swap_nr() helper " Barry Song
2024-07-26 16:30 ` Yosry Ahmed
2024-07-29 2:02 ` Barry Song
2024-07-29 3:43 ` Matthew Wilcox
2024-07-29 4:52 ` Barry Song
2024-07-26 9:46 ` [PATCH v5 3/4] mm: support large folios swapin as a whole for zRAM-like swapfile Barry Song
2024-07-29 3:51 ` Matthew Wilcox
2024-07-29 4:41 ` Barry Song
[not found] ` <CAGsJ_4wxUZAysyg3cCVnHhOFt5SbyAMUfq3tJcX-Wb6D4BiBhA@mail.gmail.com>
2024-07-29 12:49 ` Matthew Wilcox
2024-07-29 13:11 ` Barry Song
2024-07-29 15:13 ` Matthew Wilcox
2024-07-29 20:03 ` Barry Song
2024-07-29 21:56 ` Barry Song
2024-07-30 8:12 ` Ryan Roberts
2024-07-29 6:36 ` Chuanhua Han
2024-07-29 12:55 ` Matthew Wilcox
2024-07-29 13:18 ` Barry Song
2024-07-29 13:32 ` Chuanhua Han
2024-07-29 14:16 ` Dan Carpenter
2024-07-26 9:46 ` [PATCH v5 4/4] mm: Introduce per-thpsize swapin control policy Barry Song
2024-07-27 5:58 ` kernel test robot
2024-07-29 1:37 ` Barry Song
2024-07-29 3:52 ` Matthew Wilcox
2024-07-29 4:49 ` Barry Song
2024-07-29 16:11 ` Christoph Hellwig
2024-07-29 20:11 ` Barry Song
2024-07-30 16:30 ` Christoph Hellwig
2024-07-30 19:28 ` Nhat Pham
2024-07-30 21:06 ` Barry Song
2024-07-31 18:35 ` Nhat Pham
2024-08-01 3:00 ` Sergey Senozhatsky
2024-08-01 20:55 ` Chris Li
2024-08-12 8:27 ` Christoph Hellwig
2024-08-12 8:44 ` Barry Song
2024-07-30 2:27 ` Chuanhua Han
2024-07-30 8:36 ` Ryan Roberts [this message]
2024-07-30 8:47 ` David Hildenbrand
2024-08-05 6:10 ` Huang, Ying
2024-08-02 12:20 ` [PATCH v6 0/2] mm: Ignite large folios swap-in support Barry Song
2024-08-02 12:20 ` [PATCH v6 1/2] mm: add nr argument in mem_cgroup_swapin_uncharge_swap() helper to support large folios Barry Song
2024-08-02 17:29 ` Chris Li
2024-08-02 12:20 ` [PATCH v6 2/2] mm: support large folios swap-in for zRAM-like devices Barry Song
2024-08-03 19:08 ` Andrew Morton
2024-08-12 8:26 ` Christoph Hellwig
2024-08-12 8:53 ` Barry Song
2024-08-12 11:38 ` Christoph Hellwig
2024-08-15 9:47 ` Kairui Song
2024-08-15 13:27 ` Kefeng Wang
2024-08-15 23:06 ` Barry Song
2024-08-16 16:50 ` Kairui Song
2024-08-16 20:34 ` Andrew Morton
2024-08-27 3:41 ` Chuanhua Han
2024-08-16 21:16 ` Matthew Wilcox
2024-08-16 21:39 ` Barry Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f0c7f061-6284-4fe5-8cbf-93281070895b@arm.com \
--to=ryan.roberts@arm.com \
--cc=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=chrisl@kernel.org \
--cc=david@redhat.com \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=kaleshsingh@google.com \
--cc=kasong@tencent.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=minchan@kernel.org \
--cc=nphamcs@gmail.com \
--cc=senozhatsky@chromium.org \
--cc=shakeel.butt@linux.dev \
--cc=shy828301@gmail.com \
--cc=surenb@google.com \
--cc=v-songbaohua@oppo.com \
--cc=willy@infradead.org \
--cc=xiang@kernel.org \
--cc=ying.huang@intel.com \
--cc=yosryahmed@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox