From: Chengming Zhou <chengming.zhou@linux.dev>
To: Yosry Ahmed <yosryahmed@google.com>, Nhat Pham <nphamcs@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>,
Andrew Morton <akpm@linux-foundation.org>,
Johannes Weiner <hannes@cmpxchg.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
David Hildenbrand <david@redhat.com>,
Barry Song <21cnbao@gmail.com>, Chris Li <chrisl@kernel.org>,
Ryan Roberts <ryan.roberts@arm.com>,
Kairui Song <kasong@tencent.com>
Subject: Re: [PATCH 0/3] mm: zswap: trivial folio conversions
Date: Mon, 3 Jun 2024 14:19:17 +0800 [thread overview]
Message-ID: <9de0ce63-3815-4c1a-91a2-11cb3d526672@linux.dev> (raw)
In-Reply-To: <CAJD7tkYz1-nsoDrjLfNoYaKp5R5QShpzPirKWrY-PSqRtXswtg@mail.gmail.com>
On 2024/5/29 03:32, Yosry Ahmed wrote:
> On Tue, May 28, 2024 at 12:08 PM Nhat Pham <nphamcs@gmail.com> wrote:
>>
>> On Fri, May 24, 2024 at 4:13 PM Yosry Ahmed <yosryahmed@google.com> wrote:
>>>
>>> On Fri, May 24, 2024 at 12:53 PM Yosry Ahmed <yosryahmed@google.com> wrote:
>>>>
>>>> On Thu, May 23, 2024 at 8:59 PM Matthew Wilcox <willy@infradead.org> wrote:
>>>>>
>>>>> On Fri, May 24, 2024 at 03:38:15AM +0000, Yosry Ahmed wrote:
>>>>>> Some trivial folio conversions in zswap code.
>>>>>
>>>>> The three patches themselves look good.
>>>>>
>>>>>> The mean reason I included a cover letter is that I wanted to get
>>>>>> feedback on what other trivial conversions can/should be done in
>>>>>> mm/zswap.c (keeping in mind that only order-0 folios are supported
>>>>>> anyway). These are the things I came across while searching for 'page'
>>>>>> in mm/zswap.c, and chose not to do anything about for now:
>>>>>
>>>>> I think there's a deeper question to answer before answering these
>>>>> questions, which is what we intend to do with large folios and zswap in
>>>>> the future. Do we intend to split them? Compress them as a large
>>>>> folio? Compress each page in a large folio separately? I can see an
>>>>> argument for choices 2 and 3, but I think choice 1 is going to be
>>>>> increasingly untenable.
>>>>
>>>> Yeah I was kinda getting the small things out of the way so that zswap
>>>> is fully folio-ized, before we think about large folios. I haven't
>>>> given it a lot of thought, but here's what I have in mind.
>>>>
>>>> Right now, I think most configs enable zswap will disable
>>>> CONFIG_THP_SWAP (otherwise all THPs will go straight to disk), so
>>>> let's assume that today we are splitting large folios before they go
>>>> to zswap (i.e. choice 1).
>>>>
>>>> What we do next depends on how the core swap intends to deal with
>>>> large folios. My understanding based on recent developments is that we
>>>> intend to swapout large folios as a whole, but I saw some discussions
>>>> about splitting all large folios before swapping them out, or leaving
>>>> them whole but swapping them out in order-0 chunks.
>>>>
>>>> I assume the rationale is that there is little benefit to keeping the
>>>> folios whole because they will most likely be freed soon anyway, but I
>>>> understand not wanting to spend time on splitting them, so swapping
>>>> them out in order-0 chunks makes some sense to me. It also dodges the
>>>> whole fragmentation issue.
>>>>
>>>> If we do either of these things in the core swap code, then I think
>>>> zswap doesn't need to do anything to support large folios. If not,
>>>> then we need to make a choice between 2 (compress large folios) &
>>>> choice 3 (compress each page separately) as you mentioned.
>>>>
>>>> Compressing large folios as a whole means that we need to decompress
>>>> them as a whole to read a single page, which I think could be very
>>>> inefficient in some cases or force us to swapin large folios. Unless
>>>> of course we end up in a world where we mostly swapin the same large
>>>> folios that we swapped out. Although there can be additional
>>>> compression savings from compressing large folios as a whole.
>>>>
>>>> Hence, I think choice 3 is the most reasonable one, at least for the
>>>> short-term. I also think this is what zram does, but I haven't
>>>> checked. Even if we all agree on this, there are still questions that
>>>> we need to answer. For example, do we allocate zswap_entry's for each
>>>> order-0 chunk right away, or do we allocate a single zswap_entry for
>>>> the entire folio, and then "split" it during swapin if we only need to
>>>> read part of the folio?
>>>>
>>>> Wondering what others think here.
>>>
>>> More thoughts that came to mind here:
>>>
>>> - Whether we go with choice 2 or 3, we may face a latency issue. Zswap
>>> compression happens synchronously in the context of reclaim, so if we
>>> start handling large folios in zswap, it may be more efficient to do
>>> it asynchronously like swap to disk.
>>
>> We've been discussing this in private as well :)
>>
>> It doesn't have to be these two extremes right? I'm perfectly happy
>> with starting with compressing each subpage separately, but perhaps we
>> can consider managing larger folios in bigger chunks (say 64KB). That
>> way, on swap-in, we just have to bring a whole chunk in, not the
>> entire folio, and still take advantage of compression efficiencies on
>> bigger-than-one-page chunks. I'd also check with other filesystems
>> that leverage compression, to see what's their unit of compression is.
>
> Right. But I think it will be a clearer win to start with compressing
> each subpage separately, and it avoids splitting folios during reclaim
> to zswap. It also doesn't depend on the zsmalloc work.
>
> Once we have that, we can experiment with compressing folios in larger
> chunks. The tradeoffs become less clear at that point, and the number
> of variables you can tune goes up :)
Agree, it's a good approach! And it hasn't any decompression amplification
problem.
Thanks.
next prev parent reply other threads:[~2024-06-03 6:19 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-24 3:38 Yosry Ahmed
2024-05-24 3:38 ` [PATCH 1/3] mm: zswap: use sg_set_folio() in zswap_{compress/decompress}() Yosry Ahmed
2024-06-03 6:03 ` Chengming Zhou
2024-05-24 3:38 ` [PATCH 2/3] mm :zswap: use kmap_local_folio() in zswap_load() Yosry Ahmed
2024-05-28 15:16 ` Nhat Pham
2024-06-03 6:04 ` Chengming Zhou
2024-05-24 3:38 ` [PATCH 3/3] mm: zswap: make same_filled functions folio-friendly Yosry Ahmed
2024-05-28 15:18 ` Nhat Pham
2024-06-03 6:07 ` Chengming Zhou
2024-05-24 3:59 ` [PATCH 0/3] mm: zswap: trivial folio conversions Matthew Wilcox
2024-05-24 19:53 ` Yosry Ahmed
2024-05-24 23:12 ` Yosry Ahmed
2024-05-28 19:08 ` Nhat Pham
2024-05-28 19:32 ` Yosry Ahmed
2024-06-03 6:19 ` Chengming Zhou [this message]
2024-06-02 1:30 ` Barry Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9de0ce63-3815-4c1a-91a2-11cb3d526672@linux.dev \
--to=chengming.zhou@linux.dev \
--cc=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=chrisl@kernel.org \
--cc=david@redhat.com \
--cc=hannes@cmpxchg.org \
--cc=kasong@tencent.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nphamcs@gmail.com \
--cc=ryan.roberts@arm.com \
--cc=willy@infradead.org \
--cc=yosryahmed@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox