From: Kairui Song <ryncsn@gmail.com>
To: Ryan Roberts <ryan.roberts@arm.com>
Cc: "Huang, Ying" <ying.huang@intel.com>,
linux-mm@kvack.org, Chris Li <chrisl@kernel.org>,
Minchan Kim <minchan@kernel.org>,
Barry Song <v-songbaohua@oppo.com>, Yu Zhao <yuzhao@google.com>,
SeongJae Park <sj@kernel.org>,
David Hildenbrand <david@redhat.com>,
Yosry Ahmed <yosryahmed@google.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Matthew Wilcox <willy@infradead.org>,
Nhat Pham <nphamcs@gmail.com>,
Chengming Zhou <zhouchengming@bytedance.com>,
Andrew Morton <akpm@linux-foundation.org>,
linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 00/10] mm/swap: always use swap cache for synchronization
Date: Wed, 27 Mar 2024 19:04:45 +0800 [thread overview]
Message-ID: <CAMgjq7D0XA=rd5CuonrYnHYeSJR3tqC1O49vj7Cr+su0MgOQ8Q@mail.gmail.com> (raw)
In-Reply-To: <58e4f0c2-99d1-42b9-ab70-907cf35ac1a7@arm.com>
On Wed, Mar 27, 2024 at 4:27 PM Ryan Roberts <ryan.roberts@arm.com> wrote:
>
> [...]
>
> >>> Test 1, sequential swapin/out of 30G zero page on ZRAM:
> >>>
> >>> Before (us) After (us)
> >>> Swapout: 33619409 33886008
> >>> Swapin: 32393771 32465441 (- 0.2%)
> >>> Swapout (THP): 7817909 6899938 (+11.8%)
> >>> Swapin (THP) : 32452387 33193479 (- 2.2%)
> >>
> >> If my understanding were correct, we don't have swapin (THP) support,
> >> yet. Right?
> >
> > Yes, this series doesn't change how swapin/swapout works with THP in
> > general, but now THP swapout will leave shadows with large order, so
> > it needs to be splitted upon swapin, that will slow down later swapin
> > by a little bit but I think that's worth it.
> >
> > If we can do THP swapin in the future, this split on swapin can be
> > saved to make the performance even better.
>
> I'm confused by this (clearly my understanding of how this works is incorrect).
> Perhaps you can help me understand:
>
> When you talk about "shadows" I assume you are referring to the swap cache? It
> was my understanding that swapping out a THP would always leave the large folio
> in the swap cache, so this is nothing new?
>
> And on swap-in, if the target page is in the swap cache, even if part of a large
> folio, why does it need to be split? I assumed the single page would just be
> mapped? (and if all the other pages subsequently fault, then you end up with a
> fully mapped large folio back in the process)?
>
> Perhaps I'm misunderstanding what "shadows" are?
Hi Ryan
My bad I haven't made this clear.
Ying have posted the link to the commit that added "shadow" support
for anon pages, it has become a very important part for LRU activation
/ workingset tracking. Basically when folios are removed from the
cache xarray (eg. after swap writeback is done), instead of releasing
the xarray slot, an unsigned long / void * is stored to it, recording
some info that will be used when refault happens, to decide how to
handle the folio from LRU / workingset side.
And about large folio in swapcahce: if you look at the current version
of add_to_swap_cache in mainline (it adds a folio of any order into
swap cache), it calls xas_create_range(&xas) which fill all xarray
slots in entire range covered by the folio. But xarray supports
multi-index storing, making use of the nature of the radix tree to
save a lot of slots. eg. for a 2M THP page, previously 8 + 512 slots
(8 extra xa nodes) is needed to store it, after this series it only
needs 8 slots by using a multi-index store. (not sure if I did the
math right).
Same for shadow, when folio is being deleted, __delete_from_swap_cache
will currently walk the xarray with xas_next update all 8 + 512 slots
one by one, after this series only 8 stores are needed (ignoring
fragmentation).
And upon swapin, I was talking about swapin 1 sub page of a THP folio,
and the folio is gone, leaving a few multi-index shadow slots. The
multi-index slots need to be splitted (multi-index slot have to be
updated as a whole or split first, __filemap_add_folio handles such
split), I optimize and reused routine in __filemap_add_folio in this
series so without too much work it works perfectly for swapcache.
prev parent reply other threads:[~2024-03-27 11:05 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-26 18:50 Kairui Song
2024-03-26 18:50 ` [RFC PATCH 01/10] mm/filemap: split filemap storing logic into a standalone helper Kairui Song
2024-03-26 18:50 ` [RFC PATCH 02/10] mm/swap: move no readahead swapin code to a stand-alone helper Kairui Song
2024-03-26 18:50 ` [RFC PATCH 03/10] mm/swap: convert swapin_readahead to return a folio Kairui Song
2024-03-26 20:03 ` Matthew Wilcox
2024-03-26 18:50 ` [RFC PATCH 04/10] mm/swap: remove cache bypass swapin Kairui Song
2024-03-27 6:30 ` Huang, Ying
2024-03-27 6:55 ` Kairui Song
2024-03-27 7:29 ` Huang, Ying
2024-03-26 18:50 ` [RFC PATCH 05/10] mm/swap: clean shadow only in unmap path Kairui Song
2024-03-26 18:50 ` [RFC PATCH 06/10] mm/swap: switch to use multi index entries Kairui Song
2024-03-26 18:50 ` [RFC PATCH 07/10] mm/swap: rename __read_swap_cache_async to swap_cache_alloc_or_get Kairui Song
2024-03-26 18:50 ` [RFC PATCH 08/10] mm/swap: use swap cache as a synchronization layer Kairui Song
2024-03-26 18:50 ` [RFC PATCH 09/10] mm/swap: delay the swap cache lookup for swapin Kairui Song
2024-03-26 18:50 ` [RFC PATCH 10/10] mm/swap: optimize synchronous swapin Kairui Song
2024-03-27 6:22 ` Huang, Ying
2024-03-27 6:37 ` Kairui Song
2024-03-27 6:47 ` Huang, Ying
2024-03-27 7:14 ` Kairui Song
2024-03-27 8:16 ` Huang, Ying
2024-03-27 8:08 ` Barry Song
2024-03-27 8:44 ` Kairui Song
2024-03-27 2:52 ` [RFC PATCH 00/10] mm/swap: always use swap cache for synchronization Huang, Ying
2024-03-27 3:01 ` Kairui Song
2024-03-27 8:27 ` Ryan Roberts
2024-03-27 8:32 ` Huang, Ying
2024-03-27 9:39 ` Ryan Roberts
2024-03-27 11:04 ` Kairui Song [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAMgjq7D0XA=rd5CuonrYnHYeSJR3tqC1O49vj7Cr+su0MgOQ8Q@mail.gmail.com' \
--to=ryncsn@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=chrisl@kernel.org \
--cc=david@redhat.com \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=minchan@kernel.org \
--cc=nphamcs@gmail.com \
--cc=ryan.roberts@arm.com \
--cc=sj@kernel.org \
--cc=v-songbaohua@oppo.com \
--cc=willy@infradead.org \
--cc=ying.huang@intel.com \
--cc=yosryahmed@google.com \
--cc=yuzhao@google.com \
--cc=zhouchengming@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox