From: Usama Arif <usamaarif642@gmail.com>
To: Yosry Ahmed <yosryahmed@google.com>, 21cnbao@gmail.com
Cc: akpm@linux-foundation.org, hannes@cmpxchg.org, david@redhat.com,
ying.huang@intel.com, hughd@google.com, willy@infradead.org,
nphamcs@gmail.com, chengming.zhou@linux.dev, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, kernel-team@meta.com,
Shakeel Butt <shakeel.butt@linux.dev>
Subject: Re: [PATCH v3 1/2] mm: store zero pages to be swapped out in a bitmap
Date: Tue, 11 Jun 2024 12:49:27 +0100 [thread overview]
Message-ID: <9ddfe544-636d-4638-ae0e-053674e47322@gmail.com> (raw)
In-Reply-To: <CAJD7tkZoaM=dWim7GPqEERDZyxp3PqMNctzXQCPDMO=8kQSwfw@mail.gmail.com>
@@ -515,8 +600,11 @@ void swap_read_folio(struct folio *folio, bool
synchronous,
>>>> psi_memstall_enter(&pflags);
>>>> }
>>>> delayacct_swapin_start();
>>>> -
>>>> - if (zswap_load(folio)) {
>>>> + if (swap_zeromap_folio_test(folio)) {
>>>> + folio_zero_fill(folio);
>>>> + folio_mark_uptodate(folio);
>>>> + folio_unlock(folio);
>>> We don't currently support swapping in large folios, but it is a work
>>> in progress, and this will break once we have it.
>>> swap_zeromap_folio_test() will return false even if parts of the folio
>>> are in fact zero-filled. Then, we will go read those from disk swap,
>>> essentially corrupting data.
>> So yes, with this patch I tested swap out of large zero folio, but when
>> swapping in it was page by page. My assumption was that swap_read_folio
>> (when support is added) would only pass a large folio that was earlier
>> swapped out as a large folio. So if a zero filled large folio was
>> swapped out, the zeromap will be set for all the pages in that folio,
>> and at large folio swap in (when it is supported), it will see that all
>> the bits in the zeromap for that folio are set, and will just
>> folio_zero_fill.
>>
>> If even a single page in large folio has non-zero data then zeromap will
>> not store it and it will go to either zswap or disk, and at read time as
>> all the bits in zeromap are not set, it will still goto either zswap or
>> disk, so I think this works?
>>
>> Is my assumption wrong that only large folios can be swapped in only if
>> they were swapped out as large? I think this code works in that case.
> I think the assumption is incorrect. I think we would just check if
> contiguous PTEs have contiguous swap entries and swapin the folio as a
> large folio in this case. It is likely that the swap entries are
> contiguous because it was swapped out as a large folio, but I don't
> think it's guaranteed.
Yes, makes sense. Thanks for explaining this.
>
> For example, here is a patch that implements large swapin support for
> the synchronous swapin case, and I think it just checks that the PTEs
> have contiguous swap entries:
> https://lore.kernel.org/linux-mm/20240304081348.197341-6-21cnbao@gmail.com/
>
> This makes a lot of sense because otherwise you'd have to keep track
> of how the folios were composed at the time of swapout, to be able to
> swap the same folios back in.
I think the solution to large folio swap-in for this optimization and
zswap is not in swap_read_folio in this patch-series or any call further
down the stack, as its too late by the time you reach here, but in
Barrys' patch. This needs to happen much earlier when deciding the size
of the folio at folio creation in alloc_swap_folio, after Barry checks
if (is_zswap_enabled())
goto fallback;
once the order is decided, we would need to check the indexes in the
zeromap array starting from swap_offset(entry) and ending at
swap_offset(entry) + 2^order are set. If no bits are set, or all bits
are set, then we allocate large folio. Otherwise, we goto fallback.
I think its better to handle this in Barrys patch. I feel this series is
close to its final state, i.e. the only diff I have for the next
revision is below to remove start/end_writeback for zer_filled case. I
will comment on Barrys patch once the I send out the next revision of this.
diff --git a/mm/page_io.c b/mm/page_io.c
index 2cac1e11fb85..08a3772ddcf7 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -274,9 +274,7 @@ int swap_writepage(struct page *page, struct
writeback_control *wbc)
if (is_folio_zero_filled(folio)) {
swap_zeromap_folio_set(folio);
- folio_start_writeback(folio);
folio_unlock(folio);
- folio_end_writeback(folio);
return 0;
}
swap_zeromap_folio_clear(folio);
next prev parent reply other threads:[~2024-06-11 11:49 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-10 12:15 [PATCH v3 0/2] " Usama Arif
2024-06-10 12:15 ` [PATCH v3 1/2] " Usama Arif
2024-06-10 13:07 ` Matthew Wilcox
2024-06-10 13:56 ` Usama Arif
2024-06-10 14:06 ` Matthew Wilcox
2024-06-10 14:14 ` Usama Arif
2024-06-10 14:33 ` Usama Arif
2024-06-10 17:57 ` Yosry Ahmed
2024-06-10 18:36 ` Usama Arif
2024-06-10 18:47 ` Yosry Ahmed
2024-06-11 11:49 ` Usama Arif [this message]
2024-06-11 15:42 ` Yosry Ahmed
2024-06-11 16:52 ` Usama Arif
2024-06-11 17:51 ` Yosry Ahmed
2024-06-11 18:43 ` Usama Arif
2024-06-11 18:39 ` Nhat Pham
2024-06-11 18:46 ` Yosry Ahmed
2024-06-11 18:53 ` Nhat Pham
2024-06-11 18:50 ` Usama Arif
2024-06-11 19:33 ` Nhat Pham
2024-06-12 10:42 ` Usama Arif
2024-06-10 12:16 ` [PATCH v3 2/2] mm: remove code to handle same filled pages Usama Arif
2024-06-13 21:21 ` [PATCH v3 0/2] mm: store zero pages to be swapped out in a bitmap Yosry Ahmed
2024-06-14 9:22 ` Usama Arif
2024-06-14 9:28 ` Yosry Ahmed
2024-06-13 21:50 ` Yosry Ahmed
2024-06-13 22:41 ` Shakeel Butt
2024-06-13 22:59 ` Yosry Ahmed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9ddfe544-636d-4638-ae0e-053674e47322@gmail.com \
--to=usamaarif642@gmail.com \
--cc=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=chengming.zhou@linux.dev \
--cc=david@redhat.com \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nphamcs@gmail.com \
--cc=shakeel.butt@linux.dev \
--cc=willy@infradead.org \
--cc=ying.huang@intel.com \
--cc=yosryahmed@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox