From: David Hildenbrand <david@redhat.com>
To: Barry Song <21cnbao@gmail.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>,
akpm@linux-foundation.org, linux-mm@kvack.org,
baolin.wang@linux.alibaba.com, chrisl@kernel.org,
hanchuanhua@oppo.com, hannes@cmpxchg.org, hughd@google.com,
kasong@tencent.com, linux-kernel@vger.kernel.org,
surenb@google.com, v-songbaohua@oppo.com, willy@infradead.org,
xiang@kernel.org, ying.huang@intel.com, yosryahmed@google.com,
yuzhao@google.com, ziy@nvidia.com
Subject: Re: [PATCH v3 6/6] mm: swap: entirely map large folios found in swapcache
Date: Tue, 7 May 2024 10:59:01 +0200 [thread overview]
Message-ID: <099a2c9e-f85e-4fe7-99dd-81b61115935c@redhat.com> (raw)
In-Reply-To: <CAGsJ_4w4vDu4p-ALrTSKMZhGKDK6TpyDLrAyvY4vkPJEJ3N5Lw@mail.gmail.com>
>> Let's assume a single subpage of a large folio is no longer mapped.
>> Then, we'd have:
>>
>> nr_pages == folio_nr_pages(folio) - 1.
>>
>> You could simply map+reuse most of the folio without COWing.
>
> yes. This is good but the pte which is no longer mapped could be
> anyone within the nr_pages PTEs. so it could be quite tricky for
> set_ptes.
The swap batching logic should take care of that, otherwise it would be
buggy.
>
>>
>> Once we support COW reuse of PTE-mapped THP we'd do the same. Here, it's
>> just easy to detect that the folio is exclusive (folio_ref_count(folio)
>> == 1 before mapping anything).
>>
>> If you really want to mimic what do_wp_page() currently does, you should
>> have:
>>
>> exclusive || (folio_ref_count(folio) == 1 && !folio_test_large(folio))
>
> I actually dislike the part that do_wp_page() handles the reuse of a large
> folio which is entirely mapped. For example, A forks B, B exit, we write
> A's large folio, we get nr_pages CoW of small folios. Ideally, we can
> reuse the whole folios for writing.
Yes, see the link I shared to what I am working on. There isn't really a
question if what we do right now needs to be improved and all these
scenarios are pretty obvious clear.
>
>>
>> Personally, I think we should keep it simple here and use:
>>
>> exclusive || folio_ref_count(folio) == 1
>
> I feel this is still better than
> exclusive || (folio_ref_count(folio) == 1 && !folio_test_large(folio))
> as we reuse the whole large folio. the do_wp_page() behaviour
> doesn't have this.
Yes, but there is the comment "Same logic as in do_wp_page();". We
already ran into issues having different COW reuse logic all over the
place. For this case here, I don't care if we leave it as
"exclusive || folio_ref_count(folio) == 1"
But let's not try inventing new stuff here.
--
Cheers,
David / dhildenb
next prev parent reply other threads:[~2024-05-07 8:59 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-03 0:50 [PATCH v3 0/6] large folios swap-in: handle refault cases first Barry Song
2024-05-03 0:50 ` [PATCH v3 1/6] mm: swap: introduce swap_free_nr() for batched swap_free() Barry Song
2024-05-03 9:26 ` Ryan Roberts
2024-05-03 20:25 ` Chris Li
2024-05-08 7:35 ` Huang, Ying
2024-05-03 0:50 ` [PATCH v3 2/6] mm: remove swap_free() and always use swap_free_nr() Barry Song
2024-05-03 9:31 ` Ryan Roberts
2024-05-03 20:37 ` Chris Li
2024-05-04 4:03 ` Christoph Hellwig
2024-05-04 4:27 ` Barry Song
2024-05-04 4:28 ` Christoph Hellwig
2024-05-04 4:47 ` Barry Song
2024-05-08 7:56 ` Huang, Ying
2024-05-08 8:30 ` Barry Song
2024-05-08 9:10 ` Ryan Roberts
2024-05-03 0:50 ` [PATCH v3 3/6] mm: introduce pte_move_swp_offset() helper which can move offset bidirectionally Barry Song
2024-05-03 9:41 ` Ryan Roberts
2024-05-03 23:40 ` Barry Song
2024-05-06 8:06 ` David Hildenbrand
2024-05-06 8:20 ` Barry Song
2024-05-06 8:31 ` David Hildenbrand
2024-05-07 8:14 ` Ryan Roberts
2024-05-07 8:24 ` Barry Song
2024-05-07 9:39 ` Ryan Roberts
2024-05-03 20:51 ` Chris Li
2024-05-03 23:07 ` Barry Song
2024-05-08 8:08 ` Huang, Ying
2024-05-03 0:50 ` [PATCH v3 4/6] mm: introduce arch_do_swap_page_nr() which allows restore metadata for nr pages Barry Song
2024-05-03 10:02 ` Ryan Roberts
2024-05-06 16:51 ` Khalid Aziz
2024-05-03 0:50 ` [PATCH v3 5/6] mm: swap: make should_try_to_free_swap() support large-folio Barry Song
2024-05-03 0:50 ` [PATCH v3 6/6] mm: swap: entirely map large folios found in swapcache Barry Song
2024-05-03 10:50 ` Ryan Roberts
2024-05-03 23:23 ` Barry Song
2024-05-06 12:07 ` David Hildenbrand
2024-05-06 12:38 ` Barry Song
2024-05-06 12:58 ` Barry Song
2024-05-06 13:16 ` David Hildenbrand
2024-05-06 22:58 ` Barry Song
2024-05-07 8:24 ` David Hildenbrand
2024-05-07 8:43 ` Barry Song
2024-05-07 8:59 ` David Hildenbrand [this message]
2024-05-07 9:24 ` Barry Song
2024-05-07 10:39 ` David Hildenbrand
2024-05-07 10:48 ` Barry Song
2024-05-07 8:17 ` Ryan Roberts
2024-05-06 12:05 ` David Hildenbrand
2024-05-06 12:27 ` Barry Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=099a2c9e-f85e-4fe7-99dd-81b61115935c@redhat.com \
--to=david@redhat.com \
--cc=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=chrisl@kernel.org \
--cc=hanchuanhua@oppo.com \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=kasong@tencent.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ryan.roberts@arm.com \
--cc=surenb@google.com \
--cc=v-songbaohua@oppo.com \
--cc=willy@infradead.org \
--cc=xiang@kernel.org \
--cc=ying.huang@intel.com \
--cc=yosryahmed@google.com \
--cc=yuzhao@google.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox