linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Hugh Dickins <hughd@google.com>
To: Zi Yan <ziy@nvidia.com>
Cc: Liu Shixin <liushixin2@huawei.com>,
	 Baolin Wang <baolin.wang@linux.alibaba.com>,
	linux-mm@kvack.org,  Andrew Morton <akpm@linux-foundation.org>,
	Barry Song <baohua@kernel.org>,
	 David Hildenbrand <david@redhat.com>,
	 Kefeng Wang <wangkefeng.wang@huawei.com>,
	Lance Yang <ioworker0@gmail.com>,
	 Ryan Roberts <ryan.roberts@arm.com>,
	Matthew Wilcox <willy@infradead.org>,
	 Hugh Dickins <hughd@google.com>,
	 Charan Teja Kalla <quic_charante@quicinc.com>,
	 linux-kernel@vger.kernel.org, Shivank Garg <shivankg@amd.com>
Subject: Re: [PATCH v2] mm/migrate: fix shmem xarray update during migration
Date: Tue, 4 Mar 2025 01:47:46 -0800 (PST)	[thread overview]
Message-ID: <23d65532-859a-e88f-9c24-06a6c7ff4006@google.com> (raw)
In-Reply-To: <20250228174953.2222831-1-ziy@nvidia.com>

On Fri, 28 Feb 2025, Zi Yan wrote:

> Pagecache uses multi-index entries for large folio, so does shmem. Only
> swap cache still stores multiple entries for a single large folio.
> Commit fc346d0a70a1 ("mm: migrate high-order folios in swap cache correctly")
> fixed swap cache but got shmem wrong by storing multiple entries for
> a large shmem folio. Fix it by storing a single entry for a shmem
> folio.
> 
> Fixes: fc346d0a70a1 ("mm: migrate high-order folios in swap cache correctly")
> Reported-by: Liu Shixin <liushixin2@huawei.com>
> Closes: https://lore.kernel.org/all/28546fb4-5210-bf75-16d6-43e1f8646080@huawei.com/
> Signed-off-by: Zi Yan <ziy@nvidia.com>
> Reviewed-by: Shivank Garg <shivankg@amd.com>

It's a great find (I think), and your commit message is okay:
but unless I'm much mistaken, NAK to the patch itself.

First, I say "(I think)" there, because I don't actually know what the
loop writing the same folio nr times to the multi-index entry does to
the xarray: I can imagine it as being completely harmless, just nr
times more work than was needed.

But I guess it does something bad, since Matthew was horrified,
and we have all found that your patch appears to improve behaviour
(or at least improve behaviour in the context of your folio_split()
series: none of us noticed a problem before that, but it may be
that your new series is widening our exposure to existing bugs).

Maybe your orginal patch, with the shmem_mapping(mapping) check there,
was good, and it's only wrong when changed to !folio_test_anon(folio);
but TBH I find it too confusing, with the conditionals the way they are.
See my preferred alternative below.

The vital point is that multi-index entries are not used in swap cache:
whether the folio in question orginates from anon or from shmem.  And
it's easier to understand once you remember that a shmem folio is never
in both page cache and swap cache at the same time (well, there may be an
instant of transition from one to other while that folio is held locked) -
once it's in swap cache, folio->mapping is NULL and it's no longer
recognizable as from a shmem mapping.

The way I read your patch originally, I thought it meant that shmem
folios go into the swap cache as multi-index, but anon folios do not;
which seemed a worrying mixture to me.  But crashes on the
VM_BUG_ON_PAGE(entry != folio, entry) in __delete_from_swap_cache()
yesterday (with your patch in) led me to see how add_to_swap_cache()
inserts multiple non-multi-index entries, whether for anon or for shmem.

If this patch really is needed in old releases, then I suspect that
mm/huge_memory.c needs correction there too; but let me explain in
a response to your folio_split() series.

> ---
>  mm/migrate.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 365c6daa8d1b..2c9669135a38 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -524,7 +524,11 @@ static int __folio_migrate_mapping(struct address_space *mapping,
>  			folio_set_swapcache(newfolio);
>  			newfolio->private = folio_get_private(folio);
>  		}
> -		entries = nr;
> +		/* shmem uses high-order entry */
> +		if (!folio_test_anon(folio))
> +			entries = 1;
> +		else
> +			entries = nr;
>  	} else {
>  		VM_BUG_ON_FOLIO(folio_test_swapcache(folio), folio);
>  		entries = 1;
> -- 
> 2.47.2

NAK to that patch above, here's how I think it should be:

Signed-off-by: Hugh Dickins <hughd@google.com>
---
 mm/migrate.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index fb19a18892c8..822776819ca6 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -518,12 +518,12 @@ static int __folio_migrate_mapping(struct address_space *mapping,
 	if (folio_test_anon(folio) && folio_test_large(folio))
 		mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_ANON, 1);
 	folio_ref_add(newfolio, nr); /* add cache reference */
-	if (folio_test_swapbacked(folio)) {
+	if (folio_test_swapbacked(folio))
 		__folio_set_swapbacked(newfolio);
-		if (folio_test_swapcache(folio)) {
-			folio_set_swapcache(newfolio);
-			newfolio->private = folio_get_private(folio);
-		}
+
+	if (folio_test_swapcache(folio)) {
+		folio_set_swapcache(newfolio);
+		newfolio->private = folio_get_private(folio);
 		entries = nr;
 	} else {
 		VM_BUG_ON_FOLIO(folio_test_swapcache(folio), folio);
-- 
2.43.0


  parent reply	other threads:[~2025-03-04  9:48 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-28 17:49 Zi Yan
2025-03-01  1:59 ` Baolin Wang
2025-03-04  2:03 ` Zi Yan
2025-03-04  5:30   ` Greg KH
2025-03-04 17:00     ` Zi Yan
2025-03-04  9:47 ` Hugh Dickins [this message]
2025-03-04 17:18   ` Zi Yan
2025-03-04 20:07     ` Zi Yan
2025-03-05  3:22       ` Zi Yan
2025-03-05 19:55   ` Zi Yan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=23d65532-859a-e88f-9c24-06a6c7ff4006@google.com \
    --to=hughd@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=david@redhat.com \
    --cc=ioworker0@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=liushixin2@huawei.com \
    --cc=quic_charante@quicinc.com \
    --cc=ryan.roberts@arm.com \
    --cc=shivankg@amd.com \
    --cc=wangkefeng.wang@huawei.com \
    --cc=willy@infradead.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox