From: Muchun Song <songmuchun@bytedance.com>
To: Zi Yan <ziy@nvidia.com>
Cc: David Rientjes <rientjes@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Linux Memory Management List <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>,
Xiongchun duan <duanxiongchun@bytedance.com>,
Lars Persson <lars.persson@axis.com>,
Mike Kravetz <mike.kravetz@oracle.com>
Subject: Re: [PATCH v2 2/2] mm: fix missing cache flush for all tail pages of THP
Date: Tue, 25 Jan 2022 14:01:59 +0800 [thread overview]
Message-ID: <CAMZfGtVCUbTa6xv2R9x+_a5_dK4ZLUX8ECNkF8mkJgEy4kdnfQ@mail.gmail.com> (raw)
In-Reply-To: <7D7EB27F-DEA7-41AA-B24E-B61A2A1A5F07@nvidia.com>
On Tue, Jan 25, 2022 at 10:42 AM Zi Yan <ziy@nvidia.com> wrote:
>
> On 24 Jan 2022, at 20:55, Muchun Song wrote:
>
> > On Tue, Jan 25, 2022 at 3:22 AM Zi Yan <ziy@nvidia.com> wrote:
> >>
> >> On 24 Jan 2022, at 13:11, David Rientjes wrote:
> >>
> >>> On Mon, 24 Jan 2022, Muchun Song wrote:
> >>>
> >>>> The D-cache maintenance inside move_to_new_page() only consider one page,
> >>>> there is still D-cache maintenance issue for tail pages of THP. Fix this
> >>>> by not using flush_dcache_folio() since it is not backportable.
> >>>>
> >>>
> >>> The mention of being backportable suggests that we should backport this,
> >>> likely to 4.14+. So should it be marked as stable?
> >>
> >> Hmm, after more digging, I am not sure if the bug exists. For THP migration,
> >> flush_cache_range() is used in remove_migration_pmd(). The flush_dcache_page()
> >> was added by Lars Persson (cc’d) to solve the data corruption on MIPS[1],
> >> but THP migration is only enabled on x86_64, PPC_BOOK3S_64, and ARM64.
> >
> > I only mention the THP case. After some more thinking, I think the HugeTLB
> > should also be considered, Right? The HugeTLB is enabled on arm, arm64,
> > mips, parisc, powerpc, riscv, s390 and sh.
> >
>
> +Mike for HugeTLB
>
> If HugeTLB page migration also misses flush_dcache_page() on its tail pages,
> you will need a different patch for the commit introducing hugetlb page migration.
Agree. I think arm (see the following commit) has handled this issue, while most
others do not.
commit 0b19f93351dd ("ARM: mm: Add support for flushing HugeTLB pages.")
But I do not have any real devices to test if this issue exists on other archs.
In theory, it exists.
>
> >>
> >> To make code more consistent, I guess flush_cache_range() in remove_migration_pmd()
> >> can be removed, since it is superseded by the flush_dcache_page() below.
> >
> > From my point of view, flush_cache_range() in remove_migration_pmd() is
> > a wrong usage, which cannot replace flush_dcache_page(). I think the commit
> > c2cc499c5bcf ("mm compaction: fix of improper cache flush in migration code")
> > , which is similar to the situation here, can offer more infos.
> >
>
> Thanks for the information. That helps. But remove_migration_pmd() did not cause
> any issue at the commit pointed by Fixes but at the commit which enabled THP
> migration on IBM and ARM64, whichever came first.
>
> IIUC, there will be different versions of the fix targeting different stable
> trees:
>
> 1. pre-4.14, THP migration did not exist: you will need to fix the use of
> flush_dcache_page() at that time for HugeTLB page migration. Both flushing
> dcache page for all subpages and moving flush_dcache_page from
> remove_migration_pte() to move_to_new_page(). 4.9 and 4.4 are affected.
> But EOL of 4.4 is next month, so you might skip it.
>
> 2. 4.14 to before device public page is removed: your current fix will not
> apply directly, but the for loop works. flush_cache_range() in
> remove_migration_pmd() should be removed, since it is dead code based on
> the commit you mentioned. It might not be worth the effort to find when
> IBM and ARM64 enable THP migration.
>
> 3. after device public page is removed: your current fix will apply cleanly
> and the removal of flush_cache_range() in remove_migration_pmd() should
> be added.
>
> Let me know if it makes sense.
Make sense.
Thanks.
next prev parent reply other threads:[~2022-01-25 6:27 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-01-24 5:17 [PATCH v2 1/2] mm: thp: fix wrong cache flush in remove_migration_pmd() Muchun Song
2022-01-24 5:17 ` [PATCH v2 2/2] mm: fix missing cache flush for all tail pages of THP Muchun Song
2022-01-24 16:07 ` Zi Yan
2022-01-24 18:11 ` David Rientjes
2022-01-24 19:22 ` Zi Yan
2022-01-25 0:41 ` David Rientjes
2022-01-25 1:55 ` Muchun Song
2022-01-25 2:42 ` Zi Yan
2022-01-25 6:01 ` Muchun Song [this message]
2022-01-25 21:24 ` Mike Kravetz
2022-01-26 3:29 ` Muchun Song
2022-01-26 23:26 ` Mike Kravetz
2022-01-27 1:55 ` Muchun Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAMZfGtVCUbTa6xv2R9x+_a5_dK4ZLUX8ECNkF8mkJgEy4kdnfQ@mail.gmail.com \
--to=songmuchun@bytedance.com \
--cc=akpm@linux-foundation.org \
--cc=duanxiongchun@bytedance.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=lars.persson@axis.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mike.kravetz@oracle.com \
--cc=rientjes@google.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox