linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "David Hildenbrand (Arm)" <david@kernel.org>
To: Zi Yan <ziy@nvidia.com>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	Song Liu <songliubraving@fb.com>
Cc: Chris Mason <clm@fb.com>, David Sterba <dsterba@suse.com>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	Lorenzo Stoakes <ljs@kernel.org>,
	Baolin Wang <baolin.wang@linux.alibaba.com>,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Nico Pache <npache@redhat.com>,
	Ryan Roberts <ryan.roberts@arm.com>, Dev Jain <dev.jain@arm.com>,
	Barry Song <baohua@kernel.org>, Lance Yang <lance.yang@linux.dev>,
	Vlastimil Babka <vbabka@kernel.org>,
	Mike Rapoport <rppt@kernel.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Michal Hocko <mhocko@suse.com>, Shuah Khan <shuah@kernel.org>,
	linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	linux-kselftest@vger.kernel.org
Subject: Re: [PATCH 7.2 v2 02/12] mm/khugepaged: add folio dirty check after try_to_unmap_flush()
Date: Tue, 14 Apr 2026 12:38:39 +0200	[thread overview]
Message-ID: <10aadfeb-86c6-47b2-b6ab-b86657aafb88@kernel.org> (raw)
In-Reply-To: <20260413192030.3275825-3-ziy@nvidia.com>

On 4/13/26 21:20, Zi Yan wrote:
> This check ensures the correctness of collapse read-only THPs for FSes
> after READ_ONLY_THP_FOR_FS is enabled by default for all FSes supporting
> PMD THP pagecache.
> 
> READ_ONLY_THP_FOR_FS only supports read-only fd and uses mapping->nr_thps
> and inode->i_writecount to prevent any write to read-only to-be-collapsed
> folios. In upcoming commits, READ_ONLY_THP_FOR_FS will be removed and the
> aforementioned mechanism will go away too. To ensure khugepaged functions
> as expected after the changes, rollback if any folio is dirty after
> try_to_unmap_flush() to , since a dirty folio means this read-only folio
> got some writes via mmap can happen between try_to_unmap() and
> try_to_unmap_flush() via cached TLB entries and khugepaged does not support
> collapse writable pagecache folios.
> 
> Signed-off-by: Zi Yan <ziy@nvidia.com>
> ---
>  mm/khugepaged.c | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)
> 
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index d2f0acd2dac2..ec609e53082e 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -2121,6 +2121,24 @@ static enum scan_result collapse_file(struct mm_struct *mm, unsigned long addr,
>  	 */
>  	try_to_unmap_flush();
>  
> +	/*
> +	 * At this point, all folios are locked, unmapped, and all cached
> +	 * mappings in TLBs are flushed. No one else is able to write to these
> +	 * folios, since
> +	 * 1. writes via FS ops require folio locks (see write_begin_get_folio());
> +	 * 2. writes via mmap require taking a fault and locking folio locks.
> +	 *

maybe simplify to "folios, since that would require taking the folio lock first."

> +	 * khugepaged only works for read-only fd, make sure all folios are
> +	 * clean, since writes via mmap can happen between try_to_unmap() and
> +	 * try_to_unmap_flush() via cached TLB entries.


IIRC, after successful try_to_unmap() the PTE dirty bit would be synced to the 
folio. That's what you care about, not about any stale TLB entries.

The important part is that the 

So can't we simply test for dirty folios after the refcount check (where
we made sure the folio is no longer mapped)?



diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index b2ac28ddd480..920e16067134 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -2089,6 +2089,14 @@ static enum scan_result collapse_file(struct mm_struct *mm, unsigned long addr,
                        goto out_unlock;
                }
 
+               /* ... */
+               if (!is_shmem && folio_test_dirty(folio)) {
+                       result = SCAN_PAGE_DIRTY_OR_WRITEBACK;
+                       xas_unlock_irq(&xas);
+                       folio_putback_lru(folio);
+                       goto out_unlock;
+               }
+
                /*
                 * Accumulate the folios that are being collapsed.


I guess we don't have to recheck folio_test_writeback() ?

-- 
Cheers,

David


  parent reply	other threads:[~2026-04-14 10:38 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-13 19:20 [PATCH 7.2 v2 00/12] Remove read-only THP support for FSes without large folio support Zi Yan
2026-04-13 19:20 ` [PATCH 7.2 v2 01/12] mm/khugepaged: remove READ_ONLY_THP_FOR_FS check Zi Yan
2026-04-13 20:20   ` Matthew Wilcox
2026-04-13 20:34     ` Zi Yan
2026-04-14 10:19       ` David Hildenbrand (Arm)
2026-04-14 10:20       ` David Hildenbrand (Arm)
2026-04-14 10:29   ` David Hildenbrand (Arm)
2026-04-14 15:37     ` Lance Yang
2026-04-14 15:43       ` Lance Yang
2026-04-14 15:59         ` Zi Yan
2026-04-13 19:20 ` [PATCH 7.2 v2 02/12] mm/khugepaged: add folio dirty check after try_to_unmap_flush() Zi Yan
2026-04-13 20:23   ` Matthew Wilcox
2026-04-13 20:28     ` Zi Yan
2026-04-14 10:38   ` David Hildenbrand (Arm) [this message]
2026-04-14 15:55     ` Zi Yan
2026-04-13 19:20 ` [PATCH 7.2 v2 03/12] mm/huge_memory: remove READ_ONLY_THP_FOR_FS from file_thp_enabled() Zi Yan
2026-04-14 10:40   ` David Hildenbrand (Arm)
2026-04-14 15:59     ` Zi Yan
2026-04-13 19:20 ` [PATCH 7.2 v2 04/12] mm: remove READ_ONLY_THP_FOR_FS Kconfig option Zi Yan
2026-04-14 10:40   ` David Hildenbrand (Arm)
2026-04-13 19:20 ` [PATCH 7.2 v2 05/12] mm/khugepaged: remove READ_ONLY_THP_FOR_FS check in hugepage_pmd_enabled() Zi Yan
2026-04-13 20:33   ` Matthew Wilcox
2026-04-13 20:42     ` Zi Yan
2026-04-14 11:02       ` David Hildenbrand (Arm)
2026-04-14 16:30         ` Zi Yan
2026-04-14 18:14           ` David Hildenbrand (Arm)
2026-04-14 18:25             ` Zi Yan
2026-04-13 19:20 ` [PATCH 7.2 v2 06/12] mm: fs: remove filemap_nr_thps*() functions and their users Zi Yan
2026-04-13 20:35   ` Matthew Wilcox
2026-04-14 11:02   ` David Hildenbrand (Arm)
2026-04-13 19:20 ` [PATCH 7.2 v2 07/12] fs: remove nr_thps from struct address_space Zi Yan
2026-04-13 20:38   ` Matthew Wilcox
2026-04-13 19:20 ` [PATCH 7.2 v2 08/12] mm/huge_memory: remove folio split check for READ_ONLY_THP_FOR_FS Zi Yan
2026-04-13 20:41   ` Matthew Wilcox
2026-04-13 20:46     ` Zi Yan
2026-04-14 11:03       ` David Hildenbrand (Arm)
2026-04-13 19:20 ` [PATCH 7.2 v2 09/12] mm/truncate: use folio_split() in truncate_inode_partial_folio() Zi Yan
2026-04-13 19:20 ` [PATCH 7.2 v2 10/12] fs/btrfs: remove a comment referring to READ_ONLY_THP_FOR_FS Zi Yan
2026-04-14 11:06   ` David Hildenbrand (Arm)
2026-04-13 19:20 ` [PATCH 7.2 v2 11/12] selftests/mm: remove READ_ONLY_THP_FOR_FS in khugepaged Zi Yan
2026-04-14 11:06   ` David Hildenbrand (Arm)
2026-04-13 19:20 ` [PATCH 7.2 v2 12/12] selftests/mm: remove READ_ONLY_THP_FOR_FS from comments in guard-regions Zi Yan
2026-04-13 20:47   ` Matthew Wilcox
2026-04-13 20:51     ` Zi Yan
2026-04-13 22:28       ` Matthew Wilcox
2026-04-14 11:09         ` David Hildenbrand (Arm)
2026-04-14 16:45           ` Zi Yan
2026-04-14 17:40             ` Matthew Wilcox
2026-04-14 17:53               ` Zi Yan
2026-04-14 11:07   ` David Hildenbrand (Arm)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=10aadfeb-86c6-47b2-b6ab-b86657aafb88@kernel.org \
    --to=david@kernel.org \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=brauner@kernel.org \
    --cc=clm@fb.com \
    --cc=dev.jain@arm.com \
    --cc=dsterba@suse.com \
    --cc=jack@suse.cz \
    --cc=lance.yang@linux.dev \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ljs@kernel.org \
    --cc=mhocko@suse.com \
    --cc=npache@redhat.com \
    --cc=rppt@kernel.org \
    --cc=ryan.roberts@arm.com \
    --cc=shuah@kernel.org \
    --cc=songliubraving@fb.com \
    --cc=surenb@google.com \
    --cc=vbabka@kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox