linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: David Stevens <stevensd@chromium.org>
Cc: linux-mm@kvack.org, Peter Xu <peterx@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Kirill A . Shutemov" <kirill@shutemov.name>,
	Yang Shi <shy828301@gmail.com>,
	David Hildenbrand <david@redhat.com>,
	Hugh Dickins <hughd@google.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] mm/khugepaged: skip shmem with userfaultfd
Date: Mon, 6 Feb 2023 19:01:39 +0000	[thread overview]
Message-ID: <Y+FOk+ty7OKmkwLL@casper.infradead.org> (raw)
In-Reply-To: <20230206112856.1802547-1-stevensd@google.com>

On Mon, Feb 06, 2023 at 08:28:56PM +0900, David Stevens wrote:
> This change first makes sure that the intermediate page cache state
> during collapse is not visible by moving when gaps are filled to after
> the page cache lock is acquired for the final time. This is necessary
> because the synchronization provided by locking hpage is insufficient
> for functions which operate on the page cache without actually locking
> individual pages to examine their content (e.g. shmem_mfill_atomic_pte).

I've been a little scared of touching khugepaged because, well, look at
that function.  But if we are going to touch it, how about this patch
first?  It does _part_ of what you need by not filling in the holes,
but obviously not the part that looks at uffd.  

It leaves the old pages in-place and frozen.  I think this should be
safe, but I haven't booted it (not entirely sure what test I'd run
to prove that it's not broken)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index eb38bd1b1b2f..cfd33dff7253 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1845,15 +1845,14 @@ static int retract_page_tables(struct address_space *mapping, pgoff_t pgoff,
  *  - allocate and lock a new huge page;
  *  - scan page cache replacing old pages with the new one
  *    + swap/gup in pages if necessary;
- *    + fill in gaps;
+ *    + freeze the old pages
  *    + keep old pages around in case rollback is required;
  *  - if replacing succeeds:
  *    + copy data over;
  *    + free old pages;
  *    + unlock huge page;
  *  - if replacing failed;
- *    + put all pages back and unfreeze them;
- *    + restore gaps in the page cache;
+ *    + unfreeze old pages;
  *    + unlock and free huge page;
  */
 static int collapse_file(struct mm_struct *mm, unsigned long addr,
@@ -1930,7 +1929,6 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
 					result = SCAN_FAIL;
 					goto xa_locked;
 				}
-				xas_store(&xas, hpage);
 				nr_none++;
 				continue;
 			}
@@ -2081,8 +2079,6 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
 		 */
 		list_add_tail(&page->lru, &pagelist);
 
-		/* Finally, replace with the new page. */
-		xas_store(&xas, hpage);
 		continue;
 out_unlock:
 		unlock_page(page);
@@ -2195,32 +2191,17 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
 			shmem_uncharge(mapping->host, nr_none);
 		}
 
-		xas_set(&xas, start);
-		xas_for_each(&xas, page, end - 1) {
+		list_for_each_entry_safe(page, tmp, &pagelist, lru) {
+			list_del(&page->lru);
 			page = list_first_entry_or_null(&pagelist,
 					struct page, lru);
-			if (!page || xas.xa_index < page->index) {
-				if (!nr_none)
-					break;
-				nr_none--;
-				/* Put holes back where they were */
-				xas_store(&xas, NULL);
-				continue;
-			}
-
-			VM_BUG_ON_PAGE(page->index != xas.xa_index, page);
 
 			/* Unfreeze the page. */
 			list_del(&page->lru);
 			page_ref_unfreeze(page, 2);
-			xas_store(&xas, page);
-			xas_pause(&xas);
-			xas_unlock_irq(&xas);
 			unlock_page(page);
 			putback_lru_page(page);
-			xas_lock_irq(&xas);
 		}
-		VM_BUG_ON(nr_none);
 		/*
 		 * Undo the updates of filemap_nr_thps_inc for non-SHMEM file only.
 		 * This undo is not needed unless failure is due to SCAN_COPY_MC.


  parent reply	other threads:[~2023-02-06 19:01 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-06 11:28 David Stevens
2023-02-06 14:25 ` Matthew Wilcox
2023-02-06 19:01 ` Matthew Wilcox [this message]
2023-02-06 20:52   ` Peter Xu
2023-02-06 21:50     ` Matthew Wilcox
2023-02-07  1:37       ` David Stevens
2023-02-07  2:29         ` Matthew Wilcox
2023-02-07  4:14           ` David Stevens
2023-02-06 21:02 ` Peter Xu
2023-02-07  3:56   ` David Stevens
2023-02-07 16:34     ` Peter Xu
2023-02-08  2:42       ` David Stevens
2023-02-08 17:24         ` Peter Xu
2023-02-09  5:10           ` David Stevens
2023-02-09 18:50             ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y+FOk+ty7OKmkwLL@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=hughd@google.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=peterx@redhat.com \
    --cc=shy828301@gmail.com \
    --cc=stevensd@chromium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox