From: Matthew Wilcox <willy@infradead.org>
To: David Stevens <stevensd@chromium.org>
Cc: linux-mm@kvack.org, Peter Xu <peterx@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
"Kirill A . Shutemov" <kirill@shutemov.name>,
Yang Shi <shy828301@gmail.com>,
David Hildenbrand <david@redhat.com>,
Hugh Dickins <hughd@google.com>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] mm/khugepaged: skip shmem with userfaultfd
Date: Mon, 6 Feb 2023 19:01:39 +0000 [thread overview]
Message-ID: <Y+FOk+ty7OKmkwLL@casper.infradead.org> (raw)
In-Reply-To: <20230206112856.1802547-1-stevensd@google.com>
On Mon, Feb 06, 2023 at 08:28:56PM +0900, David Stevens wrote:
> This change first makes sure that the intermediate page cache state
> during collapse is not visible by moving when gaps are filled to after
> the page cache lock is acquired for the final time. This is necessary
> because the synchronization provided by locking hpage is insufficient
> for functions which operate on the page cache without actually locking
> individual pages to examine their content (e.g. shmem_mfill_atomic_pte).
I've been a little scared of touching khugepaged because, well, look at
that function. But if we are going to touch it, how about this patch
first? It does _part_ of what you need by not filling in the holes,
but obviously not the part that looks at uffd.
It leaves the old pages in-place and frozen. I think this should be
safe, but I haven't booted it (not entirely sure what test I'd run
to prove that it's not broken)
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index eb38bd1b1b2f..cfd33dff7253 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1845,15 +1845,14 @@ static int retract_page_tables(struct address_space *mapping, pgoff_t pgoff,
* - allocate and lock a new huge page;
* - scan page cache replacing old pages with the new one
* + swap/gup in pages if necessary;
- * + fill in gaps;
+ * + freeze the old pages
* + keep old pages around in case rollback is required;
* - if replacing succeeds:
* + copy data over;
* + free old pages;
* + unlock huge page;
* - if replacing failed;
- * + put all pages back and unfreeze them;
- * + restore gaps in the page cache;
+ * + unfreeze old pages;
* + unlock and free huge page;
*/
static int collapse_file(struct mm_struct *mm, unsigned long addr,
@@ -1930,7 +1929,6 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
result = SCAN_FAIL;
goto xa_locked;
}
- xas_store(&xas, hpage);
nr_none++;
continue;
}
@@ -2081,8 +2079,6 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
*/
list_add_tail(&page->lru, &pagelist);
- /* Finally, replace with the new page. */
- xas_store(&xas, hpage);
continue;
out_unlock:
unlock_page(page);
@@ -2195,32 +2191,17 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
shmem_uncharge(mapping->host, nr_none);
}
- xas_set(&xas, start);
- xas_for_each(&xas, page, end - 1) {
+ list_for_each_entry_safe(page, tmp, &pagelist, lru) {
+ list_del(&page->lru);
page = list_first_entry_or_null(&pagelist,
struct page, lru);
- if (!page || xas.xa_index < page->index) {
- if (!nr_none)
- break;
- nr_none--;
- /* Put holes back where they were */
- xas_store(&xas, NULL);
- continue;
- }
-
- VM_BUG_ON_PAGE(page->index != xas.xa_index, page);
/* Unfreeze the page. */
list_del(&page->lru);
page_ref_unfreeze(page, 2);
- xas_store(&xas, page);
- xas_pause(&xas);
- xas_unlock_irq(&xas);
unlock_page(page);
putback_lru_page(page);
- xas_lock_irq(&xas);
}
- VM_BUG_ON(nr_none);
/*
* Undo the updates of filemap_nr_thps_inc for non-SHMEM file only.
* This undo is not needed unless failure is due to SCAN_COPY_MC.
next prev parent reply other threads:[~2023-02-06 19:01 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-06 11:28 David Stevens
2023-02-06 14:25 ` Matthew Wilcox
2023-02-06 19:01 ` Matthew Wilcox [this message]
2023-02-06 20:52 ` Peter Xu
2023-02-06 21:50 ` Matthew Wilcox
2023-02-07 1:37 ` David Stevens
2023-02-07 2:29 ` Matthew Wilcox
2023-02-07 4:14 ` David Stevens
2023-02-06 21:02 ` Peter Xu
2023-02-07 3:56 ` David Stevens
2023-02-07 16:34 ` Peter Xu
2023-02-08 2:42 ` David Stevens
2023-02-08 17:24 ` Peter Xu
2023-02-09 5:10 ` David Stevens
2023-02-09 18:50 ` Peter Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y+FOk+ty7OKmkwLL@casper.infradead.org \
--to=willy@infradead.org \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=hughd@google.com \
--cc=kirill@shutemov.name \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=peterx@redhat.com \
--cc=shy828301@gmail.com \
--cc=stevensd@chromium.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox