* [RFC PATCH 0/3] snapshot_page()
@ 2025-02-10 21:21 Matthew Wilcox (Oracle)
2025-02-10 21:21 ` [RFC PATCH 1/3] mm: Constify folio_mapping() and swapcache_mapping() Matthew Wilcox (Oracle)
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Matthew Wilcox (Oracle) @ 2025-02-10 21:21 UTC (permalink / raw)
To: linux-mm; +Cc: Matthew Wilcox (Oracle), David Hildenbrand
This compiles fine, but I have not tested it in any meaningful way.
Anyone got a good test case to be sure I didn't break anything?
Matthew Wilcox (Oracle) (3):
mm: Constify folio_mapping() and swapcache_mapping()
mm: Create snapshot_page()
proc: Use snapshot_page() in kpageflags
fs/proc/page.c | 24 ++++++++++--------
include/linux/pagemap.h | 4 +--
mm/debug.c | 55 ++++++++---------------------------------
mm/internal.h | 2 ++
mm/swapfile.c | 2 +-
mm/util.c | 52 +++++++++++++++++++++++++++++++++++++-
6 files changed, 80 insertions(+), 59 deletions(-)
--
2.47.2
^ permalink raw reply [flat|nested] 11+ messages in thread
* [RFC PATCH 1/3] mm: Constify folio_mapping() and swapcache_mapping()
2025-02-10 21:21 [RFC PATCH 0/3] snapshot_page() Matthew Wilcox (Oracle)
@ 2025-02-10 21:21 ` Matthew Wilcox (Oracle)
2025-02-11 20:04 ` David Hildenbrand
2025-02-12 8:56 ` Shivank Garg
2025-02-10 21:21 ` [RFC PATCH 2/3] mm: Create snapshot_page() Matthew Wilcox (Oracle)
2025-02-10 21:21 ` [RFC PATCH 3/3] proc: Use snapshot_page() in kpageflags Matthew Wilcox (Oracle)
2 siblings, 2 replies; 11+ messages in thread
From: Matthew Wilcox (Oracle) @ 2025-02-10 21:21 UTC (permalink / raw)
To: linux-mm; +Cc: Matthew Wilcox (Oracle), David Hildenbrand
Neither of these functions modify their argument; make it const
so the compiler knows this and can optimise accordingly.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
include/linux/pagemap.h | 4 ++--
mm/debug.c | 2 +-
mm/swapfile.c | 2 +-
mm/util.c | 2 +-
4 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 47bfc6b1b632..8a9b2d201706 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -532,8 +532,8 @@ static inline void filemap_nr_thps_dec(struct address_space *mapping)
#endif
}
-struct address_space *folio_mapping(struct folio *);
-struct address_space *swapcache_mapping(struct folio *);
+struct address_space *folio_mapping(const struct folio *);
+struct address_space *swapcache_mapping(const struct folio *);
/**
* folio_file_mapping - Find the mapping this folio belongs to.
diff --git a/mm/debug.c b/mm/debug.c
index 8d2acf432385..fa3d9686034c 100644
--- a/mm/debug.c
+++ b/mm/debug.c
@@ -67,7 +67,7 @@ static const char *page_type_name(unsigned int page_type)
return page_type_names[i];
}
-static void __dump_folio(struct folio *folio, struct page *page,
+static void __dump_folio(const struct folio *folio, struct page *page,
unsigned long pfn, unsigned long idx)
{
struct address_space *mapping = folio_mapping(folio);
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 6e867c16ea93..ccaed8c2f761 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -3658,7 +3658,7 @@ struct swap_info_struct *swp_swap_info(swp_entry_t entry)
/*
* out-of-line methods to avoid include hell.
*/
-struct address_space *swapcache_mapping(struct folio *folio)
+struct address_space *swapcache_mapping(const struct folio *folio)
{
return swp_swap_info(folio->swap)->swap_file->f_mapping;
}
diff --git a/mm/util.c b/mm/util.c
index b6b9684a1438..682ecdb1b1c2 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -845,7 +845,7 @@ struct anon_vma *folio_anon_vma(const struct folio *folio)
* You can call this for folios which aren't in the swap cache or page
* cache and it will return NULL.
*/
-struct address_space *folio_mapping(struct folio *folio)
+struct address_space *folio_mapping(const struct folio *folio)
{
struct address_space *mapping;
--
2.47.2
^ permalink raw reply [flat|nested] 11+ messages in thread
* [RFC PATCH 2/3] mm: Create snapshot_page()
2025-02-10 21:21 [RFC PATCH 0/3] snapshot_page() Matthew Wilcox (Oracle)
2025-02-10 21:21 ` [RFC PATCH 1/3] mm: Constify folio_mapping() and swapcache_mapping() Matthew Wilcox (Oracle)
@ 2025-02-10 21:21 ` Matthew Wilcox (Oracle)
2025-02-12 8:54 ` Shivank Garg
2025-02-14 21:18 ` David Hildenbrand
2025-02-10 21:21 ` [RFC PATCH 3/3] proc: Use snapshot_page() in kpageflags Matthew Wilcox (Oracle)
2 siblings, 2 replies; 11+ messages in thread
From: Matthew Wilcox (Oracle) @ 2025-02-10 21:21 UTC (permalink / raw)
To: linux-mm; +Cc: Matthew Wilcox (Oracle), David Hildenbrand
Move the guts of __dump_page() into a new function called
snapshot_page(). With that done, __dump_page() becomes trivial so
inline it into dump_page().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
mm/debug.c | 53 +++++++++------------------------------------------
mm/internal.h | 2 ++
mm/util.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 61 insertions(+), 44 deletions(-)
diff --git a/mm/debug.c b/mm/debug.c
index fa3d9686034c..1ea5dacb1524 100644
--- a/mm/debug.c
+++ b/mm/debug.c
@@ -120,54 +120,19 @@ static void __dump_folio(const struct folio *folio, struct page *page,
2 * sizeof(struct page), false);
}
-static void __dump_page(const struct page *page)
-{
- struct folio *foliop, folio;
- struct page precise;
- unsigned long head;
- unsigned long pfn = page_to_pfn(page);
- unsigned long idx, nr_pages = 1;
- int loops = 5;
-
-again:
- memcpy(&precise, page, sizeof(*page));
- head = precise.compound_head;
- if ((head & 1) == 0) {
- foliop = (struct folio *)&precise;
- idx = 0;
- if (!folio_test_large(foliop))
- goto dump;
- foliop = (struct folio *)page;
- } else {
- foliop = (struct folio *)(head - 1);
- idx = folio_page_idx(foliop, page);
- }
-
- if (idx < MAX_FOLIO_NR_PAGES) {
- memcpy(&folio, foliop, 2 * sizeof(struct page));
- nr_pages = folio_nr_pages(&folio);
- foliop = &folio;
- }
-
- if (idx > nr_pages) {
- if (loops-- > 0)
- goto again;
- pr_warn("page does not match folio\n");
- precise.compound_head &= ~1UL;
- foliop = (struct folio *)&precise;
- idx = 0;
- }
-
-dump:
- __dump_folio(foliop, &precise, pfn, idx);
-}
-
void dump_page(const struct page *page, const char *reason)
{
+ struct folio stack_folio;
+ struct page stack_page;
+ const struct folio *folio;
+ unsigned long idx;
+
if (PagePoisoned(page))
pr_warn("page:%p is uninitialized and poisoned", page);
- else
- __dump_page(page);
+ else {
+ folio = snapshot_page(&stack_folio, &stack_page, &idx, page);
+ __dump_folio(folio, &stack_page, page_to_pfn(page), idx);
+ }
if (reason)
pr_warn("page dumped because: %s\n", reason);
dump_page_owner(page);
diff --git a/mm/internal.h b/mm/internal.h
index 109ef30fee11..8fdfea104068 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -856,6 +856,8 @@ static inline bool free_area_empty(struct free_area *area, int migratetype)
/* mm/util.c */
struct anon_vma *folio_anon_vma(const struct folio *folio);
+const struct folio *snapshot_page(struct folio *foliop, struct page *precise,
+ unsigned long *idxp, const struct page *unstable);
#ifdef CONFIG_MMU
void unmap_mapping_folio(struct folio *folio);
diff --git a/mm/util.c b/mm/util.c
index 682ecdb1b1c2..9f9cf3933eb1 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -1239,3 +1239,53 @@ void flush_dcache_folio(struct folio *folio)
}
EXPORT_SYMBOL(flush_dcache_folio);
#endif
+
+/*
+ * If you have an unstable reference to a page, use this to get a
+ * somewhat-consistent (potentially outdated) snapshot. The consistency
+ * is limited to the page being contained in the folio. You need to pass in
+ * a scratch folio and scratch page, probably allocated on the stack.
+ * You get back a pointer to the scratch folio you passed in, marked
+ * as const to remind you not to modify this.
+ */
+const struct folio *snapshot_page(struct folio *foliop, struct page *precise,
+ unsigned long *idxp, const struct page *unstable)
+{
+ struct folio *folio;
+ unsigned long head;
+ unsigned long idx, nr_pages = 1;
+ int loops = 5;
+
+again:
+ memcpy(precise, unstable, sizeof(struct page));
+ head = precise->compound_head;
+ /* Open-coded !PageTail because page_is_fake_head() doesn't work here */
+ if ((head & 1) == 0) {
+ folio = (struct folio *)precise;
+ *idxp = 0;
+ /* Not a tail, not a head, we have a single page */
+ if (!folio_test_large(folio))
+ goto out;
+ folio = (struct folio *)unstable;
+ } else {
+ folio = (struct folio *)(head - 1);
+ *idxp = folio_page_idx(folio, unstable);
+ }
+
+ if (idx < MAX_FOLIO_NR_PAGES || folio_test_hugetlb(folio)) {
+ memcpy(foliop, folio, sizeof(struct folio));
+ nr_pages = folio_nr_pages(foliop);
+ folio = foliop;
+ }
+
+ if (idx > nr_pages) {
+ if (loops-- > 0)
+ goto again;
+ pr_warn("page does not match folio\n");
+ precise->compound_head &= ~1UL;
+ folio = (struct folio *)precise;
+ *idxp = 0;
+ }
+out:
+ return folio;
+}
--
2.47.2
^ permalink raw reply [flat|nested] 11+ messages in thread
* [RFC PATCH 3/3] proc: Use snapshot_page() in kpageflags
2025-02-10 21:21 [RFC PATCH 0/3] snapshot_page() Matthew Wilcox (Oracle)
2025-02-10 21:21 ` [RFC PATCH 1/3] mm: Constify folio_mapping() and swapcache_mapping() Matthew Wilcox (Oracle)
2025-02-10 21:21 ` [RFC PATCH 2/3] mm: Create snapshot_page() Matthew Wilcox (Oracle)
@ 2025-02-10 21:21 ` Matthew Wilcox (Oracle)
2025-02-11 21:17 ` Zi Yan
2025-02-12 8:57 ` Shivank Garg
2 siblings, 2 replies; 11+ messages in thread
From: Matthew Wilcox (Oracle) @ 2025-02-10 21:21 UTC (permalink / raw)
To: linux-mm; +Cc: Matthew Wilcox (Oracle), David Hildenbrand
syzbot has reported a number of oopses caused by reading
/proc/kpageflags racing with a folio being split / freed / allocated
and thus a page which starts out as a hed page becomes a tail page
during the read, which our assertions catch as an error.
To solve this problem, snapshot the page like dump_page() does.
Link: the syzbot reports
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
fs/proc/page.c | 24 ++++++++++++++----------
1 file changed, 14 insertions(+), 10 deletions(-)
diff --git a/fs/proc/page.c b/fs/proc/page.c
index a55f5acefa97..9ebbb1e963b4 100644
--- a/fs/proc/page.c
+++ b/fs/proc/page.c
@@ -17,6 +17,7 @@
#include <linux/kernel-page-flags.h>
#include <linux/uaccess.h>
#include "internal.h"
+#include "../mm/internal.h" /* snapshot_page() */
#define KPMSIZE sizeof(u64)
#define KPMMASK (KPMSIZE - 1)
@@ -106,9 +107,12 @@ static inline u64 kpf_copy_bit(u64 kflags, int ubit, int kbit)
return ((kflags >> kbit) & 1) << ubit;
}
-u64 stable_page_flags(const struct page *page)
+u64 stable_page_flags(const struct page *unstable)
{
+ struct folio stack_folio;
+ struct page stack_page;
const struct folio *folio;
+ unsigned long idx;
unsigned long k;
unsigned long mapping;
bool is_anon;
@@ -118,9 +122,9 @@ u64 stable_page_flags(const struct page *page)
* pseudo flag: KPF_NOPAGE
* it differentiates a memory hole from a page with no flags
*/
- if (!page)
+ if (!unstable)
return 1 << KPF_NOPAGE;
- folio = page_folio(page);
+ folio = snapshot_page(&stack_folio, &stack_page, &idx, unstable);
k = folio->flags;
mapping = (unsigned long)folio->mapping;
@@ -129,7 +133,7 @@ u64 stable_page_flags(const struct page *page)
/*
* pseudo flags for the well known (anonymous) memory mapped pages
*/
- if (page_mapped(page))
+ if (page_mapped(&stack_page))
u |= 1 << KPF_MMAP;
if (is_anon) {
u |= 1 << KPF_ANON;
@@ -141,7 +145,7 @@ u64 stable_page_flags(const struct page *page)
* compound pages: export both head/tail info
* they together define a compound page's start/end pos and order
*/
- if (page == &folio->page)
+ if (idx == 0)
u |= kpf_copy_bit(k, KPF_COMPOUND_HEAD, PG_head);
else
u |= 1 << KPF_COMPOUND_TAIL;
@@ -162,14 +166,14 @@ u64 stable_page_flags(const struct page *page)
* Caveats on high order pages: PG_buddy and PG_slab will only be set
* on the head page.
*/
- if (PageBuddy(page))
+ if (PageBuddy(unstable))
u |= 1 << KPF_BUDDY;
- else if (page_count(page) == 0 && is_free_buddy_page(page))
+ else if (folio_ref_count(folio) == 0 && is_free_buddy_page(unstable))
u |= 1 << KPF_BUDDY;
- if (PageOffline(page))
+ if (folio_test_offline(folio))
u |= 1 << KPF_OFFLINE;
- if (PageTable(page))
+ if (folio_test_pgtable(folio))
u |= 1 << KPF_PGTABLE;
if (folio_test_slab(folio))
u |= 1 << KPF_SLAB;
@@ -203,7 +207,7 @@ u64 stable_page_flags(const struct page *page)
if (u & (1 << KPF_HUGE))
u |= kpf_copy_bit(k, KPF_HWPOISON, PG_hwpoison);
else
- u |= kpf_copy_bit(page->flags, KPF_HWPOISON, PG_hwpoison);
+ u |= kpf_copy_bit(stack_page.flags, KPF_HWPOISON, PG_hwpoison);
#endif
u |= kpf_copy_bit(k, KPF_RESERVED, PG_reserved);
--
2.47.2
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC PATCH 1/3] mm: Constify folio_mapping() and swapcache_mapping()
2025-02-10 21:21 ` [RFC PATCH 1/3] mm: Constify folio_mapping() and swapcache_mapping() Matthew Wilcox (Oracle)
@ 2025-02-11 20:04 ` David Hildenbrand
2025-02-12 8:56 ` Shivank Garg
1 sibling, 0 replies; 11+ messages in thread
From: David Hildenbrand @ 2025-02-11 20:04 UTC (permalink / raw)
To: Matthew Wilcox (Oracle), linux-mm
On 10.02.25 22:21, Matthew Wilcox (Oracle) wrote:
> Neither of these functions modify their argument; make it const
> so the compiler knows this and can optimise accordingly.
>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
Reviewed-by: David Hildenbrand <david@redhat.com>
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC PATCH 3/3] proc: Use snapshot_page() in kpageflags
2025-02-10 21:21 ` [RFC PATCH 3/3] proc: Use snapshot_page() in kpageflags Matthew Wilcox (Oracle)
@ 2025-02-11 21:17 ` Zi Yan
2025-02-12 8:57 ` Shivank Garg
1 sibling, 0 replies; 11+ messages in thread
From: Zi Yan @ 2025-02-11 21:17 UTC (permalink / raw)
To: Matthew Wilcox (Oracle); +Cc: linux-mm, David Hildenbrand
On 10 Feb 2025, at 16:21, Matthew Wilcox (Oracle) wrote:
> syzbot has reported a number of oopses caused by reading
> /proc/kpageflags racing with a folio being split / freed / allocated
> and thus a page which starts out as a hed page becomes a tail page
s/hed/head/
> during the read, which our assertions catch as an error.
>
> To solve this problem, snapshot the page like dump_page() does.
>
> Link: the syzbot reports
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
> fs/proc/page.c | 24 ++++++++++++++----------
> 1 file changed, 14 insertions(+), 10 deletions(-)
LGTM. Reviewed-by: Zi Yan <ziy@nvidia.com>
Best Regards,
Yan, Zi
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC PATCH 2/3] mm: Create snapshot_page()
2025-02-10 21:21 ` [RFC PATCH 2/3] mm: Create snapshot_page() Matthew Wilcox (Oracle)
@ 2025-02-12 8:54 ` Shivank Garg
2025-02-14 21:18 ` David Hildenbrand
1 sibling, 0 replies; 11+ messages in thread
From: Shivank Garg @ 2025-02-12 8:54 UTC (permalink / raw)
To: Matthew Wilcox (Oracle), linux-mm; +Cc: David Hildenbrand
Hi Willy,
On 2/11/2025 2:51 AM, Matthew Wilcox (Oracle) wrote:
< snip >
...
> +const struct folio *snapshot_page(struct folio *foliop, struct page *precise,
> + unsigned long *idxp, const struct page *unstable)
> +{
> + struct folio *folio;
> + unsigned long head;
> + unsigned long idx, nr_pages = 1;
> + int loops = 5;
> +
> +again:
> + memcpy(precise, unstable, sizeof(struct page));
> + head = precise->compound_head;
> + /* Open-coded !PageTail because page_is_fake_head() doesn't work here */
> + if ((head & 1) == 0) {
> + folio = (struct folio *)precise;
> + *idxp = 0;
> + /* Not a tail, not a head, we have a single page */
> + if (!folio_test_large(folio))
> + goto out;
> + folio = (struct folio *)unstable;
> + } else {
> + folio = (struct folio *)(head - 1);
> + *idxp = folio_page_idx(folio, unstable);
> + }
> +
> + if (idx < MAX_FOLIO_NR_PAGES || folio_test_hugetlb(folio)) {
idx is not initialized before use. I think you meant *idxp here.
> + memcpy(foliop, folio, sizeof(struct folio));
> + nr_pages = folio_nr_pages(foliop);
> + folio = foliop;
> + }
> +
> + if (idx > nr_pages) {
> + if (loops-- > 0)
> + goto again;
> + pr_warn("page does not match folio\n");
> + precise->compound_head &= ~1UL;
> + folio = (struct folio *)precise;
> + *idxp = 0;
> + }
> +out:
> + return folio;
> +}
Please consider adding my
Reviewed-by: Shivank Garg <shivankg@amd.com>
with this fix:
diff --git a/mm/util.c b/mm/util.c
index 9f9cf3933eb1..155493b71d28 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -1253,7 +1253,7 @@ const struct folio *snapshot_page(struct folio *foliop, struct page *precise,
{
struct folio *folio;
unsigned long head;
- unsigned long idx, nr_pages = 1;
+ unsigned long nr_pages = 1;
int loops = 5;
again:
@@ -1272,13 +1272,13 @@ const struct folio *snapshot_page(struct folio *foliop, struct page *precise,
*idxp = folio_page_idx(folio, unstable);
}
- if (idx < MAX_FOLIO_NR_PAGES || folio_test_hugetlb(folio)) {
+ if (*idxp < MAX_FOLIO_NR_PAGES || folio_test_hugetlb(folio)) {
memcpy(foliop, folio, sizeof(struct folio));
nr_pages = folio_nr_pages(foliop);
folio = foliop;
}
- if (idx > nr_pages) {
+ if (*idxp > nr_pages) {
if (loops-- > 0)
goto again;
pr_warn("page does not match folio\n");
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC PATCH 1/3] mm: Constify folio_mapping() and swapcache_mapping()
2025-02-10 21:21 ` [RFC PATCH 1/3] mm: Constify folio_mapping() and swapcache_mapping() Matthew Wilcox (Oracle)
2025-02-11 20:04 ` David Hildenbrand
@ 2025-02-12 8:56 ` Shivank Garg
1 sibling, 0 replies; 11+ messages in thread
From: Shivank Garg @ 2025-02-12 8:56 UTC (permalink / raw)
To: Matthew Wilcox (Oracle), linux-mm; +Cc: David Hildenbrand
On 2/11/2025 2:51 AM, Matthew Wilcox (Oracle) wrote:
> Neither of these functions modify their argument; make it const
> so the compiler knows this and can optimise accordingly.
>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
> include/linux/pagemap.h | 4 ++--
> mm/debug.c | 2 +-
> mm/swapfile.c | 2 +-
> mm/util.c | 2 +-
> 4 files changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
> index 47bfc6b1b632..8a9b2d201706 100644
> --- a/include/linux/pagemap.h
> +++ b/include/linux/pagemap.h
> @@ -532,8 +532,8 @@ static inline void filemap_nr_thps_dec(struct address_space *mapping)
> #endif
> }
>
> -struct address_space *folio_mapping(struct folio *);
> -struct address_space *swapcache_mapping(struct folio *);
> +struct address_space *folio_mapping(const struct folio *);
> +struct address_space *swapcache_mapping(const struct folio *);
>
> /**
> * folio_file_mapping - Find the mapping this folio belongs to.
> diff --git a/mm/debug.c b/mm/debug.c
> index 8d2acf432385..fa3d9686034c 100644
> --- a/mm/debug.c
> +++ b/mm/debug.c
> @@ -67,7 +67,7 @@ static const char *page_type_name(unsigned int page_type)
> return page_type_names[i];
> }
>
> -static void __dump_folio(struct folio *folio, struct page *page,
> +static void __dump_folio(const struct folio *folio, struct page *page,
> unsigned long pfn, unsigned long idx)
> {
> struct address_space *mapping = folio_mapping(folio);
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 6e867c16ea93..ccaed8c2f761 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -3658,7 +3658,7 @@ struct swap_info_struct *swp_swap_info(swp_entry_t entry)
> /*
> * out-of-line methods to avoid include hell.
> */
> -struct address_space *swapcache_mapping(struct folio *folio)
> +struct address_space *swapcache_mapping(const struct folio *folio)
> {
> return swp_swap_info(folio->swap)->swap_file->f_mapping;
> }
> diff --git a/mm/util.c b/mm/util.c
> index b6b9684a1438..682ecdb1b1c2 100644
> --- a/mm/util.c
> +++ b/mm/util.c
> @@ -845,7 +845,7 @@ struct anon_vma *folio_anon_vma(const struct folio *folio)
> * You can call this for folios which aren't in the swap cache or page
> * cache and it will return NULL.
> */
> -struct address_space *folio_mapping(struct folio *folio)
> +struct address_space *folio_mapping(const struct folio *folio)
> {
> struct address_space *mapping;
>
This patch looks good to me.
Please consider
Reviewed-by: Shivank Garg <shivankg@amd.com>
Thanks,
Shivank
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC PATCH 3/3] proc: Use snapshot_page() in kpageflags
2025-02-10 21:21 ` [RFC PATCH 3/3] proc: Use snapshot_page() in kpageflags Matthew Wilcox (Oracle)
2025-02-11 21:17 ` Zi Yan
@ 2025-02-12 8:57 ` Shivank Garg
1 sibling, 0 replies; 11+ messages in thread
From: Shivank Garg @ 2025-02-12 8:57 UTC (permalink / raw)
To: Matthew Wilcox (Oracle), linux-mm; +Cc: David Hildenbrand
On 2/11/2025 2:51 AM, Matthew Wilcox (Oracle) wrote:
> syzbot has reported a number of oopses caused by reading
> /proc/kpageflags racing with a folio being split / freed / allocated
> and thus a page which starts out as a hed page becomes a tail page
> during the read, which our assertions catch as an error.
>
> To solve this problem, snapshot the page like dump_page() does.
>
> Link: the syzbot reports
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
> fs/proc/page.c | 24 ++++++++++++++----------
> 1 file changed, 14 insertions(+), 10 deletions(-)
>
> diff --git a/fs/proc/page.c b/fs/proc/page.c
> index a55f5acefa97..9ebbb1e963b4 100644
> --- a/fs/proc/page.c
> +++ b/fs/proc/page.c
> @@ -17,6 +17,7 @@
> #include <linux/kernel-page-flags.h>
> #include <linux/uaccess.h>
> #include "internal.h"
> +#include "../mm/internal.h" /* snapshot_page() */
>
> #define KPMSIZE sizeof(u64)
> #define KPMMASK (KPMSIZE - 1)
> @@ -106,9 +107,12 @@ static inline u64 kpf_copy_bit(u64 kflags, int ubit, int kbit)
> return ((kflags >> kbit) & 1) << ubit;
> }
>
> -u64 stable_page_flags(const struct page *page)
> +u64 stable_page_flags(const struct page *unstable)
> {
> + struct folio stack_folio;
> + struct page stack_page;
> const struct folio *folio;
> + unsigned long idx;
> unsigned long k;
> unsigned long mapping;
> bool is_anon;
> @@ -118,9 +122,9 @@ u64 stable_page_flags(const struct page *page)
> * pseudo flag: KPF_NOPAGE
> * it differentiates a memory hole from a page with no flags
> */
> - if (!page)
> + if (!unstable)
> return 1 << KPF_NOPAGE;
> - folio = page_folio(page);
> + folio = snapshot_page(&stack_folio, &stack_page, &idx, unstable);
>
> k = folio->flags;
> mapping = (unsigned long)folio->mapping;
> @@ -129,7 +133,7 @@ u64 stable_page_flags(const struct page *page)
> /*
> * pseudo flags for the well known (anonymous) memory mapped pages
> */
> - if (page_mapped(page))
> + if (page_mapped(&stack_page))
> u |= 1 << KPF_MMAP;
> if (is_anon) {
> u |= 1 << KPF_ANON;
> @@ -141,7 +145,7 @@ u64 stable_page_flags(const struct page *page)
> * compound pages: export both head/tail info
> * they together define a compound page's start/end pos and order
> */
> - if (page == &folio->page)
> + if (idx == 0)
> u |= kpf_copy_bit(k, KPF_COMPOUND_HEAD, PG_head);
> else
> u |= 1 << KPF_COMPOUND_TAIL;
> @@ -162,14 +166,14 @@ u64 stable_page_flags(const struct page *page)
> * Caveats on high order pages: PG_buddy and PG_slab will only be set
> * on the head page.
> */
> - if (PageBuddy(page))
> + if (PageBuddy(unstable))
> u |= 1 << KPF_BUDDY;
> - else if (page_count(page) == 0 && is_free_buddy_page(page))
> + else if (folio_ref_count(folio) == 0 && is_free_buddy_page(unstable))
> u |= 1 << KPF_BUDDY;
>
> - if (PageOffline(page))
> + if (folio_test_offline(folio))
> u |= 1 << KPF_OFFLINE;
> - if (PageTable(page))
> + if (folio_test_pgtable(folio))
> u |= 1 << KPF_PGTABLE;
> if (folio_test_slab(folio))
> u |= 1 << KPF_SLAB;
> @@ -203,7 +207,7 @@ u64 stable_page_flags(const struct page *page)
> if (u & (1 << KPF_HUGE))
> u |= kpf_copy_bit(k, KPF_HWPOISON, PG_hwpoison);
> else
> - u |= kpf_copy_bit(page->flags, KPF_HWPOISON, PG_hwpoison);
> + u |= kpf_copy_bit(stack_page.flags, KPF_HWPOISON, PG_hwpoison);
> #endif
>
> u |= kpf_copy_bit(k, KPF_RESERVED, PG_reserved);
LGTM!
Please consider
Reviewed-by: Shivank Garg <shivankg@amd.com>
Thanks,
Shivank
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC PATCH 2/3] mm: Create snapshot_page()
2025-02-10 21:21 ` [RFC PATCH 2/3] mm: Create snapshot_page() Matthew Wilcox (Oracle)
2025-02-12 8:54 ` Shivank Garg
@ 2025-02-14 21:18 ` David Hildenbrand
2025-04-17 13:52 ` David Hildenbrand
1 sibling, 1 reply; 11+ messages in thread
From: David Hildenbrand @ 2025-02-14 21:18 UTC (permalink / raw)
To: Matthew Wilcox (Oracle), linux-mm
On 10.02.25 22:21, Matthew Wilcox (Oracle) wrote:
> Move the guts of __dump_page() into a new function called
> snapshot_page(). With that done, __dump_page() becomes trivial so
> inline it into dump_page().
>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
> mm/debug.c | 53 +++++++++------------------------------------------
> mm/internal.h | 2 ++
> mm/util.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++
> 3 files changed, 61 insertions(+), 44 deletions(-)
>
> diff --git a/mm/debug.c b/mm/debug.c
> index fa3d9686034c..1ea5dacb1524 100644
> --- a/mm/debug.c
> +++ b/mm/debug.c
> @@ -120,54 +120,19 @@ static void __dump_folio(const struct folio *folio, struct page *page,
> 2 * sizeof(struct page), false);
> }
>
> -static void __dump_page(const struct page *page)
> -{
> - struct folio *foliop, folio;
> - struct page precise;
> - unsigned long head;
> - unsigned long pfn = page_to_pfn(page);
> - unsigned long idx, nr_pages = 1;
> - int loops = 5;
> -
> -again:
> - memcpy(&precise, page, sizeof(*page));
> - head = precise.compound_head;
> - if ((head & 1) == 0) {
> - foliop = (struct folio *)&precise;
> - idx = 0;
> - if (!folio_test_large(foliop))
> - goto dump;
> - foliop = (struct folio *)page;
> - } else {
> - foliop = (struct folio *)(head - 1);
> - idx = folio_page_idx(foliop, page);
> - }
> -
> - if (idx < MAX_FOLIO_NR_PAGES) {
> - memcpy(&folio, foliop, 2 * sizeof(struct page));
> - nr_pages = folio_nr_pages(&folio);
> - foliop = &folio;
> - }
> -
> - if (idx > nr_pages) {
> - if (loops-- > 0)
> - goto again;
> - pr_warn("page does not match folio\n");
> - precise.compound_head &= ~1UL;
> - foliop = (struct folio *)&precise;
> - idx = 0;
> - }
> -
> -dump:
> - __dump_folio(foliop, &precise, pfn, idx);
> -}
> -
> void dump_page(const struct page *page, const char *reason)
> {
> + struct folio stack_folio;
> + struct page stack_page;
> + const struct folio *folio;
> + unsigned long idx;
> +
> if (PagePoisoned(page))
> pr_warn("page:%p is uninitialized and poisoned", page);
> - else
> - __dump_page(page);
> + else {
> + folio = snapshot_page(&stack_folio, &stack_page, &idx, page);
> + __dump_folio(folio, &stack_page, page_to_pfn(page), idx);
> + }
> if (reason)
> pr_warn("page dumped because: %s\n", reason);
> dump_page_owner(page);
> diff --git a/mm/internal.h b/mm/internal.h
> index 109ef30fee11..8fdfea104068 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -856,6 +856,8 @@ static inline bool free_area_empty(struct free_area *area, int migratetype)
>
> /* mm/util.c */
> struct anon_vma *folio_anon_vma(const struct folio *folio);
> +const struct folio *snapshot_page(struct folio *foliop, struct page *precise,
> + unsigned long *idxp, const struct page *unstable);
>
> #ifdef CONFIG_MMU
> void unmap_mapping_folio(struct folio *folio);
> diff --git a/mm/util.c b/mm/util.c
> index 682ecdb1b1c2..9f9cf3933eb1 100644
> --- a/mm/util.c
> +++ b/mm/util.c
> @@ -1239,3 +1239,53 @@ void flush_dcache_folio(struct folio *folio)
> }
> EXPORT_SYMBOL(flush_dcache_folio);
> #endif
> +
> +/*
> + * If you have an unstable reference to a page, use this to get a
> + * somewhat-consistent (potentially outdated) snapshot. The consistency
> + * is limited to the page being contained in the folio. You need to pass in
> + * a scratch folio and scratch page, probably allocated on the stack.
> + * You get back a pointer to the scratch folio you passed in, marked
> + * as const to remind you not to modify this.
> + */
> +const struct folio *snapshot_page(struct folio *foliop, struct page *precise,
> + unsigned long *idxp, const struct page *unstable)
> +{
A better interface might be something like:
struct page_snapshot {
struct folio folio_snapshot;
struct page page_snapshot;
/* page_to_pfn(page_snapshot) etc. don't work. */
unsigned long pfn;
/* folio_page_idx(folio_snapshot, ...) etc. don't work. */
unsigned long idx;
};
void snapshot_page(struct page_snapshot *ps, struct page *page);
Or even (likely performance is less relevant)
struct page_snapshot snapshot_page(struct page *page);
__dump_folio() would then only accept that structure.
In the future, we could indicate what kind of memdesc the page was a
part of, and have "struct folio folio_snapshot;" be in a union with
other types we support to snapshot etc..
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC PATCH 2/3] mm: Create snapshot_page()
2025-02-14 21:18 ` David Hildenbrand
@ 2025-04-17 13:52 ` David Hildenbrand
0 siblings, 0 replies; 11+ messages in thread
From: David Hildenbrand @ 2025-04-17 13:52 UTC (permalink / raw)
To: Matthew Wilcox (Oracle), linux-mm
> A better interface might be something like:
>
> struct page_snapshot {
> struct folio folio_snapshot;
> struct page page_snapshot;
> /* page_to_pfn(page_snapshot) etc. don't work. */
> unsigned long pfn;
> /* folio_page_idx(folio_snapshot, ...) etc. don't work. */
> unsigned long idx;
> };
>
> void snapshot_page(struct page_snapshot *ps, struct page *page);
>
> Or even (likely performance is less relevant)
>
> struct page_snapshot snapshot_page(struct page *page);
>
>
> __dump_folio() would then only accept that structure.
>
>
> In the future, we could indicate what kind of memdesc the page was a
> part of, and have "struct folio folio_snapshot;" be in a union with
> other types we support to snapshot etc..
Hi Willy,
either I or somebody from our team might have time to look into that.
Are you planning on following up on this here soon, or would it help you
if we take care of this?
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2025-04-17 13:52 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-02-10 21:21 [RFC PATCH 0/3] snapshot_page() Matthew Wilcox (Oracle)
2025-02-10 21:21 ` [RFC PATCH 1/3] mm: Constify folio_mapping() and swapcache_mapping() Matthew Wilcox (Oracle)
2025-02-11 20:04 ` David Hildenbrand
2025-02-12 8:56 ` Shivank Garg
2025-02-10 21:21 ` [RFC PATCH 2/3] mm: Create snapshot_page() Matthew Wilcox (Oracle)
2025-02-12 8:54 ` Shivank Garg
2025-02-14 21:18 ` David Hildenbrand
2025-04-17 13:52 ` David Hildenbrand
2025-02-10 21:21 ` [RFC PATCH 3/3] proc: Use snapshot_page() in kpageflags Matthew Wilcox (Oracle)
2025-02-11 21:17 ` Zi Yan
2025-02-12 8:57 ` Shivank Garg
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox