* [PATCH 01/28] mm: Remove folio_pincount_ptr() and head_compound_pincount()
2023-01-11 14:28 [PATCH 00/28] Get rid of tail page fields Matthew Wilcox (Oracle)
@ 2023-01-11 14:28 ` Matthew Wilcox (Oracle)
2023-01-12 3:05 ` John Hubbard
2023-01-11 14:28 ` [PATCH 02/28] mm: Convert head_subpages_mapcount() into folio_nr_pages_mapped() Matthew Wilcox (Oracle)
` (26 subsequent siblings)
27 siblings, 1 reply; 31+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-11 14:28 UTC (permalink / raw)
To: Andrew Morton; +Cc: Matthew Wilcox (Oracle), linux-mm, Hugh Dickins
We can use folio->_pincount directly, since all users are guarded by
tests of compound/large.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
Documentation/core-api/pin_user_pages.rst | 29 +++++++++++------------
include/linux/mm.h | 14 ++---------
include/linux/mm_types.h | 5 ----
mm/debug.c | 4 ++--
mm/gup.c | 8 +++----
mm/huge_memory.c | 4 ++--
mm/hugetlb.c | 4 ++--
mm/page_alloc.c | 9 ++++---
8 files changed, 32 insertions(+), 45 deletions(-)
diff --git a/Documentation/core-api/pin_user_pages.rst b/Documentation/core-api/pin_user_pages.rst
index facafbdecb95..9fb0b1080d3b 100644
--- a/Documentation/core-api/pin_user_pages.rst
+++ b/Documentation/core-api/pin_user_pages.rst
@@ -55,18 +55,17 @@ flags the caller provides. The caller is required to pass in a non-null struct
pages* array, and the function then pins pages by incrementing each by a special
value: GUP_PIN_COUNTING_BIAS.
-For compound pages, the GUP_PIN_COUNTING_BIAS scheme is not used. Instead,
-an exact form of pin counting is achieved, by using the 2nd struct page
-in the compound page. A new struct page field, compound_pincount, has
-been added in order to support this.
-
-This approach for compound pages avoids the counting upper limit problems that
-are discussed below. Those limitations would have been aggravated severely by
-huge pages, because each tail page adds a refcount to the head page. And in
-fact, testing revealed that, without a separate compound_pincount field,
-page overflows were seen in some huge page stress tests.
-
-This also means that huge pages and compound pages do not suffer
+For large folios, the GUP_PIN_COUNTING_BIAS scheme is not used. Instead,
+the extra space available in the struct folio is used to store the
+pincount directly.
+
+This approach for large folios avoids the counting upper limit problems
+that are discussed below. Those limitations would have been aggravated
+severely by huge pages, because each tail page adds a refcount to the
+head page. And in fact, testing revealed that, without a separate pincount
+field, refcount overflows were seen in some huge page stress tests.
+
+This also means that huge pages and large folios do not suffer
from the false positives problem that is mentioned below.::
Function
@@ -264,9 +263,9 @@ place.)
Other diagnostics
=================
-dump_page() has been enhanced slightly, to handle these new counting
-fields, and to better report on compound pages in general. Specifically,
-for compound pages, the exact (compound_pincount) pincount is reported.
+dump_page() has been enhanced slightly to handle these new counting
+fields, and to better report on large folios in general. Specifically,
+for large folios, the exact pincount is reported.
References
==========
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 49e40766adb6..5683a25ce08e 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1011,11 +1011,6 @@ static inline void folio_set_compound_dtor(struct folio *folio,
void destroy_large_folio(struct folio *folio);
-static inline int head_compound_pincount(struct page *head)
-{
- return atomic_read(compound_pincount_ptr(head));
-}
-
static inline void set_compound_order(struct page *page, unsigned int order)
{
page[1].compound_order = order;
@@ -1641,11 +1636,6 @@ static inline struct folio *pfn_folio(unsigned long pfn)
return page_folio(pfn_to_page(pfn));
}
-static inline atomic_t *folio_pincount_ptr(struct folio *folio)
-{
- return &folio_page(folio, 1)->compound_pincount;
-}
-
/**
* folio_maybe_dma_pinned - Report if a folio may be pinned for DMA.
* @folio: The folio.
@@ -1663,7 +1653,7 @@ static inline atomic_t *folio_pincount_ptr(struct folio *folio)
* expected to be able to deal gracefully with a false positive.
*
* For large folios, the result will be exactly correct. That's because
- * we have more tracking data available: the compound_pincount is used
+ * we have more tracking data available: the _pincount field is used
* instead of the GUP_PIN_COUNTING_BIAS scheme.
*
* For more information, please see Documentation/core-api/pin_user_pages.rst.
@@ -1674,7 +1664,7 @@ static inline atomic_t *folio_pincount_ptr(struct folio *folio)
static inline bool folio_maybe_dma_pinned(struct folio *folio)
{
if (folio_test_large(folio))
- return atomic_read(folio_pincount_ptr(folio)) > 0;
+ return atomic_read(&folio->_pincount) > 0;
/*
* folio_ref_count() is signed. If that refcount overflows, then
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 2b0a0595fc9e..c225d81eae83 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -443,11 +443,6 @@ static inline atomic_t *subpages_mapcount_ptr(struct page *page)
return &page[1].subpages_mapcount;
}
-static inline atomic_t *compound_pincount_ptr(struct page *page)
-{
- return &page[1].compound_pincount;
-}
-
/*
* Used for sizing the vmemmap region on some architectures
*/
diff --git a/mm/debug.c b/mm/debug.c
index 7f8e5f744e42..893c9dbf76ca 100644
--- a/mm/debug.c
+++ b/mm/debug.c
@@ -94,11 +94,11 @@ static void __dump_page(struct page *page)
page, page_ref_count(head), mapcount, mapping,
page_to_pgoff(page), page_to_pfn(page));
if (compound) {
- pr_warn("head:%p order:%u compound_mapcount:%d subpages_mapcount:%d compound_pincount:%d\n",
+ pr_warn("head:%p order:%u compound_mapcount:%d subpages_mapcount:%d pincount:%d\n",
head, compound_order(head),
head_compound_mapcount(head),
head_subpages_mapcount(head),
- head_compound_pincount(head));
+ atomic_read(&folio->_pincount));
}
#ifdef CONFIG_MEMCG
diff --git a/mm/gup.c b/mm/gup.c
index f45a3a5be53a..38ba1697dd61 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -111,7 +111,7 @@ static inline struct folio *try_get_folio(struct page *page, int refs)
* FOLL_GET: folio's refcount will be incremented by @refs.
*
* FOLL_PIN on large folios: folio's refcount will be incremented by
- * @refs, and its compound_pincount will be incremented by @refs.
+ * @refs, and its pincount will be incremented by @refs.
*
* FOLL_PIN on single-page folios: folio's refcount will be incremented by
* @refs * GUP_PIN_COUNTING_BIAS.
@@ -157,7 +157,7 @@ struct folio *try_grab_folio(struct page *page, int refs, unsigned int flags)
* try_get_folio() is left intact.
*/
if (folio_test_large(folio))
- atomic_add(refs, folio_pincount_ptr(folio));
+ atomic_add(refs, &folio->_pincount);
else
folio_ref_add(folio,
refs * (GUP_PIN_COUNTING_BIAS - 1));
@@ -182,7 +182,7 @@ static void gup_put_folio(struct folio *folio, int refs, unsigned int flags)
if (flags & FOLL_PIN) {
node_stat_mod_folio(folio, NR_FOLL_PIN_RELEASED, refs);
if (folio_test_large(folio))
- atomic_sub(refs, folio_pincount_ptr(folio));
+ atomic_sub(refs, &folio->_pincount);
else
refs *= GUP_PIN_COUNTING_BIAS;
}
@@ -232,7 +232,7 @@ int __must_check try_grab_page(struct page *page, unsigned int flags)
*/
if (folio_test_large(folio)) {
folio_ref_add(folio, 1);
- atomic_add(1, folio_pincount_ptr(folio));
+ atomic_add(1, &folio->_pincount);
} else {
folio_ref_add(folio, GUP_PIN_COUNTING_BIAS);
}
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index c13b1f67d14e..9570f03cdee4 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2477,9 +2477,9 @@ static void __split_huge_page_tail(struct page *head, int tail,
* of swap cache pages that store the swp_entry_t in tail pages.
* Fix up and warn once if private is unexpectedly set.
*
- * What of 32-bit systems, on which head[1].compound_pincount overlays
+ * What of 32-bit systems, on which folio->_pincount overlays
* head[1].private? No problem: THP_SWAP is not enabled on 32-bit, and
- * compound_pincount must be 0 for folio_ref_freeze() to have succeeded.
+ * pincount must be 0 for folio_ref_freeze() to have succeeded.
*/
if (!folio_test_swapcache(page_folio(head))) {
VM_WARN_ON_ONCE_PAGE(page_tail->private != 0, page_tail);
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 273a6522aa4c..15b2707c1600 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1476,7 +1476,7 @@ static void __destroy_compound_gigantic_folio(struct folio *folio,
atomic_set(folio_mapcount_ptr(folio), 0);
atomic_set(folio_subpages_mapcount_ptr(folio), 0);
- atomic_set(folio_pincount_ptr(folio), 0);
+ atomic_set(&folio->_pincount, 0);
for (i = 1; i < nr_pages; i++) {
p = folio_page(folio, i);
@@ -1998,7 +1998,7 @@ static bool __prep_compound_gigantic_folio(struct folio *folio,
}
atomic_set(folio_mapcount_ptr(folio), -1);
atomic_set(folio_subpages_mapcount_ptr(folio), 0);
- atomic_set(folio_pincount_ptr(folio), 0);
+ atomic_set(&folio->_pincount, 0);
return true;
out_error:
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 4d9afa1048ea..d1e5ec875fd0 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -775,11 +775,13 @@ void free_compound_page(struct page *page)
static void prep_compound_head(struct page *page, unsigned int order)
{
+ struct folio *folio = (struct folio *)page;
+
set_compound_page_dtor(page, COMPOUND_PAGE_DTOR);
set_compound_order(page, order);
atomic_set(compound_mapcount_ptr(page), -1);
atomic_set(subpages_mapcount_ptr(page), 0);
- atomic_set(compound_pincount_ptr(page), 0);
+ atomic_set(&folio->_pincount, 0);
}
static void prep_compound_tail(struct page *head, int tail_idx)
@@ -1291,6 +1293,7 @@ static inline bool free_page_is_bad(struct page *page)
static int free_tail_pages_check(struct page *head_page, struct page *page)
{
+ struct folio *folio = (struct folio *)head_page;
int ret = 1;
/*
@@ -1314,8 +1317,8 @@ static int free_tail_pages_check(struct page *head_page, struct page *page)
bad_page(page, "nonzero subpages_mapcount");
goto out;
}
- if (unlikely(head_compound_pincount(head_page))) {
- bad_page(page, "nonzero compound_pincount");
+ if (unlikely(atomic_read(&folio->_pincount))) {
+ bad_page(page, "nonzero pincount");
goto out;
}
break;
--
2.35.1
^ permalink raw reply [flat|nested] 31+ messages in thread* Re: [PATCH 01/28] mm: Remove folio_pincount_ptr() and head_compound_pincount()
2023-01-11 14:28 ` [PATCH 01/28] mm: Remove folio_pincount_ptr() and head_compound_pincount() Matthew Wilcox (Oracle)
@ 2023-01-12 3:05 ` John Hubbard
2023-01-12 12:40 ` Matthew Wilcox
0 siblings, 1 reply; 31+ messages in thread
From: John Hubbard @ 2023-01-12 3:05 UTC (permalink / raw)
To: Matthew Wilcox (Oracle), Andrew Morton; +Cc: linux-mm, Hugh Dickins
On 1/11/23 06:28, Matthew Wilcox (Oracle) wrote:
> We can use folio->_pincount directly, since all users are guarded by
> tests of compound/large.
>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
> Documentation/core-api/pin_user_pages.rst | 29 +++++++++++------------
> include/linux/mm.h | 14 ++---------
> include/linux/mm_types.h | 5 ----
> mm/debug.c | 4 ++--
> mm/gup.c | 8 +++----
> mm/huge_memory.c | 4 ++--
> mm/hugetlb.c | 4 ++--
> mm/page_alloc.c | 9 ++++---
> 8 files changed, 32 insertions(+), 45 deletions(-)
Looks very nice, just a couple of questions about casts, below.
...
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 4d9afa1048ea..d1e5ec875fd0 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -775,11 +775,13 @@ void free_compound_page(struct page *page)
>
> static void prep_compound_head(struct page *page, unsigned int order)
> {
> + struct folio *folio = (struct folio *)page;
Casting, eh? I wonder if prep_compound_head() should just take a folio?
There are only a few callers of that.
> +
> set_compound_page_dtor(page, COMPOUND_PAGE_DTOR);
> set_compound_order(page, order);
> atomic_set(compound_mapcount_ptr(page), -1);
> atomic_set(subpages_mapcount_ptr(page), 0);
> - atomic_set(compound_pincount_ptr(page), 0);
> + atomic_set(&folio->_pincount, 0);
> }
>
> static void prep_compound_tail(struct page *head, int tail_idx)
> @@ -1291,6 +1293,7 @@ static inline bool free_page_is_bad(struct page *page)
>
> static int free_tail_pages_check(struct page *head_page, struct page *page)
> {
> + struct folio *folio = (struct folio *)head_page;
Similar to above: should the function just accept a folio instead of a
page?
Anyway, the patch as-is is good, and those points could be follow-up
changes if they are done, so either way, please feel free to add:
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
thanks,
--
John Hubbard
NVIDIA
^ permalink raw reply [flat|nested] 31+ messages in thread* Re: [PATCH 01/28] mm: Remove folio_pincount_ptr() and head_compound_pincount()
2023-01-12 3:05 ` John Hubbard
@ 2023-01-12 12:40 ` Matthew Wilcox
0 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox @ 2023-01-12 12:40 UTC (permalink / raw)
To: John Hubbard; +Cc: Andrew Morton, linux-mm, Hugh Dickins
On Wed, Jan 11, 2023 at 07:05:29PM -0800, John Hubbard wrote:
> > +++ b/mm/page_alloc.c
> > @@ -775,11 +775,13 @@ void free_compound_page(struct page *page)
> > static void prep_compound_head(struct page *page, unsigned int order)
> > {
> > + struct folio *folio = (struct folio *)page;
>
> Casting, eh? I wonder if prep_compound_head() should just take a folio?
> There are only a few callers of that.
Yes, I think both of these functions should take a folio. They're both
static, so I was delaying that for later. At some point this year, I
intend to convert all of page_alloc.c away from using pages, but it's
not high on my todo list just yet.
^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH 02/28] mm: Convert head_subpages_mapcount() into folio_nr_pages_mapped()
2023-01-11 14:28 [PATCH 00/28] Get rid of tail page fields Matthew Wilcox (Oracle)
2023-01-11 14:28 ` [PATCH 01/28] mm: Remove folio_pincount_ptr() and head_compound_pincount() Matthew Wilcox (Oracle)
@ 2023-01-11 14:28 ` Matthew Wilcox (Oracle)
2023-01-11 14:28 ` [PATCH 03/28] doc: Clarify refcount section by referring to folios & pages Matthew Wilcox (Oracle)
` (25 subsequent siblings)
27 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-11 14:28 UTC (permalink / raw)
To: Andrew Morton; +Cc: Matthew Wilcox (Oracle), linux-mm, Hugh Dickins
Calling this 'mapcount' is confusing since mapcount is usually the number
of times something is mapped; instead this is the number of mapped pages.
It's also better to enforce that this is a folio rather than a head page.
Move folio_nr_pages_mapped() into mm/internal.h since this is not
something we want device drivers or filesystems poking at. Get rid of
folio_subpages_mapcount_ptr() and use folio->_nr_pages_mapped directly.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
include/linux/mm.h | 22 ++--------------------
include/linux/mm_types.h | 12 +++---------
mm/debug.c | 4 ++--
mm/hugetlb.c | 4 ++--
mm/internal.h | 18 ++++++++++++++++++
mm/rmap.c | 9 +++++----
6 files changed, 32 insertions(+), 37 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 5683a25ce08e..aa151e69416d 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -843,24 +843,6 @@ static inline int head_compound_mapcount(struct page *head)
return atomic_read(compound_mapcount_ptr(head)) + 1;
}
-/*
- * If a 16GB hugetlb page were mapped by PTEs of all of its 4kB sub-pages,
- * its subpages_mapcount would be 0x400000: choose the COMPOUND_MAPPED bit
- * above that range, instead of 2*(PMD_SIZE/PAGE_SIZE). Hugetlb currently
- * leaves subpages_mapcount at 0, but avoid surprise if it participates later.
- */
-#define COMPOUND_MAPPED 0x800000
-#define SUBPAGES_MAPPED (COMPOUND_MAPPED - 1)
-
-/*
- * Number of sub-pages mapped by PTE, does not include compound mapcount.
- * Must be called only on head of compound page.
- */
-static inline int head_subpages_mapcount(struct page *head)
-{
- return atomic_read(subpages_mapcount_ptr(head)) & SUBPAGES_MAPPED;
-}
-
/*
* The atomic page->_mapcount, starts from -1: so that transitions
* both from it and to it can be tracked, using atomic_inc_and_test
@@ -920,9 +902,9 @@ static inline bool folio_large_is_mapped(struct folio *folio)
{
/*
* Reading folio_mapcount_ptr() below could be omitted if hugetlb
- * participated in incrementing subpages_mapcount when compound mapped.
+ * participated in incrementing nr_pages_mapped when compound mapped.
*/
- return atomic_read(folio_subpages_mapcount_ptr(folio)) > 0 ||
+ return atomic_read(&folio->_nr_pages_mapped) > 0 ||
atomic_read(folio_mapcount_ptr(folio)) >= 0;
}
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index c225d81eae83..aa2039f18f4d 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -307,7 +307,7 @@ static inline struct page *encoded_page_ptr(struct encoded_page *page)
* @_folio_dtor: Which destructor to use for this folio.
* @_folio_order: Do not use directly, call folio_order().
* @_compound_mapcount: Do not use directly, call folio_entire_mapcount().
- * @_subpages_mapcount: Do not use directly, call folio_mapcount().
+ * @_nr_pages_mapped: Do not use directly, call folio_mapcount().
* @_pincount: Do not use directly, call folio_maybe_dma_pinned().
* @_folio_nr_pages: Do not use directly, call folio_nr_pages().
* @_flags_2: For alignment. Do not use.
@@ -361,7 +361,7 @@ struct folio {
unsigned char _folio_dtor;
unsigned char _folio_order;
atomic_t _compound_mapcount;
- atomic_t _subpages_mapcount;
+ atomic_t _nr_pages_mapped;
atomic_t _pincount;
#ifdef CONFIG_64BIT
unsigned int _folio_nr_pages;
@@ -404,7 +404,7 @@ FOLIO_MATCH(compound_head, _head_1);
FOLIO_MATCH(compound_dtor, _folio_dtor);
FOLIO_MATCH(compound_order, _folio_order);
FOLIO_MATCH(compound_mapcount, _compound_mapcount);
-FOLIO_MATCH(subpages_mapcount, _subpages_mapcount);
+FOLIO_MATCH(subpages_mapcount, _nr_pages_mapped);
FOLIO_MATCH(compound_pincount, _pincount);
#ifdef CONFIG_64BIT
FOLIO_MATCH(compound_nr, _folio_nr_pages);
@@ -427,12 +427,6 @@ static inline atomic_t *folio_mapcount_ptr(struct folio *folio)
return &tail->compound_mapcount;
}
-static inline atomic_t *folio_subpages_mapcount_ptr(struct folio *folio)
-{
- struct page *tail = &folio->page + 1;
- return &tail->subpages_mapcount;
-}
-
static inline atomic_t *compound_mapcount_ptr(struct page *page)
{
return &page[1].compound_mapcount;
diff --git a/mm/debug.c b/mm/debug.c
index 893c9dbf76ca..8e58e8dab0b2 100644
--- a/mm/debug.c
+++ b/mm/debug.c
@@ -94,10 +94,10 @@ static void __dump_page(struct page *page)
page, page_ref_count(head), mapcount, mapping,
page_to_pgoff(page), page_to_pfn(page));
if (compound) {
- pr_warn("head:%p order:%u compound_mapcount:%d subpages_mapcount:%d pincount:%d\n",
+ pr_warn("head:%p order:%u compound_mapcount:%d nr_pages_mapped:%d pincount:%d\n",
head, compound_order(head),
head_compound_mapcount(head),
- head_subpages_mapcount(head),
+ folio_nr_pages_mapped(folio),
atomic_read(&folio->_pincount));
}
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 15b2707c1600..c9702224931c 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1475,7 +1475,7 @@ static void __destroy_compound_gigantic_folio(struct folio *folio,
struct page *p;
atomic_set(folio_mapcount_ptr(folio), 0);
- atomic_set(folio_subpages_mapcount_ptr(folio), 0);
+ atomic_set(&folio->_nr_pages_mapped, 0);
atomic_set(&folio->_pincount, 0);
for (i = 1; i < nr_pages; i++) {
@@ -1997,7 +1997,7 @@ static bool __prep_compound_gigantic_folio(struct folio *folio,
set_compound_head(p, &folio->page);
}
atomic_set(folio_mapcount_ptr(folio), -1);
- atomic_set(folio_subpages_mapcount_ptr(folio), 0);
+ atomic_set(&folio->_nr_pages_mapped, 0);
atomic_set(&folio->_pincount, 0);
return true;
diff --git a/mm/internal.h b/mm/internal.h
index f04b8fb57d90..57b4a6992ed4 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -52,6 +52,24 @@ struct folio_batch;
void page_writeback_init(void);
+/*
+ * If a 16GB hugetlb folio were mapped by PTEs of all of its 4kB pages,
+ * its nr_pages_mapped would be 0x400000: choose the COMPOUND_MAPPED bit
+ * above that range, instead of 2*(PMD_SIZE/PAGE_SIZE). Hugetlb currently
+ * leaves nr_pages_mapped at 0, but avoid surprise if it participates later.
+ */
+#define COMPOUND_MAPPED 0x800000
+#define FOLIO_PAGES_MAPPED (COMPOUND_MAPPED - 1)
+
+/*
+ * How many individual pages have an elevated _mapcount. Excludes
+ * the folio's entire_mapcount.
+ */
+static inline int folio_nr_pages_mapped(struct folio *folio)
+{
+ return atomic_read(&folio->_nr_pages_mapped) & FOLIO_PAGES_MAPPED;
+}
+
static inline void *folio_raw_mapping(struct folio *folio)
{
unsigned long mapping = (unsigned long)folio->mapping;
diff --git a/mm/rmap.c b/mm/rmap.c
index 7bc3b1fa7bc7..f08685054d0a 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1080,12 +1080,13 @@ int pfn_mkclean_range(unsigned long pfn, unsigned long nr_pages, pgoff_t pgoff,
int total_compound_mapcount(struct page *head)
{
+ struct folio *folio = (struct folio *)head;
int mapcount = head_compound_mapcount(head);
int nr_subpages;
int i;
/* In the common case, avoid the loop when no subpages mapped by PTE */
- if (head_subpages_mapcount(head) == 0)
+ if (folio_nr_pages_mapped(folio) == 0)
return mapcount;
/*
* Add all the PTE mappings of those subpages mapped by PTE.
@@ -1233,7 +1234,7 @@ void page_add_anon_rmap(struct page *page,
nr = atomic_add_return_relaxed(COMPOUND_MAPPED, mapped);
if (likely(nr < COMPOUND_MAPPED + COMPOUND_MAPPED)) {
nr_pmdmapped = thp_nr_pages(page);
- nr = nr_pmdmapped - (nr & SUBPAGES_MAPPED);
+ nr = nr_pmdmapped - (nr & FOLIO_PAGES_MAPPED);
/* Raced ahead of a remove and another add? */
if (unlikely(nr < 0))
nr = 0;
@@ -1337,7 +1338,7 @@ void page_add_file_rmap(struct page *page,
nr = atomic_add_return_relaxed(COMPOUND_MAPPED, mapped);
if (likely(nr < COMPOUND_MAPPED + COMPOUND_MAPPED)) {
nr_pmdmapped = thp_nr_pages(page);
- nr = nr_pmdmapped - (nr & SUBPAGES_MAPPED);
+ nr = nr_pmdmapped - (nr & FOLIO_PAGES_MAPPED);
/* Raced ahead of a remove and another add? */
if (unlikely(nr < 0))
nr = 0;
@@ -1399,7 +1400,7 @@ void page_remove_rmap(struct page *page,
nr = atomic_sub_return_relaxed(COMPOUND_MAPPED, mapped);
if (likely(nr < COMPOUND_MAPPED)) {
nr_pmdmapped = thp_nr_pages(page);
- nr = nr_pmdmapped - (nr & SUBPAGES_MAPPED);
+ nr = nr_pmdmapped - (nr & FOLIO_PAGES_MAPPED);
/* Raced ahead of another remove and an add? */
if (unlikely(nr < 0))
nr = 0;
--
2.35.1
^ permalink raw reply [flat|nested] 31+ messages in thread* [PATCH 03/28] doc: Clarify refcount section by referring to folios & pages
2023-01-11 14:28 [PATCH 00/28] Get rid of tail page fields Matthew Wilcox (Oracle)
2023-01-11 14:28 ` [PATCH 01/28] mm: Remove folio_pincount_ptr() and head_compound_pincount() Matthew Wilcox (Oracle)
2023-01-11 14:28 ` [PATCH 02/28] mm: Convert head_subpages_mapcount() into folio_nr_pages_mapped() Matthew Wilcox (Oracle)
@ 2023-01-11 14:28 ` Matthew Wilcox (Oracle)
2023-01-11 14:28 ` [PATCH 04/28] mm: Convert total_compound_mapcount() to folio_total_mapcount() Matthew Wilcox (Oracle)
` (24 subsequent siblings)
27 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-11 14:28 UTC (permalink / raw)
To: Andrew Morton; +Cc: Matthew Wilcox (Oracle), linux-mm, Hugh Dickins
Include the rename of subpages_mapcount to _nr_pages_mapped.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
Documentation/mm/transhuge.rst | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/Documentation/mm/transhuge.rst b/Documentation/mm/transhuge.rst
index ec3dc5b04226..03bbd0a19041 100644
--- a/Documentation/mm/transhuge.rst
+++ b/Documentation/mm/transhuge.rst
@@ -112,20 +112,20 @@ Refcounts and transparent huge pages
Refcounting on THP is mostly consistent with refcounting on other compound
pages:
- - get_page()/put_page() and GUP operate on head page's ->_refcount.
+ - get_page()/put_page() and GUP operate on the folio->_refcount.
- ->_refcount in tail pages is always zero: get_page_unless_zero() never
succeeds on tail pages.
- - map/unmap of PMD entry for the whole compound page increment/decrement
- ->compound_mapcount, stored in the first tail page of the compound page;
- and also increment/decrement ->subpages_mapcount (also in the first tail)
- by COMPOUND_MAPPED when compound_mapcount goes from -1 to 0 or 0 to -1.
+ - map/unmap of a PMD entry for the whole THP increment/decrement
+ folio->_entire_mapcount and also increment/decrement
+ folio->_nr_pages_mapped by COMPOUND_MAPPED when _entire_mapcount
+ goes from -1 to 0 or 0 to -1.
- - map/unmap of sub-pages with PTE entry increment/decrement ->_mapcount
- on relevant sub-page of the compound page, and also increment/decrement
- ->subpages_mapcount, stored in first tail page of the compound page, when
- _mapcount goes from -1 to 0 or 0 to -1: counting sub-pages mapped by PTE.
+ - map/unmap of individual pages with PTE entry increment/decrement
+ page->_mapcount and also increment/decrement folio->_nr_pages_mapped
+ when page->_mapcount goes from -1 to 0 or 0 to -1 as this counts
+ the number of pages mapped by PTE.
split_huge_page internally has to distribute the refcounts in the head
page to the tail pages before clearing all PG_head/tail bits from the page
--
2.35.1
^ permalink raw reply [flat|nested] 31+ messages in thread* [PATCH 04/28] mm: Convert total_compound_mapcount() to folio_total_mapcount()
2023-01-11 14:28 [PATCH 00/28] Get rid of tail page fields Matthew Wilcox (Oracle)
` (2 preceding siblings ...)
2023-01-11 14:28 ` [PATCH 03/28] doc: Clarify refcount section by referring to folios & pages Matthew Wilcox (Oracle)
@ 2023-01-11 14:28 ` Matthew Wilcox (Oracle)
2023-01-11 14:28 ` [PATCH 05/28] mm: Convert page_remove_rmap() to use a folio internally Matthew Wilcox (Oracle)
` (23 subsequent siblings)
27 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-11 14:28 UTC (permalink / raw)
To: Andrew Morton; +Cc: Matthew Wilcox (Oracle), linux-mm, Hugh Dickins
Instead of enforcing that the argument must be a head page by naming,
enforce it with the compiler by making it a folio. Also rename the
counter in struct folio from _compound_mapcount to _entire_mapcount.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
include/linux/mm.h | 6 +++---
include/linux/mm_types.h | 6 +++---
mm/rmap.c | 21 ++++++++++-----------
3 files changed, 16 insertions(+), 17 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index aa151e69416d..8bddc7810f78 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -871,7 +871,7 @@ static inline int page_mapcount(struct page *page)
return head_compound_mapcount(page) + mapcount;
}
-int total_compound_mapcount(struct page *head);
+int folio_total_mapcount(struct folio *folio);
/**
* folio_mapcount() - Calculate the number of mappings of this folio.
@@ -888,14 +888,14 @@ static inline int folio_mapcount(struct folio *folio)
{
if (likely(!folio_test_large(folio)))
return atomic_read(&folio->_mapcount) + 1;
- return total_compound_mapcount(&folio->page);
+ return folio_total_mapcount(folio);
}
static inline int total_mapcount(struct page *page)
{
if (likely(!PageCompound(page)))
return atomic_read(&page->_mapcount) + 1;
- return total_compound_mapcount(compound_head(page));
+ return folio_total_mapcount(page_folio(page));
}
static inline bool folio_large_is_mapped(struct folio *folio)
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index aa2039f18f4d..a66054a9f0b6 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -306,7 +306,7 @@ static inline struct page *encoded_page_ptr(struct encoded_page *page)
* @_head_1: Points to the folio. Do not use.
* @_folio_dtor: Which destructor to use for this folio.
* @_folio_order: Do not use directly, call folio_order().
- * @_compound_mapcount: Do not use directly, call folio_entire_mapcount().
+ * @_entire_mapcount: Do not use directly, call folio_entire_mapcount().
* @_nr_pages_mapped: Do not use directly, call folio_mapcount().
* @_pincount: Do not use directly, call folio_maybe_dma_pinned().
* @_folio_nr_pages: Do not use directly, call folio_nr_pages().
@@ -360,7 +360,7 @@ struct folio {
unsigned long _head_1;
unsigned char _folio_dtor;
unsigned char _folio_order;
- atomic_t _compound_mapcount;
+ atomic_t _entire_mapcount;
atomic_t _nr_pages_mapped;
atomic_t _pincount;
#ifdef CONFIG_64BIT
@@ -403,7 +403,7 @@ FOLIO_MATCH(flags, _flags_1);
FOLIO_MATCH(compound_head, _head_1);
FOLIO_MATCH(compound_dtor, _folio_dtor);
FOLIO_MATCH(compound_order, _folio_order);
-FOLIO_MATCH(compound_mapcount, _compound_mapcount);
+FOLIO_MATCH(compound_mapcount, _entire_mapcount);
FOLIO_MATCH(subpages_mapcount, _nr_pages_mapped);
FOLIO_MATCH(compound_pincount, _pincount);
#ifdef CONFIG_64BIT
diff --git a/mm/rmap.c b/mm/rmap.c
index f08685054d0a..675d8401c2da 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1078,27 +1078,26 @@ int pfn_mkclean_range(unsigned long pfn, unsigned long nr_pages, pgoff_t pgoff,
return page_vma_mkclean_one(&pvmw);
}
-int total_compound_mapcount(struct page *head)
+int folio_total_mapcount(struct folio *folio)
{
- struct folio *folio = (struct folio *)head;
- int mapcount = head_compound_mapcount(head);
- int nr_subpages;
+ int mapcount = folio_entire_mapcount(folio);
+ int nr_pages;
int i;
- /* In the common case, avoid the loop when no subpages mapped by PTE */
+ /* In the common case, avoid the loop when no pages mapped by PTE */
if (folio_nr_pages_mapped(folio) == 0)
return mapcount;
/*
- * Add all the PTE mappings of those subpages mapped by PTE.
- * Limit the loop, knowing that only subpages_mapcount are mapped?
+ * Add all the PTE mappings of those pages mapped by PTE.
+ * Limit the loop to folio_nr_pages_mapped()?
* Perhaps: given all the raciness, that may be a good or a bad idea.
*/
- nr_subpages = thp_nr_pages(head);
- for (i = 0; i < nr_subpages; i++)
- mapcount += atomic_read(&head[i]._mapcount);
+ nr_pages = folio_nr_pages(folio);
+ for (i = 0; i < nr_pages; i++)
+ mapcount += atomic_read(&folio_page(folio, i)->_mapcount);
/* But each of those _mapcounts was based on -1 */
- mapcount += nr_subpages;
+ mapcount += nr_pages;
return mapcount;
}
--
2.35.1
^ permalink raw reply [flat|nested] 31+ messages in thread* [PATCH 05/28] mm: Convert page_remove_rmap() to use a folio internally
2023-01-11 14:28 [PATCH 00/28] Get rid of tail page fields Matthew Wilcox (Oracle)
` (3 preceding siblings ...)
2023-01-11 14:28 ` [PATCH 04/28] mm: Convert total_compound_mapcount() to folio_total_mapcount() Matthew Wilcox (Oracle)
@ 2023-01-11 14:28 ` Matthew Wilcox (Oracle)
2023-01-11 14:28 ` [PATCH 06/28] mm: Convert page_add_anon_rmap() " Matthew Wilcox (Oracle)
` (22 subsequent siblings)
27 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-11 14:28 UTC (permalink / raw)
To: Andrew Morton; +Cc: Matthew Wilcox (Oracle), linux-mm, Hugh Dickins
The API for page_remove_rmap() needs to be page-based, because we can
remove mappings of pages individually. But inside the function, we want
to only call compound_head() once and then use the folio APIs instead
of the page APIs that each call compound_head().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
mm/rmap.c | 47 ++++++++++++++++++++++++++---------------------
1 file changed, 26 insertions(+), 21 deletions(-)
diff --git a/mm/rmap.c b/mm/rmap.c
index 675d8401c2da..d137bd8e5309 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1365,19 +1365,21 @@ void page_add_file_rmap(struct page *page,
*
* The caller needs to hold the pte lock.
*/
-void page_remove_rmap(struct page *page,
- struct vm_area_struct *vma, bool compound)
+void page_remove_rmap(struct page *page, struct vm_area_struct *vma,
+ bool compound)
{
- atomic_t *mapped;
+ struct folio *folio = page_folio(page);
+ atomic_t *mapped = &folio->_nr_pages_mapped;
int nr = 0, nr_pmdmapped = 0;
bool last;
+ enum node_stat_item idx;
VM_BUG_ON_PAGE(compound && !PageHead(page), page);
/* Hugetlb pages are not counted in NR_*MAPPED */
- if (unlikely(PageHuge(page))) {
+ if (unlikely(folio_test_hugetlb(folio))) {
/* hugetlb pages are always mapped with pmds */
- atomic_dec(compound_mapcount_ptr(page));
+ atomic_dec(&folio->_entire_mapcount);
return;
}
@@ -1385,20 +1387,18 @@ void page_remove_rmap(struct page *page,
if (likely(!compound)) {
last = atomic_add_negative(-1, &page->_mapcount);
nr = last;
- if (last && PageCompound(page)) {
- mapped = subpages_mapcount_ptr(compound_head(page));
+ if (last && folio_test_large(folio)) {
nr = atomic_dec_return_relaxed(mapped);
nr = (nr < COMPOUND_MAPPED);
}
- } else if (PageTransHuge(page)) {
+ } else if (folio_test_pmd_mappable(folio)) {
/* That test is redundant: it's for safety or to optimize out */
- last = atomic_add_negative(-1, compound_mapcount_ptr(page));
+ last = atomic_add_negative(-1, &folio->_entire_mapcount);
if (last) {
- mapped = subpages_mapcount_ptr(page);
nr = atomic_sub_return_relaxed(COMPOUND_MAPPED, mapped);
if (likely(nr < COMPOUND_MAPPED)) {
- nr_pmdmapped = thp_nr_pages(page);
+ nr_pmdmapped = folio_nr_pages(folio);
nr = nr_pmdmapped - (nr & FOLIO_PAGES_MAPPED);
/* Raced ahead of another remove and an add? */
if (unlikely(nr < 0))
@@ -1411,21 +1411,26 @@ void page_remove_rmap(struct page *page,
}
if (nr_pmdmapped) {
- __mod_lruvec_page_state(page, PageAnon(page) ? NR_ANON_THPS :
- (PageSwapBacked(page) ? NR_SHMEM_PMDMAPPED :
- NR_FILE_PMDMAPPED), -nr_pmdmapped);
+ if (folio_test_anon(folio))
+ idx = NR_ANON_THPS;
+ else if (folio_test_swapbacked(folio))
+ idx = NR_SHMEM_PMDMAPPED;
+ else
+ idx = NR_FILE_PMDMAPPED;
+ __lruvec_stat_mod_folio(folio, idx, -nr_pmdmapped);
}
if (nr) {
- __mod_lruvec_page_state(page, PageAnon(page) ? NR_ANON_MAPPED :
- NR_FILE_MAPPED, -nr);
+ idx = folio_test_anon(folio) ? NR_ANON_MAPPED : NR_FILE_MAPPED;
+ __lruvec_stat_mod_folio(folio, idx, -nr);
+
/*
- * Queue anon THP for deferred split if at least one small
- * page of the compound page is unmapped, but at least one
- * small page is still mapped.
+ * Queue anon THP for deferred split if at least one
+ * page of the folio is unmapped and at least one page
+ * is still mapped.
*/
- if (PageTransCompound(page) && PageAnon(page))
+ if (folio_test_pmd_mappable(folio) && folio_test_anon(folio))
if (!compound || nr < nr_pmdmapped)
- deferred_split_huge_page(compound_head(page));
+ deferred_split_huge_page(&folio->page);
}
/*
--
2.35.1
^ permalink raw reply [flat|nested] 31+ messages in thread* [PATCH 06/28] mm: Convert page_add_anon_rmap() to use a folio internally
2023-01-11 14:28 [PATCH 00/28] Get rid of tail page fields Matthew Wilcox (Oracle)
` (4 preceding siblings ...)
2023-01-11 14:28 ` [PATCH 05/28] mm: Convert page_remove_rmap() to use a folio internally Matthew Wilcox (Oracle)
@ 2023-01-11 14:28 ` Matthew Wilcox (Oracle)
2023-01-11 14:28 ` [PATCH 07/28] mm: Convert page_add_file_rmap() " Matthew Wilcox (Oracle)
` (21 subsequent siblings)
27 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-11 14:28 UTC (permalink / raw)
To: Andrew Morton; +Cc: Matthew Wilcox (Oracle), linux-mm, Hugh Dickins
The API for page_add_anon_rmap() needs to be page-based, because we can
add mappings of individual pages. But inside the function, we want to
only call compound_head() once and then use the folio APIs instead of
the page APIs that each call compound_head().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
mm/rmap.c | 23 +++++++++++------------
1 file changed, 11 insertions(+), 12 deletions(-)
diff --git a/mm/rmap.c b/mm/rmap.c
index d137bd8e5309..187c7c832111 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1207,10 +1207,11 @@ static void __page_check_anon_rmap(struct page *page,
* and to ensure that PageAnon is not being upgraded racily to PageKsm
* (but PageKsm is never downgraded to PageAnon).
*/
-void page_add_anon_rmap(struct page *page,
- struct vm_area_struct *vma, unsigned long address, rmap_t flags)
+void page_add_anon_rmap(struct page *page, struct vm_area_struct *vma,
+ unsigned long address, rmap_t flags)
{
- atomic_t *mapped;
+ struct folio *folio = page_folio(page);
+ atomic_t *mapped = &folio->_nr_pages_mapped;
int nr = 0, nr_pmdmapped = 0;
bool compound = flags & RMAP_COMPOUND;
bool first = true;
@@ -1219,20 +1220,18 @@ void page_add_anon_rmap(struct page *page,
if (likely(!compound)) {
first = atomic_inc_and_test(&page->_mapcount);
nr = first;
- if (first && PageCompound(page)) {
- mapped = subpages_mapcount_ptr(compound_head(page));
+ if (first && folio_test_large(folio)) {
nr = atomic_inc_return_relaxed(mapped);
nr = (nr < COMPOUND_MAPPED);
}
- } else if (PageTransHuge(page)) {
+ } else if (folio_test_pmd_mappable(folio)) {
/* That test is redundant: it's for safety or to optimize out */
- first = atomic_inc_and_test(compound_mapcount_ptr(page));
+ first = atomic_inc_and_test(&folio->_entire_mapcount);
if (first) {
- mapped = subpages_mapcount_ptr(page);
nr = atomic_add_return_relaxed(COMPOUND_MAPPED, mapped);
if (likely(nr < COMPOUND_MAPPED + COMPOUND_MAPPED)) {
- nr_pmdmapped = thp_nr_pages(page);
+ nr_pmdmapped = folio_nr_pages(folio);
nr = nr_pmdmapped - (nr & FOLIO_PAGES_MAPPED);
/* Raced ahead of a remove and another add? */
if (unlikely(nr < 0))
@@ -1248,11 +1247,11 @@ void page_add_anon_rmap(struct page *page,
VM_BUG_ON_PAGE(!first && PageAnonExclusive(page), page);
if (nr_pmdmapped)
- __mod_lruvec_page_state(page, NR_ANON_THPS, nr_pmdmapped);
+ __lruvec_stat_mod_folio(folio, NR_ANON_THPS, nr_pmdmapped);
if (nr)
- __mod_lruvec_page_state(page, NR_ANON_MAPPED, nr);
+ __lruvec_stat_mod_folio(folio, NR_ANON_MAPPED, nr);
- if (likely(!PageKsm(page))) {
+ if (likely(!folio_test_ksm(folio))) {
/* address might be in next vma when migration races vma_adjust */
if (first)
__page_set_anon_rmap(page, vma, address,
--
2.35.1
^ permalink raw reply [flat|nested] 31+ messages in thread* [PATCH 07/28] mm: Convert page_add_file_rmap() to use a folio internally
2023-01-11 14:28 [PATCH 00/28] Get rid of tail page fields Matthew Wilcox (Oracle)
` (5 preceding siblings ...)
2023-01-11 14:28 ` [PATCH 06/28] mm: Convert page_add_anon_rmap() " Matthew Wilcox (Oracle)
@ 2023-01-11 14:28 ` Matthew Wilcox (Oracle)
2023-01-11 14:28 ` [PATCH 08/28] mm: Add folio_add_new_anon_rmap() Matthew Wilcox (Oracle)
` (20 subsequent siblings)
27 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-11 14:28 UTC (permalink / raw)
To: Andrew Morton; +Cc: Matthew Wilcox (Oracle), linux-mm, Hugh Dickins
The API for page_add_file_rmap() needs to be page-based, because we can
add mappings of individual pages. But inside the function, we want to
only call compound_head() once and then use the folio APIs instead of
the page APIs that each call compound_head().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
mm/rmap.c | 21 ++++++++++-----------
1 file changed, 10 insertions(+), 11 deletions(-)
diff --git a/mm/rmap.c b/mm/rmap.c
index 187c7c832111..7b83b56d5603 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1309,10 +1309,11 @@ void page_add_new_anon_rmap(struct page *page,
*
* The caller needs to hold the pte lock.
*/
-void page_add_file_rmap(struct page *page,
- struct vm_area_struct *vma, bool compound)
+void page_add_file_rmap(struct page *page, struct vm_area_struct *vma,
+ bool compound)
{
- atomic_t *mapped;
+ struct folio *folio = page_folio(page);
+ atomic_t *mapped = &folio->_nr_pages_mapped;
int nr = 0, nr_pmdmapped = 0;
bool first;
@@ -1322,20 +1323,18 @@ void page_add_file_rmap(struct page *page,
if (likely(!compound)) {
first = atomic_inc_and_test(&page->_mapcount);
nr = first;
- if (first && PageCompound(page)) {
- mapped = subpages_mapcount_ptr(compound_head(page));
+ if (first && folio_test_large(folio)) {
nr = atomic_inc_return_relaxed(mapped);
nr = (nr < COMPOUND_MAPPED);
}
- } else if (PageTransHuge(page)) {
+ } else if (folio_test_pmd_mappable(folio)) {
/* That test is redundant: it's for safety or to optimize out */
- first = atomic_inc_and_test(compound_mapcount_ptr(page));
+ first = atomic_inc_and_test(&folio->_entire_mapcount);
if (first) {
- mapped = subpages_mapcount_ptr(page);
nr = atomic_add_return_relaxed(COMPOUND_MAPPED, mapped);
if (likely(nr < COMPOUND_MAPPED + COMPOUND_MAPPED)) {
- nr_pmdmapped = thp_nr_pages(page);
+ nr_pmdmapped = folio_nr_pages(folio);
nr = nr_pmdmapped - (nr & FOLIO_PAGES_MAPPED);
/* Raced ahead of a remove and another add? */
if (unlikely(nr < 0))
@@ -1348,10 +1347,10 @@ void page_add_file_rmap(struct page *page,
}
if (nr_pmdmapped)
- __mod_lruvec_page_state(page, PageSwapBacked(page) ?
+ __lruvec_stat_mod_folio(folio, folio_test_swapbacked(folio) ?
NR_SHMEM_PMDMAPPED : NR_FILE_PMDMAPPED, nr_pmdmapped);
if (nr)
- __mod_lruvec_page_state(page, NR_FILE_MAPPED, nr);
+ __lruvec_stat_mod_folio(folio, NR_FILE_MAPPED, nr);
mlock_vma_page(page, vma, compound);
}
--
2.35.1
^ permalink raw reply [flat|nested] 31+ messages in thread* [PATCH 08/28] mm: Add folio_add_new_anon_rmap()
2023-01-11 14:28 [PATCH 00/28] Get rid of tail page fields Matthew Wilcox (Oracle)
` (6 preceding siblings ...)
2023-01-11 14:28 ` [PATCH 07/28] mm: Convert page_add_file_rmap() " Matthew Wilcox (Oracle)
@ 2023-01-11 14:28 ` Matthew Wilcox (Oracle)
2023-01-11 14:28 ` [PATCH 09/28] page_alloc: Use folio fields directly Matthew Wilcox (Oracle)
` (19 subsequent siblings)
27 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-11 14:28 UTC (permalink / raw)
To: Andrew Morton; +Cc: Matthew Wilcox (Oracle), linux-mm, Hugh Dickins
In contrast to other rmap functions, page_add_new_anon_rmap() is always
called with a freshly allocated page. That means it can't be called with
a tail page. Turn page_add_new_anon_rmap() into folio_add_new_anon_rmap()
and add a page_add_new_anon_rmap() wrapper. Callers can be converted
individually.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
include/linux/rmap.h | 2 ++
mm/folio-compat.c | 8 ++++++++
mm/rmap.c | 37 ++++++++++++++++++-------------------
3 files changed, 28 insertions(+), 19 deletions(-)
diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index bd3504d11b15..aa682a2a93ce 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -194,6 +194,8 @@ void page_add_anon_rmap(struct page *, struct vm_area_struct *,
unsigned long address, rmap_t flags);
void page_add_new_anon_rmap(struct page *, struct vm_area_struct *,
unsigned long address);
+void folio_add_new_anon_rmap(struct folio *, struct vm_area_struct *,
+ unsigned long address);
void page_add_file_rmap(struct page *, struct vm_area_struct *,
bool compound);
void page_remove_rmap(struct page *, struct vm_area_struct *,
diff --git a/mm/folio-compat.c b/mm/folio-compat.c
index 69ed25790c68..92f53adc0dd9 100644
--- a/mm/folio-compat.c
+++ b/mm/folio-compat.c
@@ -123,3 +123,11 @@ void putback_lru_page(struct page *page)
{
folio_putback_lru(page_folio(page));
}
+
+void page_add_new_anon_rmap(struct page *page, struct vm_area_struct *vma,
+ unsigned long address)
+{
+ VM_BUG_ON_PAGE(PageTail(page), page);
+
+ return folio_add_new_anon_rmap((struct folio *)page, vma, address);
+}
diff --git a/mm/rmap.c b/mm/rmap.c
index 7b83b56d5603..2749e1466b09 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1264,41 +1264,40 @@ void page_add_anon_rmap(struct page *page, struct vm_area_struct *vma,
}
/**
- * page_add_new_anon_rmap - add mapping to a new anonymous page
- * @page: the page to add the mapping to
+ * folio_add_new_anon_rmap - Add mapping to a new anonymous folio.
+ * @folio: The folio to add the mapping to.
* @vma: the vm area in which the mapping is added
* @address: the user virtual address mapped
*
- * If it's a compound page, it is accounted as a compound page. As the page
- * is new, it's assume to get mapped exclusively by a single process.
- *
- * Same as page_add_anon_rmap but must only be called on *new* pages.
+ * Like page_add_anon_rmap() but must only be called on *new* folios.
* This means the inc-and-test can be bypassed.
- * Page does not have to be locked.
+ * The folio does not have to be locked.
+ *
+ * If the folio is large, it is accounted as a THP. As the folio
+ * is new, it's assumed to be mapped exclusively by a single process.
*/
-void page_add_new_anon_rmap(struct page *page,
- struct vm_area_struct *vma, unsigned long address)
+void folio_add_new_anon_rmap(struct folio *folio, struct vm_area_struct *vma,
+ unsigned long address)
{
int nr;
VM_BUG_ON_VMA(address < vma->vm_start || address >= vma->vm_end, vma);
- __SetPageSwapBacked(page);
+ __folio_set_swapbacked(folio);
- if (likely(!PageCompound(page))) {
+ if (likely(!folio_test_pmd_mappable(folio))) {
/* increment count (starts at -1) */
- atomic_set(&page->_mapcount, 0);
+ atomic_set(&folio->_mapcount, 0);
nr = 1;
} else {
- VM_BUG_ON_PAGE(!PageTransHuge(page), page);
/* increment count (starts at -1) */
- atomic_set(compound_mapcount_ptr(page), 0);
- atomic_set(subpages_mapcount_ptr(page), COMPOUND_MAPPED);
- nr = thp_nr_pages(page);
- __mod_lruvec_page_state(page, NR_ANON_THPS, nr);
+ atomic_set(&folio->_entire_mapcount, 0);
+ atomic_set(&folio->_nr_pages_mapped, COMPOUND_MAPPED);
+ nr = folio_nr_pages(folio);
+ __lruvec_stat_mod_folio(folio, NR_ANON_THPS, nr);
}
- __mod_lruvec_page_state(page, NR_ANON_MAPPED, nr);
- __page_set_anon_rmap(page, vma, address, 1);
+ __lruvec_stat_mod_folio(folio, NR_ANON_MAPPED, nr);
+ __page_set_anon_rmap(&folio->page, vma, address, 1);
}
/**
--
2.35.1
^ permalink raw reply [flat|nested] 31+ messages in thread* [PATCH 09/28] page_alloc: Use folio fields directly
2023-01-11 14:28 [PATCH 00/28] Get rid of tail page fields Matthew Wilcox (Oracle)
` (7 preceding siblings ...)
2023-01-11 14:28 ` [PATCH 08/28] mm: Add folio_add_new_anon_rmap() Matthew Wilcox (Oracle)
@ 2023-01-11 14:28 ` Matthew Wilcox (Oracle)
2023-01-11 14:28 ` [PATCH 10/28] mm: Use a folio in hugepage_add_anon_rmap() and hugepage_add_new_anon_rmap() Matthew Wilcox (Oracle)
` (18 subsequent siblings)
27 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-11 14:28 UTC (permalink / raw)
To: Andrew Morton; +Cc: Matthew Wilcox (Oracle), linux-mm, Hugh Dickins
Rmove the uses of compound_mapcount_ptr(), head_compound_mapcount()
and subpages_mapcount_ptr()
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
mm/page_alloc.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index d1e5ec875fd0..946a4ab7b278 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -779,8 +779,8 @@ static void prep_compound_head(struct page *page, unsigned int order)
set_compound_page_dtor(page, COMPOUND_PAGE_DTOR);
set_compound_order(page, order);
- atomic_set(compound_mapcount_ptr(page), -1);
- atomic_set(subpages_mapcount_ptr(page), 0);
+ atomic_set(&folio->_entire_mapcount, -1);
+ atomic_set(&folio->_nr_pages_mapped, 0);
atomic_set(&folio->_pincount, 0);
}
@@ -1309,12 +1309,12 @@ static int free_tail_pages_check(struct page *head_page, struct page *page)
switch (page - head_page) {
case 1:
/* the first tail page: these may be in place of ->mapping */
- if (unlikely(head_compound_mapcount(head_page))) {
- bad_page(page, "nonzero compound_mapcount");
+ if (unlikely(folio_entire_mapcount(folio))) {
+ bad_page(page, "nonzero entire_mapcount");
goto out;
}
- if (unlikely(atomic_read(subpages_mapcount_ptr(head_page)))) {
- bad_page(page, "nonzero subpages_mapcount");
+ if (unlikely(atomic_read(&folio->_nr_pages_mapped))) {
+ bad_page(page, "nonzero nr_pages_mapped");
goto out;
}
if (unlikely(atomic_read(&folio->_pincount))) {
--
2.35.1
^ permalink raw reply [flat|nested] 31+ messages in thread* [PATCH 10/28] mm: Use a folio in hugepage_add_anon_rmap() and hugepage_add_new_anon_rmap()
2023-01-11 14:28 [PATCH 00/28] Get rid of tail page fields Matthew Wilcox (Oracle)
` (8 preceding siblings ...)
2023-01-11 14:28 ` [PATCH 09/28] page_alloc: Use folio fields directly Matthew Wilcox (Oracle)
@ 2023-01-11 14:28 ` Matthew Wilcox (Oracle)
2023-01-11 14:28 ` [PATCH 11/28] mm: Use entire_mapcount in __page_dup_rmap() Matthew Wilcox (Oracle)
` (17 subsequent siblings)
27 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-11 14:28 UTC (permalink / raw)
To: Andrew Morton; +Cc: Matthew Wilcox (Oracle), linux-mm, Hugh Dickins
Remove uses of compound_mapcount_ptr()
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
mm/rmap.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/mm/rmap.c b/mm/rmap.c
index 2749e1466b09..462b334d6842 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -2542,13 +2542,14 @@ void rmap_walk_locked(struct folio *folio, struct rmap_walk_control *rwc)
void hugepage_add_anon_rmap(struct page *page, struct vm_area_struct *vma,
unsigned long address, rmap_t flags)
{
+ struct folio *folio = page_folio(page);
struct anon_vma *anon_vma = vma->anon_vma;
int first;
- BUG_ON(!PageLocked(page));
+ BUG_ON(!folio_test_locked(folio));
BUG_ON(!anon_vma);
/* address might be in next vma when migration races vma_adjust */
- first = atomic_inc_and_test(compound_mapcount_ptr(page));
+ first = atomic_inc_and_test(&folio->_entire_mapcount);
VM_BUG_ON_PAGE(!first && (flags & RMAP_EXCLUSIVE), page);
VM_BUG_ON_PAGE(!first && PageAnonExclusive(page), page);
if (first)
@@ -2559,10 +2560,12 @@ void hugepage_add_anon_rmap(struct page *page, struct vm_area_struct *vma,
void hugepage_add_new_anon_rmap(struct page *page,
struct vm_area_struct *vma, unsigned long address)
{
+ struct folio *folio = page_folio(page);
+
BUG_ON(address < vma->vm_start || address >= vma->vm_end);
/* increment count (starts at -1) */
- atomic_set(compound_mapcount_ptr(page), 0);
- ClearHPageRestoreReserve(page);
+ atomic_set(&folio->_entire_mapcount, 0);
+ folio_clear_hugetlb_restore_reserve(folio);
__page_set_anon_rmap(page, vma, address, 1);
}
#endif /* CONFIG_HUGETLB_PAGE */
--
2.35.1
^ permalink raw reply [flat|nested] 31+ messages in thread* [PATCH 11/28] mm: Use entire_mapcount in __page_dup_rmap()
2023-01-11 14:28 [PATCH 00/28] Get rid of tail page fields Matthew Wilcox (Oracle)
` (9 preceding siblings ...)
2023-01-11 14:28 ` [PATCH 10/28] mm: Use a folio in hugepage_add_anon_rmap() and hugepage_add_new_anon_rmap() Matthew Wilcox (Oracle)
@ 2023-01-11 14:28 ` Matthew Wilcox (Oracle)
2023-01-11 14:28 ` [PATCH 12/28] mm/debug: Remove call to head_compound_mapcount() Matthew Wilcox (Oracle)
` (16 subsequent siblings)
27 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-11 14:28 UTC (permalink / raw)
To: Andrew Morton; +Cc: Matthew Wilcox (Oracle), linux-mm, Hugh Dickins
Remove the use of the compound_mapcount_ptr() wrapper, and add an
assertion that we're not passing a tail page if we're duplicating a PMD.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
include/linux/rmap.h | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index aa682a2a93ce..a6bd1f0a183d 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -208,7 +208,14 @@ void hugepage_add_new_anon_rmap(struct page *, struct vm_area_struct *,
static inline void __page_dup_rmap(struct page *page, bool compound)
{
- atomic_inc(compound ? compound_mapcount_ptr(page) : &page->_mapcount);
+ if (compound) {
+ struct folio *folio = (struct folio *)page;
+
+ VM_BUG_ON_PAGE(compound && !PageHead(page), page);
+ atomic_inc(&folio->_entire_mapcount);
+ } else {
+ atomic_inc(&page->_mapcount);
+ }
}
static inline void page_dup_file_rmap(struct page *page, bool compound)
--
2.35.1
^ permalink raw reply [flat|nested] 31+ messages in thread* [PATCH 12/28] mm/debug: Remove call to head_compound_mapcount()
2023-01-11 14:28 [PATCH 00/28] Get rid of tail page fields Matthew Wilcox (Oracle)
` (10 preceding siblings ...)
2023-01-11 14:28 ` [PATCH 11/28] mm: Use entire_mapcount in __page_dup_rmap() Matthew Wilcox (Oracle)
@ 2023-01-11 14:28 ` Matthew Wilcox (Oracle)
2023-01-11 14:28 ` [PATCH 13/28] hugetlb: Remove uses of folio_mapcount_ptr Matthew Wilcox (Oracle)
` (15 subsequent siblings)
27 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-11 14:28 UTC (permalink / raw)
To: Andrew Morton; +Cc: Matthew Wilcox (Oracle), linux-mm, Hugh Dickins
Call folio_entire_mapcount() instead.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
mm/debug.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/debug.c b/mm/debug.c
index 8e58e8dab0b2..9d3d893dc7f4 100644
--- a/mm/debug.c
+++ b/mm/debug.c
@@ -94,9 +94,9 @@ static void __dump_page(struct page *page)
page, page_ref_count(head), mapcount, mapping,
page_to_pgoff(page), page_to_pfn(page));
if (compound) {
- pr_warn("head:%p order:%u compound_mapcount:%d nr_pages_mapped:%d pincount:%d\n",
+ pr_warn("head:%p order:%u entire_mapcount:%d nr_pages_mapped:%d pincount:%d\n",
head, compound_order(head),
- head_compound_mapcount(head),
+ folio_entire_mapcount(folio),
folio_nr_pages_mapped(folio),
atomic_read(&folio->_pincount));
}
--
2.35.1
^ permalink raw reply [flat|nested] 31+ messages in thread* [PATCH 13/28] hugetlb: Remove uses of folio_mapcount_ptr
2023-01-11 14:28 [PATCH 00/28] Get rid of tail page fields Matthew Wilcox (Oracle)
` (11 preceding siblings ...)
2023-01-11 14:28 ` [PATCH 12/28] mm/debug: Remove call to head_compound_mapcount() Matthew Wilcox (Oracle)
@ 2023-01-11 14:28 ` Matthew Wilcox (Oracle)
2023-01-11 14:29 ` [PATCH 14/28] mm: Convert page_mapcount() to use folio_entire_mapcount() Matthew Wilcox (Oracle)
` (14 subsequent siblings)
27 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-11 14:28 UTC (permalink / raw)
To: Andrew Morton; +Cc: Matthew Wilcox (Oracle), linux-mm, Hugh Dickins
Use the entire_mapcount field directly.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
mm/hugetlb.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index c9702224931c..a68e0e597a8f 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1474,7 +1474,7 @@ static void __destroy_compound_gigantic_folio(struct folio *folio,
int nr_pages = 1 << order;
struct page *p;
- atomic_set(folio_mapcount_ptr(folio), 0);
+ atomic_set(&folio->_entire_mapcount, 0);
atomic_set(&folio->_nr_pages_mapped, 0);
atomic_set(&folio->_pincount, 0);
@@ -1996,7 +1996,7 @@ static bool __prep_compound_gigantic_folio(struct folio *folio,
if (i != 0)
set_compound_head(p, &folio->page);
}
- atomic_set(folio_mapcount_ptr(folio), -1);
+ atomic_set(&folio->_entire_mapcount, -1);
atomic_set(&folio->_nr_pages_mapped, 0);
atomic_set(&folio->_pincount, 0);
return true;
--
2.35.1
^ permalink raw reply [flat|nested] 31+ messages in thread* [PATCH 14/28] mm: Convert page_mapcount() to use folio_entire_mapcount()
2023-01-11 14:28 [PATCH 00/28] Get rid of tail page fields Matthew Wilcox (Oracle)
` (12 preceding siblings ...)
2023-01-11 14:28 ` [PATCH 13/28] hugetlb: Remove uses of folio_mapcount_ptr Matthew Wilcox (Oracle)
@ 2023-01-11 14:29 ` Matthew Wilcox (Oracle)
2023-01-11 14:29 ` [PATCH 15/28] mm: Remove head_compound_mapcount() and _ptr functions Matthew Wilcox (Oracle)
` (13 subsequent siblings)
27 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-11 14:29 UTC (permalink / raw)
To: Andrew Morton; +Cc: Matthew Wilcox (Oracle), linux-mm, Hugh Dickins
Remove a use of head_compound_mapcount().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
include/linux/mm.h | 22 +++++++++++++---------
1 file changed, 13 insertions(+), 9 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 8bddc7810f78..554b73dae188 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -853,22 +853,26 @@ static inline void page_mapcount_reset(struct page *page)
atomic_set(&(page)->_mapcount, -1);
}
-/*
- * Mapcount of 0-order page; when compound sub-page, includes
- * compound_mapcount of compound_head of page.
+/**
+ * page_mapcount() - Number of times this precise page is mapped.
+ * @page: The page.
+ *
+ * The number of times this page is mapped. If this page is part of
+ * a large folio, it includes the number of times this page is mapped
+ * as part of that folio.
*
- * Result is undefined for pages which cannot be mapped into userspace.
+ * The result is undefined for pages which cannot be mapped into userspace.
* For example SLAB or special types of pages. See function page_has_type().
- * They use this place in struct page differently.
+ * They use this field in struct page differently.
*/
static inline int page_mapcount(struct page *page)
{
int mapcount = atomic_read(&page->_mapcount) + 1;
- if (likely(!PageCompound(page)))
- return mapcount;
- page = compound_head(page);
- return head_compound_mapcount(page) + mapcount;
+ if (unlikely(PageCompound(page)))
+ mapcount += folio_entire_mapcount(page_folio(page));
+
+ return mapcount;
}
int folio_total_mapcount(struct folio *folio);
--
2.35.1
^ permalink raw reply [flat|nested] 31+ messages in thread* [PATCH 15/28] mm: Remove head_compound_mapcount() and _ptr functions
2023-01-11 14:28 [PATCH 00/28] Get rid of tail page fields Matthew Wilcox (Oracle)
` (13 preceding siblings ...)
2023-01-11 14:29 ` [PATCH 14/28] mm: Convert page_mapcount() to use folio_entire_mapcount() Matthew Wilcox (Oracle)
@ 2023-01-11 14:29 ` Matthew Wilcox (Oracle)
2023-01-11 14:29 ` [PATCH 16/28] mm: Reimplement compound_order() Matthew Wilcox (Oracle)
` (12 subsequent siblings)
27 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-11 14:29 UTC (permalink / raw)
To: Andrew Morton; +Cc: Matthew Wilcox (Oracle), linux-mm, Hugh Dickins
folio_mapcount_ptr(), compound_mapcount_ptr() and subpages_mapcount_ptr()
are all now unused.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
include/linux/mm.h | 15 +++------------
include/linux/mm_types.h | 16 ----------------
2 files changed, 3 insertions(+), 28 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 554b73dae188..5002dd4db544 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -831,16 +831,7 @@ static inline int is_vmalloc_or_module_addr(const void *x)
static inline int folio_entire_mapcount(struct folio *folio)
{
VM_BUG_ON_FOLIO(!folio_test_large(folio), folio);
- return atomic_read(folio_mapcount_ptr(folio)) + 1;
-}
-
-/*
- * Mapcount of compound page as a whole, does not include mapped sub-pages.
- * Must be called only on head of compound page.
- */
-static inline int head_compound_mapcount(struct page *head)
-{
- return atomic_read(compound_mapcount_ptr(head)) + 1;
+ return atomic_read(&folio->_entire_mapcount) + 1;
}
/*
@@ -905,11 +896,11 @@ static inline int total_mapcount(struct page *page)
static inline bool folio_large_is_mapped(struct folio *folio)
{
/*
- * Reading folio_mapcount_ptr() below could be omitted if hugetlb
+ * Reading _entire_mapcount below could be omitted if hugetlb
* participated in incrementing nr_pages_mapped when compound mapped.
*/
return atomic_read(&folio->_nr_pages_mapped) > 0 ||
- atomic_read(folio_mapcount_ptr(folio)) >= 0;
+ atomic_read(&folio->_entire_mapcount) >= 0;
}
/**
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index a66054a9f0b6..1cf0fcd99d49 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -421,22 +421,6 @@ FOLIO_MATCH(hugetlb_cgroup_rsvd, _hugetlb_cgroup_rsvd);
FOLIO_MATCH(hugetlb_hwpoison, _hugetlb_hwpoison);
#undef FOLIO_MATCH
-static inline atomic_t *folio_mapcount_ptr(struct folio *folio)
-{
- struct page *tail = &folio->page + 1;
- return &tail->compound_mapcount;
-}
-
-static inline atomic_t *compound_mapcount_ptr(struct page *page)
-{
- return &page[1].compound_mapcount;
-}
-
-static inline atomic_t *subpages_mapcount_ptr(struct page *page)
-{
- return &page[1].subpages_mapcount;
-}
-
/*
* Used for sizing the vmemmap region on some architectures
*/
--
2.35.1
^ permalink raw reply [flat|nested] 31+ messages in thread* [PATCH 16/28] mm: Reimplement compound_order()
2023-01-11 14:28 [PATCH 00/28] Get rid of tail page fields Matthew Wilcox (Oracle)
` (14 preceding siblings ...)
2023-01-11 14:29 ` [PATCH 15/28] mm: Remove head_compound_mapcount() and _ptr functions Matthew Wilcox (Oracle)
@ 2023-01-11 14:29 ` Matthew Wilcox (Oracle)
2023-01-11 14:29 ` [PATCH 17/28] mm: Reimplement compound_nr() Matthew Wilcox (Oracle)
` (11 subsequent siblings)
27 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-11 14:29 UTC (permalink / raw)
To: Andrew Morton; +Cc: Matthew Wilcox (Oracle), linux-mm, Hugh Dickins
Make compound_order() use struct folio. It can't be turned into a wrapper
around folio_order() as a page can be turned into a tail page between
a check in compound_order() and the assertion in folio_test_large().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
include/linux/mm.h | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 5002dd4db544..ddf09522c0d3 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -719,11 +719,20 @@ int vma_is_stack_for_current(struct vm_area_struct *vma);
struct mmu_gather;
struct inode;
+/*
+ * compound_order() can be called without holding a reference, which means
+ * that niceties like page_folio() don't work. These callers should be
+ * prepared to handle wild return values. For example, PG_head may be
+ * set before _folio_order is initialised, or this may be a tail page.
+ * See compaction.c for some good examples.
+ */
static inline unsigned int compound_order(struct page *page)
{
- if (!PageHead(page))
+ struct folio *folio = (struct folio *)page;
+
+ if (!test_bit(PG_head, &folio->flags))
return 0;
- return page[1].compound_order;
+ return folio->_folio_order;
}
/**
--
2.35.1
^ permalink raw reply [flat|nested] 31+ messages in thread* [PATCH 17/28] mm: Reimplement compound_nr()
2023-01-11 14:28 [PATCH 00/28] Get rid of tail page fields Matthew Wilcox (Oracle)
` (15 preceding siblings ...)
2023-01-11 14:29 ` [PATCH 16/28] mm: Reimplement compound_order() Matthew Wilcox (Oracle)
@ 2023-01-11 14:29 ` Matthew Wilcox (Oracle)
2023-01-11 14:29 ` [PATCH 18/28] mm: Convert set_compound_page_dtor() and set_compound_order() to folios Matthew Wilcox (Oracle)
` (10 subsequent siblings)
27 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-11 14:29 UTC (permalink / raw)
To: Andrew Morton; +Cc: Matthew Wilcox (Oracle), linux-mm, Hugh Dickins
Turn compound_nr() into a wrapper around folio_nr_pages(). Similarly
to compound_order(), casting the struct page directly to struct folio
preserves the existing behaviour, while calling page_folio() would change
the behaviour. Move thp_nr_pages() down in the file so that compound_nr()
can be after folio_nr_pages().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
include/linux/mm.h | 38 ++++++++++++++++----------------------
1 file changed, 16 insertions(+), 22 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index ddf09522c0d3..0b1cdaf0fa90 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1005,18 +1005,6 @@ static inline void set_compound_order(struct page *page, unsigned int order)
#endif
}
-/* Returns the number of pages in this potentially compound page. */
-static inline unsigned long compound_nr(struct page *page)
-{
- if (!PageHead(page))
- return 1;
-#ifdef CONFIG_64BIT
- return page[1].compound_nr;
-#else
- return 1UL << compound_order(page);
-#endif
-}
-
/* Returns the number of bytes in this potentially compound page. */
static inline unsigned long page_size(struct page *page)
{
@@ -1039,16 +1027,6 @@ static inline unsigned int thp_order(struct page *page)
return compound_order(page);
}
-/**
- * thp_nr_pages - The number of regular pages in this huge page.
- * @page: The head page of a huge page.
- */
-static inline int thp_nr_pages(struct page *page)
-{
- VM_BUG_ON_PGFLAGS(PageTail(page), page);
- return compound_nr(page);
-}
-
/**
* thp_size - Size of a transparent huge page.
* @page: Head page of a transparent huge page.
@@ -1758,6 +1736,22 @@ static inline long folio_nr_pages(struct folio *folio)
#endif
}
+/* Returns the number of pages in this potentially compound page. */
+static inline unsigned long compound_nr(struct page *page)
+{
+ return folio_nr_pages((struct folio *)page);
+}
+
+/**
+ * thp_nr_pages - The number of regular pages in this huge page.
+ * @page: The head page of a huge page.
+ */
+static inline int thp_nr_pages(struct page *page)
+{
+ VM_BUG_ON_PGFLAGS(PageTail(page), page);
+ return compound_nr(page);
+}
+
/**
* folio_next - Move to the next physical folio.
* @folio: The folio we're currently operating on.
--
2.35.1
^ permalink raw reply [flat|nested] 31+ messages in thread* [PATCH 18/28] mm: Convert set_compound_page_dtor() and set_compound_order() to folios
2023-01-11 14:28 [PATCH 00/28] Get rid of tail page fields Matthew Wilcox (Oracle)
` (16 preceding siblings ...)
2023-01-11 14:29 ` [PATCH 17/28] mm: Reimplement compound_nr() Matthew Wilcox (Oracle)
@ 2023-01-11 14:29 ` Matthew Wilcox (Oracle)
2023-01-11 14:29 ` [PATCH 19/28] mm: Convert is_transparent_hugepage() to use a folio Matthew Wilcox (Oracle)
` (9 subsequent siblings)
27 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-11 14:29 UTC (permalink / raw)
To: Andrew Morton; +Cc: Matthew Wilcox (Oracle), linux-mm, Hugh Dickins
Replace uses of compound_dtor, compound_order and compound_nr by
their folio equivalents.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
include/linux/mm.h | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 0b1cdaf0fa90..57d702fc8677 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -984,8 +984,11 @@ extern compound_page_dtor * const compound_page_dtors[NR_COMPOUND_DTORS];
static inline void set_compound_page_dtor(struct page *page,
enum compound_dtor_id compound_dtor)
{
+ struct folio *folio = (struct folio *)page;
+
VM_BUG_ON_PAGE(compound_dtor >= NR_COMPOUND_DTORS, page);
- page[1].compound_dtor = compound_dtor;
+ VM_BUG_ON_PAGE(!PageHead(page), page);
+ folio->_folio_dtor = compound_dtor;
}
static inline void folio_set_compound_dtor(struct folio *folio,
@@ -999,9 +1002,11 @@ void destroy_large_folio(struct folio *folio);
static inline void set_compound_order(struct page *page, unsigned int order)
{
- page[1].compound_order = order;
+ struct folio *folio = (struct folio *)page;
+
+ folio->_folio_order = order;
#ifdef CONFIG_64BIT
- page[1].compound_nr = 1U << order;
+ folio->_folio_nr_pages = 1U << order;
#endif
}
--
2.35.1
^ permalink raw reply [flat|nested] 31+ messages in thread* [PATCH 19/28] mm: Convert is_transparent_hugepage() to use a folio
2023-01-11 14:28 [PATCH 00/28] Get rid of tail page fields Matthew Wilcox (Oracle)
` (17 preceding siblings ...)
2023-01-11 14:29 ` [PATCH 18/28] mm: Convert set_compound_page_dtor() and set_compound_order() to folios Matthew Wilcox (Oracle)
@ 2023-01-11 14:29 ` Matthew Wilcox (Oracle)
2023-01-11 14:29 ` [PATCH 20/28] mm: Convert destroy_large_folio() to use folio_dtor Matthew Wilcox (Oracle)
` (8 subsequent siblings)
27 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-11 14:29 UTC (permalink / raw)
To: Andrew Morton; +Cc: Matthew Wilcox (Oracle), linux-mm, Hugh Dickins
Replace a use of page->compound_dtor with its folio equivalent.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
mm/huge_memory.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 9570f03cdee4..bfa960f012fa 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -591,12 +591,14 @@ void prep_transhuge_page(struct page *page)
static inline bool is_transparent_hugepage(struct page *page)
{
+ struct folio *folio;
+
if (!PageCompound(page))
return false;
- page = compound_head(page);
- return is_huge_zero_page(page) ||
- page[1].compound_dtor == TRANSHUGE_PAGE_DTOR;
+ folio = page_folio(page);
+ return is_huge_zero_page(&folio->page) ||
+ folio->_folio_dtor == TRANSHUGE_PAGE_DTOR;
}
static unsigned long __thp_get_unmapped_area(struct file *filp,
--
2.35.1
^ permalink raw reply [flat|nested] 31+ messages in thread* [PATCH 20/28] mm: Convert destroy_large_folio() to use folio_dtor
2023-01-11 14:28 [PATCH 00/28] Get rid of tail page fields Matthew Wilcox (Oracle)
` (18 preceding siblings ...)
2023-01-11 14:29 ` [PATCH 19/28] mm: Convert is_transparent_hugepage() to use a folio Matthew Wilcox (Oracle)
@ 2023-01-11 14:29 ` Matthew Wilcox (Oracle)
2023-01-11 14:29 ` [PATCH 21/28] hugetlb: Remove uses of compound_dtor and compound_nr Matthew Wilcox (Oracle)
` (7 subsequent siblings)
27 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-11 14:29 UTC (permalink / raw)
To: Andrew Morton; +Cc: Matthew Wilcox (Oracle), linux-mm, Hugh Dickins
Replace a use of compound_dtor.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
mm/page_alloc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 946a4ab7b278..41a239ce4692 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -807,7 +807,7 @@ void prep_compound_page(struct page *page, unsigned int order)
void destroy_large_folio(struct folio *folio)
{
- enum compound_dtor_id dtor = folio_page(folio, 1)->compound_dtor;
+ enum compound_dtor_id dtor = folio->_folio_dtor;
VM_BUG_ON_FOLIO(dtor >= NR_COMPOUND_DTORS, folio);
compound_page_dtors[dtor](&folio->page);
--
2.35.1
^ permalink raw reply [flat|nested] 31+ messages in thread* [PATCH 21/28] hugetlb: Remove uses of compound_dtor and compound_nr
2023-01-11 14:28 [PATCH 00/28] Get rid of tail page fields Matthew Wilcox (Oracle)
` (19 preceding siblings ...)
2023-01-11 14:29 ` [PATCH 20/28] mm: Convert destroy_large_folio() to use folio_dtor Matthew Wilcox (Oracle)
@ 2023-01-11 14:29 ` Matthew Wilcox (Oracle)
2023-01-11 14:29 ` [PATCH 22/28] mm: Remove 'First tail page' members from struct page Matthew Wilcox (Oracle)
` (6 subsequent siblings)
27 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-11 14:29 UTC (permalink / raw)
To: Andrew Morton; +Cc: Matthew Wilcox (Oracle), linux-mm, Hugh Dickins
Convert the entire file to use the folio equivalents.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
mm/hugetlb.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index a68e0e597a8f..ca9e177b9c54 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2038,11 +2038,12 @@ static bool prep_compound_gigantic_folio_for_demote(struct folio *folio,
*/
int PageHuge(struct page *page)
{
+ struct folio *folio;
+
if (!PageCompound(page))
return 0;
-
- page = compound_head(page);
- return page[1].compound_dtor == HUGETLB_PAGE_DTOR;
+ folio = page_folio(page);
+ return folio->_folio_dtor == HUGETLB_PAGE_DTOR;
}
EXPORT_SYMBOL_GPL(PageHuge);
@@ -2052,10 +2053,11 @@ EXPORT_SYMBOL_GPL(PageHuge);
*/
int PageHeadHuge(struct page *page_head)
{
- if (!PageHead(page_head))
+ struct folio *folio = (struct folio *)page_head;
+ if (!folio_test_large(folio))
return 0;
- return page_head[1].compound_dtor == HUGETLB_PAGE_DTOR;
+ return folio->_folio_dtor == HUGETLB_PAGE_DTOR;
}
EXPORT_SYMBOL_GPL(PageHeadHuge);
--
2.35.1
^ permalink raw reply [flat|nested] 31+ messages in thread* [PATCH 22/28] mm: Remove 'First tail page' members from struct page
2023-01-11 14:28 [PATCH 00/28] Get rid of tail page fields Matthew Wilcox (Oracle)
` (20 preceding siblings ...)
2023-01-11 14:29 ` [PATCH 21/28] hugetlb: Remove uses of compound_dtor and compound_nr Matthew Wilcox (Oracle)
@ 2023-01-11 14:29 ` Matthew Wilcox (Oracle)
2023-01-11 14:29 ` [PATCH 23/28] doc: Correct struct folio kernel-doc Matthew Wilcox (Oracle)
` (5 subsequent siblings)
27 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-11 14:29 UTC (permalink / raw)
To: Andrew Morton; +Cc: Matthew Wilcox (Oracle), linux-mm, Hugh Dickins
All former users now use the folio equivalents, so remove them from
the definition of struct page.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
include/linux/mm_types.h | 18 ------------------
kernel/crash_core.c | 4 ++--
2 files changed, 2 insertions(+), 20 deletions(-)
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 1cf0fcd99d49..61a2e6b32781 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -140,16 +140,6 @@ struct page {
};
struct { /* Tail pages of compound page */
unsigned long compound_head; /* Bit zero is set */
-
- /* First tail page only */
- unsigned char compound_dtor;
- unsigned char compound_order;
- atomic_t compound_mapcount;
- atomic_t subpages_mapcount;
- atomic_t compound_pincount;
-#ifdef CONFIG_64BIT
- unsigned int compound_nr; /* 1 << compound_order */
-#endif
};
struct { /* Second tail page of transparent huge page */
unsigned long _compound_pad_1; /* compound_head */
@@ -401,14 +391,6 @@ FOLIO_MATCH(memcg_data, memcg_data);
offsetof(struct page, pg) + sizeof(struct page))
FOLIO_MATCH(flags, _flags_1);
FOLIO_MATCH(compound_head, _head_1);
-FOLIO_MATCH(compound_dtor, _folio_dtor);
-FOLIO_MATCH(compound_order, _folio_order);
-FOLIO_MATCH(compound_mapcount, _entire_mapcount);
-FOLIO_MATCH(subpages_mapcount, _nr_pages_mapped);
-FOLIO_MATCH(compound_pincount, _pincount);
-#ifdef CONFIG_64BIT
-FOLIO_MATCH(compound_nr, _folio_nr_pages);
-#endif
#undef FOLIO_MATCH
#define FOLIO_MATCH(pg, fl) \
static_assert(offsetof(struct folio, fl) == \
diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 87ef6096823f..755f5f08ab38 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -455,8 +455,8 @@ static int __init crash_save_vmcoreinfo_init(void)
VMCOREINFO_OFFSET(page, lru);
VMCOREINFO_OFFSET(page, _mapcount);
VMCOREINFO_OFFSET(page, private);
- VMCOREINFO_OFFSET(page, compound_dtor);
- VMCOREINFO_OFFSET(page, compound_order);
+ VMCOREINFO_OFFSET(folio, _folio_dtor);
+ VMCOREINFO_OFFSET(folio, _folio_order);
VMCOREINFO_OFFSET(page, compound_head);
VMCOREINFO_OFFSET(pglist_data, node_zones);
VMCOREINFO_OFFSET(pglist_data, nr_zones);
--
2.35.1
^ permalink raw reply [flat|nested] 31+ messages in thread* [PATCH 23/28] doc: Correct struct folio kernel-doc
2023-01-11 14:28 [PATCH 00/28] Get rid of tail page fields Matthew Wilcox (Oracle)
` (21 preceding siblings ...)
2023-01-11 14:29 ` [PATCH 22/28] mm: Remove 'First tail page' members from struct page Matthew Wilcox (Oracle)
@ 2023-01-11 14:29 ` Matthew Wilcox (Oracle)
2023-01-11 14:29 ` [PATCH 24/28] mm: Move page->deferred_list to folio->_deferred_list Matthew Wilcox (Oracle)
` (4 subsequent siblings)
27 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-11 14:29 UTC (permalink / raw)
To: Andrew Morton; +Cc: Matthew Wilcox (Oracle), linux-mm, Hugh Dickins
Insert appropriate public: and private: markers to make the generated
kernel-doc look right.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
include/linux/mm_types.h | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 61a2e6b32781..4b8aa0f8f9fe 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -292,16 +292,12 @@ static inline struct page *encoded_page_ptr(struct encoded_page *page)
* @_refcount: Do not access this member directly. Use folio_ref_count()
* to find how many references there are to this folio.
* @memcg_data: Memory Control Group data.
- * @_flags_1: For large folios, additional page flags.
- * @_head_1: Points to the folio. Do not use.
* @_folio_dtor: Which destructor to use for this folio.
* @_folio_order: Do not use directly, call folio_order().
* @_entire_mapcount: Do not use directly, call folio_entire_mapcount().
* @_nr_pages_mapped: Do not use directly, call folio_mapcount().
* @_pincount: Do not use directly, call folio_maybe_dma_pinned().
* @_folio_nr_pages: Do not use directly, call folio_nr_pages().
- * @_flags_2: For alignment. Do not use.
- * @_head_2: Points to the folio. Do not use.
* @_hugetlb_subpool: Do not use directly, use accessor in hugetlb.h.
* @_hugetlb_cgroup: Do not use directly, use accessor in hugetlb_cgroup.h.
* @_hugetlb_cgroup_rsvd: Do not use directly, use accessor in hugetlb_cgroup.h.
@@ -348,6 +344,7 @@ struct folio {
struct {
unsigned long _flags_1;
unsigned long _head_1;
+ /* public: */
unsigned char _folio_dtor;
unsigned char _folio_order;
atomic_t _entire_mapcount;
@@ -356,6 +353,7 @@ struct folio {
#ifdef CONFIG_64BIT
unsigned int _folio_nr_pages;
#endif
+ /* private: the union with struct page is transitional */
};
struct page __page_1;
};
@@ -363,10 +361,12 @@ struct folio {
struct {
unsigned long _flags_2;
unsigned long _head_2;
+ /* public: */
void *_hugetlb_subpool;
void *_hugetlb_cgroup;
void *_hugetlb_cgroup_rsvd;
void *_hugetlb_hwpoison;
+ /* private: the union with struct page is transitional */
};
struct page __page_2;
};
--
2.35.1
^ permalink raw reply [flat|nested] 31+ messages in thread* [PATCH 24/28] mm: Move page->deferred_list to folio->_deferred_list
2023-01-11 14:28 [PATCH 00/28] Get rid of tail page fields Matthew Wilcox (Oracle)
` (22 preceding siblings ...)
2023-01-11 14:29 ` [PATCH 23/28] doc: Correct struct folio kernel-doc Matthew Wilcox (Oracle)
@ 2023-01-11 14:29 ` Matthew Wilcox (Oracle)
2023-01-11 14:29 ` [PATCH 25/28] mm/huge_memory: Remove page_deferred_list() Matthew Wilcox (Oracle)
` (3 subsequent siblings)
27 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-11 14:29 UTC (permalink / raw)
To: Andrew Morton; +Cc: Matthew Wilcox (Oracle), linux-mm, Hugh Dickins
Remove the entire block of definitions for the second tail page,
and add the deferred list to the struct folio. This actually moves
_deferred_list to a different offset in struct folio because I don't
see a need to include the padding.
This lets us use list_for_each_entry_safe() in deferred_split_scan()
and avoid a number of calls to compound_head().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
include/linux/huge_mm.h | 9 ++++-----
include/linux/mm_types.h | 14 ++++++++------
mm/huge_memory.c | 32 +++++++++++++++-----------------
3 files changed, 27 insertions(+), 28 deletions(-)
diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index a1341fdcf666..aacfcb02606f 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -295,11 +295,10 @@ static inline bool thp_migration_supported(void)
static inline struct list_head *page_deferred_list(struct page *page)
{
- /*
- * See organization of tail pages of compound page in
- * "struct page" definition.
- */
- return &page[2].deferred_list;
+ struct folio *folio = (struct folio *)page;
+
+ VM_BUG_ON_FOLIO(folio_order(folio) < 2, folio);
+ return &folio->_deferred_list;
}
#else /* CONFIG_TRANSPARENT_HUGEPAGE */
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 4b8aa0f8f9fe..c464205cf7ea 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -141,12 +141,6 @@ struct page {
struct { /* Tail pages of compound page */
unsigned long compound_head; /* Bit zero is set */
};
- struct { /* Second tail page of transparent huge page */
- unsigned long _compound_pad_1; /* compound_head */
- unsigned long _compound_pad_2;
- /* For both global and memcg */
- struct list_head deferred_list;
- };
struct { /* Second tail page of hugetlb page */
unsigned long _hugetlb_pad_1; /* compound_head */
void *hugetlb_subpool;
@@ -302,6 +296,7 @@ static inline struct page *encoded_page_ptr(struct encoded_page *page)
* @_hugetlb_cgroup: Do not use directly, use accessor in hugetlb_cgroup.h.
* @_hugetlb_cgroup_rsvd: Do not use directly, use accessor in hugetlb_cgroup.h.
* @_hugetlb_hwpoison: Do not use directly, call raw_hwp_list_head().
+ * @_deferred_list: Folios to be split under memory pressure.
*
* A folio is a physically, virtually and logically contiguous set
* of bytes. It is a power-of-two in size, and it is aligned to that
@@ -366,6 +361,13 @@ struct folio {
void *_hugetlb_cgroup;
void *_hugetlb_cgroup_rsvd;
void *_hugetlb_hwpoison;
+ /* private: the union with struct page is transitional */
+ };
+ struct {
+ unsigned long _flags_2a;
+ unsigned long _head_2a;
+ /* public: */
+ struct list_head _deferred_list;
/* private: the union with struct page is transitional */
};
struct page __page_2;
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index bfa960f012fa..a4138daaa0b8 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2756,9 +2756,9 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
/* Prevent deferred_split_scan() touching ->_refcount */
spin_lock(&ds_queue->split_queue_lock);
if (folio_ref_freeze(folio, 1 + extra_pins)) {
- if (!list_empty(page_deferred_list(&folio->page))) {
+ if (!list_empty(&folio->_deferred_list)) {
ds_queue->split_queue_len--;
- list_del(page_deferred_list(&folio->page));
+ list_del(&folio->_deferred_list);
}
spin_unlock(&ds_queue->split_queue_lock);
if (mapping) {
@@ -2873,8 +2873,8 @@ static unsigned long deferred_split_scan(struct shrinker *shrink,
struct pglist_data *pgdata = NODE_DATA(sc->nid);
struct deferred_split *ds_queue = &pgdata->deferred_split_queue;
unsigned long flags;
- LIST_HEAD(list), *pos, *next;
- struct page *page;
+ LIST_HEAD(list);
+ struct folio *folio, *next;
int split = 0;
#ifdef CONFIG_MEMCG
@@ -2884,14 +2884,13 @@ static unsigned long deferred_split_scan(struct shrinker *shrink,
spin_lock_irqsave(&ds_queue->split_queue_lock, flags);
/* Take pin on all head pages to avoid freeing them under us */
- list_for_each_safe(pos, next, &ds_queue->split_queue) {
- page = list_entry((void *)pos, struct page, deferred_list);
- page = compound_head(page);
- if (get_page_unless_zero(page)) {
- list_move(page_deferred_list(page), &list);
+ list_for_each_entry_safe(folio, next, &ds_queue->split_queue,
+ _deferred_list) {
+ if (folio_try_get(folio)) {
+ list_move(&folio->_deferred_list, &list);
} else {
- /* We lost race with put_compound_page() */
- list_del_init(page_deferred_list(page));
+ /* We lost race with folio_put() */
+ list_del_init(&folio->_deferred_list);
ds_queue->split_queue_len--;
}
if (!--sc->nr_to_scan)
@@ -2899,16 +2898,15 @@ static unsigned long deferred_split_scan(struct shrinker *shrink,
}
spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags);
- list_for_each_safe(pos, next, &list) {
- page = list_entry((void *)pos, struct page, deferred_list);
- if (!trylock_page(page))
+ list_for_each_entry_safe(folio, next, &list, _deferred_list) {
+ if (!folio_trylock(folio))
goto next;
/* split_huge_page() removes page from list on success */
- if (!split_huge_page(page))
+ if (!split_folio(folio))
split++;
- unlock_page(page);
+ folio_unlock(folio);
next:
- put_page(page);
+ folio_put(folio);
}
spin_lock_irqsave(&ds_queue->split_queue_lock, flags);
--
2.35.1
^ permalink raw reply [flat|nested] 31+ messages in thread* [PATCH 25/28] mm/huge_memory: Remove page_deferred_list()
2023-01-11 14:28 [PATCH 00/28] Get rid of tail page fields Matthew Wilcox (Oracle)
` (23 preceding siblings ...)
2023-01-11 14:29 ` [PATCH 24/28] mm: Move page->deferred_list to folio->_deferred_list Matthew Wilcox (Oracle)
@ 2023-01-11 14:29 ` Matthew Wilcox (Oracle)
2023-01-11 14:29 ` [PATCH 26/28] mm/huge_memory: Convert get_deferred_split_queue() to take a folio Matthew Wilcox (Oracle)
` (2 subsequent siblings)
27 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-11 14:29 UTC (permalink / raw)
To: Andrew Morton; +Cc: Matthew Wilcox (Oracle), linux-mm, Hugh Dickins
Use folio->_deferred_list directly.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
include/linux/huge_mm.h | 8 --------
mm/huge_memory.c | 34 +++++++++++++++++-----------------
2 files changed, 17 insertions(+), 25 deletions(-)
diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index aacfcb02606f..b9978978a160 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -293,14 +293,6 @@ static inline bool thp_migration_supported(void)
return IS_ENABLED(CONFIG_ARCH_ENABLE_THP_MIGRATION);
}
-static inline struct list_head *page_deferred_list(struct page *page)
-{
- struct folio *folio = (struct folio *)page;
-
- VM_BUG_ON_FOLIO(folio_order(folio) < 2, folio);
- return &folio->_deferred_list;
-}
-
#else /* CONFIG_TRANSPARENT_HUGEPAGE */
#define HPAGE_PMD_SHIFT ({ BUILD_BUG(); 0; })
#define HPAGE_PMD_MASK ({ BUILD_BUG(); 0; })
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index a4138daaa0b8..7aedfe7cf5df 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -580,12 +580,10 @@ static inline struct deferred_split *get_deferred_split_queue(struct page *page)
void prep_transhuge_page(struct page *page)
{
- /*
- * we use page->mapping and page->index in second tail page
- * as list_head: assuming THP order >= 2
- */
+ struct folio *folio = (struct folio *)page;
- INIT_LIST_HEAD(page_deferred_list(page));
+ VM_BUG_ON_FOLIO(folio_order(folio) < 2, folio);
+ INIT_LIST_HEAD(&folio->_deferred_list);
set_compound_page_dtor(page, TRANSHUGE_PAGE_DTOR);
}
@@ -2802,13 +2800,14 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
void free_transhuge_page(struct page *page)
{
+ struct folio *folio = (struct folio *)page;
struct deferred_split *ds_queue = get_deferred_split_queue(page);
unsigned long flags;
spin_lock_irqsave(&ds_queue->split_queue_lock, flags);
- if (!list_empty(page_deferred_list(page))) {
+ if (!list_empty(&folio->_deferred_list)) {
ds_queue->split_queue_len--;
- list_del(page_deferred_list(page));
+ list_del(&folio->_deferred_list);
}
spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags);
free_compound_page(page);
@@ -2816,38 +2815,39 @@ void free_transhuge_page(struct page *page)
void deferred_split_huge_page(struct page *page)
{
+ struct folio *folio = page_folio(page);
struct deferred_split *ds_queue = get_deferred_split_queue(page);
#ifdef CONFIG_MEMCG
- struct mem_cgroup *memcg = page_memcg(compound_head(page));
+ struct mem_cgroup *memcg = folio_memcg(folio);
#endif
unsigned long flags;
- VM_BUG_ON_PAGE(!PageTransHuge(page), page);
+ VM_BUG_ON_FOLIO(folio_order(folio) < 2, folio);
/*
* The try_to_unmap() in page reclaim path might reach here too,
* this may cause a race condition to corrupt deferred split queue.
- * And, if page reclaim is already handling the same page, it is
+ * And, if page reclaim is already handling the same folio, it is
* unnecessary to handle it again in shrinker.
*
- * Check PageSwapCache to determine if the page is being
- * handled by page reclaim since THP swap would add the page into
+ * Check the swapcache flag to determine if the folio is being
+ * handled by page reclaim since THP swap would add the folio into
* swap cache before calling try_to_unmap().
*/
- if (PageSwapCache(page))
+ if (folio_test_swapcache(folio))
return;
- if (!list_empty(page_deferred_list(page)))
+ if (!list_empty(&folio->_deferred_list))
return;
spin_lock_irqsave(&ds_queue->split_queue_lock, flags);
- if (list_empty(page_deferred_list(page))) {
+ if (list_empty(&folio->_deferred_list)) {
count_vm_event(THP_DEFERRED_SPLIT_PAGE);
- list_add_tail(page_deferred_list(page), &ds_queue->split_queue);
+ list_add_tail(&folio->_deferred_list, &ds_queue->split_queue);
ds_queue->split_queue_len++;
#ifdef CONFIG_MEMCG
if (memcg)
- set_shrinker_bit(memcg, page_to_nid(page),
+ set_shrinker_bit(memcg, folio_nid(folio),
deferred_split_shrinker.id);
#endif
}
--
2.35.1
^ permalink raw reply [flat|nested] 31+ messages in thread* [PATCH 26/28] mm/huge_memory: Convert get_deferred_split_queue() to take a folio
2023-01-11 14:28 [PATCH 00/28] Get rid of tail page fields Matthew Wilcox (Oracle)
` (24 preceding siblings ...)
2023-01-11 14:29 ` [PATCH 25/28] mm/huge_memory: Remove page_deferred_list() Matthew Wilcox (Oracle)
@ 2023-01-11 14:29 ` Matthew Wilcox (Oracle)
2023-01-11 14:29 ` [PATCH 27/28] mm: Convert deferred_split_huge_page() to deferred_split_folio() Matthew Wilcox (Oracle)
2023-01-11 14:29 ` [PATCH 28/28] mm: remove the hugetlb field from struct page Matthew Wilcox (Oracle)
27 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-11 14:29 UTC (permalink / raw)
To: Andrew Morton; +Cc: Matthew Wilcox (Oracle), linux-mm, Hugh Dickins
Removes a few calls to compound_head().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
mm/huge_memory.c | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 7aedfe7cf5df..c23b0e01734b 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -559,10 +559,11 @@ pmd_t maybe_pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma)
}
#ifdef CONFIG_MEMCG
-static inline struct deferred_split *get_deferred_split_queue(struct page *page)
+static inline
+struct deferred_split *get_deferred_split_queue(struct folio *folio)
{
- struct mem_cgroup *memcg = page_memcg(compound_head(page));
- struct pglist_data *pgdat = NODE_DATA(page_to_nid(page));
+ struct mem_cgroup *memcg = folio_memcg(folio);
+ struct pglist_data *pgdat = NODE_DATA(folio_nid(folio));
if (memcg)
return &memcg->deferred_split_queue;
@@ -570,9 +571,10 @@ static inline struct deferred_split *get_deferred_split_queue(struct page *page)
return &pgdat->deferred_split_queue;
}
#else
-static inline struct deferred_split *get_deferred_split_queue(struct page *page)
+static inline
+struct deferred_split *get_deferred_split_queue(struct folio *folio)
{
- struct pglist_data *pgdat = NODE_DATA(page_to_nid(page));
+ struct pglist_data *pgdat = NODE_DATA(folio_nid(folio));
return &pgdat->deferred_split_queue;
}
@@ -2650,7 +2652,7 @@ bool can_split_folio(struct folio *folio, int *pextra_pins)
int split_huge_page_to_list(struct page *page, struct list_head *list)
{
struct folio *folio = page_folio(page);
- struct deferred_split *ds_queue = get_deferred_split_queue(&folio->page);
+ struct deferred_split *ds_queue = get_deferred_split_queue(folio);
XA_STATE(xas, &folio->mapping->i_pages, folio->index);
struct anon_vma *anon_vma = NULL;
struct address_space *mapping = NULL;
@@ -2801,7 +2803,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
void free_transhuge_page(struct page *page)
{
struct folio *folio = (struct folio *)page;
- struct deferred_split *ds_queue = get_deferred_split_queue(page);
+ struct deferred_split *ds_queue = get_deferred_split_queue(folio);
unsigned long flags;
spin_lock_irqsave(&ds_queue->split_queue_lock, flags);
@@ -2816,7 +2818,7 @@ void free_transhuge_page(struct page *page)
void deferred_split_huge_page(struct page *page)
{
struct folio *folio = page_folio(page);
- struct deferred_split *ds_queue = get_deferred_split_queue(page);
+ struct deferred_split *ds_queue = get_deferred_split_queue(folio);
#ifdef CONFIG_MEMCG
struct mem_cgroup *memcg = folio_memcg(folio);
#endif
--
2.35.1
^ permalink raw reply [flat|nested] 31+ messages in thread* [PATCH 27/28] mm: Convert deferred_split_huge_page() to deferred_split_folio()
2023-01-11 14:28 [PATCH 00/28] Get rid of tail page fields Matthew Wilcox (Oracle)
` (25 preceding siblings ...)
2023-01-11 14:29 ` [PATCH 26/28] mm/huge_memory: Convert get_deferred_split_queue() to take a folio Matthew Wilcox (Oracle)
@ 2023-01-11 14:29 ` Matthew Wilcox (Oracle)
2023-01-11 14:29 ` [PATCH 28/28] mm: remove the hugetlb field from struct page Matthew Wilcox (Oracle)
27 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-11 14:29 UTC (permalink / raw)
To: Andrew Morton; +Cc: Matthew Wilcox (Oracle), linux-mm, Hugh Dickins
Now that both callers use a folio, pass the folio in and save a
call to compound_head().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
Documentation/mm/transhuge.rst | 6 +++---
include/linux/huge_mm.h | 4 ++--
mm/huge_memory.c | 3 +--
mm/rmap.c | 2 +-
4 files changed, 7 insertions(+), 8 deletions(-)
diff --git a/Documentation/mm/transhuge.rst b/Documentation/mm/transhuge.rst
index 03bbd0a19041..a9608fe51649 100644
--- a/Documentation/mm/transhuge.rst
+++ b/Documentation/mm/transhuge.rst
@@ -153,8 +153,8 @@ clear where references should go after split: it will stay on the head page.
Note that split_huge_pmd() doesn't have any limitations on refcounting:
pmd can be split at any point and never fails.
-Partial unmap and deferred_split_huge_page()
-============================================
+Partial unmap and deferred_split_folio()
+========================================
Unmapping part of THP (with munmap() or other way) is not going to free
memory immediately. Instead, we detect that a subpage of THP is not in use
@@ -166,6 +166,6 @@ the place where we can detect partial unmap. It also might be
counterproductive since in many cases partial unmap happens during exit(2) if
a THP crosses a VMA boundary.
-The function deferred_split_huge_page() is used to queue a page for splitting.
+The function deferred_split_folio() is used to queue a folio for splitting.
The splitting itself will happen when we get memory pressure via shrinker
interface.
diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index b9978978a160..70bd867eba94 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -187,7 +187,7 @@ static inline int split_huge_page(struct page *page)
{
return split_huge_page_to_list(page, NULL);
}
-void deferred_split_huge_page(struct page *page);
+void deferred_split_folio(struct folio *folio);
void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd,
unsigned long address, bool freeze, struct folio *folio);
@@ -340,7 +340,7 @@ static inline int split_huge_page(struct page *page)
{
return 0;
}
-static inline void deferred_split_huge_page(struct page *page) {}
+static inline void deferred_split_folio(struct folio *folio) {}
#define split_huge_pmd(__vma, __pmd, __address) \
do { } while (0)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index c23b0e01734b..868fcccdff72 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2815,9 +2815,8 @@ void free_transhuge_page(struct page *page)
free_compound_page(page);
}
-void deferred_split_huge_page(struct page *page)
+void deferred_split_folio(struct folio *folio)
{
- struct folio *folio = page_folio(page);
struct deferred_split *ds_queue = get_deferred_split_queue(folio);
#ifdef CONFIG_MEMCG
struct mem_cgroup *memcg = folio_memcg(folio);
diff --git a/mm/rmap.c b/mm/rmap.c
index 462b334d6842..7f76fc40af9a 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1427,7 +1427,7 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma,
*/
if (folio_test_pmd_mappable(folio) && folio_test_anon(folio))
if (!compound || nr < nr_pmdmapped)
- deferred_split_huge_page(&folio->page);
+ deferred_split_folio(folio);
}
/*
--
2.35.1
^ permalink raw reply [flat|nested] 31+ messages in thread* [PATCH 28/28] mm: remove the hugetlb field from struct page
2023-01-11 14:28 [PATCH 00/28] Get rid of tail page fields Matthew Wilcox (Oracle)
` (26 preceding siblings ...)
2023-01-11 14:29 ` [PATCH 27/28] mm: Convert deferred_split_huge_page() to deferred_split_folio() Matthew Wilcox (Oracle)
@ 2023-01-11 14:29 ` Matthew Wilcox (Oracle)
27 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-01-11 14:29 UTC (permalink / raw)
To: Andrew Morton; +Cc: Sidhartha Kumar, linux-mm, Hugh Dickins, Matthew Wilcox
From: Sidhartha Kumar <sidhartha.kumar@oracle.com>
commit dad6a5eb5556(mm,hugetlb: use folio fields in second tail page)
added a transitional hugetlb field to struct page and struct folio to make
room for another int in the first tail of a compound page. Hugetlb folio
conversions have changed all page users of this field to use the fields
within the folio so struct page no longer needs this hugetlb specific
field.
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
include/linux/mm_types.h | 12 ------------
1 file changed, 12 deletions(-)
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index c464205cf7ea..9932a4cd5b42 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -141,14 +141,6 @@ struct page {
struct { /* Tail pages of compound page */
unsigned long compound_head; /* Bit zero is set */
};
- struct { /* Second tail page of hugetlb page */
- unsigned long _hugetlb_pad_1; /* compound_head */
- void *hugetlb_subpool;
- void *hugetlb_cgroup;
- void *hugetlb_cgroup_rsvd;
- void *hugetlb_hwpoison;
- /* No more space on 32-bit: use third tail if more */
- };
struct { /* Page table pages */
unsigned long _pt_pad_1; /* compound_head */
pgtable_t pmd_huge_pte; /* protected by page->ptl */
@@ -399,10 +391,6 @@ FOLIO_MATCH(compound_head, _head_1);
offsetof(struct page, pg) + 2 * sizeof(struct page))
FOLIO_MATCH(flags, _flags_2);
FOLIO_MATCH(compound_head, _head_2);
-FOLIO_MATCH(hugetlb_subpool, _hugetlb_subpool);
-FOLIO_MATCH(hugetlb_cgroup, _hugetlb_cgroup);
-FOLIO_MATCH(hugetlb_cgroup_rsvd, _hugetlb_cgroup_rsvd);
-FOLIO_MATCH(hugetlb_hwpoison, _hugetlb_hwpoison);
#undef FOLIO_MATCH
/*
--
2.35.1
^ permalink raw reply [flat|nested] 31+ messages in thread