* [PATCH v3 0/2] mm: Don't set and reset page count in MEMINIT_EARLY
@ 2023-09-26 2:33 Yajun Deng
2023-09-26 2:33 ` [PATCH v3 2/2] mm: Init page count in reserve_bootmem_region when MEMINIT_EARLY Yajun Deng
[not found] ` <20230926023341.991124-2-yajun.deng@linux.dev>
0 siblings, 2 replies; 5+ messages in thread
From: Yajun Deng @ 2023-09-26 2:33 UTC (permalink / raw)
To: akpm, rppt
Cc: mike.kravetz, muchun.song, willy, david, linux-mm, linux-kernel,
Yajun Deng
__init_single_page would set page count and __free_pages_core would
reset it. A lot of pages don't need to do this when in MEMINIT_EARLY
context. It's unnecessary and time-consuming.
The 1st patch is pass page count and reserved to __init_single_page.
It's in preparation for the 2nd patch, it didn't change anything.
The 2nd patch only set page count for the reserved region, not all
of the region.
Yajun Deng (2):
mm: pass page count and reserved to __init_single_page
mm: Init page count in reserve_bootmem_region when MEMINIT_EARLY
mm/hugetlb.c | 2 +-
mm/internal.h | 8 +++++++-
mm/mm_init.c | 45 ++++++++++++++++++++++++++++-----------------
mm/page_alloc.c | 20 ++++++++++++--------
4 files changed, 48 insertions(+), 27 deletions(-)
--
2.25.1
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH v3 2/2] mm: Init page count in reserve_bootmem_region when MEMINIT_EARLY
2023-09-26 2:33 [PATCH v3 0/2] mm: Don't set and reset page count in MEMINIT_EARLY Yajun Deng
@ 2023-09-26 2:33 ` Yajun Deng
[not found] ` <20230926023341.991124-2-yajun.deng@linux.dev>
1 sibling, 0 replies; 5+ messages in thread
From: Yajun Deng @ 2023-09-26 2:33 UTC (permalink / raw)
To: akpm, rppt
Cc: mike.kravetz, muchun.song, willy, david, linux-mm, linux-kernel,
Yajun Deng
memmap_init_range() would set page count of all pages, but the free
pages count would be reset in __free_pages_core(). There are opposite
operations. It's unnecessary and time-consuming when it's MEMINIT_EARLY
context.
Init page count in reserve_bootmem_region when in MEMINIT_EARLY context,
and check the page count before reset it.
At the same time, the INIT_LIST_HEAD in reserve_bootmem_region isn't
need, as it already done in __init_single_page.
The following data was tested on an x86 machine with 190GB of RAM.
before:
free_low_memory_core_early() 341ms
after:
free_low_memory_core_early() 285ms
Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
---
v3: same with v2.
v2: check page count instead of check context before reset it.
v1: https://lore.kernel.org/all/20230922070923.355656-1-yajun.deng@linux.dev/
---
mm/mm_init.c | 18 +++++++++++++-----
mm/page_alloc.c | 20 ++++++++++++--------
2 files changed, 25 insertions(+), 13 deletions(-)
diff --git a/mm/mm_init.c b/mm/mm_init.c
index 07fe7e489769..af1b3e7b0f52 100644
--- a/mm/mm_init.c
+++ b/mm/mm_init.c
@@ -718,7 +718,7 @@ static void __meminit init_reserved_page(unsigned long pfn, int nid)
if (zone_spans_pfn(zone, pfn))
break;
}
- __init_single_page(pfn_to_page(pfn), pfn, zid, nid, INIT_PAGE_COUNT);
+ __init_single_page(pfn_to_page(pfn), pfn, zid, nid, 0);
}
#else
static inline void pgdat_set_deferred_range(pg_data_t *pgdat) {}
@@ -756,8 +756,8 @@ void __meminit reserve_bootmem_region(phys_addr_t start,
init_reserved_page(start_pfn, nid);
- /* Avoid false-positive PageTail() */
- INIT_LIST_HEAD(&page->lru);
+ /* Set page count for the reserved region */
+ init_page_count(page);
/*
* no need for atomic set_bit because the struct
@@ -888,9 +888,17 @@ void __meminit memmap_init_range(unsigned long size, int nid, unsigned long zone
}
page = pfn_to_page(pfn);
- __init_single_page(page, pfn, zone, nid, INIT_PAGE_COUNT);
- if (context == MEMINIT_HOTPLUG)
+
+ /* If the context is MEMINIT_EARLY, we will set page count and
+ * mark page reserved in reserve_bootmem_region, the free region
+ * wouldn't have page count and we will check the pages count
+ * in __free_pages_core.
+ */
+ __init_single_page(page, pfn, zone, nid, 0);
+ if (context == MEMINIT_HOTPLUG) {
+ init_page_count(page);
__SetPageReserved(page);
+ }
/*
* Usually, we want to mark the pageblock MIGRATE_MOVABLE,
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 06be8821d833..b868caabe8dc 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1285,18 +1285,22 @@ void __free_pages_core(struct page *page, unsigned int order)
unsigned int loop;
/*
- * When initializing the memmap, __init_single_page() sets the refcount
- * of all pages to 1 ("allocated"/"not free"). We have to set the
- * refcount of all involved pages to 0.
+ * When initializing the memmap, memmap_init_range sets the refcount
+ * of all pages to 1 ("reserved" and "free") in hotplug context. We
+ * have to set the refcount of all involved pages to 0. Otherwise,
+ * we don't do it, as reserve_bootmem_region only set the refcount on
+ * reserve region ("reserved") in early context.
*/
- prefetchw(p);
- for (loop = 0; loop < (nr_pages - 1); loop++, p++) {
- prefetchw(p + 1);
+ if (page_count(page)) {
+ prefetchw(p);
+ for (loop = 0; loop < (nr_pages - 1); loop++, p++) {
+ prefetchw(p + 1);
+ __ClearPageReserved(p);
+ set_page_count(p, 0);
+ }
__ClearPageReserved(p);
set_page_count(p, 0);
}
- __ClearPageReserved(p);
- set_page_count(p, 0);
atomic_long_add(nr_pages, &page_zone(page)->managed_pages);
--
2.25.1
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v3 1/2] mm: pass page count and reserved to __init_single_page
[not found] ` <20230926023341.991124-2-yajun.deng@linux.dev>
@ 2023-09-26 7:44 ` David Hildenbrand
2023-09-26 7:57 ` Yajun Deng
2023-09-28 5:30 ` Mike Rapoport
1 sibling, 1 reply; 5+ messages in thread
From: David Hildenbrand @ 2023-09-26 7:44 UTC (permalink / raw)
To: Yajun Deng, akpm, rppt
Cc: mike.kravetz, muchun.song, willy, linux-mm, linux-kernel
On 26.09.23 04:33, Yajun Deng wrote:
> When we init a single page, we need to mark this page reserved if it
> does.
I failed to parse the last part of this sentence.
> And some pages may not need to set page count, such as compound
> pages.
Usually, the refcount of all tail pages *must* be zero. Otherwise,
get_page_unless_zero() would work on tail pages.
Can you elaborate why it should be okay here?
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v3 1/2] mm: pass page count and reserved to __init_single_page
2023-09-26 7:44 ` [PATCH v3 1/2] mm: pass page count and reserved to __init_single_page David Hildenbrand
@ 2023-09-26 7:57 ` Yajun Deng
0 siblings, 0 replies; 5+ messages in thread
From: Yajun Deng @ 2023-09-26 7:57 UTC (permalink / raw)
To: David Hildenbrand, akpm, rppt
Cc: mike.kravetz, muchun.song, willy, linux-mm, linux-kernel
On 2023/9/26 15:44, David Hildenbrand wrote:
> On 26.09.23 04:33, Yajun Deng wrote:
>> When we init a single page, we need to mark this page reserved if it
>> does.
>
> I failed to parse the last part of this sentence.
>
>> And some pages may not need to set page count, such as compound
>> pages.
>
> Usually, the refcount of all tail pages *must* be zero. Otherwise,
> get_page_unless_zero() would work on tail pages.
>
> Can you elaborate why it should be okay here?
>
>
It means the following code in memmap_init_compound().
- __init_zone_device_page(page, pfn, zone_idx, nid, pgmap);
+ __init_zone_device_page(page, pfn, zone_idx, nid, pgmap, 0);
prep_compound_tail(head, pfn - head_pfn);
set_page_count(page, 0);
As we can see, it will reset the page count by 'set_page_count(page, 0)'.
Therefore, we don't need to set page count in __init_zone_device_page().
I wasn't clear enough in the commit.
Maybe we can remove the 'set_page_count(page, 0)', But I didn't do
that, just to be safe.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v3 1/2] mm: pass page count and reserved to __init_single_page
[not found] ` <20230926023341.991124-2-yajun.deng@linux.dev>
2023-09-26 7:44 ` [PATCH v3 1/2] mm: pass page count and reserved to __init_single_page David Hildenbrand
@ 2023-09-28 5:30 ` Mike Rapoport
1 sibling, 0 replies; 5+ messages in thread
From: Mike Rapoport @ 2023-09-28 5:30 UTC (permalink / raw)
To: Yajun Deng
Cc: akpm, mike.kravetz, muchun.song, willy, david, linux-mm, linux-kernel
On Tue, Sep 26, 2023 at 10:33:40AM +0800, Yajun Deng wrote:
> When we init a single page, we need to mark this page reserved if it
> does. And some pages may not need to set page count, such as compound
> pages.
>
> Introduce enum init_page_flags, the caller init page count and mark page
> reserved by passing INIT_PAGE_COUNT and INIT_PAGE_RESERVED.
>
> Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
> ---
> v3: Introduce enum init_page_flags.
> v2: Introduce INIT_PAGE_COUNT and INIT_PAGE_RESERVED.
> v1: https://lore.kernel.org/all/20230922070923.355656-1-yajun.deng@linux.dev/
> ---
> mm/hugetlb.c | 2 +-
> mm/internal.h | 8 +++++++-
> mm/mm_init.c | 31 +++++++++++++++++--------------
> 3 files changed, 25 insertions(+), 16 deletions(-)
>
> diff --git a/mm/mm_init.c b/mm/mm_init.c
> index 06a72c223bce..07fe7e489769 100644
> --- a/mm/mm_init.c
> +++ b/mm/mm_init.c
> @@ -1041,7 +1043,7 @@ static void __ref memmap_init_compound(struct page *head,
> for (pfn = head_pfn + 1; pfn < end_pfn; pfn++) {
> struct page *page = pfn_to_page(pfn);
>
> - __init_zone_device_page(page, pfn, zone_idx, nid, pgmap);
> + __init_zone_device_page(page, pfn, zone_idx, nid, pgmap, 0);
I think the first patch should not contain any functional changes, but only
add the flags parameter to __init_single_page().
> prep_compound_tail(head, pfn - head_pfn);
> set_page_count(page, 0);
>
--
Sincerely yours,
Mike.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-09-28 5:31 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-26 2:33 [PATCH v3 0/2] mm: Don't set and reset page count in MEMINIT_EARLY Yajun Deng
2023-09-26 2:33 ` [PATCH v3 2/2] mm: Init page count in reserve_bootmem_region when MEMINIT_EARLY Yajun Deng
[not found] ` <20230926023341.991124-2-yajun.deng@linux.dev>
2023-09-26 7:44 ` [PATCH v3 1/2] mm: pass page count and reserved to __init_single_page David Hildenbrand
2023-09-26 7:57 ` Yajun Deng
2023-09-28 5:30 ` Mike Rapoport
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox