linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Zi Yan <zi.yan@sent.com>
To: linux-mm@kvack.org
Cc: Zi Yan <ziy@nvidia.com>, David Hildenbrand <david@redhat.com>,
	Matthew Wilcox <willy@infradead.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	John Hubbard <jhubbard@nvidia.com>,
	Yang Shi <shy828301@gmail.com>,
	David Rientjes <rientjes@google.com>,
	James Houghton <jthoughton@google.com>,
	Mike Rapoport <rppt@kernel.org>,
	Muchun Song <songmuchun@bytedance.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org
Subject: [PATCH v1 09/12] mm: Make MAX_ORDER of buddy allocator configurable via Kconfig SET_MAX_ORDER.
Date: Wed, 21 Sep 2022 21:12:49 -0400	[thread overview]
Message-ID: <20220922011252.2266780-10-zi.yan@sent.com> (raw)
In-Reply-To: <20220922011252.2266780-1-zi.yan@sent.com>

From: Zi Yan <ziy@nvidia.com>

With SPARSEMEM_VMEMMAP, all struct page are virtually contigous,
thus kernel can manipulate arbitrarily large pages. By checking
PFN validity during buddy page merging process, all free pages in buddy
allocator's free area have their PFNs contiguous even if the system has
several not physically contiguous memory sections. With these two
conditions, it is OK to remove the restriction of
MAX_ORDER + PAGE_SHIFT < SECTION_SIZE_BITS and change MAX_ORDER freely.

Add SET_MAX_ORDER to allow MAX_ORDER adjustment when arch does not set
its own MAX_ORDER via ARCH_FORCE_MAX_ORDER. Make it depend
on SPARSEMEM_VMEMMAP, when MAX_ORDER is not limited by SECTION_SIZE_BITS.

Signed-off-by: Zi Yan <ziy@nvidia.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 arch/Kconfig           |  4 ++++
 include/linux/mmzone.h | 17 ++++++++++++++---
 mm/Kconfig             | 14 ++++++++++++++
 mm/internal.h          |  2 --
 4 files changed, 32 insertions(+), 5 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index 1c2599618eeb..e51c759a82ad 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -11,6 +11,10 @@ source "arch/$(SRCARCH)/Kconfig"
 
 menu "General architecture-dependent options"
 
+config ARCH_FORCE_MAX_ORDER
+	int
+	default "0"
+
 config CRASH_CORE
 	bool
 
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index de1548f4fc07..da5745fa15c3 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -24,11 +24,14 @@
 #include <asm/page.h>
 
 /* Free memory management - zoned buddy allocator.  */
-#ifndef CONFIG_ARCH_FORCE_MAX_ORDER
-#define MAX_ORDER 10
-#else
+#ifdef CONFIG_SET_MAX_ORDER
+#define MAX_ORDER CONFIG_SET_MAX_ORDER
+#elif CONFIG_ARCH_FORCE_MAX_ORDER != 0
 #define MAX_ORDER CONFIG_ARCH_FORCE_MAX_ORDER
+#else
+#define MAX_ORDER 10
 #endif
+
 #define MAX_ORDER_NR_PAGES (1 << MAX_ORDER)
 
 /*
@@ -1588,9 +1591,17 @@ static inline bool movable_only_nodes(nodemask_t *nodes)
 #define SECTION_BLOCKFLAGS_BITS \
 	((1UL << (PFN_SECTION_SHIFT - pageblock_order)) * NR_PAGEBLOCK_BITS)
 
+/*
+ * The MAX_ORDER check is not necessary when CONFIG_SET_MAX_ORDER is set, since
+ * it depends on CONFIG_SPARSEMEM_VMEMMAP, where all struct page are virtually
+ * contiguous, thus > section size pages can be allocated and manipulated
+ * without worrying about non-contiguous struct page.
+ */
+#ifndef CONFIG_SET_MAX_ORDER
 #if (MAX_ORDER + PAGE_SHIFT) > SECTION_SIZE_BITS
 #error Allocator MAX_ORDER exceeds SECTION_SIZE
 #endif
+#endif /* CONFIG_SET_MAX_ORDER*/
 
 static inline unsigned long pfn_to_section_nr(unsigned long pfn)
 {
diff --git a/mm/Kconfig b/mm/Kconfig
index ae6711d24e4a..9c7280acd528 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -435,6 +435,20 @@ config SPARSEMEM_VMEMMAP
 	  pfn_to_page and page_to_pfn operations.  This is the most
 	  efficient option when sufficient kernel resources are available.
 
+config SET_MAX_ORDER
+	int "Set maximum order of buddy allocator"
+	depends on SPARSEMEM_VMEMMAP && (ARCH_FORCE_MAX_ORDER = 0)
+	range 10 255
+	default "10"
+	help
+	  The kernel memory allocator divides physically contiguous memory
+	  blocks into "zones", where each zone is a power of two number of
+	  pages.  This option selects the largest power of two that the kernel
+	  keeps in the memory allocator.  If you need to allocate very large
+	  blocks of physically contiguous memory, then you may need to
+	  increase this value. A value of 10 means that the largest free memory
+	  block is 2^10 pages.
+
 config HAVE_MEMBLOCK_PHYS_MAP
 	bool
 
diff --git a/mm/internal.h b/mm/internal.h
index 1b1abfc2196e..1c3f260930d8 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -303,8 +303,6 @@ static inline bool page_is_buddy(struct page *page, struct page *buddy,
  * 2) Any buddy B will have an order O+1 parent P which
  * satisfies the following equation:
  *     P = B & ~(1 << O)
- *
- * Assumption: *_mem_map is contiguous at least up to MAX_PHYS_CONTIG_ORDER
  */
 static inline unsigned long
 __find_buddy_pfn(unsigned long page_pfn, unsigned int order)
-- 
2.35.1



  parent reply	other threads:[~2022-09-22  1:13 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-22  1:12 [PATCH v1 00/12] Make MAX_ORDER adjustable as a kernel boot time parameter Zi Yan
2022-09-22  1:12 ` [PATCH v1 01/12] mm: rectify MAX_ORDER semantics to be the largest page order from buddy allocator Zi Yan
2022-09-22  1:12 ` [PATCH v1 02/12] mm: check page validity when find a buddy page in a non-contiguous zone Zi Yan
2022-09-22  1:12 ` [PATCH v1 03/12] mm: adapt deferred struct page init to new MAX_ORDER Zi Yan
2022-09-22  1:12 ` [PATCH v1 04/12] mm: prevent pageblock size being larger than section size Zi Yan
2022-09-22  1:12 ` [PATCH v1 05/12] fs: proc: use pageblock_nr_pages for reschedule period in read_kcore() Zi Yan
2022-09-22  1:12 ` [PATCH v1 06/12] virtio: virtio_balloon: use pageblock_order instead of MAX_ORDER Zi Yan
2022-09-22  1:12 ` [PATCH v1 07/12] mm/page_reporting: set page_reporting_order to -1 to prevent it running Zi Yan
2022-09-22  1:12 ` [PATCH v1 08/12] mm: replace MAX_ORDER when it is used to indicate max physical contiguity Zi Yan
2022-09-22  1:12 ` Zi Yan [this message]
2022-09-22  1:12 ` [PATCH v1 10/12] mm: convert MAX_ORDER sized static arrays to dynamic ones Zi Yan
2022-09-22  1:12 ` [PATCH v1 11/12] mm: introduce MIN_MAX_ORDER to replace MAX_ORDER as compile time constant Zi Yan
2022-09-22  1:12 ` [PATCH v1 12/12] mm: make MAX_ORDER a kernel boot time parameter Zi Yan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220922011252.2266780-10-zi.yan@sent.com \
    --to=zi.yan@sent.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=jhubbard@nvidia.com \
    --cc=jthoughton@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=rientjes@google.com \
    --cc=rppt@kernel.org \
    --cc=shy828301@gmail.com \
    --cc=songmuchun@bytedance.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox