linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [RFC] Scale slub page allocations with memory size
@ 2018-04-25  4:47 Matthew Wilcox
  2018-04-25 19:13 ` Christopher Lameter
  0 siblings, 1 reply; 2+ messages in thread
From: Matthew Wilcox @ 2018-04-25  4:47 UTC (permalink / raw)
  To: linux-mm; +Cc: Christopher Lameter

From: Matthew Wilcox <mawilcox@microsoft.com>

With larger memory sizes, it's more important to avoid external
fragmentation than reduce memory usage.
    
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>

diff --git a/mm/internal.h b/mm/internal.h
index 62d8c34e63d5..fe0e60b8db11 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -167,6 +167,7 @@ extern void prep_compound_page(struct page *page, unsigned int order);
 extern void post_alloc_hook(struct page *page, unsigned int order,
 					gfp_t gfp_flags);
 extern int user_min_free_kbytes;
+extern unsigned long __meminitdata nr_kernel_pages;
 
 extern void set_zone_contiguous(struct zone *zone);
 extern void clear_zone_contiguous(struct zone *zone);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 905db9d7962f..7db8945bc915 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -265,7 +265,7 @@ int min_free_kbytes = 1024;
 int user_min_free_kbytes = -1;
 int watermark_scale_factor = 10;
 
-static unsigned long nr_kernel_pages __meminitdata;
+unsigned long nr_kernel_pages __meminitdata;
 static unsigned long nr_all_pages __meminitdata;
 static unsigned long dma_reserve __meminitdata;
 
diff --git a/mm/slub.c b/mm/slub.c
index 44aa7847324a..61a423e38dcf 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3195,7 +3195,7 @@ EXPORT_SYMBOL(kmem_cache_alloc_bulk);
  * and increases the number of allocations possible without having to
  * take the list_lock.
  */
-static unsigned int slub_min_order;
+static unsigned int slub_min_order = ~0U;
 static unsigned int slub_max_order = PAGE_ALLOC_COSTLY_ORDER;
 static unsigned int slub_min_objects;
 
@@ -4221,6 +4221,23 @@ void __init kmem_cache_init(void)
 
 	if (debug_guardpage_minorder())
 		slub_max_order = 0;
+	if (slub_min_order == ~0) {
+		unsigned long numpages = nr_kernel_pages;
+
+		/*
+		 * Above a million pages, we start to care more about
+		 * fragmentation than about using the minimum amount of
+		 * memory.  Scale the slub page size at half the rate of
+		 * the memory size; at 4GB we double the page size to 8k,
+		 * 16GB to 16k, 64GB to 32k, 256GB to 64k.
+		 */
+		do {
+			slub_min_order++;
+			if (slub_min_order == slub_max_order)
+				break;
+			numpages /= 4;
+		} while (numpages > (1UL << 20));
+	}
 
 	kmem_cache_node = &boot_kmem_cache_node;
 	kmem_cache = &boot_kmem_cache;

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [RFC] Scale slub page allocations with memory size
  2018-04-25  4:47 [RFC] Scale slub page allocations with memory size Matthew Wilcox
@ 2018-04-25 19:13 ` Christopher Lameter
  0 siblings, 0 replies; 2+ messages in thread
From: Christopher Lameter @ 2018-04-25 19:13 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-mm

On Tue, 24 Apr 2018, Matthew Wilcox wrote:

> From: Matthew Wilcox <mawilcox@microsoft.com>
>
> With larger memory sizes, it's more important to avoid external
> fragmentation than reduce memory usage.

If you do that then the higher order pages that we will then be using will
be exhausted faster. I think we need a generic fix to be able to preserve
higher order pages first.

Dave Hansen and I thought about a 2M basepage configuration?

Something between 4k and 2M would be better but then the hardware wont
support that and given that we can have terabytes in a server this may
be feasable now.

Or make order 0 be 64k page like on ARM 64 and Power and then handle
multiple ptes like the implementation years ago by Hugh.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2018-04-25 19:13 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-25  4:47 [RFC] Scale slub page allocations with memory size Matthew Wilcox
2018-04-25 19:13 ` Christopher Lameter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox