[PATCH] mm: Try harder to allocate vmemmap blocks

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] mm: Try harder to allocate vmemmap blocks
@ 2013-02-14  4:34 Ben Hutchings
  2013-02-14  6:40 ` Johannes Weiner
  0 siblings, 1 reply; 3+ messages in thread
From: Ben Hutchings @ 2013-02-14  4:34 UTC (permalink / raw)
  To: linux-mm

[-- Attachment #1: Type: text/plain, Size: 1319 bytes --]

Hot-adding memory on x86_64 normally requires huge page allocation.
When this is done to a VM guest, it's usually because the system is
already tight on memory, so the request tends to fail.  Try to avoid
this by adding __GFP_REPEAT to the allocation flags.

Reported-and-tested-by: Bernhard Schmidt <Bernhard.Schmidt@lrz.de>
Reference: http://bugs.debian.org/699913
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
We could go even further and use __GFP_NOFAIL, but I'm not sure whether
that would be a good idea.

Ben.

 mm/sparse-vmemmap.c |    8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c
index 1b7e22a..22b7e18 100644
--- a/mm/sparse-vmemmap.c
+++ b/mm/sparse-vmemmap.c
@@ -53,10 +53,12 @@ void * __meminit vmemmap_alloc_block(unsigned long size, int node)
 		struct page *page;
 
 		if (node_state(node, N_HIGH_MEMORY))
-			page = alloc_pages_node(node,
-				GFP_KERNEL | __GFP_ZERO, get_order(size));
+			page = alloc_pages_node(
+				node, GFP_KERNEL | __GFP_ZERO | __GFP_REPEAT,
+				get_order(size));
 		else
-			page = alloc_pages(GFP_KERNEL | __GFP_ZERO,
+			page = alloc_pages(
+				GFP_KERNEL | __GFP_ZERO | __GFP_REPEAT,
 				get_order(size));
 		if (page)
 			return page_address(page);


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] mm: Try harder to allocate vmemmap blocks
  2013-02-14  4:34 [PATCH] mm: Try harder to allocate vmemmap blocks Ben Hutchings
@ 2013-02-14  6:40 ` Johannes Weiner
  2013-02-15  1:47   ` Ben Hutchings
  0 siblings, 1 reply; 3+ messages in thread
From: Johannes Weiner @ 2013-02-14  6:40 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: linux-mm

On Thu, Feb 14, 2013 at 04:34:28AM +0000, Ben Hutchings wrote:
> Hot-adding memory on x86_64 normally requires huge page allocation.
> When this is done to a VM guest, it's usually because the system is
> already tight on memory, so the request tends to fail.  Try to avoid
> this by adding __GFP_REPEAT to the allocation flags.
> 
> Reported-and-tested-by: Bernhard Schmidt <Bernhard.Schmidt@lrz.de>
> Reference: http://bugs.debian.org/699913
> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

> We could go even further and use __GFP_NOFAIL, but I'm not sure
> whether that would be a good idea.

If __GFP_REPEAT is not enough, I'd rather fall back to regular page
backing at this point:

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 2ead3c8..1f5301d 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -919,6 +919,7 @@ vmemmap_populate(struct page *start_page, unsigned long size, int node)
 {
 	unsigned long addr = (unsigned long)start_page;
 	unsigned long end = (unsigned long)(start_page + size);
+	int use_huge = cpu_has_pse;
 	unsigned long next;
 	pgd_t *pgd;
 	pud_t *pud;
@@ -934,8 +935,8 @@ vmemmap_populate(struct page *start_page, unsigned long size, int node)
 		pud = vmemmap_pud_populate(pgd, addr, node);
 		if (!pud)
 			return -ENOMEM;
-
-		if (!cpu_has_pse) {
+retry_pmd:
+		if (!use_huge) {
 			next = (addr + PAGE_SIZE) & PAGE_MASK;
 			pmd = vmemmap_pmd_populate(pud, addr, node);
 
@@ -957,8 +958,10 @@ vmemmap_populate(struct page *start_page, unsigned long size, int node)
 				pte_t entry;
 
 				p = vmemmap_alloc_block_buf(PMD_SIZE, node);
-				if (!p)
-					return -ENOMEM;
+				if (!p) {
+					use_huge = 0;
+					goto retry_pmd;
+				}
 
 				entry = pfn_pte(__pa(p) >> PAGE_SHIFT,
 						PAGE_KERNEL_LARGE);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] mm: Try harder to allocate vmemmap blocks
  2013-02-14  6:40 ` Johannes Weiner
@ 2013-02-15  1:47   ` Ben Hutchings
  0 siblings, 0 replies; 3+ messages in thread
From: Ben Hutchings @ 2013-02-15  1:47 UTC (permalink / raw)
  To: Johannes Weiner; +Cc: linux-mm

[-- Attachment #1: Type: text/plain, Size: 2387 bytes --]

On Thu, 2013-02-14 at 01:40 -0500, Johannes Weiner wrote:
> On Thu, Feb 14, 2013 at 04:34:28AM +0000, Ben Hutchings wrote:
> > Hot-adding memory on x86_64 normally requires huge page allocation.
> > When this is done to a VM guest, it's usually because the system is
> > already tight on memory, so the request tends to fail.  Try to avoid
> > this by adding __GFP_REPEAT to the allocation flags.
> > 
> > Reported-and-tested-by: Bernhard Schmidt <Bernhard.Schmidt@lrz.de>
> > Reference: http://bugs.debian.org/699913
> > Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
> 
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> 
> > We could go even further and use __GFP_NOFAIL, but I'm not sure
> > whether that would be a good idea.
> 
> If __GFP_REPEAT is not enough, I'd rather fall back to regular page
> backing at this point:

Oh yes, I had considered doing that before settling on __GFP_REPEAT.  It
does seem worth doing.  Perhaps you could also log a specific warning,
as the use of 4K page entries for this could have a significant
performance impact.

Ben.

> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index 2ead3c8..1f5301d 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -919,6 +919,7 @@ vmemmap_populate(struct page *start_page, unsigned long size, int node)
>  {
>  	unsigned long addr = (unsigned long)start_page;
>  	unsigned long end = (unsigned long)(start_page + size);
> +	int use_huge = cpu_has_pse;
>  	unsigned long next;
>  	pgd_t *pgd;
>  	pud_t *pud;
> @@ -934,8 +935,8 @@ vmemmap_populate(struct page *start_page, unsigned long size, int node)
>  		pud = vmemmap_pud_populate(pgd, addr, node);
>  		if (!pud)
>  			return -ENOMEM;
> -
> -		if (!cpu_has_pse) {
> +retry_pmd:
> +		if (!use_huge) {
>  			next = (addr + PAGE_SIZE) & PAGE_MASK;
>  			pmd = vmemmap_pmd_populate(pud, addr, node);
>  
> @@ -957,8 +958,10 @@ vmemmap_populate(struct page *start_page, unsigned long size, int node)
>  				pte_t entry;
>  
>  				p = vmemmap_alloc_block_buf(PMD_SIZE, node);
> -				if (!p)
> -					return -ENOMEM;
> +				if (!p) {
> +					use_huge = 0;
> +					goto retry_pmd;
> +				}
>  
>  				entry = pfn_pte(__pa(p) >> PAGE_SHIFT,
>  						PAGE_KERNEL_LARGE);
> 

-- 
Ben Hutchings
Absolutum obsoletum. (If it works, it's out of date.) - Stafford Beer

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2013-02-15  1:48 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-14  4:34 [PATCH] mm: Try harder to allocate vmemmap blocks Ben Hutchings
2013-02-14  6:40 ` Johannes Weiner
2013-02-15  1:47   ` Ben Hutchings

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox