linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Maxim Levitsky <maximlevitsky@gmail.com>
Cc: linux-mm@kvack.org
Subject: Re: Guaranteed allocation of huge pages (1G) using movablecore=N doesn't seem to work at all
Date: Mon, 6 Nov 2017 20:31:45 +0100	[thread overview]
Message-ID: <20171106193145.v7s4rjyc3tfvcqkq@dhcp22.suse.cz> (raw)
In-Reply-To: <20171106183237.64b3hj25hbfw7v4l@dhcp22.suse.cz>

On Mon 06-11-17 19:32:37, Michal Hocko wrote:
> On Mon 06-11-17 20:13:36, Maxim Levitsky wrote:
> > Yes, I tested git head from mainline and few kernels from ubuntu repos
> > since I was lazy to compile them too.
> 
> OK, so this hasn't worked realiably as I've suspected.
> 
> > Do you have an idea what can I do about this issue? Do you think its
> > feasable to fix this?
> 
> Well, I think that giga pages need quite some love to be usable
> reliably. The current implementation is more towards "make it work if
> there is enough unused memory".
> 
> > And if not using moveable zone, how would it even be possible to have
> > guaranreed allocation of 1g pages
> 
> Having a guaranteed giga pages is something the kernel is not yet ready
> to offer.  Abusing zone movable might look like the right direction
> but that is not really the case until we make sure those pages are
> migratable.
> 
> There has been a simple patch which makes PUD (1GB) pages migrateable
> http://lkml.kernel.org/r/20170913101047.GA13026@gmail.com but I've had
> concerns that it really didn't consider the migration path much
> http://lkml.kernel.org/r/20171003073301.hydw7jf2wztsx2om%40dhcp22.suse.cz
> I still believe the patch is not complete but maybe it is not that far
> away from being so. E.g. the said pfn_range_valid_gigantic can be
> enhanced to make the migration much more reliable or get rid of it
> altogether because the pfn based allocator already knows how to do
> migration and other stuff.

Here is the first shot on the weird pfn_range_valid_gigantic. This is
completely (even compile) untested. It should just give an idea. I will
think about this some more later. If you have a scratch system that you
are not afraid to play with I would appreciate if you could give it a
try.
---
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 5ab12fda8ed5..17ca753560b7 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1071,34 +1071,6 @@ static int __alloc_gigantic_page(unsigned long start_pfn,
 				  gfp_mask);
 }
 
-static bool pfn_range_valid_gigantic(struct zone *z,
-			unsigned long start_pfn, unsigned long nr_pages)
-{
-	unsigned long i, end_pfn = start_pfn + nr_pages;
-	struct page *page;
-
-	for (i = start_pfn; i < end_pfn; i++) {
-		if (!pfn_valid(i))
-			return false;
-
-		page = pfn_to_page(i);
-
-		if (page_zone(page) != z)
-			return false;
-
-		if (PageReserved(page))
-			return false;
-
-		if (page_count(page) > 0)
-			return false;
-
-		if (PageHuge(page))
-			return false;
-	}
-
-	return true;
-}
-
 static bool zone_spans_last_pfn(const struct zone *zone,
 			unsigned long start_pfn, unsigned long nr_pages)
 {
@@ -1110,7 +1082,7 @@ static struct page *alloc_gigantic_page(int nid, struct hstate *h)
 {
 	unsigned int order = huge_page_order(h);
 	unsigned long nr_pages = 1 << order;
-	unsigned long ret, pfn, flags;
+	unsigned long ret, pfn;
 	struct zonelist *zonelist;
 	struct zone *zone;
 	struct zoneref *z;
@@ -1119,28 +1091,29 @@ static struct page *alloc_gigantic_page(int nid, struct hstate *h)
 	gfp_mask = htlb_alloc_mask(h) | __GFP_THISNODE;
 	zonelist = node_zonelist(nid, gfp_mask);
 	for_each_zone_zonelist_nodemask(zone, z, zonelist, gfp_zone(gfp_mask), NULL) {
-		spin_lock_irqsave(&zone->lock, flags);
 
 		pfn = ALIGN(zone->zone_start_pfn, nr_pages);
 		while (zone_spans_last_pfn(zone, pfn, nr_pages)) {
-			if (pfn_range_valid_gigantic(zone, pfn, nr_pages)) {
-				/*
-				 * We release the zone lock here because
-				 * alloc_contig_range() will also lock the zone
-				 * at some point. If there's an allocation
-				 * spinning on this lock, it may win the race
-				 * and cause alloc_contig_range() to fail...
-				 */
-				spin_unlock_irqrestore(&zone->lock, flags);
-				ret = __alloc_gigantic_page(pfn, nr_pages, gfp_mask);
-				if (!ret)
-					return pfn_to_page(pfn);
-				spin_lock_irqsave(&zone->lock, flags);
+			struct page *page = pfn_to_online_page(pfn);
+
+			/*
+			 * be careful about offline pageblocks and interleaving
+			 * zones
+			 */
+			if (!page || page_zone(page) != zone) {
+				pfn += pageblock_nr_pages;
+				continue;
 			}
+			if (PageReserved(page)) {
+				pfn++;
+				continue;
+			}
+
+			ret = __alloc_gigantic_page(pfn, nr_pages, gfp_mask);
+			if (!ret)
+				return pfn_to_page(pfn);
 			pfn += nr_pages;
 		}
-
-		spin_unlock_irqrestore(&zone->lock, flags);
 	}
 
 	return NULL;
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-11-06 19:31 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-04  9:55 Maxim Levitsky
2017-11-06 13:05 ` Michal Hocko
2017-11-06 17:03   ` Maxim Levitsky
2017-11-06 17:11     ` Michal Hocko
     [not found]       ` <CACAwPwZuiT9BfunVgy73KYjGfVopgcE0dknAxSLPNeJB8rkcMQ@mail.gmail.com>
     [not found]         ` <CACAwPwZqFRyFJhb7pyyrufah+1TfCDuzQMo3qwJuMKkp6aYd_Q@mail.gmail.com>
     [not found]           ` <CACAwPwbA0NpTC9bfV7ySHkxPrbZJVvjH=Be5_c25Q3S8qNay+w@mail.gmail.com>
     [not found]             ` <CACAwPwamD4RL9O8wujK_jCKGu=x0dBBmH9O-9078cUEEk4WsMA@mail.gmail.com>
     [not found]               ` <CACAwPwYKjK5RT-ChQqqUnD7PrtpXg1WhTHGK3q60i6StvDMDRg@mail.gmail.com>
     [not found]                 ` <CACAwPwav-eY4_nt=Z7TQB8WMFg+1X5WY2Gkgxph74X7=Ovfvrw@mail.gmail.com>
     [not found]                   ` <CACAwPwaP05FgxTp=kavwgFZF+LEGO-OSspJ4jH+Y=_uRxiVZaA@mail.gmail.com>
     [not found]                     ` <CACAwPwY5ss_D9kj7XoLVVkQ9=KXDFCnyDzdoxkGxhJZBNFre3w@mail.gmail.com>
2017-11-06 17:36                       ` Maxim Levitsky
2017-11-06 18:04                         ` Michal Hocko
2017-11-06 18:13                           ` Maxim Levitsky
2017-11-06 18:32                             ` Michal Hocko
2017-11-06 19:31                               ` Michal Hocko [this message]
2017-11-06 22:32                                 ` Maxim Levitsky
2017-11-07  8:20                               ` Vlastimil Babka
2017-11-07  8:30                                 ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171106193145.v7s4rjyc3tfvcqkq@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=maximlevitsky@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox