linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Zhenguo Yao <yaozhenguo1@gmail.com>
To: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Maxim Levitsky <mlevitsk@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	 Linux Memory Management List <linux-mm@kvack.org>,
	linux-kernel@vger.kernel.org
Subject: Re: Commit 'hugetlbfs: extend the definition of hugepages parameter to support node allocation' breaks old numa less syntax of reserving hugepages on boot.
Date: Mon, 29 Nov 2021 20:26:02 +0800	[thread overview]
Message-ID: <CA+WzARmTSD_S22xHSp2TinobzEXDwZzPU5vv7NX7-SqtUOtA5g@mail.gmail.com> (raw)
In-Reply-To: <ff89f867-0709-d8bb-b6f5-51b2be4cc2dd@oracle.com>

Mike Kravetz <mike.kravetz@oracle.com> 于2021年11月29日周一 下午12:31写道:
>
> On 11/28/21 03:18, Maxim Levitsky wrote:
> >
> > dmesg prints this:
> >
> > HugeTLB: allocating 64 of page size 1.00 GiB failed.  Only allocated 0 hugepages
> >
> > Huge pages were allocated on kernel command line (1/2 of 128GB system):
> >
> > 'default_hugepagesz=1G hugepagesz=1G hugepages=64'
> >
> > This is 3970X and no real support/need for NUMA, thus only fake NUMA node 0 is present.
> >
> > Reverting the commit helps.
> >
> > New syntax also works ( hugepages=0:64 )
> >
> > I can test any patches for this bug.
>
> Argh!  I think preallocation of gigantic pages on all systems with only
> a single node is broken.  The issue is at the beginning of
> __alloc_bootmem_huge_page:
>
> int __alloc_bootmem_huge_page(struct hstate *h, int nid)
> {
>         struct huge_bootmem_page *m = NULL; /* initialize for clang */
>         int nr_nodes, node;
>
>         if (nid >= nr_online_nodes)
>                 return 0;
>
> Without using the node specific syntax, nid == NUMA_NO_NODE == -1.  For the
> comparison, nid will be converted to an unsigned into to match nr_online_nodes
> so we will immediately return 0 instead of doing the allocations.
>
> Zhenguo Yao,
> Can you verify and perhaps put together a patch?does
>
Preallocation of gigantic pages cant‘ work in all the environment, not
only in single node.
I think the issue is because of the replacement
nodes_weight(node_states[N_MEMORY] with
nr_online_nodes in my patch of last version. Sorry for my careless. I
didn't notice that  parameter
nid is int ,but nr_online_nodes is unsigned int. so, this if (nid >=
nr_online_nodes) is always
true when nid is NUMA_NO_NODE(-1).  I will send a fix  as soon as passible.
This is really a low-level mistake ^^
> >
> > Also unrelated, is there any progress on allocating 1GB pages on demand so that I could
> > allocate them only when I run a VM?
>
> That should be possible.  Such support was added back in 2014 with commit
> 944d9fec8d7a "hugetlb: add support for gigantic page allocation at runtime".
>
> >
> > i don't mind having these pages to be marked as to be used for userspace only,
> > since as far as I remember its the kernel usage that makes some page unmoveable.
> >
>
> Of course, finding 1GB of contiguous space for a gigantic page is often
> difficult at runtime.  So, allocations are likely to fail the longer the
> system is up and running and fragmentation increases.
>
> > Last time (many years ago) I tried to create a zone with only userspace pages
> > (I don't remember what options I used) but it didn't work.
>
> Not too long ago, support was added to use CMA for gigantic page allocation.
> See commit cf11e85fc08c "mm: hugetlb: optionally allocate gigantic hugepages
> using cma".  This sounds like something you might want to try.
> --
> Mike Kravetz
>
> >
> > Is there a way to debug what is causing unmoveable pages and doesn't let
> > /proc/sys/vm/nr_hugepages work (I tried it today and as usual the number
> > it can allocate steadly decreases over time).
>
>
>
>


      parent reply	other threads:[~2021-11-29 12:26 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-28 11:18 Maxim Levitsky
2021-11-29  4:31 ` Mike Kravetz
2021-11-29  8:17   ` Zhenguo Yao
2021-11-29 10:39   ` Maxim Levitsky
2021-11-30  7:49     ` Maxim Levitsky
2021-11-29 12:26   ` Zhenguo Yao [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CA+WzARmTSD_S22xHSp2TinobzEXDwZzPU5vv7NX7-SqtUOtA5g@mail.gmail.com \
    --to=yaozhenguo1@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=mlevitsk@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox