linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Muchun Song <songmuchun@bytedance.com>
To: Oscar Salvador <osalvador@suse.de>
Cc: corbet@lwn.net, mike.kravetz@oracle.com,
	akpm@linux-foundation.org, mcgrof@kernel.org,
	keescook@chromium.org, yzaikin@google.com, david@redhat.com,
	masahiroy@kernel.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	duanxiongchun@bytedance.com, smuchun@gmail.com
Subject: Re: [PATCH v12 7/7] mm: hugetlb_vmemmap: add hugetlb_optimize_vmemmap sysctl
Date: Tue, 17 May 2022 17:16:11 +0800	[thread overview]
Message-ID: <YoNn2+8VG7XxQ20Y@FVFYT0MHHV2J.usts.net> (raw)
In-Reply-To: <YoNXm2c5fJq8luqf@localhost.localdomain>

On Tue, May 17, 2022 at 10:06:51AM +0200, Oscar Salvador wrote:
> On Mon, May 16, 2022 at 06:22:11PM +0800, Muchun Song wrote:
> > We must add hugetlb_free_vmemmap=on (or "off") to the boot cmdline and
> > reboot the server to enable or disable the feature of optimizing vmemmap
> > pages associated with HugeTLB pages.  However, rebooting usually takes a
> > long time.  So add a sysctl to enable or disable the feature at runtime
> > without rebooting.  Why we need this?  There are 3 use cases.
> > 
> > 1) The feature of minimizing overhead of struct page associated with each
> > HugeTLB is disabled by default without passing "hugetlb_free_vmemmap=on"
> > to the boot cmdline. When we (ByteDance) deliver the servers to the
> > users who want to enable this feature, they have to configure the grub
> > (change boot cmdline) and reboot the servers, whereas rebooting usually
> > takes a long time (we have thousands of servers).  It's a very bad
> > experience for the users.  So we need a approach to enable this feature
> > after rebooting. This is a use case in our practical environment.
> > 
> > 2) Some use cases are that HugeTLB pages are allocated 'on the fly'
> > instead of being pulled from the HugeTLB pool, those workloads would be
> > affected with this feature enabled.  Those workloads could be identified
> > by the characteristics of they never explicitly allocating huge pages
> > with 'nr_hugepages' but only set 'nr_overcommit_hugepages' and then let
> > the pages be allocated from the buddy allocator at fault time.  We can
> > confirm it is a real use case from the commit 099730d67417.  For those
> > workloads, the page fault time could be ~2x slower than before. We
> > suspect those users want to disable this feature if the system has enabled
> > this before and they don't think the memory savings benefit is enough to
> > make up for the performance drop.
> > 
> > 3) If the workload which wants vmemmap pages to be optimized and the
> > workload which wants to set 'nr_overcommit_hugepages' and does not want
> > the extera overhead at fault time when the overcommitted pages be
> > allocated from the buddy allocator are deployed in the same server.
> > The user could enable this feature and set 'nr_hugepages' and
> > 'nr_overcommit_hugepages', then disable the feature.  In this case,
> > the overcommited HugeTLB pages will not encounter the extra overhead
> > at fault time.
> 
> I am having issues parsing point 3), specially the first part.
> IIUC, you are saying we have two kind of different workloads:
> 
> - one that wants to have hugetlb vmemmap pages optimized
> - one that wants to allocate hugetlb pages at fault time rather than
>   allocating them via /proc/..., but does not want to suffer the
>   overhead of optimizing the vmemmap pages when faulting them

I need to clarify this workload, the one that does not want to
suffer the overhead of optimizing the vmemmap pages when faulting
them instead of wanting to allocate hugetlb pages at fault time.
It is different from the one in the case 2). This one usually
configures 'nr_overcommit_hugepages' as well as 'nr_hugepages',
if it does not want to suffer the overhead of optimizing the
vmemmap pages when faulting pages (must be overcommitted pages),
then they could follow the steps mentioned above.

> 
> Then you say the user could enable the optimization and allocate
> those pages via nr_hugepages, and then disable the feature.
> So, when we fault in those pages, the pages are already in the
> pool, right? And are already optimized.
>

I mean the overcommitted pages (it could be allocated at fault
time) as explained above.

Thanks.


  reply	other threads:[~2022-05-17  9:16 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-16 10:22 [PATCH v12 0/7] " Muchun Song
2022-05-16 10:22 ` [PATCH v12 1/7] mm: hugetlb_vmemmap: disable hugetlb_optimize_vmemmap when struct page crosses page boundaries Muchun Song
2022-05-17  7:34   ` Oscar Salvador
2022-05-16 10:22 ` [PATCH v12 2/7] mm: hugetlb_vmemmap: use kstrtobool for hugetlb_vmemmap param parsing Muchun Song
2022-05-17  7:36   ` Oscar Salvador
2022-05-16 10:22 ` [PATCH v12 3/7] mm: memory_hotplug: enumerate all supported section flags Muchun Song
2022-05-17  7:47   ` Oscar Salvador
2022-05-17  8:32     ` Muchun Song
2022-05-16 10:22 ` [PATCH v12 4/7] mm: hotplug: introduce SECTION_CANNOT_OPTIMIZE_VMEMMAP Muchun Song
2022-05-16 10:38   ` Oscar Salvador
2022-05-16 12:03     ` Muchun Song
2022-05-17  7:52       ` Oscar Salvador
2022-05-17  8:10         ` Muchun Song
2022-05-16 10:22 ` [PATCH v12 5/7] mm: hugetlb_vmemmap: remove hugetlb_optimize_vmemmap_enabled() Muchun Song
2022-05-16 10:22 ` [PATCH v12 6/7] sysctl: handle table->maxlen properly for proc_dobool Muchun Song
2022-05-16 10:22 ` [PATCH v12 7/7] mm: hugetlb_vmemmap: add hugetlb_optimize_vmemmap sysctl Muchun Song
2022-05-17  8:06   ` Oscar Salvador
2022-05-17  9:16     ` Muchun Song [this message]
2022-05-16 18:46 ` [PATCH v12 0/7] " Andrew Morton
2022-05-17  3:26   ` Muchun Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YoNn2+8VG7XxQ20Y@FVFYT0MHHV2J.usts.net \
    --to=songmuchun@bytedance.com \
    --cc=akpm@linux-foundation.org \
    --cc=corbet@lwn.net \
    --cc=david@redhat.com \
    --cc=duanxiongchun@bytedance.com \
    --cc=keescook@chromium.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=masahiroy@kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=mike.kravetz@oracle.com \
    --cc=osalvador@suse.de \
    --cc=smuchun@gmail.com \
    --cc=yzaikin@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox