From: David Hildenbrand <david@redhat.com>
To: Mike Rapoport <rppt@kernel.org>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
Andrew Morton <akpm@linux-foundation.org>,
Oscar Salvador <osalvador@suse.de>,
Michal Hocko <mhocko@suse.com>,
Mike Kravetz <mike.kravetz@oracle.com>,
Dave Hansen <dave.hansen@linux.intel.com>,
Matthew Wilcox <willy@infradead.org>,
Anshuman Khandual <anshuman.khandual@arm.com>,
Muchun Song <songmuchun@bytedance.com>,
Pavel Tatashin <pasha.tatashin@soleen.com>,
Jonathan Corbet <corbet@lwn.net>,
Stephen Rothwell <sfr@canb.auug.org.au>,
linux-doc@vger.kernel.org
Subject: Re: [PATCH v1] memory-hotplug.rst: complete admin-guide overhaul
Date: Tue, 8 Jun 2021 15:04:19 +0200 [thread overview]
Message-ID: <5e01bd6f-4073-1ebb-489d-2e5c529909a2@redhat.com> (raw)
In-Reply-To: <YL4Ek6AqMUyiDrxY@kernel.org>
>> +ZONE_MOVABLE
>> +============
>> +
>> +ZONE_MOVABLE is an important mechanism for more reliable memory offlining.
>> +Further, having system RAM managed by ZONE_MOVABLE instead of one of the
>> +kernel zones can increase the number of possible transparent huge pages and
>> +dynamically allocated huge pages.
>> +
>
> I'd move the first two paragraphs from "Zone Imbalances" here to provide
> some context what is movable and what is unmovable allocation.
Makes sense.
[...]
>> -How to offline memory
>> ----------------------
>> +Considerations
>
> ZONE_MOVABLE Sizing Considerations ?
>
Ack
> I'd also move the contents of "Boot Memory and ZONE_MOVABLE" here (with
> some adjustments):
>
> By default, all the memory configured at boot time is managed by the kernel
> zones and ZONE_MOVABLE is not used.
>
> To enable ZONE_MOVABLE to include the memory present at boot and to
> control the ratio between movable and kernel zones there are two command
> line options: ``kernelcore=`` and ``movablecore=``. See
> Documentation/admin-guide/kernel-parameters.rst for their description.
>
Makes sense. I'll move it to the end of the "ZONE_MOVABLE Sizing
Considerations" section.
>> +--------------
>>
>> -You can offline a memory block by using the same sysfs interface that was used
>> -in memory onlining::
>> +We usually expect that a large portion of available system RAM will actually
>> +be consumed by user space, either directly or indirectly via the page cache. In
>> +the normal case, ZONE_MOVABLE can be used when allocating such pages just fine.
>>
>> - % echo offline > /sys/devices/system/memory/memoryXXX/state
>> +With that in mind, it makes sense that we can have a big portion of system RAM
>> +managed by ZONE_MOVABLE. However, there are some things to consider when
>> +using ZONE_MOVABLE, especially when fine-tuning zone ratios:
>>
>> -If offline succeeds, the state of the memory block is changed to be "offline".
>> -If it fails, some error core (like -EBUSY) will be returned by the kernel.
>> -Even if a memory block does not belong to ZONE_MOVABLE, you can try to offline
>> -it. If it doesn't contain 'unmovable' memory, you'll get success.
>> +- Having a lot of offline memory blocks. Even offline memory blocks consume
>> + memory for metadata and page tables in the direct map; having a lot of
>> + offline memory blocks is not a typical case, though.
>> +
>> +- Memory ballooning. Some memory ballooning implementations, such as
>> + the Hyper-V balloon, the XEN balloon, the vbox balloon and the VMWare
>
> So, everyone except virtio-mem? ;-)
Well, virtio-mem does not classify as memory balloon in that sense, as
it only operates on own device memory ;)
virtio-balloon and pseries CMM support balloon compaction.
> I'd drop the names because if some of those will implement balloon
> compaction they surely will forget to update the docs.
I can do the opposite and mention the ones that already do. Some most
probably will never support it.
"Memory ballooning without balloon compaction is incompatible with
ZONE_MOVABLE. Only some implementations, such as virtio-balloon and
pseries CMM, fully support balloon compaction."
>
>> + balloon with huge pages don't support balloon compaction and, thereby
>> + ZONE_MOVABLE.
>> +
>> + Further, CONFIG_BALLOON_COMPACTION might be disabled. In that case, balloon
>> + inflation will only perform unmovable allocations and silently create a
>> + zone imbalance, usually triggered by inflation requests from the
>> + hypervisor.
>> +
>> +- Gigantic pages are unmovable, resulting in user space consuming a
>> + lot of unmovable memory.
>> +
>> +- Huge pages are unmovable when an architectures does not support huge
>> + page migration, resulting in a similar issue as with gigantic pages.
>> +
>> +- Page tables are unmovable. Excessive swapping, mapping extremely large
>> + files or ZONE_DEVICE memory can be problematic, although only
>> + really relevant in corner cases. When we manage a lot of user space memory
>> + that has been swapped out or is served from a file/pmem/... we still need
>
> ^ persistent memory
Agreed.
>
>> + a lot of page tables to manage that memory once user space accessed that
>> + memory once.
>> +
>> +- DAX: when we have a lot of ZONE_DEVICE memory added to the system as DAX
>> + and we are not using an altmap to allocate the memmap from device memory
>> + directly, we will have to allocate the memmap for this memory from the
>> + kernel zones.
>
> I'm not sure admin-guide reader will know when we use altmap when we don't.
> Maybe
>
> DAX: in certain DAX configurations the memory map for the device memory will
> be allocated from the kernel zones.
Indeed, simpler and communicates the same message.
I'll also add
"KASAN can have a significant memory overhead, for example, consuming
1/8th of the total system memory size as (unmovable) tracking metadata."
Thanks Mike!
--
Thanks,
David / dhildenb
next prev parent reply other threads:[~2021-06-08 13:04 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-25 10:26 David Hildenbrand
2021-05-26 6:54 ` Mike Rapoport
2021-06-08 12:05 ` David Hildenbrand
2021-06-08 14:16 ` Mike Rapoport
2021-06-08 15:22 ` David Hildenbrand
2021-06-07 7:51 ` Michal Hocko
2021-06-07 8:08 ` David Hildenbrand
2021-06-07 11:35 ` Mike Rapoport
2021-06-08 13:04 ` David Hildenbrand [this message]
2021-06-08 14:18 ` Mike Rapoport
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5e01bd6f-4073-1ebb-489d-2e5c529909a2@redhat.com \
--to=david@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=anshuman.khandual@arm.com \
--cc=corbet@lwn.net \
--cc=dave.hansen@linux.intel.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=mike.kravetz@oracle.com \
--cc=osalvador@suse.de \
--cc=pasha.tatashin@soleen.com \
--cc=rppt@kernel.org \
--cc=sfr@canb.auug.org.au \
--cc=songmuchun@bytedance.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox