From: Joao Martins <joao.m.martins@oracle.com>
To: lsf-pc@lists.linux-foundation.org
Cc: linux-mm@kvack.org
Subject: [LSF/MM TOPIC] Guest memory without struct page
Date: Fri, 14 Feb 2020 21:32:11 +0000 [thread overview]
Message-ID: <1be38ae3-d51e-2661-d0ab-6ad8baefe804@oracle.com> (raw)
All system RAM is tracked by a metadata structure called 'struct page' which
amounts to 64bytes and represents a certain page granualarity. On x86 (or
systems which PAGE_SIZE is 4K) this data structure represents a total of 1.5%
overhead of total capacity.
For hypervisors -- specially those without vhost/PV-devices, and just VFs --
persistent/volatile memory is largely assigned to userspace without kernel
taking part in any of it's I/O paths, except for VFIO. 1.5% may not seem like
much, but it is still a total of 16G per Tb just for struct page, which is a lot
considering the hypervisor won't need it and instead should be used to create
more guests (=Happy Users).
The RFC patches submitted here [0] approach this through device-dax given the
interface it provides already for VMMs and also given that this is too a source
of overhead for non-volatile memory assigned to guests. Essentially it extends
device-dax to create a PFNMAP vma with special pages (while adding support for
huge special pages). host memory would be limited through some form of mem=X,
efi_fake_mem=Y@X:0x40000 or memmap=Y@X-1+0xefffffff i.e. dedicate Y amount for
guests memory.
Should vhost-{net,scsi,etc} be used, we copy from/to guest memory (which works
today for vhost-net, and easily adjusted for vhost-scsi), or perhaps explore
dynamically creating/freeing struct pages on GUP temporary pinning.
This topic would be to brainstorm the idea/proposal and also discuss
alternatives/pitfalls/limitations/other-usecases(*).
Regards,
Joao
(*) To some extent there might be a similarity to '"Secret" memory userspace
APIs' subitem of this previously submitted topic[1] given that the guest memory
in the described topic isn't part of the direct map.
[0]
https://lore.kernel.org/linux-mm/20200110190313.17144-1-joao.m.martins@oracle.com/
[1] https://lore.kernel.org/linux-mm/20200206165900.GD17499@linux.ibm.com/
reply other threads:[~2020-02-14 21:32 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1be38ae3-d51e-2661-d0ab-6ad8baefe804@oracle.com \
--to=joao.m.martins@oracle.com \
--cc=linux-mm@kvack.org \
--cc=lsf-pc@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox