linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
To: Baoquan He <bhe@redhat.com>
Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de,
	dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com,
	ebiederm@xmission.com, akpm@linux-foundation.org,
	stanislav.kinsburskii@gmail.com, corbet@lwn.net,
	linux-kernel@vger.kernel.org, kexec@lists.infradead.org,
	linux-mm@kvack.org, kys@microsoft.com, jgowans@amazon.com,
	wei.liu@kernel.org, arnd@arndb.de, gregkh@linuxfoundation.org,
	graf@amazon.de, pbonzini@redhat.com
Subject: Re: [RFC PATCH v2 0/7] Introduce persistent memory pool
Date: Wed, 27 Sep 2023 09:13:19 -0700	[thread overview]
Message-ID: <20230927161319.GA19976@skinsburskii.> (raw)
In-Reply-To: <ZRPBRkXrYvbw8+Lt@MiWiFi-R3L-srv>

On Wed, Sep 27, 2023 at 01:44:38PM +0800, Baoquan He wrote:
> Hi Stanislav,
> 
> On 09/25/23 at 02:27pm, Stanislav Kinsburskii wrote:
> > This patch introduces a memory allocator specifically tailored for
> > persistent memory within the kernel. The allocator maintains
> > kernel-specific states like DMA passthrough device states, IOMMU state, and
> > more across kexec.
> 
> Can you give more details about how this persistent memory pool will be
> utilized in a actual scenario? I mean, what problem have you met so that
> you have to introduce persistent memory pool to solve it?
> 

The major reason we have at the moment, is that Linux root partition
running on top of the Microsoft hypervisor needs to deposit pages to
hypervisor in runtime, when hypervisor runs out of memory.
"Depositing" here means, that Linux passes a set of its PFNs to the
hypervisor via hypercall, and hypervisor then uses these pages for its
own needs.

Once deposited, these pages can't be accessed by Linux anymore and thus
must be preserved in "used" state across kexec, as hypervisor state is
unware of kexec. In the same time, these pages can we withdrawn when
usused. Thus, an allocator persistent across kexec looks reasonable for
this particular matter.

Also, the last patch in the series is aimed to demonstrate the usage,
described above.

Thanks,
Stanislav

> Thanks
> Baoquan
> 
> > 
> > The current implementation provides a foundation for custom solutions that
> > may be developed in the future. Although the design is kept concise and
> > straightforward to encourage discussion and feedback, it remains fully
> > functional.
> > 
> > The persistent memory pool builds upon the continuous memory allocator
> > (CMA) and ensures CMA state persistency across kexec by incorporating the
> > CMA bitmap into the memory region instead of allocation it from kernel
> > memory.
> > 
> > Persistent memory pool metadata is passed across kexec by using Flattened
> > Device Tree, which is added as another kexec segment for x86 architecture.
> > 
> > Potential applications include:
> > 
> >   1. Enabling various in-kernel entities to allocate persistent pages from
> >      a unified memory pool, obviating the need for reserving multiple
> >      regions.
> > 
> >   2. For in-kernel components that need the allocation address to be
> >      retained on kernel kexec, this address can be exposed to user space
> >      and subsequently passed through the command line.
> > 
> >   3. Distinct subsystems or drivers can set aside their region, allocating
> >      a segment for their persistent memory pool, suitable for uses such as
> >      file systems, key-value stores, and other applications.
> > 
> > Notes:
> > 
> >   1. The last patch of the series represents a use case for the feature.
> >      However, the patch won't compile and is for illustrative purposes only
> >      as the code being patched hasn't been merged yet.
> > 
> >   2. The code being patched is currently under review by the community. The
> >      series is named "Introduce /dev/mshv drivers":
> > 
> >          https://lkml.org/lkml/2023/9/22/1117
> > 
> > 
> > Changes since v1:
> > 
> >   1. Persistent memory pool is now a wrapper on top of CMA instead of being a
> >      new allocator.
> > 
> >   2. Persistent memory pool metadata doesn't belong to the pool anymore and
> >      is now passed via Flattened Device Tree instead over kexec to the new
> >      kernel.
> > 
> > The following series implements...
> > 
> > ---
> > 
> > Stanislav Kinsburskii (7):
> >       kexec_file: Add fdt modification callback support
> >       x86: kexec: Transfer existing fdt to the new kernel
> >       x86: kexec: Enable fdt modification in callbacks
> >       pmpool: Introduce persistent memory pool
> >       pmpool: Update device tree on kexec
> >       pmpool: Restore state from device tree post-kexec
> >       Drivers: hv: Allocate persistent pages for root partition
> > 
> > 
> >  arch/x86/Kconfig                  |   16 +++
> >  arch/x86/kernel/kexec-bzimage64.c |   97 +++++++++++++++++
> >  drivers/hv/hv_common.c            |   13 ++
> >  include/linux/kexec.h             |    7 +
> >  include/linux/pmpool.h            |   22 ++++
> >  kernel/kexec_file.c               |   24 ++++
> >  mm/Kconfig                        |    9 ++
> >  mm/Makefile                       |    1 
> >  mm/pmpool.c                       |  208 +++++++++++++++++++++++++++++++++++++
> >  9 files changed, 394 insertions(+), 3 deletions(-)
> >  create mode 100644 include/linux/pmpool.h
> >  create mode 100644 mm/pmpool.c
> > 
> > 
> > _______________________________________________
> > kexec mailing list
> > kexec@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/kexec
> > 


  reply	other threads:[~2023-09-27 16:13 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <01828.123092517290700465@us-mta-156.us.mimecast.lan>
2023-09-27  5:44 ` Baoquan He
2023-09-27 16:13   ` Stanislav Kinsburskii [this message]
2023-09-28 13:22     ` Dave Hansen
2023-09-27 23:25       ` Stanislav Kinsburskii
2023-09-28 17:29         ` Dave Hansen
2023-09-28  0:02           ` Stanislav Kinsburskii
2023-09-28 18:00             ` Dave Hansen
2023-09-28  0:38               ` Stanislav Kinsburskii
2023-09-28 19:16                 ` Dave Hansen
2023-09-28  2:46                   ` Stanislav Kinsburskii
2023-09-29 10:13                     ` Shutemov, Kirill
2023-09-28  9:16                       ` Stanislav Kinsburskii
     [not found]                   ` <64208.123092816192300612@us-mta-483.us.mimecast.lan>
2023-09-28 23:56                     ` Baoquan He
2023-09-28  7:18                       ` Stanislav Kinsburskii
2023-09-28 17:35       ` David Hildenbrand
2023-09-28 17:37         ` Dave Hansen
2023-09-28 18:12           ` [EXTERNAL] " KY Srinivasan
     [not found]   ` <58146.123092712145601339@us-mta-73.us.mimecast.lan>
2023-09-28 10:25     ` Baoquan He
2023-09-27 22:44       ` Stanislav Kinsburskii
2023-09-28 17:29       ` David Hildenbrand
2023-09-25 21:27 Stanislav Kinsburskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230927161319.GA19976@skinsburskii. \
    --to=skinsburskii@linux.microsoft.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=bhe@redhat.com \
    --cc=bp@alien8.de \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=ebiederm@xmission.com \
    --cc=graf@amazon.de \
    --cc=gregkh@linuxfoundation.org \
    --cc=hpa@zytor.com \
    --cc=jgowans@amazon.com \
    --cc=kexec@lists.infradead.org \
    --cc=kys@microsoft.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=stanislav.kinsburskii@gmail.com \
    --cc=tglx@linutronix.de \
    --cc=wei.liu@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox