From: Pasha Tatashin <pasha.tatashin@soleen.com>
To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com,
pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com,
rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org,
ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com,
ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org,
akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr,
mmaurer@google.com, roman.gushchin@linux.dev,
chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com,
jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org,
dan.j.williams@intel.com, david@redhat.com,
joel.granados@kernel.org, rostedt@goodmis.org,
anna.schumaker@oracle.com, song@kernel.org,
zhangguopeng@kylinos.cn, linux@weissschuh.net,
linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
linux-mm@kvack.org, gregkh@linuxfoundation.org,
tglx@linutronix.de, mingo@redhat.com, bp@alien8.de,
dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com,
rafael@kernel.org, dakr@kernel.org,
bartosz.golaszewski@linaro.org, cw00.choi@samsung.com,
myungjoo.ham@samsung.com, yesanishhere@gmail.com,
Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com,
aleksander.lobakin@intel.com, ira.weiny@intel.com,
andriy.shevchenko@linux.intel.com, leon@kernel.org,
lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org,
djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de,
lennart@poettering.net, brauner@kernel.org,
linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org,
saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com,
parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com,
hughd@google.com, skhawaja@google.com, chrisl@kernel.org
Subject: [PATCH v5 18/22] docs: add documentation for memfd preservation via LUO
Date: Fri, 7 Nov 2025 16:03:16 -0500 [thread overview]
Message-ID: <20251107210526.257742-19-pasha.tatashin@soleen.com> (raw)
In-Reply-To: <20251107210526.257742-1-pasha.tatashin@soleen.com>
From: Pratyush Yadav <ptyadav@amazon.de>
Add the documentation under the "Preserving file descriptors" section of
LUO's documentation. The doc describes the properties preserved,
behaviour of the file under different LUO states, serialization format,
and current limitations.
Signed-off-by: Pratyush Yadav <ptyadav@amazon.de>
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
---
Documentation/core-api/liveupdate.rst | 7 ++
Documentation/mm/index.rst | 1 +
Documentation/mm/memfd_preservation.rst | 138 ++++++++++++++++++++++++
MAINTAINERS | 1 +
4 files changed, 147 insertions(+)
create mode 100644 Documentation/mm/memfd_preservation.rst
diff --git a/Documentation/core-api/liveupdate.rst b/Documentation/core-api/liveupdate.rst
index deacc098d024..384de79a2457 100644
--- a/Documentation/core-api/liveupdate.rst
+++ b/Documentation/core-api/liveupdate.rst
@@ -28,6 +28,13 @@ Live Update Orchestrator ABI
.. kernel-doc:: include/linux/liveupdate/abi/luo.h
:doc: Live Update Orchestrator ABI
+The following types of file descriptors can be preserved
+
+.. toctree::
+ :maxdepth: 1
+
+ ../mm/memfd_preservation
+
Public API
==========
.. kernel-doc:: include/linux/liveupdate.h
diff --git a/Documentation/mm/index.rst b/Documentation/mm/index.rst
index ba6a8872849b..7aa2a8886908 100644
--- a/Documentation/mm/index.rst
+++ b/Documentation/mm/index.rst
@@ -48,6 +48,7 @@ documentation, or deleted if it has served its purpose.
hugetlbfs_reserv
ksm
memory-model
+ memfd_preservation
mmu_notifier
multigen_lru
numa
diff --git a/Documentation/mm/memfd_preservation.rst b/Documentation/mm/memfd_preservation.rst
new file mode 100644
index 000000000000..3fc612e1288c
--- /dev/null
+++ b/Documentation/mm/memfd_preservation.rst
@@ -0,0 +1,138 @@
+.. SPDX-License-Identifier: GPL-2.0-or-later
+
+==========================
+Memfd Preservation via LUO
+==========================
+
+Overview
+========
+
+Memory file descriptors (memfd) can be preserved over a kexec using the Live
+Update Orchestrator (LUO) file preservation. This allows userspace to transfer
+its memory contents to the next kernel after a kexec.
+
+The preservation is not intended to be transparent. Only select properties of
+the file are preserved. All others are reset to default. The preserved
+properties are described below.
+
+.. note::
+ The LUO API is not stabilized yet, so the preserved properties of a memfd are
+ also not stable and are subject to backwards incompatible changes.
+
+.. note::
+ Currently a memfd backed by Hugetlb is not supported. Memfds created
+ with ``MFD_HUGETLB`` will be rejected.
+
+Preserved Properties
+====================
+
+The following properties of the memfd are preserved across kexec:
+
+File Contents
+ All data stored in the file is preserved.
+
+File Size
+ The size of the file is preserved. Holes in the file are filled by allocating
+ pages for them during preservation.
+
+File Position
+ The current file position is preserved, allowing applications to continue
+ reading/writing from their last position.
+
+File Status Flags
+ memfds are always opened with ``O_RDWR`` and ``O_LARGEFILE``. This property is
+ maintained.
+
+Non-Preserved Properties
+========================
+
+All properties which are not preserved must be assumed to be reset to default.
+This section describes some of those properties which may be more of note.
+
+``FD_CLOEXEC`` flag
+ A memfd can be created with the ``MFD_CLOEXEC`` flag that sets the
+ ``FD_CLOEXEC`` on the file. This flag is not preserved and must be set again
+ after restore via ``fcntl()``.
+
+Seals
+ File seals are not preserved. The file is unsealed on restore and if needed,
+ must be sealed again via ``fcntl()``.
+
+Behavior with LUO states
+========================
+
+This section described the behavior of the memfd in the different LUO states.
+
+Normal Phase
+ During the normal phase, the memfd can be marked for preservation using the
+ ``LIVEUPDATE_SESSION_PRESERVE_FD`` ioctl. The memfd acts as a regular memfd
+ during this phase with no additional restrictions.
+
+Prepared Phase
+ After LUO enters ``LIVEUPDATE_STATE_PREPARED``, the memfd is serialized and
+ prepared for the next kernel. During this phase, the below things happen:
+
+ - All the folios are pinned. If some folios reside in ``ZONE_MIGRATE``, they
+ are migrated out. This ensures none of the preserved folios land in KHO
+ scratch area.
+ - Pages in swap are swapped in. Currently, there is no way to pass pages in
+ swap over KHO, so all swapped out pages are swapped back in and pinned.
+ - The memfd goes into "frozen mapping" mode. The file can no longer grow or
+ shrink, or punch holes. This ensures the serialized mappings stay in sync.
+ The file can still be read from or written to or mmap-ed.
+
+Freeze Phase
+ Updates the current file position in the serialized data to capture any
+ changes that occurred between prepare and freeze phases. After this, the FD is
+ not allowed to be accessed.
+
+Restoration Phase
+ After being restored, the memfd is functional as normal with the properties
+ listed above restored.
+
+Cancellation
+ If the liveupdate is cancelled after going into prepared phase, the memfd
+ functions like in normal phase.
+
+Serialization format
+====================
+
+The state is serialized in an FDT with the following structure::
+
+ /dts-v1/;
+
+ / {
+ compatible = "memfd-v1";
+ pos = <current_file_position>;
+ size = <file_size_in_bytes>;
+ folios = <array_of_preserved_folio_descriptors>;
+ };
+
+Each folio descriptor contains:
+
+- PFN + flags (8 bytes)
+
+ - Physical frame number (PFN) of the preserved folio (bits 63:12).
+ - Folio flags (bits 11:0):
+
+ - ``PRESERVED_FLAG_DIRTY`` (bit 0)
+ - ``PRESERVED_FLAG_UPTODATE`` (bit 1)
+
+- Folio index within the file (8 bytes).
+
+Limitations
+===========
+
+The current implementation has the following limitations:
+
+Size
+ Currently the size of the file is limited by the size of the FDT. The FDT can
+ be at of most ``MAX_PAGE_ORDER`` order. By default this is 4 MiB with 4K
+ pages. Each page in the file is tracked using 16 bytes. This limits the
+ maximum size of the file to 1 GiB.
+
+See Also
+========
+
+- :doc:`Live Update Orchestrator </core-api/liveupdate>`
+- :doc:`/core-api/kho/concepts`
diff --git a/MAINTAINERS b/MAINTAINERS
index 3497354b7fbb..3ece47c552a8 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -14518,6 +14518,7 @@ R: Pratyush Yadav <pratyush@kernel.org>
L: linux-kernel@vger.kernel.org
S: Maintained
F: Documentation/core-api/liveupdate.rst
+F: Documentation/mm/memfd_preservation.rst
F: Documentation/userspace-api/liveupdate.rst
F: include/linux/liveupdate.h
F: include/linux/liveupdate/
--
2.51.2.1041.gc1ab5b90ca-goog
next prev parent reply other threads:[~2025-11-07 21:06 UTC|newest]
Thread overview: 66+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-07 21:02 [PATCH v5 00/22] Live Update Orchestrator Pasha Tatashin
2025-11-07 21:02 ` [PATCH v5 01/22] liveupdate: luo_core: luo_ioctl: " Pasha Tatashin
2025-11-13 13:37 ` Mike Rapoport
2025-11-13 13:56 ` Pasha Tatashin
2025-11-07 21:03 ` [PATCH v5 02/22] liveupdate: luo_core: integrate with KHO Pasha Tatashin
2025-11-10 13:00 ` Mike Rapoport
2025-11-10 15:43 ` Pasha Tatashin
2025-11-11 20:16 ` Mike Rapoport
2025-11-11 20:57 ` Pasha Tatashin
2025-11-12 13:25 ` Mike Rapoport
2025-11-12 14:58 ` Pasha Tatashin
2025-11-13 16:31 ` Mike Rapoport
2025-11-13 18:38 ` Pasha Tatashin
2025-11-11 20:25 ` Mike Rapoport
2025-11-11 20:39 ` Pasha Tatashin
2025-11-11 20:42 ` Pasha Tatashin
2025-11-12 10:21 ` Mike Rapoport
2025-11-12 12:46 ` Pasha Tatashin
2025-11-12 13:33 ` Mike Rapoport
2025-11-12 15:14 ` Pasha Tatashin
2025-11-12 17:39 ` Pasha Tatashin
2025-11-14 11:29 ` Mike Rapoport
2025-11-14 14:48 ` Pasha Tatashin
2025-11-07 21:03 ` [PATCH v5 03/22] reboot: call liveupdate_reboot() before kexec Pasha Tatashin
2025-11-14 11:30 ` Mike Rapoport
2025-11-07 21:03 ` [PATCH v5 04/22] liveupdate: Kconfig: Make debugfs optional Pasha Tatashin
2025-11-07 21:03 ` [PATCH v5 05/22] liveupdate: kho: when live update add KHO image during kexec load Pasha Tatashin
2025-11-10 12:47 ` Mike Rapoport
2025-11-10 15:31 ` Pasha Tatashin
2025-11-11 20:18 ` Mike Rapoport
2025-11-11 20:59 ` Pasha Tatashin
2025-11-07 21:03 ` [PATCH v5 06/22] liveupdate: luo_session: add sessions support Pasha Tatashin
2025-11-12 20:39 ` Mike Rapoport
2025-11-12 20:47 ` Pasha Tatashin
2025-11-14 12:49 ` Mike Rapoport
2025-11-14 14:07 ` Pasha Tatashin
2025-11-07 21:03 ` [PATCH v5 07/22] liveupdate: luo_ioctl: add user interface Pasha Tatashin
2025-11-14 12:57 ` Mike Rapoport
2025-11-14 14:09 ` Pasha Tatashin
2025-11-07 21:03 ` [PATCH v5 08/22] liveupdate: luo_file: implement file systems callbacks Pasha Tatashin
2025-11-10 17:27 ` Pratyush Yadav
2025-11-10 17:42 ` Pasha Tatashin
2025-11-07 21:03 ` [PATCH v5 09/22] liveupdate: luo_session: Add ioctls for file preservation and state management Pasha Tatashin
2025-11-07 21:03 ` [PATCH v5 10/22] liveupdate: luo_flb: Introduce File-Lifecycle-Bound global state Pasha Tatashin
2025-11-07 21:03 ` [PATCH v5 11/22] docs: add luo documentation Pasha Tatashin
2025-11-07 21:03 ` [PATCH v5 12/22] MAINTAINERS: add liveupdate entry Pasha Tatashin
2025-11-07 21:03 ` [PATCH v5 13/22] mm: shmem: use SHMEM_F_* flags instead of VM_* flags Pasha Tatashin
2025-11-07 21:03 ` [PATCH v5 14/22] mm: shmem: allow freezing inode mapping Pasha Tatashin
2025-11-07 21:03 ` [PATCH v5 15/22] mm: shmem: export some functions to internal.h Pasha Tatashin
2025-11-07 21:03 ` [PATCH v5 16/22] liveupdate: luo_file: add private argument to store runtime state Pasha Tatashin
2025-11-07 21:03 ` [PATCH v5 17/22] mm: memfd_luo: allow preserving memfd Pasha Tatashin
2025-11-07 21:03 ` Pasha Tatashin [this message]
2025-11-13 16:55 ` [PATCH v5 18/22] docs: add documentation for memfd preservation via LUO Pasha Tatashin
2025-11-13 16:59 ` Pratyush Yadav
2025-11-07 21:03 ` [PATCH v5 19/22] selftests/liveupdate: Add userspace API selftests Pasha Tatashin
2025-11-07 21:03 ` [PATCH v5 20/22] selftests/liveupdate: Add kexec-based selftest for session lifecycle Pasha Tatashin
2025-11-07 21:03 ` [PATCH v5 21/22] selftests/liveupdate: Add kexec test for multiple and empty sessions Pasha Tatashin
2025-11-07 21:03 ` [PATCH v5 22/22] tests/liveupdate: Add in-kernel liveupdate test Pasha Tatashin
2025-11-12 20:23 ` Mike Rapoport
2025-11-12 20:40 ` Pasha Tatashin
2025-11-16 18:36 ` Mike Rapoport
2025-11-17 14:09 ` Pasha Tatashin
2025-11-07 22:33 ` [PATCH v5 00/22] Live Update Orchestrator Andrew Morton
2025-11-08 18:13 ` Pasha Tatashin
2025-11-08 18:36 ` Andrew Morton
2025-11-09 2:31 ` Pasha Tatashin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251107210526.257742-19-pasha.tatashin@soleen.com \
--to=pasha.tatashin@soleen.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=ajayachandra@nvidia.com \
--cc=akpm@linux-foundation.org \
--cc=aleksander.lobakin@intel.com \
--cc=aliceryhl@google.com \
--cc=andriy.shevchenko@linux.intel.com \
--cc=anna.schumaker@oracle.com \
--cc=axboe@kernel.dk \
--cc=bartosz.golaszewski@linaro.org \
--cc=bhelgaas@google.com \
--cc=bp@alien8.de \
--cc=brauner@kernel.org \
--cc=chenridong@huawei.com \
--cc=chrisl@kernel.org \
--cc=corbet@lwn.net \
--cc=cw00.choi@samsung.com \
--cc=dakr@kernel.org \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=david@redhat.com \
--cc=djeffery@redhat.com \
--cc=dmatlack@google.com \
--cc=graf@amazon.com \
--cc=gregkh@linuxfoundation.org \
--cc=hannes@cmpxchg.org \
--cc=hpa@zytor.com \
--cc=hughd@google.com \
--cc=ilpo.jarvinen@linux.intel.com \
--cc=ira.weiny@intel.com \
--cc=jannh@google.com \
--cc=jasonmiu@google.com \
--cc=jgg@nvidia.com \
--cc=joel.granados@kernel.org \
--cc=kanie@linux.alibaba.com \
--cc=lennart@poettering.net \
--cc=leon@kernel.org \
--cc=leonro@nvidia.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux@weissschuh.net \
--cc=lukas@wunner.de \
--cc=mark.rutland@arm.com \
--cc=masahiroy@kernel.org \
--cc=mingo@redhat.com \
--cc=mmaurer@google.com \
--cc=myungjoo.ham@samsung.com \
--cc=ojeda@kernel.org \
--cc=parav@nvidia.com \
--cc=pratyush@kernel.org \
--cc=ptyadav@amazon.de \
--cc=quic_zijuhu@quicinc.com \
--cc=rafael@kernel.org \
--cc=rdunlap@infradead.org \
--cc=rientjes@google.com \
--cc=roman.gushchin@linux.dev \
--cc=rostedt@goodmis.org \
--cc=rppt@kernel.org \
--cc=saeedm@nvidia.com \
--cc=skhawaja@google.com \
--cc=song@kernel.org \
--cc=stuart.w.hayes@gmail.com \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=vincent.guittot@linaro.org \
--cc=wagi@kernel.org \
--cc=witu@nvidia.com \
--cc=x86@kernel.org \
--cc=yesanishhere@gmail.com \
--cc=yoann.congal@smile.fr \
--cc=zhangguopeng@kylinos.cn \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox