* [PATCH v2] liveupdate: luo_file: remember retrieve() status
@ 2026-02-16 13:22 Pratyush Yadav
2026-02-16 21:44 ` Andrew Morton
2026-02-17 12:03 ` Mike Rapoport
0 siblings, 2 replies; 4+ messages in thread
From: Pratyush Yadav @ 2026-02-16 13:22 UTC (permalink / raw)
To: Pasha Tatashin, Mike Rapoport, Pratyush Yadav, Andrew Morton
Cc: linux-kernel, linux-mm
From: "Pratyush Yadav (Google)" <pratyush@kernel.org>
LUO keeps track of successful retrieve attempts on a LUO file. It does
so to avoid multiple retrievals of the same file. Multiple retrievals
cause problems because once the file is retrieved, the serialized data
structures are likely freed and the file is likely in a very different
state from what the code expects.
The retrieve boolean in struct luo_file keeps track of this, and is
passed to the finish callback so it knows what work was already done and
what it has left to do.
All this works well when retrieve succeeds. When it fails,
luo_retrieve_file() returns the error immediately, without ever storing
anywhere that a retrieve was attempted or what its error code was. This
results in an errored LIVEUPDATE_SESSION_RETRIEVE_FD ioctl to userspace,
but nothing prevents it from trying this again.
The retry is problematic for much of the same reasons listed above. The
file is likely in a very different state than what the retrieve logic
normally expects, and it might even have freed some serialization data
structures. Attempting to access them or free them again is going to
break things.
For example, if memfd managed to restore 8 of its 10 folios, but fails
on the 9th, a subsequent retrieve attempt will try to call
kho_restore_folio() on the first folio again, and that will fail with a
warning since it is an invalid operation.
Apart from the retry, finish() also breaks. Since on failure the
retrieved bool in luo_file is never touched, the finish() call on
session close will tell the file handler that retrieve was never
attempted, and it will try to access or free the data structures that
might not exist, much in the same way as the retry attempt.
There is no sane way of attempting the retrieve again. Remember the
error retrieve returned and directly return it on a retry. Also pass
this status code to finish() so it can make the right decision on the
work it needs to do.
This is done by changing the bool to an integer. A value of 0 means
retrieve was never attempted, a positive value means it succeeded, and a
negative value means it failed and the error code is the value.
Fixes: 7c722a7f44e0 ("liveupdate: luo_file: implement file systems callbacks")
Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org>
---
Notes:
Changes in v2:
- s/retrieve_sts/retrieve_status/g
- Update name in liveupdate_file_op_args docstring.
- Re-order retrieve_status checks in luo_retrieve_file().
- Do not explicitly initialize retrieve_status since we kzalloc() both
luo_file and liveupdate_file_op_args.
- Apply commit message fixups suggested by Mike.
include/linux/liveupdate.h | 9 +++++---
kernel/liveupdate/luo_file.c | 41 ++++++++++++++++++++++--------------
mm/memfd_luo.c | 7 +++++-
3 files changed, 37 insertions(+), 20 deletions(-)
diff --git a/include/linux/liveupdate.h b/include/linux/liveupdate.h
index fe82a6c3005f..dd11fdc76a5f 100644
--- a/include/linux/liveupdate.h
+++ b/include/linux/liveupdate.h
@@ -23,8 +23,11 @@ struct file;
/**
* struct liveupdate_file_op_args - Arguments for file operation callbacks.
* @handler: The file handler being called.
- * @retrieved: The retrieve status for the 'can_finish / finish'
- * operation.
+ * @retrieve_status: The retrieve status for the 'can_finish / finish'
+ * operation. A value of 0 means the retrieve has not been
+ * attempted, a positive value means the retrieve was
+ * successful, and a negative value means the retrieve failed,
+ * and the value is the error code of the call.
* @file: The file object. For retrieve: [OUT] The callback sets
* this to the new file. For other ops: [IN] The caller sets
* this to the file being operated on.
@@ -40,7 +43,7 @@ struct file;
*/
struct liveupdate_file_op_args {
struct liveupdate_file_handler *handler;
- bool retrieved;
+ int retrieve_status;
struct file *file;
u64 serialized_data;
void *private_data;
diff --git a/kernel/liveupdate/luo_file.c b/kernel/liveupdate/luo_file.c
index 4c7df52a6507..a64ae611cb30 100644
--- a/kernel/liveupdate/luo_file.c
+++ b/kernel/liveupdate/luo_file.c
@@ -134,9 +134,12 @@ static LIST_HEAD(luo_file_handler_list);
* state that is not preserved. Set by the handler's .preserve()
* callback, and must be freed in the handler's .unpreserve()
* callback.
- * @retrieved: A flag indicating whether a user/kernel in the new kernel has
+ * @retrieve_status: Status code indicating whether a user/kernel in the new kernel has
* successfully called retrieve() on this file. This prevents
- * multiple retrieval attempts.
+ * multiple retrieval attempts. A value of 0 means a retrieve()
+ * has not been attempted, a positive value means the retrieve()
+ * was successful, and a negative value means the retrieve()
+ * failed, and the value is the error code of the call.
* @mutex: A mutex that protects the fields of this specific instance
* (e.g., @retrieved, @file), ensuring that operations like
* retrieving or finishing a file are atomic.
@@ -161,7 +164,7 @@ struct luo_file {
struct file *file;
u64 serialized_data;
void *private_data;
- bool retrieved;
+ int retrieve_status;
struct mutex mutex;
struct list_head list;
u64 token;
@@ -298,7 +301,6 @@ int luo_preserve_file(struct luo_file_set *file_set, u64 token, int fd)
luo_file->file = file;
luo_file->fh = fh;
luo_file->token = token;
- luo_file->retrieved = false;
mutex_init(&luo_file->mutex);
args.handler = fh;
@@ -577,7 +579,12 @@ int luo_retrieve_file(struct luo_file_set *file_set, u64 token,
return -ENOENT;
guard(mutex)(&luo_file->mutex);
- if (luo_file->retrieved) {
+ if (luo_file->retrieve_status < 0) {
+ /* Retrieve was attempted and it failed. Return the error code. */
+ return luo_file->retrieve_status;
+ }
+
+ if (luo_file->retrieve_status > 0) {
/*
* Someone is asking for this file again, so get a reference
* for them.
@@ -590,16 +597,19 @@ int luo_retrieve_file(struct luo_file_set *file_set, u64 token,
args.handler = luo_file->fh;
args.serialized_data = luo_file->serialized_data;
err = luo_file->fh->ops->retrieve(&args);
- if (!err) {
- luo_file->file = args.file;
-
- /* Get reference so we can keep this file in LUO until finish */
- get_file(luo_file->file);
- *filep = luo_file->file;
- luo_file->retrieved = true;
+ if (err) {
+ /* Keep the error code for later use. */
+ luo_file->retrieve_status = err;
+ return err;
}
- return err;
+ luo_file->file = args.file;
+ /* Get reference so we can keep this file in LUO until finish */
+ get_file(luo_file->file);
+ *filep = luo_file->file;
+ luo_file->retrieve_status = 1;
+
+ return 0;
}
static int luo_file_can_finish_one(struct luo_file_set *file_set,
@@ -615,7 +625,7 @@ static int luo_file_can_finish_one(struct luo_file_set *file_set,
args.handler = luo_file->fh;
args.file = luo_file->file;
args.serialized_data = luo_file->serialized_data;
- args.retrieved = luo_file->retrieved;
+ args.retrieve_status = luo_file->retrieve_status;
can_finish = luo_file->fh->ops->can_finish(&args);
}
@@ -632,7 +642,7 @@ static void luo_file_finish_one(struct luo_file_set *file_set,
args.handler = luo_file->fh;
args.file = luo_file->file;
args.serialized_data = luo_file->serialized_data;
- args.retrieved = luo_file->retrieved;
+ args.retrieve_status = luo_file->retrieve_status;
luo_file->fh->ops->finish(&args);
luo_flb_file_finish(luo_file->fh);
@@ -788,7 +798,6 @@ int luo_file_deserialize(struct luo_file_set *file_set,
luo_file->file = NULL;
luo_file->serialized_data = file_ser[i].data;
luo_file->token = file_ser[i].token;
- luo_file->retrieved = false;
mutex_init(&luo_file->mutex);
list_add_tail(&luo_file->list, &file_set->files_list);
}
diff --git a/mm/memfd_luo.c b/mm/memfd_luo.c
index a34fccc23b6a..785f26aa58c0 100644
--- a/mm/memfd_luo.c
+++ b/mm/memfd_luo.c
@@ -326,7 +326,12 @@ static void memfd_luo_finish(struct liveupdate_file_op_args *args)
struct memfd_luo_folio_ser *folios_ser;
struct memfd_luo_ser *ser;
- if (args->retrieved)
+ /*
+ * If retrieve was successful, nothing to do. If it failed, retrieve()
+ * already cleaned up everything it could. So nothing to do there
+ * either. Only need to clean up when retrieve was not called.
+ */
+ if (args->retrieve_status)
return;
ser = phys_to_virt(args->serialized_data);
base-commit: 6c8dd4f02805de481c200636e567a871f25399a2
--
2.53.0.335.g19a08e0c02-goog
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v2] liveupdate: luo_file: remember retrieve() status
2026-02-16 13:22 [PATCH v2] liveupdate: luo_file: remember retrieve() status Pratyush Yadav
@ 2026-02-16 21:44 ` Andrew Morton
2026-02-17 10:38 ` Pratyush Yadav
2026-02-17 12:03 ` Mike Rapoport
1 sibling, 1 reply; 4+ messages in thread
From: Andrew Morton @ 2026-02-16 21:44 UTC (permalink / raw)
To: Pratyush Yadav; +Cc: Pasha Tatashin, Mike Rapoport, linux-kernel, linux-mm
On Mon, 16 Feb 2026 14:22:19 +0100 Pratyush Yadav <pratyush@kernel.org> wrote:
> From: "Pratyush Yadav (Google)" <pratyush@kernel.org>
>
> LUO keeps track of successful retrieve attempts on a LUO file. It does
> so to avoid multiple retrievals of the same file. Multiple retrievals
> cause problems because once the file is retrieved, the serialized data
> structures are likely freed and the file is likely in a very different
> state from what the code expects.
>
> The retrieve boolean in struct luo_file keeps track of this, and is
> passed to the finish callback so it knows what work was already done and
> what it has left to do.
>
> All this works well when retrieve succeeds. When it fails,
> luo_retrieve_file() returns the error immediately, without ever storing
> anywhere that a retrieve was attempted or what its error code was. This
> results in an errored LIVEUPDATE_SESSION_RETRIEVE_FD ioctl to userspace,
> but nothing prevents it from trying this again.
>
> The retry is problematic for much of the same reasons listed above. The
> file is likely in a very different state than what the retrieve logic
> normally expects, and it might even have freed some serialization data
> structures. Attempting to access them or free them again is going to
> break things.
>
> For example, if memfd managed to restore 8 of its 10 folios, but fails
> on the 9th, a subsequent retrieve attempt will try to call
> kho_restore_folio() on the first folio again, and that will fail with a
> warning since it is an invalid operation.
>
> Apart from the retry, finish() also breaks. Since on failure the
> retrieved bool in luo_file is never touched, the finish() call on
> session close will tell the file handler that retrieve was never
> attempted, and it will try to access or free the data structures that
> might not exist, much in the same way as the retry attempt.
>
> There is no sane way of attempting the retrieve again. Remember the
> error retrieve returned and directly return it on a retry. Also pass
> this status code to finish() so it can make the right decision on the
> work it needs to do.
>
> This is done by changing the bool to an integer. A value of 0 means
> retrieve was never attempted, a positive value means it succeeded, and a
> negative value means it failed and the error code is the value.
>
> Fixes: 7c722a7f44e0 ("liveupdate: luo_file: implement file systems callbacks")
Should we backport this into 6.19.1?
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v2] liveupdate: luo_file: remember retrieve() status
2026-02-16 21:44 ` Andrew Morton
@ 2026-02-17 10:38 ` Pratyush Yadav
0 siblings, 0 replies; 4+ messages in thread
From: Pratyush Yadav @ 2026-02-17 10:38 UTC (permalink / raw)
To: Andrew Morton
Cc: Pratyush Yadav, Pasha Tatashin, Mike Rapoport, linux-kernel, linux-mm
On Mon, Feb 16 2026, Andrew Morton wrote:
> On Mon, 16 Feb 2026 14:22:19 +0100 Pratyush Yadav <pratyush@kernel.org> wrote:
>
>> From: "Pratyush Yadav (Google)" <pratyush@kernel.org>
>>
>> LUO keeps track of successful retrieve attempts on a LUO file. It does
>> so to avoid multiple retrievals of the same file. Multiple retrievals
>> cause problems because once the file is retrieved, the serialized data
>> structures are likely freed and the file is likely in a very different
>> state from what the code expects.
>>
>> The retrieve boolean in struct luo_file keeps track of this, and is
>> passed to the finish callback so it knows what work was already done and
>> what it has left to do.
>>
>> All this works well when retrieve succeeds. When it fails,
>> luo_retrieve_file() returns the error immediately, without ever storing
>> anywhere that a retrieve was attempted or what its error code was. This
>> results in an errored LIVEUPDATE_SESSION_RETRIEVE_FD ioctl to userspace,
>> but nothing prevents it from trying this again.
>>
>> The retry is problematic for much of the same reasons listed above. The
>> file is likely in a very different state than what the retrieve logic
>> normally expects, and it might even have freed some serialization data
>> structures. Attempting to access them or free them again is going to
>> break things.
>>
>> For example, if memfd managed to restore 8 of its 10 folios, but fails
>> on the 9th, a subsequent retrieve attempt will try to call
>> kho_restore_folio() on the first folio again, and that will fail with a
>> warning since it is an invalid operation.
>>
>> Apart from the retry, finish() also breaks. Since on failure the
>> retrieved bool in luo_file is never touched, the finish() call on
>> session close will tell the file handler that retrieve was never
>> attempted, and it will try to access or free the data structures that
>> might not exist, much in the same way as the retry attempt.
>>
>> There is no sane way of attempting the retrieve again. Remember the
>> error retrieve returned and directly return it on a retry. Also pass
>> this status code to finish() so it can make the right decision on the
>> work it needs to do.
>>
>> This is done by changing the bool to an integer. A value of 0 means
>> retrieve was never attempted, a positive value means it succeeded, and a
>> negative value means it failed and the error code is the value.
>>
>> Fixes: 7c722a7f44e0 ("liveupdate: luo_file: implement file systems callbacks")
>
> Should we backport this into 6.19.1?
Yes.
I keep forgetting that a Fixes tag alone isn't enough for stable
backports and I should add Cc: stable@vger.kernel.org too.
Please add it to the patch.
--
Regards,
Pratyush Yadav
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v2] liveupdate: luo_file: remember retrieve() status
2026-02-16 13:22 [PATCH v2] liveupdate: luo_file: remember retrieve() status Pratyush Yadav
2026-02-16 21:44 ` Andrew Morton
@ 2026-02-17 12:03 ` Mike Rapoport
1 sibling, 0 replies; 4+ messages in thread
From: Mike Rapoport @ 2026-02-17 12:03 UTC (permalink / raw)
To: Pratyush Yadav; +Cc: Pasha Tatashin, Andrew Morton, linux-kernel, linux-mm
On Mon, Feb 16, 2026 at 02:22:19PM +0100, Pratyush Yadav wrote:
> From: "Pratyush Yadav (Google)" <pratyush@kernel.org>
>
> LUO keeps track of successful retrieve attempts on a LUO file. It does
> so to avoid multiple retrievals of the same file. Multiple retrievals
> cause problems because once the file is retrieved, the serialized data
> structures are likely freed and the file is likely in a very different
> state from what the code expects.
>
> The retrieve boolean in struct luo_file keeps track of this, and is
> passed to the finish callback so it knows what work was already done and
> what it has left to do.
>
> All this works well when retrieve succeeds. When it fails,
> luo_retrieve_file() returns the error immediately, without ever storing
> anywhere that a retrieve was attempted or what its error code was. This
> results in an errored LIVEUPDATE_SESSION_RETRIEVE_FD ioctl to userspace,
> but nothing prevents it from trying this again.
>
> The retry is problematic for much of the same reasons listed above. The
> file is likely in a very different state than what the retrieve logic
> normally expects, and it might even have freed some serialization data
> structures. Attempting to access them or free them again is going to
> break things.
>
> For example, if memfd managed to restore 8 of its 10 folios, but fails
> on the 9th, a subsequent retrieve attempt will try to call
> kho_restore_folio() on the first folio again, and that will fail with a
> warning since it is an invalid operation.
>
> Apart from the retry, finish() also breaks. Since on failure the
> retrieved bool in luo_file is never touched, the finish() call on
> session close will tell the file handler that retrieve was never
> attempted, and it will try to access or free the data structures that
> might not exist, much in the same way as the retry attempt.
>
> There is no sane way of attempting the retrieve again. Remember the
> error retrieve returned and directly return it on a retry. Also pass
> this status code to finish() so it can make the right decision on the
> work it needs to do.
>
> This is done by changing the bool to an integer. A value of 0 means
> retrieve was never attempted, a positive value means it succeeded, and a
> negative value means it failed and the error code is the value.
>
> Fixes: 7c722a7f44e0 ("liveupdate: luo_file: implement file systems callbacks")
> Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org>
Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
--
Sincerely yours,
Mike.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-02-17 12:03 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-02-16 13:22 [PATCH v2] liveupdate: luo_file: remember retrieve() status Pratyush Yadav
2026-02-16 21:44 ` Andrew Morton
2026-02-17 10:38 ` Pratyush Yadav
2026-02-17 12:03 ` Mike Rapoport
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox