linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Pasha Tatashin <pasha.tatashin@soleen.com>
To: Pratyush Yadav <pratyush@kernel.org>
Cc: Mike Rapoport <rppt@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	 linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH 2/2] liveupdate: luo_file: remember retrieve() status
Date: Sat, 31 Jan 2026 10:31:38 -0500	[thread overview]
Message-ID: <CA+CK2bC9WBdYoQfJXZs+K6DQjDkrGmppbevL18mJ=DhO0TkG1g@mail.gmail.com> (raw)
In-Reply-To: <20260126230302.2936817-3-pratyush@kernel.org>

On Mon, Jan 26, 2026 at 6:03 PM Pratyush Yadav <pratyush@kernel.org> wrote:
>
> From: "Pratyush Yadav (Google)" <pratyush@kernel.org>
>
> LUO keeps track of successful retrieve attempts on a LUO file. It does
> so to avoid multiple retrievals of the same file. Doing so will cause
> problems because once the file is retrieved, the serialized data
> structures are likely freed and the file is likely in a very different
> state from what the code expects.
>
> This is kept track of by the retrieved boolean in struct luo_file, and
> is passed to the finish callback so it knows what work was already done
> and what it has left to do.
>
> All this works well when retrieve succeeds. When it fails,
> luo_retrieve_file() returns the error immediately, without ever storing
> anywhere that a retrieve was attempted or what its error code was. This
> results in an errored LIVEUPDATE_SESSION_RETRIEVE_FD ioctl to userspace,
> but nothing prevents it from trying this again.
>
> The retry is problematic for much of the same reasons listed above. The
> file is likely in a very different state than what the retrieve logic
> normally expects, and it might even have freed some serialization data
> structures. Attempting to access them or free them again is going to
> break things.
>
> For example, if memfd managed to restore 8 of its 10 folios, but fails
> on the 9th, a subsequent retrieve attempt will try to call
> kho_restore_folio() on the first folio again, and that will fail with a
> warning since it is an invalid operation.
>
> Apart from the retry, finish() also breaks. Since on failure the
> retrieved bool in luo_file is never touched, the finish() call on
> session close will tell the file handler that retrieve was never
> attempted, and it will try to access or free the data structures that
> might not exist, much in the same way as the retry attempt.
>
> There is no sane way of attempting the retrieve again. Remember the
> error retrieve returned and directly return it on a retry. Also pass
> this status code to finish() so it can make the right decision on the
> work it needs to do.
>
> This is done by changing the bool to an integer. A value of 0 means
> retrieve was never attempted, a positive value means it succeeded, and a
> negative value means it failed and the error code is the value.
>
> Fixes: 7c722a7f44e0 ("liveupdate: luo_file: implement file systems callbacks")
> Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org>
> ---
>  include/linux/liveupdate.h   |  7 ++++--
>  kernel/liveupdate/luo_file.c | 41 ++++++++++++++++++++++--------------
>  mm/memfd_luo.c               |  7 +++++-
>  3 files changed, 36 insertions(+), 19 deletions(-)
>
> diff --git a/include/linux/liveupdate.h b/include/linux/liveupdate.h
> index a7f6ee5b6771..a543a3a8e837 100644
> --- a/include/linux/liveupdate.h
> +++ b/include/linux/liveupdate.h
> @@ -21,7 +21,10 @@ struct file;
>   * struct liveupdate_file_op_args - Arguments for file operation callbacks.
>   * @handler:          The file handler being called.
>   * @retrieved:        The retrieve status for the 'can_finish / finish'

Rename retrieved to retrieve_sts

> - *                    operation.
> + *                    operation. A value of 0 means the retrieve has not been
> + *                    attempted, a positive value means the retrieve was
> + *                    successful, and a negative value means the retrieve failed,
> + *                    and the value is the error code of the call.
>   * @file:             The file object. For retrieve: [OUT] The callback sets
>   *                    this to the new file. For other ops: [IN] The caller sets
>   *                    this to the file being operated on.
> @@ -37,7 +40,7 @@ struct file;
>   */
>  struct liveupdate_file_op_args {
>         struct liveupdate_file_handler *handler;
> -       bool retrieved;
> +       bool retrieve_sts;
>         struct file *file;
>         u64 serialized_data;
>         void *private_data;
> diff --git a/kernel/liveupdate/luo_file.c b/kernel/liveupdate/luo_file.c
> index 9f7283379ebc..82577b4cca2b 100644
> --- a/kernel/liveupdate/luo_file.c
> +++ b/kernel/liveupdate/luo_file.c
> @@ -133,9 +133,12 @@ static LIST_HEAD(luo_file_handler_list);
>   *                 state that is not preserved. Set by the handler's .preserve()
>   *                 callback, and must be freed in the handler's .unpreserve()
>   *                 callback.
> - * @retrieved:     A flag indicating whether a user/kernel in the new kernel has
> + * @retrieve_sts:  Status code indicating whether a user/kernel in the new kernel has
>   *                 successfully called retrieve() on this file. This prevents
> - *                 multiple retrieval attempts.
> + *                 multiple retrieval attempts. A value of 0 means a retrieve()
> + *                 has not been attempted, a positive value means the retrieve()
> + *                 was successful, and a negative value means the retrieve()
> + *                 failed, and the value is the error code of the call.
>   * @mutex:         A mutex that protects the fields of this specific instance
>   *                 (e.g., @retrieved, @file), ensuring that operations like
>   *                 retrieving or finishing a file are atomic.
> @@ -160,7 +163,7 @@ struct luo_file {
>         struct file *file;
>         u64 serialized_data;
>         void *private_data;
> -       bool retrieved;
> +       int retrieve_sts;
>         struct mutex mutex;
>         struct list_head list;
>         u64 token;
> @@ -293,7 +296,7 @@ int luo_preserve_file(struct luo_file_set *file_set, u64 token, int fd)
>         luo_file->file = file;
>         luo_file->fh = fh;
>         luo_file->token = token;
> -       luo_file->retrieved = false;
> +       luo_file->retrieve_sts = 0;
>         mutex_init(&luo_file->mutex);
>
>         args.handler = fh;
> @@ -569,7 +572,7 @@ int luo_retrieve_file(struct luo_file_set *file_set, u64 token,
>                 return -ENOENT;
>
>         guard(mutex)(&luo_file->mutex);
> -       if (luo_file->retrieved) {
> +       if (luo_file->retrieve_sts > 0) {
>                 /*
>                  * Someone is asking for this file again, so get a reference
>                  * for them.
> @@ -577,21 +580,27 @@ int luo_retrieve_file(struct luo_file_set *file_set, u64 token,
>                 get_file(luo_file->file);
>                 *filep = luo_file->file;
>                 return 0;
> +       } else if (luo_file->retrieve_sts < 0) {
> +               /* Retrieve was attempted and it failed. Return the error code. */
> +               return luo_file->retrieve_sts;
>         }
>
>         args.handler = luo_file->fh;
>         args.serialized_data = luo_file->serialized_data;
>         err = luo_file->fh->ops->retrieve(&args);
> -       if (!err) {
> -               luo_file->file = args.file;
> -
> -               /* Get reference so we can keep this file in LUO until finish */
> -               get_file(luo_file->file);
> -               *filep = luo_file->file;
> -               luo_file->retrieved = true;
> +       if (err) {
> +               /* Keep the error code for later use. */
> +               luo_file->retrieve_sts = err;
> +               return err;
>         }
>
> -       return err;
> +       luo_file->file = args.file;
> +       /* Get reference so we can keep this file in LUO until finish */
> +       get_file(luo_file->file);
> +       *filep = luo_file->file;
> +       luo_file->retrieve_sts = 1;
> +
> +       return 0;
>  }
>
>  static int luo_file_can_finish_one(struct luo_file_set *file_set,
> @@ -607,7 +616,7 @@ static int luo_file_can_finish_one(struct luo_file_set *file_set,
>                 args.handler = luo_file->fh;
>                 args.file = luo_file->file;
>                 args.serialized_data = luo_file->serialized_data;
> -               args.retrieved = luo_file->retrieved;
> +               args.retrieve_sts = luo_file->retrieve_sts;
>                 can_finish = luo_file->fh->ops->can_finish(&args);
>         }
>
> @@ -624,7 +633,7 @@ static void luo_file_finish_one(struct luo_file_set *file_set,
>         args.handler = luo_file->fh;
>         args.file = luo_file->file;
>         args.serialized_data = luo_file->serialized_data;
> -       args.retrieved = luo_file->retrieved;
> +       args.retrieve_sts = luo_file->retrieve_sts;
>
>         luo_file->fh->ops->finish(&args);
>  }
> @@ -779,7 +788,7 @@ int luo_file_deserialize(struct luo_file_set *file_set,
>                 luo_file->file = NULL;
>                 luo_file->serialized_data = file_ser[i].data;
>                 luo_file->token = file_ser[i].token;
> -               luo_file->retrieved = false;
> +               luo_file->retrieve_sts = 0;
>                 mutex_init(&luo_file->mutex);
>                 list_add_tail(&luo_file->list, &file_set->files_list);
>         }
> diff --git a/mm/memfd_luo.c b/mm/memfd_luo.c
> index a34fccc23b6a..ffc9f879833b 100644
> --- a/mm/memfd_luo.c
> +++ b/mm/memfd_luo.c
> @@ -326,7 +326,12 @@ static void memfd_luo_finish(struct liveupdate_file_op_args *args)
>         struct memfd_luo_folio_ser *folios_ser;
>         struct memfd_luo_ser *ser;
>
> -       if (args->retrieved)
> +       /*
> +        * If retrieve was successful, nothing to do. If it failed, retrieve()
> +        * already cleaned up everything it could. So nothing to do there
> +        * either. Only need to clean up when retrieve was not called.
> +        */
> +       if (args->retrieve_sts)
>                 return;
>
>         ser = phys_to_virt(args->serialized_data);
> --
> 2.52.0.457.g6b5491de43-goog
>


  parent reply	other threads:[~2026-01-31 15:32 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-26 23:02 [PATCH 0/2] liveupdate: fixes in error handling Pratyush Yadav
2026-01-26 23:02 ` [PATCH 1/2] liveupdate: luo_file: do not clear serialized_data on unfreeze Pratyush Yadav
2026-01-28 11:19   ` Mike Rapoport
2026-01-30 18:56     ` Pratyush Yadav
2026-01-30 19:50       ` Andrew Morton
2026-02-02 11:14         ` Pratyush Yadav
2026-01-26 23:02 ` [PATCH 2/2] liveupdate: luo_file: remember retrieve() status Pratyush Yadav
2026-01-28 11:37   ` Mike Rapoport
2026-02-10 13:30     ` Pratyush Yadav
2026-01-31 15:31   ` Pasha Tatashin [this message]
2026-02-10 13:31     ` Pratyush Yadav

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+CK2bC9WBdYoQfJXZs+K6DQjDkrGmppbevL18mJ=DhO0TkG1g@mail.gmail.com' \
    --to=pasha.tatashin@soleen.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=pratyush@kernel.org \
    --cc=rppt@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox