From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 668FEC35FFF for ; Thu, 20 Mar 2025 16:36:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6F2FB280003; Thu, 20 Mar 2025 12:36:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6A2F0280001; Thu, 20 Mar 2025 12:36:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 56C47280003; Thu, 20 Mar 2025 12:36:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 29093280001 for ; Thu, 20 Mar 2025 12:36:00 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 002C51CB56B for ; Thu, 20 Mar 2025 16:36:00 +0000 (UTC) X-FDA: 83242481322.15.F0BDBE4 Received: from mail-qt1-f175.google.com (mail-qt1-f175.google.com [209.85.160.175]) by imf20.hostedemail.com (Postfix) with ESMTP id C06991C000F for ; Thu, 20 Mar 2025 16:35:58 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=soleen-com.20230601.gappssmtp.com header.s=20230601 header.b=JqplxxwV; spf=pass (imf20.hostedemail.com: domain of pasha.tatashin@soleen.com designates 209.85.160.175 as permitted sender) smtp.mailfrom=pasha.tatashin@soleen.com; dmarc=pass (policy=none) header.from=soleen.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1742488559; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=FSJhl8Ya+U9yAyNP3Wu79LvGc5zWv1+4r8Tqq+zSr98=; b=Go7TApFPV0WJqkXoo8NPq+e3KtDobh4QoKDi1Nd+Rc8ByYmj2RnBeaaFu0DhdYQK317bRm sf+GofvI2RXtla0pFyXohzHqjOKrLxb7Mbg8RJyTgeDuJTiVs0BMuUc9Wgla+J/47jui0H A292R89a6ddivYMyuihs5dQ8tCTJt7s= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=soleen-com.20230601.gappssmtp.com header.s=20230601 header.b=JqplxxwV; spf=pass (imf20.hostedemail.com: domain of pasha.tatashin@soleen.com designates 209.85.160.175 as permitted sender) smtp.mailfrom=pasha.tatashin@soleen.com; dmarc=pass (policy=none) header.from=soleen.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1742488559; a=rsa-sha256; cv=none; b=tWcDuC10kJs2rYWjMMjipZxe6dqpu8wZXDynrt9jnzrZEDJJJMebVyGjBEztxmX3tdAbgS fIaMGuxzD6V1ZW9V/kp7dcGfrkExjadkqYPa5pEA6Xn7gc0HyzhvDZCcRLMoPyGE/18jla 26GE9ixVVqfwng3AxzLm4fG9rAzcsco= Received: by mail-qt1-f175.google.com with SMTP id d75a77b69052e-47691d82bfbso20841701cf.0 for ; Thu, 20 Mar 2025 09:35:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1742488557; x=1743093357; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=FSJhl8Ya+U9yAyNP3Wu79LvGc5zWv1+4r8Tqq+zSr98=; b=JqplxxwVeqD/el19lw3iwyB8WMjGZ5FXKir+tyQvr4V+kuQldNZFVyUn0ppZDkbkwX dtuQeMiAXzLzGe2yi2kz5YYU7ibuPbu6IIMLt0U5p3YgioX/Q7kotQ7PSTyK/lYjhPSW EmMc8EqZbdcCcGyYYUmQzM3uFXag63QHSNrh2wzkHAz2sJSH4iYG3i5r3YWW2Zgdtzm7 HHZrzmkXr3wNoyV4z2SnxZPXutz6GUqTnrPm3hdM3/tW+3TaY6Ppjbo18tuDTcLg9kPi zf7pVaipfMS1LQdFbQYCGI1MgtCI6FWcCrDaFjfHjc1Bva6SswPsmfbFZbIL4+TT4koJ weRw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1742488557; x=1743093357; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=FSJhl8Ya+U9yAyNP3Wu79LvGc5zWv1+4r8Tqq+zSr98=; b=uSLQFNLuPuUi4udLmZlRbTp89W3XyWTZDdcoVRssWfXJIZqiAJvi1I0SmsOtTUp65A N92VJNZE8B6cD48zgcWXLs//NA6YJBKXPHNgVio/9hc2DTTgHYaSjfWGkEcVGuMXM1kx Lz3HvsD4SEiIz4RK7TR96rT/2hf2EklJ8Eo/Sj1e/+BtR6t62vBJXSwpvnwHQKM4LZQu p1n8oPLJ+wGR598oOUrFcFRZo2YTNvSOOqbr7Fm6MUh8HKDicWON2vi+k8dvqO9Ob33O WSyK0Ih9U5SgyWxX/x3j5jY2e9RBQ9wcNPHW4oFNjgKiQGiqsNERKdlcUNgM9SU/1P1t Lvxg== X-Forwarded-Encrypted: i=1; AJvYcCUDSD4LoCgrZHnLUjjaOlT/xNzU//CChcN4rg5E6mvgoXiX1DqkIr76cILpM/XBjPJLcBVv6mcDhA==@kvack.org X-Gm-Message-State: AOJu0Ywv1JZLzT6ILCZaGp5X/zmiokvILR6cCYDhealk50M8fUEMb7we vP1YZkM5/CjHqbQAOnDD3QxdxU27i9fzgx4vYA1ZClxxcZQlCfuA8rmHObjQLiYBpBsuwkd39Py SQwdmROrnRjgIvjq8IXCuu2hwmKheS4q5qLmecA== X-Gm-Gg: ASbGnctaPue1C++SpmGZdB3Dpw9Boh+a/bgQLHFtmZI0YGL2m4LaLwvNc4+V9WAWtoJ okaqXyHVRbydZnC+NY7lfthcB0glUrNeNaiINymj1ji3DiL29/PSnUlP2g3QAZ4Pj/rSc64T0Kl ryomLypyhSqhLeVzbAq7Dp7sw= X-Google-Smtp-Source: AGHT+IF0eXOQlc/TCcCLJjjH1RwfQ9Q7azG+hpO1bThoH01wQ+mnOSXeNm3Os2T9J32bHJKiD8UV6/78bbxc28Q/6yY= X-Received: by 2002:a05:622a:11c7:b0:476:b06a:716e with SMTP id d75a77b69052e-4770841fd04mr103463281cf.34.1742488557448; Thu, 20 Mar 2025 09:35:57 -0700 (PDT) MIME-Version: 1.0 References: <20250320024011.2995837-1-pasha.tatashin@soleen.com> <20250320024011.2995837-2-pasha.tatashin@soleen.com> In-Reply-To: From: Pasha Tatashin Date: Thu, 20 Mar 2025 12:35:20 -0400 X-Gm-Features: AQ5f1Jo9DEAuy95jWf1ZvdTmiidPtoTdbLtKofntA_8ycDVY8wy9zZs4pKR771U Message-ID: Subject: Re: [RFC v1 1/3] luo: Live Update Orchestrator To: Andy Shevchenko Cc: changyuanl@google.com, graf@amazon.com, rppt@kernel.org, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, jgowans@amazon.com, jgg@nvidia.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Queue-Id: C06991C000F X-Rspamd-Server: rspam08 X-Stat-Signature: 9q5bwttsesqdyaimtogbffmg163nnk38 X-HE-Tag: 1742488558-36388 X-HE-Meta: U2FsdGVkX1/Z5/VMzvwzd5k4/NE2Vk7BnD+4YeXsMUL/cyzKpgAhv1Vk2wER2SGpcc0SwVcXEhr2sOUAgeuCeorXJcdw7ICAGLIpI2ihgJu5a3jbI4OByU4yv41iIzMzkZIkk2qdfdS/dsjnwxVsJJMjIY0C18XF0xeFurUH68p6yikPIToLNTGAogqw58UMiYM4jqJX5RftQjbflw2Re8t/cW1MQ943W5ELVQ4H67wGe37ZmZOulky/+fJJtlVYb44mEhe+yrs5p+tWk3YhpwkUu/5IBSdklKBKfsvTaHS1CxxcWrcsBADRRogIYOobqcs01tL6bN3DGh+8sh1LVLKQvtsMDuP/vnQt0OfbqcLTVtL/wI18ohJbk92sTb9vc1jxVyM29Z3Vf6u+9YlbERyZ8V6IrrNV8dEeiMKTp1gf6BT++LvW1WGr/AsfNXn687lVxcBuFh8qh5wEOk7KsjmDH0TsAw7hcRn4nB8ywTAXMdHRRtfOjwV0vH+SILErLspx7KudPIzsXjwrJizeXoxZ5HPEHUuvVmR5A3jrphXfouvfW1vrnZHJpuyFnoXHaXcXPjTE54UbiKzQjC688iVbhffKkA4FRnu3T+iNdOnLXKIynYm8vmnoOcNbDEy6arlvsxi+5q/tQ0P7+ausiQPYdanePZp/HmjAFY+8Rhn/ZXbAhcEl93icD3P3qC4cebgwiev7Jr9iHqeaLCjZk5oZoQrvyH/CVyW60fFHBJOglsdnRpZbFbDwyzKHIctGx6uxi8+cIykvY8emevFTiiy2dml4slI9nWrYUEdkIFCR9o+fxiuScjWpYOrm0Jzx5cicABY2l6Hzi9iV99g5Lajy1ZqEtE8t9q1Lzb4nkq3CDbes/ha3mbpmcZ1rXpIbMeyoTg0TH1/MJ9oYtanLCN5vgr6LiGNrcUOi8BXzflC3q1PLYNxlgdZPTTVeHbIOXt9izcPJqJ11nfoSHvv EFUYA41R EWbilYfiNKZgjdWREg16mjTRbqibvFEimqXPbz70BN0QBKUDPpgvhy6nvBGUPUekWOYtC996R+d1yndHUWtVj+P4J2k0Nog73Mc6Apxvct1JCqvqzyZtc6O1zn5l0WvQs0AUvBuWeEvbWP9mfvbaFf/hQjGgeWzpaXYBaaiLlc9pRqkpsJy16vjFUdzjRYHYHJkLW8byyarWoi72oMdShjnjIJaWcZvIi/rJ8BGBzU0Tg2FzRLpOoaMoVlMag0kmENzoPZIhsZa8ckcRKzBuGA22aTd5CZ/pada0V7s/NPz8TrTmUQNmRjaEDaGHCtZy0hZR9Qr4fXHrCmsTqRArf9qtPk1/PwqwGwAnN X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Mar 20, 2025 at 9:40=E2=80=AFAM Andy Shevchenko wrote: > > On Thu, Mar 20, 2025 at 02:40:09AM +0000, Pasha Tatashin wrote: > > Introduces the Live Update Orchestrator (LUO), a new kernel subsystem > > designed to facilitate live updates. Live update is a method to reboot > > the kernel while attempting to keep selected devices alive across the > > reboot boundary, minimizing downtime. > > > > The primary use case is cloud environments, allowing hypervisor updates > > without fully disrupting running virtual machines. VMs can be suspended > > while the hypervisor kernel reboots, and devices attached to these VM > > are kept operational by the LUO. > > > > Features introduced: > > > > - Core orchestration logic for managing the live update process. > > - A state machine (NORMAL, PREPARED, UPDATED, *_FAILED) to track > > the progress of live updates. > > - Notifier chains for subsystems (device layer, interrupts, KVM, IOMMU, > > etc.) to register callbacks for different live update events: > > - LIVEUPDATE_PREPARE: Prepare for reboot (before blackout). > > - LIVEUPDATE_REBOOT: Final serialization before kexec (blackout). > > - LIVEUPDATE_FINISH: Cleanup after update (after blackout). > > - LIVEUPDATE_CANCEL: Rollback actions on failure or user request. > > - A sysfs interface (/sys/kernel/liveupdate/) for user-space control: > > - `prepare`: Initiate preparation (write 1) or reset (write 0). > > - `finish`: Finalize update in new kernel (write 1). > > - `cancel`: Abort ongoing preparation or reboot (write 1). > > - `reset`: Force state back to normal (write 1). > > - `state`: Read-only view of the current LUO state. > > - `enabled`: Read-only view of whether live update is enabled. > > - Integration with KHO to pass orchestrator state to the new kernel. > > - Version checking during startup of the new kernel to ensure > > compatibility with the previous kernel's live update state. > > > > This infrastructure allows various kernel subsystems to coordinate and > > participate in the live update process, serializing and restoring devic= e > > state across a kernel reboot. > > ... > > > +Date: March 2025 > > +KernelVersion: 6.14.0 > > This is way too optimistic, it even won't make v6.15. > And date can be chosen either v6.16-rc1 or v6.16 release > in accordance with prediction tool This is an early RFC and is not intended to be applied. I meant to replace it with the appropriate version once it becomes a candidate to land. > > ... > > > +#ifndef _LINUX_LIVEUPDATE_H > > +#define _LINUX_LIVEUPDATE_H > > > +#include > > +#include > > This is semi-random list of inclusions. Try to follow IWYU principle. > See below. I will remove > > ... > > > +bool liveupdate_state_updated(void); > > Where bool is defined? in kernel/liveupdate.c > > ... > > > +/** > > + * LIVEUPDATE_DECLARE_NOTIFIER - Declare a live update notifier with d= efault > > + * structure. > > + * @_name: A base name used to generate the names of the notifier bloc= k > > + * (e.g., ``_name##_liveupdate_notifier_block``) and the callback func= tion > > + * (e.g., ``_name##_liveupdate``). > > + * @_priority: The priority of the notifier, specified using the > > + * ``enum liveupdate_cb_priority`` values > > + * (e.g., ``LIVEUPDATE_CB_PRIO_BEFORE_DEVICES``). > > + * > > + * This macro declares a static struct notifier_block and a correspond= ing > > + * notifier callback function for use with the live update orchestrato= r. > > + * It simplifies the process by automatically handling the dispatching= of > > + * live update events to separate handler functions for prepare, reboo= t, > > + * finish, and cancel. > > + * > > + * This macro expects the following functions to be defined: > > + * > > + * ``_name##_liveupdate_prepare()``: Called on LIVEUPDATE_PREPARE. > > + * ``_name##_liveupdate_reboot()``: Called on LIVEUPDATE_REBOOT. > > + * ``_name##_liveupdate_finish()``: Called on LIVEUPDATE_FINISH. > > + * ``_name##_liveupdate_cancel()``: Called on LIVEUPDATE_CANCEL. > > + * > > + * The generated callback function handles the switch statement for th= e > > + * different live update events and calls the appropriate handler func= tion. > > + * It also includes warnings if the finish or cancel handlers return a= n error. > > + * > > + * For example, declartion can look like this: > > + * > > + * ``static int foo_liveupdate_prepare(void) { ... }`` > > + * > > + * ``static int foo_liveupdate_reboot(void) { ... }`` > > + * > > + * ``static int foo_liveupdate_finish(void) { ... }`` > > + * > > + * ``static int foo_liveupdate_cancel(void) { ... }`` > > + * > > + * ``LIVEUPDATE_DECLARE_NOTIFIER(foo, LIVEUPDATE_CB_PRIO_WITH_DEVICES)= ;`` > > + * > > Hmm... Have you run kernel-doc validator? There is missing Return section= and > it will warn about that. I have, and there are no warnings. There is no return in this macro. > > > + */ > > ... > > > + WARN_ONCE(rv, "cancel failed[%d]\n", rv); \ > > + bug.h I will include bug.h > > ... > > > + #undef pr_fmt > > + #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt > > Leftover from the development? Ah, yes, it is duplicated. > > > +#undef pr_fmt > > Not needed as long as pr_fmt9) is at the top of the file. Thanks, I will remove it. > > > +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt > > ... > > > +#include > > What for? Can you follow IWYU, please? Here again semi-random list of > inclusions. I will remove kernel.h and cpufreq.h and add #include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > Can you keep them ordered which will be easier to read and maintain? Sure, I will order them alphabetically. > > ... > > > +static int __init early_liveupdate_param(char *buf) > > +{ > > + return kstrtobool(buf, &luo_enabled); > > +} > > > + > > Redundant blank line. OK > > +early_param("liveupdate", early_liveupdate_param); > > ... > > > +/* Show the current live update state */ > > +static ssize_t state_show(struct kobject *kobj, > > + struct kobj_attribute *attr, > > It is still enough room even for the strict 80 limit case. OK > > > + char *buf) > > +{ > > + return sysfs_emit(buf, "%s\n", LUO_STATE_STR); > > +} > > ... > > > + ret =3D blocking_notifier_call_chain(&luo_notify_list, > > + event, > > + NULL); > > There is room on the previous lines. Ditto for the similar cases all over > the code. OK > > ... > > > +{ > > + int ret; > > + > > + if (down_write_killable(&luo_state_rwsem)) { > > + pr_warn(" %s, change state canceled by user\n", __func__)= ; > > Why __func__ is so important in _this_ message? And why leading space? > Ditto for the similar cases. removed __func__, and rewarded messages to be more descriptive in each case= . > > > + return -EAGAIN; > > + } > > + > > + if (!IS_STATE(LIVEUPDATE_STATE_NORMAL)) { > > + pr_warn("Can't switch to [%s] from [%s] state\n", > > + luo_state_str[LIVEUPDATE_STATE_PREPARED], > > + LUO_STATE_STR); > > + up_write(&luo_state_rwsem); > > + > > + return -EINVAL; > > + } > > + > > + ret =3D luo_notify(LIVEUPDATE_PREPARE); > > + if (!ret) > > + luo_set_state(LIVEUPDATE_STATE_PREPARED); > > + > > + up_write(&luo_state_rwsem); > > + > > + return ret; > > +} > > ... > > > +static ssize_t prepare_store(struct kobject *kobj, > > + struct kobj_attribute *attr, > > + const char *buf, > > + size_t count) > > +{ > > + ssize_t ret; > > + long val; > > > + if (kstrtol(buf, 0, &val) < 0) > > + return -EINVAL; > > Shadower error code. In this case it is appropriate, we do not case why kstrtol() could not be converted into an appropriate integer, all we care is that the input was invalid, and that what we return back to user. > > > > + if (val !=3D 1 && val !=3D 0) > > + return -EINVAL; > > What's wrong with using kstrtobool() from the beginning? It makes the input less defined, here we only allow '1' or '0', kstrtobool() allows almost anything. > > > + > > + if (val) > > + ret =3D luo_prepare(); > > + else > > + ret =3D luo_cancel(); > > > + if (!ret) > > + ret =3D count; > > + > > + return ret; > > Can we go with the usual pattern "check for errors first"? > > if (ret) > return ret; Sure. > > ... > > > +} > > ... > > > +static ssize_t finish_store(struct kobject *kobj, > > + struct kobj_attribute *attr, > > + const char *buf, > > + size_t count) > > Same comments as per above. OK > > ... > > > +static struct attribute *luo_attrs[] =3D { > > + &state_attribute.attr, > > + &prepare_attribute.attr, > > + &finish_attribute.attr, > > > + NULL, > > No comma for the terminator entry. Sure. > > +}; > > ... > > > +static int __init luo_init(void) > > +{ > > + int ret; > > + > > + if (!luo_enabled || !kho_is_enabled()) { > > + pr_info("disabled by user\n"); > > + luo_enabled =3D false; > > + > > + return 0; > > + } > > Can be written like > > if (!kho_is_enabled()) > luo_enabled =3D false; > if (!luo_enabled) { > pr_info("disabled by user\n"); > return 0; > } Sure > > > + ret =3D sysfs_create_group(kernel_kobj, &luo_attr_group); > > + if (ret) > > + pr_err("Failed to create group\n"); > > + > > + luo_sysfs_initialized =3D true; > > + pr_info("Initialized\n"); > > + > > + return ret; > > Something is odd here between (non-)checking for errors and printed messa= ges. Thank you for pointing out, it is a bug, fixed. > > > +} > > ... > > > +EXPORT_SYMBOL_GPL(liveupdate_state_normal); > > No namespace? Namespace is 'liveupdate_', all public interfaces have this prefix, private functions are prefixed with luo_ where it makes sense. > > ... > > > --- a/kernel/reboot.c > > +++ b/kernel/reboot.c > > @@ -18,6 +18,7 @@ > > #include > > #include > > #include > > +#include > > Can oyu preserve order (with given context at least)? Yes. > > -- > With Best Regards, > Andy Shevchenko Thank you for your review. Pasha