From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 988D8C7115D for ; Fri, 20 Jun 2025 15:28:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 16C696B007B; Fri, 20 Jun 2025 11:28:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 11D526B008A; Fri, 20 Jun 2025 11:28:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 032476B008C; Fri, 20 Jun 2025 11:28:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id E47236B007B for ; Fri, 20 Jun 2025 11:28:44 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 754791A0D5C for ; Fri, 20 Jun 2025 15:28:44 +0000 (UTC) X-FDA: 83576161368.01.5C350AB Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf27.hostedemail.com (Postfix) with ESMTP id D0D934000F for ; Fri, 20 Jun 2025 15:28:42 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="T/yh1qX0"; spf=pass (imf27.hostedemail.com: domain of pratyush@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=pratyush@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1750433322; a=rsa-sha256; cv=none; b=sleaaa/aq6KrIkM9lFE9Ec+PU0eOMjL2JSE24EusRq2UQVh+mGVS3VRXfOY1aWtDtqLVtz 70SyGqvuNOpr53QrmB/YKsZTm3HAPIT2EljAFDilkUAZFjg1ls7hC27XYSCXKKsaybxPNV 1BFI2hcXNkGQjjQHbW7E0VX6I2qV+Gk= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="T/yh1qX0"; spf=pass (imf27.hostedemail.com: domain of pratyush@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=pratyush@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1750433322; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=+4XjBTSaDf/HKnRUTawBdduLlGUEWTDjMFzDM687am8=; b=oSrj4M3ny0cTM59DwrYGT/GMqhZ/TKhglnVy/5vbTL0ic3RrWla7W4icFR3z5HPP01PJvo b9JgCdBUDKChvGIhQUNWmBGSmpucxGYIbJF0S5s7NLmzGhWuKq8Wz77qI9l5KlLsh1sN4F k7urXbsgetAVFPkpYriaDKa2/DZYqA8= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 71F7E5C6B86; Fri, 20 Jun 2025 15:26:25 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6C94AC4CEE3; Fri, 20 Jun 2025 15:28:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1750433321; bh=s48sv8cQZxsP+ITyE0jtnLWoulZ7jFl//mGYqYmUOIg=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=T/yh1qX0eOHqOk94aDuQCH2s/qyRkQqpZOboUNJPj1MkPrHAk+mivll/EFLYhzA7f 51EJzq8JQEYwojbFGTMMqW12eR1/0K7+g2HmvP53X7MKTLUHGOAfrHaV8iilcThtc6 sVv0v8ZjR6dqxfG5qCP6CtkkmrWdn9BZmEtYHvEj0z/5M5TmBF07beiIaXyekd/2iE +KnXqSCm582ZUC5wd1mGmLvlP/Wa3MTNzDL3rka4g6wU4pEz3QQ+oFF3MB1sV03xP7 QeCrWMn1J+rjPifsv0xMED4VVllS/gk9xNPfbErfrnQipu0Ej/NbYCJe5LTZkG9wpg rz1b+ci1aoT6Q== From: Pratyush Yadav To: Pasha Tatashin Cc: Mike Rapoport , Pratyush Yadav , Jason Gunthorpe , jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com Subject: Re: [RFC v2 05/16] luo: luo_core: integrate with KHO In-Reply-To: References: <20250617152357.GB1376515@ziepe.ca> Date: Fri, 20 Jun 2025 17:28:31 +0200 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: D0D934000F X-Stat-Signature: i6dt8qm7iuhbq5t3j8oa156pw85q1g8m X-Rspam-User: X-HE-Tag: 1750433322-762478 X-HE-Meta: U2FsdGVkX1+gq7z2jlSp0KrZzGGhf77oGWzpfY00Y8lYT9191I6ErJM1Eetbrh/vYSSswbRX9JHpmNmNDJ1XOeCZ4RgAVjLW7VwYPL1jRs7D6isnSMRrn/zXvWcodid4QcUiWfjHLBVm9e6KWaB/VSoF1hHJRstTrnZaj3Q7z3df5+fUGoMNFR4MjbKAkefc41pPaR5jugA82CGnsxvY7CyqIX8wAGd7CWEq7Di/rGBSQ3HrC0qN4NHSTkLGifzcyJhMI8dRxvniYTeSij30s4ZusRIu9LG3B9Crw3JuML3lRli2eWGMBGIuyzBbBuRya9SlJVYKBKpbQ8i0c+7KUByd9LH7Hab+2HF6sJc5STl6f4wearfWZ8X4n21S095pd0cQdGuU+SvTJWMrilptEHmq2Ojhbcf3RDZtbMgBlKe0sUVd2aYRtTiSXDaFcMDgp/WJv7r9aGmWBWMSdqWxn37IcWr97rPA5v+Nsqv+G21U+9qnfLE1Q8IKNR4hYCoBEKK6TXRQF5er2B/KvuEBJPrNwUIVa5kaAobRpsHjEUzdD7DUnLvjlbTIIUV8T98s8buh+veCtMg+1Yk9htC8vDKZRHcvwM11MhOk2iE1Q54MW5YQxZMMmcTGDtltCjAmWVYsBGnlIXdKJac749Ls1BrM3qEMZ6T+ifzK2pax2d6RqdfRRgapzI6Cu8+/+r4DRpxqkIcvrP1Z8p7tND/PgxoSUXM3/HHZ34x0LRa+jAnnNsgq8UbXZyMRA6Y/trVio+2RH30K68a7OrWQOP5Pjt3qGJTDSSTYVd4ME02RFpGoGh/Y4Si4wHgbE7KokIwABCitlafzX3YhJENCN07OnuKh2NqJZWrLjwY6HA1niQxxe7tAoLwdy2eLHKesNiPvj8CGILX9R4+T0xn3X/mW89vaGhwtBuB3pULUWJ3m2awTkdQA/0JoRJSJ6zG9xiGPhJCBNV4k5MeYHm+djrD mjseqkSk yGfUo0d6X3n0ie46I5YCFOB5dHTIU1bL6Ng1oRUXC8zfYIFZT9A/GBSbvYei+N27q08mKKX/3oabXYXtv5/HmEbrASfB/dR+hGxJAKCtb98A2J0KQHjbkFd36RvDV40MrEvjwC2gLoOEEIM5t2DbrZTj7wrYQPDr7QE9q0qu1/EdZejpsv9WEG6Jml2m6eJWbBRcAnC90+MVzAi2MOd1sZtwb6kyUKadbuSaUp6TEQN6HB9RBD4AcCZGN96i1OJK4LDu3sLKC5mFrswdIx8S9QVu9zKOOM1n7ijGg X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Pasha, On Thu, Jun 19 2025, Pasha Tatashin wrote: [...] >> And it has to be done before kexec load, at least until we resolve this. > > The before kexec load constrained has been fixed. The only > "finalization" constraint we have is it should be before > reboot(LINUX_REBOOT_CMD_KEXEC) and only because memory allocations > during kernel shutdown are undesirable. Once KHO moves away from a > monolithic state machine this constraint disappears. Kernel components > could preserve their resources at appropriate times, not necessarily > tied to a shutdown-time. For live update scenarios, LUO already > orchestrates this timing. > >> Currently this is triggered either by KHO debugfs or by LUO ioctls. If we >> completely drop KHO debugfs and notifiers, we still need something that >> would trigger the magic. > > An external "magic trigger" for KHO (like the current finalize > notifier or debugfs command) is necessary for scenarios like live > update, where userspace resources are being preserved in a coordinated > fashion just before kexec. > > For kernel-internal resources that are unrelated to such a > userspace-driven live update flow, the respective kernel components > should directly use KHO's primitive preservation APIs > (kho_preserve_folio, etc.) when they need to mark their resources for > handover. No separate, state machine or external trigger should be > required for these individual, self-contained preservation acts. For kernel-internal components, I think this makes a lot of sense, especially now that we don't need to get everything done by kexec load time. I suppose the liveupdate_reboot() call at reboot time to prepare final things can be useful, but subsystems can just as well register reboot notifiers to get the same notification. > >> I'm not saying we should keep KHO debugfs and notifiers, I'm saying that if >> we make LUO the only thing driving KHO, liveupdate is not an appropriate >> name. > > LUO drives KHO specifically for the purpose of live updates. If a > different userspace use-case emerges that needs another distinct > purpose (e.g., not to preserve a FD a or a device across kernel reboot > (i.e. something for which LUO does not provide uAPI)), then that would > probably need a separate from LUO uAPI instead of extending the LUO > uAPI. Outside of hypervisor live update, I have a very clear use case in mind: userspace memory handover (on guest side). Say a guest running an in-memory cache like memcached with many gigabytes of cache wants to reboot. It can just shove the cache into a memfd, give it to LUO, and restore it after reboot. Some services that suffer from long reboots are looking into using this to reduce downtime. Since it pretty much overlaps with the hypervisor work for now, I haven't been talking about it as much. Would you also call this use case "live update"? Does it also fit with your vision of where LUO should go? If not, why do you think we should have a parallel set of uAPIs that do similar work? Why can't we accommodate other use cases under one API, especially as long as they don't have conflicting goals? In practice, outside of s/luo/khoctl/g, I don't think much would change as of now. The state machine and APIs will stay the same. When those use cases start to diverge from the liveupdate, or conflict with it, we can then decide to have a separate interface for them, but when going the other way round, we won't end up with a somewhat confusing name for a more widely applicable technology. I've been thinking about the naming since the start, but I didn't want to bikeshed on it too much. But if we are also talking about the scope of LUO, then I think this is a conversation worth having. PS: I don't have real data, but I have a feeling that after luo/khoctl mature, more use cases will come out of the woodwork to optimize reboots. -- Regards, Pratyush Yadav