From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5704CC02181 for ; Fri, 24 Jan 2025 18:24:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C2C6A28008D; Fri, 24 Jan 2025 13:24:29 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BDC7B280079; Fri, 24 Jan 2025 13:24:29 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ACAEF28008D; Fri, 24 Jan 2025 13:24:29 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 8F0C1280079 for ; Fri, 24 Jan 2025 13:24:29 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 3B284C1765 for ; Fri, 24 Jan 2025 18:24:29 +0000 (UTC) X-FDA: 83043170658.22.FB09468 Received: from mail-wr1-f48.google.com (mail-wr1-f48.google.com [209.85.221.48]) by imf18.hostedemail.com (Postfix) with ESMTP id 5143F1C0004 for ; Fri, 24 Jan 2025 18:24:27 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=mgiLYWHk; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf18.hostedemail.com: domain of ryabinin.a.a@gmail.com designates 209.85.221.48 as permitted sender) smtp.mailfrom=ryabinin.a.a@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1737743067; a=rsa-sha256; cv=none; b=pA4HXutw+99yvf4Mz+PjDEmrY1SIf9MkZRnFrwx/pMyfprTvIp7mSUpoPG9AyXn+E+Giir p53kHR8xS1j3jDSAb9sSFGUGHmreEMynLIaAMMOjzRUQhPTY7eBY06nrknr7gIkk5tmD8C aaAOu7F50eUTblcoTILWRoAI0TW18Ok= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=mgiLYWHk; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf18.hostedemail.com: domain of ryabinin.a.a@gmail.com designates 209.85.221.48 as permitted sender) smtp.mailfrom=ryabinin.a.a@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1737743067; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=8agg9BFSQYSFh5wb0aEy8Gh+aG4GbG1JoHcylwVXozI=; b=x8ScTvn8ewA/rUhYXkgwGDDaojaH8RNsxS9fFcbBHnYx4DN5NWvUyzV74saChUFONmyhiZ GwL4byr5sc3yoskzwMdKyHQjSCFbKOwuvJCkBFJ5hmB1tGLXslAlhBenyHnoj4ay15qEJx FpvvNM1VPcyR5uplcSuKDuwOnZlAaj4= Received: by mail-wr1-f48.google.com with SMTP id ffacd0b85a97d-3862e2c9bb5so203920f8f.0 for ; Fri, 24 Jan 2025 10:24:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1737743066; x=1738347866; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=8agg9BFSQYSFh5wb0aEy8Gh+aG4GbG1JoHcylwVXozI=; b=mgiLYWHkwLSHRdpOs9cM1CSI9uVYqyKQQ4JEtgXD8YzHc4kVchMk35eUjIFa5+w28W 5TOyt4Z/QKZTqDTt1jY7BblxJ+FfsZX6caDJCSvL7rc9dBCrRyhcbAxqe0HAKIjn9AXE giNFWLMaUBgUw9TqpKRmlli/2q2qELO38rVKrTHLAlOTgJShQQt5WdIHs4o1ZUGyjBnZ WdXdqog3sNptTSdW8OcUgrE6VUwZJaiBhWaI5HsvT2/fZc2DjYSYb/Sge/Hi6YrW0CxC SOmo9+tP67cxExQqv6OhyRCFCz0/GMnU42yTWJn3itffJDMEIk+zGXGALZGKa6tCI45F 09zQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1737743066; x=1738347866; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8agg9BFSQYSFh5wb0aEy8Gh+aG4GbG1JoHcylwVXozI=; b=LeT38neTQGY8nzJCil2e96EV9fv32gIM7NyqzzVBDqRKmXtNWEO8fVrtjKy+KfTUSM lLnu6d/1x3HuoFI2nsue8iPDUAI9SUtdpjf7Xhw6rk3Y6/AxUceLCjF3cnHOtxcWpqu8 /TDBQQmz6n39jCcCF284wKXpqvo4/BSzmCV7viPEwSuMmDuZPBvacGgS3rUc+tdbxeab 1XFk5FXlAqbwSyDfh6OIvGE9eevzz24Est6IppUBgE3kHYPJCaVspSXVT2Zb+cVQlkE+ QnQNXYWiohTo0j7KmfEgRLI/0sXwDqoEbNhvUi+qw+27YcaPisZHB0gRk7IuMtpQSjnf bjVA== X-Forwarded-Encrypted: i=1; AJvYcCUP6Q96PwqwFtr/WY0fKe3+G+9QhOKB9uVGQVsUQT6+d4NeeRYuQLhF4u4kcTqstGPu7EHiMahd7g==@kvack.org X-Gm-Message-State: AOJu0Yz0uHMsbDrkiANoa7oud0djK4eWntv3aB+dLTFN6+0++GRt84fP A8IPm0J0ZkV/1xltItcwqGQtivkyS2Jyhm3xyA8SNVWp0XKXFwhd/K4wz9z91tyk/CZ+kn5LMtC X1wqCzXgNADnbEHzcPTwAk9v8YRo= X-Gm-Gg: ASbGncs8sEkzY7ZErQDrUKLj+ypDv6SYMbr1DKM/L9UZtVf0cTHKfYzSBVZ+11LfbOx RVBwKlkKBgQbu59UbGZxz+7JPuHKF6T9uF0nH3lg7YjsyxDr+Ar5z2dxDx6jWVaAOna2uOLDKv8 wvxX+1lS8NpHzw/uH9 X-Google-Smtp-Source: AGHT+IHg9YGgLEV2FLTVMRAHf4xp7vbl4PJQ4EFd1hnmRJPkN5y4hmMIi6CNADbUCo5BiMXUl1BMGaKfWUQ7pNH/cKA= X-Received: by 2002:a5d:47c5:0:b0:38a:8784:9137 with SMTP id ffacd0b85a97d-38bf57a8f56mr11390034f8f.9.1737743065582; Fri, 24 Jan 2025 10:24:25 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Andrey Ryabinin Date: Fri, 24 Jan 2025 19:23:09 +0100 X-Gm-Features: AWEUYZmWqCRXeASRKrOZS5dHwLNlPBrk9peHZlA9-F-GbAU6hUBuQXInLeYgPdk Message-ID: Subject: Re: [LSF/MM/BPF TOPIC] memory persistence over kexec To: Mike Rapoport Cc: lsf-pc@lists.linux-foundation.org, Alexander Graf , "Gowans, James" , linux-mm@kvack.org, David Rientjes , Pasha Tatashin , Jason Gunthorpe Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Queue-Id: 5143F1C0004 X-Rspamd-Server: rspam10 X-Stat-Signature: ybbrabogmh7b8dda7ynoh61qtkgaw7f5 X-HE-Tag: 1737743067-347701 X-HE-Meta: U2FsdGVkX19Bih6IysRDAFmwz7Bw1OxUjEhl3SUHksMip8eG9HyueD4UH3gffccC7Aivgq/K3XRaPZQp5J2+dKohDjqNi4W2B+hcgvtViu4frrTozfh39S6a7Kd8zCmI0PgkPZzQNoXqkvSJ94fEOj29y+/xZBjnYxMQ9HN1s5r5KUVE1SdDzKvGWWprTahG9Hh5fqB0b4gmRG1dMLFDupXJJsqZQNy3kEbeKZsH7Od9U+zeJMMV/QqnUHS8158w5B3JBaGsVykqZldoHtER4eP+3pufvQsOCIJh9g6kzyRk9luwgBPk2sPcun3gjQKUwtzf2u4m5dVAe61bz1kIMb6zuKdDdgy4SWPGISYutWHpjd1/BdUNvoZXXcgi/aks3Mb6TyFZnmCAA+wgLeFmU0m1MzLjcEqdawVVlzarXaHl1vjxyJAHMYwIIf0ctWKfcLiA2xen2/Vb1o9AYi1MRJm9g+xVQA8Glp0Wf941YDUyqr5N+AO8JazCWxDAgIwvAcPp72u44vNoVGhMQlX5231a8ytMVJfhovhGsfI0LgrRnXGjgzHlNtvHftzHzLJmgQHSdWb2Hm/zUWnEL4riPuKUTRFnG175Kp0VuYmvUbKIZet6yPshkPK1wEljnFF378jE0NVHfX7c/HqhjPzkwp5xw7PgRxmmV5XEpMtnRAQ8dwpc/hhHPD83aW8uHS/qstZGvDw+Jhx8OWXa+hmHqiN+oSMeuI6Oo7qOVZ5RNVr4DHFP2+7FHTTXIegWldsA2+mna/t5EzsEiMgKI+kzDi0TuOAg99O3WWvTtfswrjfCXlT2Nl3wLxCrFeFNqesw9gMqCpCzsunlUrkBvXxF5TvV/3PtuAufJcyq+tlLXt5v3GHdtztfy0OnXCdm4P6qdNwU2wGxif2ISmzWsvXW79XwqovWZGbcPZ637OMqSnnc5eRkuRjN4YEEKMwjnFudOiwtIbuYVDezGtUVYxZ BObAEawn Ml0S7OqMh80GNXh4iHTPH+sAaAs2IMUQbwOdl4S6Px4UGxHweOZ8JcU6JUMX5HU+KfzHpAMICjaUmRyppRymzRRvyXHpW3TKL1y+0FRq9ZrDhB0gBSNKKrUyeYLOspqPwDf2k42QF1Ot2SIjCY3YbWB1WmsjwcyFuLYM0ETgzDofOpUM9g8lkuhgcRuQ4fzZdnNV1MGbewm6C5OtoUj4MP38ymk0Lu39571hRxHbjFSvlIMtG34ioxHLgta8vstahrACjd5e1pTfICIEVu8IcDgWWFRYFqlGAzJMcjIYjAFD/+LZx2t+sRBJaLS4fv4vKxALRumEdoHhI19SyMdUUvKdl4yEkanBPBSbMiCLPWuzXF0xWlgBhaCcupk6R8HpUkWwD1EmuxSUKlOMGXaPP3hgGlXG1WQ+sjH4m32o0EKYOfre51+KAKqf+5Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Jan 20, 2025 at 8:54=E2=80=AFAM Mike Rapoport wro= te: > > Hi, > > I'd like to discuss memory persistence across kexec. > Hi, I'm very interested in this topic as well, I'd like to join the club ) > Currently there is ongoing work on Kexec HandOver (KHO) [1] that allows > serialization and deserialization of kernel data as well as preserving > arbitrary memory ranges across kexec. > To be able to perform live update of hypervisor kernel with running VMs which use VFIO devices we would need to [de]serialize a lots of different and complex states (PCI, IOMMU, VFIO ...) When I've looked at KHO I found that the process of describing data using KHO is complicated, requires to write a lot of code that needs to be invaded deeply into subsystem code. So I think this might be a blocker for applying KHO to VFIO device state which is more complicated than the ftrace buffers. To address this particular issue I've come up with the proof of concept which I sent a few months ago: https://lkml.kernel.org/r/20241002160722.20025-1-arbn@yandex-team.com The idea behind was inspired by QEMU's VMSTATE mechanism which solves similar problem - to describe and migrate devices states across different instances of QEMU. As an example, I've chosen to preserve ftrace buffers as well, so it's easier to compare with KHO approach. > In addition, KHO keeps a physically contiguous memory regions that are > guaranteed to not have any memory that KHO would preserve, but still can = be > used by the system. The kexeced kernel bootstraps itself using those > regions and sets all handed over memory as in use. KHO users then can > recover their state from the preserved data. This includes memory > reservations, where the user can either discard or claim reservations. > > KHO can be used as the base layer for implementation of persistence-aware > memory allocator and persistent in-memory filesystem. > > Aside from status update on KHO progress there are a few topics that I wo= uld > like to discuss: > * Is it feasible and desirable to enable KHO support in tmpfs and hugetlb= fs? > * Or is it better to implement yet another in-memory filesystem dedicated > for persistence? We would definitely need a framework to [de]serialize data. With that we should be able to preserve tmpfs/hugetblfs (and it probably will be easier than preserving some device state). So yet another in-memory filesystem should come only as a solution for some potential problem, just for example: - serialization of tmpfs/hugetlbfs requires unreasonable amount of memory (or time to process) - implementation ends up too complicated and fragile, so it's just better to have separate dedicated fs - whatever else comes up...