From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2B27FE732E7 for ; Thu, 28 Sep 2023 16:35:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 94EFB8D00BB; Thu, 28 Sep 2023 12:35:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8FE1F8D0023; Thu, 28 Sep 2023 12:35:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 814C78D00BB; Thu, 28 Sep 2023 12:35:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 757628D0023 for ; Thu, 28 Sep 2023 12:35:11 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 30E3E40F4F for ; Thu, 28 Sep 2023 16:35:11 +0000 (UTC) X-FDA: 81286556022.04.FA864D2 Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by imf03.hostedemail.com (Postfix) with ESMTP id 3124820014 for ; Thu, 28 Sep 2023 16:35:08 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=linux.microsoft.com header.s=default header.b=cm2G3VB7; dmarc=pass (policy=none) header.from=linux.microsoft.com; spf=pass (imf03.hostedemail.com: domain of skinsburskii@linux.microsoft.com designates 13.77.154.182 as permitted sender) smtp.mailfrom=skinsburskii@linux.microsoft.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1695918909; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=20rY7wCfQkNIai47fHlCR2wiN/P/+IE4P5kF7PJk2Ow=; b=Vhw4BPjcod+GqjlFpPUjjShCZM0Yi5RKKbiwzpWW64sqDzlS2io3DEzcRS/gJHDwgZcD4X HsVIOi2hK6RtSmz1ze77H2n1VmxHJyUAvLjnV/aPgAqS2rmAHQcHD64E4xKUevUQuOVRP5 Q4PjLNwI3PJeeSdZqoXnGD7f3psE394= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=linux.microsoft.com header.s=default header.b=cm2G3VB7; dmarc=pass (policy=none) header.from=linux.microsoft.com; spf=pass (imf03.hostedemail.com: domain of skinsburskii@linux.microsoft.com designates 13.77.154.182 as permitted sender) smtp.mailfrom=skinsburskii@linux.microsoft.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1695918909; a=rsa-sha256; cv=none; b=xfZT/Ic5S4Zuu3xlz96vlwo5tm0MSDQHN/aOeSIFqFCSpzzl2Q3z5mXK9DGAg0w8hi2ABt hFMSA4A/umOAmPZmFJSJC3oEipDH8hVxeuxACX/Np/9WTYzsOtr+gLCeV458AJb4V9k0jJ jKnjK3e5gqo2jNWzfUNj9pge9aihHZ0= Received: from skinsburskii. (c-67-170-100-148.hsd1.wa.comcast.net [67.170.100.148]) by linux.microsoft.com (Postfix) with ESMTPSA id 8A59F20B74C0; Thu, 28 Sep 2023 09:35:07 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 8A59F20B74C0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1695918908; bh=20rY7wCfQkNIai47fHlCR2wiN/P/+IE4P5kF7PJk2Ow=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=cm2G3VB74wif6uU/gwtXWT1K2ZNlYLauxazS8xqu+ch2bgGeRNMojVmIISOmrFVfD HMKW9vSG9S/9Xr9CRzDlGb2gFLMIjUnDnwqnCxviIozFMyfKNhJlvsxBxXF/l2Tf5w JfmJkA3d67Lx65PqBMYNUD4rNQ7mG9RW/4TtqD/E= Date: Wed, 27 Sep 2023 15:44:45 -0700 From: Stanislav Kinsburskii To: Baoquan He Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, ebiederm@xmission.com, akpm@linux-foundation.org, stanislav.kinsburskii@gmail.com, corbet@lwn.net, linux-kernel@vger.kernel.org, kexec@lists.infradead.org, linux-mm@kvack.org, kys@microsoft.com, jgowans@amazon.com, wei.liu@kernel.org, arnd@arndb.de, gregkh@linuxfoundation.org, graf@amazon.de, pbonzini@redhat.com, david@redhat.com Subject: Re: [RFC PATCH v2 0/7] Introduce persistent memory pool Message-ID: <20230927224445.GA20200@skinsburskii.> References: <01828.123092517290700465@us-mta-156.us.mimecast.lan> <58146.123092712145601339@us-mta-73.us.mimecast.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 3124820014 X-Stat-Signature: nb33gtuz6gjeg6u8nzbppg9withyajmk X-Rspam-User: X-HE-Tag: 1695918908-13029 X-HE-Meta: U2FsdGVkX18tD0PnNiGUTopPdKt1Y1H/p/TbBtaaUJZWPnWRM+/CxaWBbbj8tRZ7V5kVLLECOH8l/j374D6ohPP6U7oyJTuz7oNk1RqP0A0Kz38ahlODFTgAuI09dADnSLl0hWvhZF5YGpsle00jb1jH4C/VP8UTiqioIba037YvOqzMsdecBc5k7ARwnXy0JLsBz4oph9usFU2hNbA+mcZ0dFIcUH+NTJDSJrVPo8FbYEmOe0sTYiW/Ui+ncpq3i98Xx1s9Sc47eMcYlcKloR3svOnW23qheppuIx24AbnPK0urH0q/mjaUmqT9bxT2iIqYggd59/RoSJbpuOhGLebQUloI7CY6mxfVUS+iijcmaq8ZfTcmIttL1vHthJiSsTYGZCdK8irXUZUxefh/7OgupxkjbLFmLBkHAW41bkCw2J1C7g6xbDK/+VPQ76/15RoEeni/nd/aohxQ+Y3hMOo8vezG8iSi+947w62GWkiaQKhEVnJd+LU0sqGdebq5QdgEXQmsms2psmPlCK0ZAlZOs1UIckGPXFrkgS4mCMv9psazxa8xZULjxdJr5Z5VsJNqhI5qPaIKQbYc0CTlr38w7rfOsTsCEbcbYZDNYuxFIlBw6uPORJMmI178ZwBkq2/sMrA2OQ1TaiTceT7ZF2diHRT28ydlVOmJeRI4HELxDLWpiy4LKgBGsOcVEh2Ybsya8bUrM9CDkJaAOX5TF947raqCeI2mtbGoNsXaMD+6xGBKqhqv1kky6g3vZeTgL9sYLZSZz2brgBlg75iLQ/bNJUowYe1HuFi36WEynWj6rcQIsAI6ys2Et6nGb/H3vJzOHj8ckGDoW5momreeH69bZlq2Kg3ux5RTQAJDVi9Vfur01yU8ADhOS/UazhnsDpqGCacytxbqHfMJb2jECUo990WI5X68WPO/sKve0ajWZOkxToUFMHu8bAtj0saAXDXiXwttnezc1Bi3wdK gesiRLxv J3oc+7pCokS56Yus3/6fEz0HCWaZ7WRRjQS+tk6B215j355vol+1LV9TSA2fLyLDJwTtdtSQnBvLSTysfs77fs3v0EGIvAkMWbFhT9lfx0U5porRE+K/kFOCDl7YekCMgRVCVXNjOmqD2YLcvUUJRG9MT1qI72wb47w0E5Ic2adqJhu4iwIT7mXF44x0VzSHDwmMwpnmRb6worCALxHqYjRBkOkNScBKKNUbLxNq3sZd6AHOdRuOHVAIu6r+FKgOpMqMa X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Sep 28, 2023 at 06:25:44PM +0800, Baoquan He wrote: > On 09/27/23 at 09:13am, Stanislav Kinsburskii wrote: > > On Wed, Sep 27, 2023 at 01:44:38PM +0800, Baoquan He wrote: > > > Hi Stanislav, > > > > > > On 09/25/23 at 02:27pm, Stanislav Kinsburskii wrote: > > > > This patch introduces a memory allocator specifically tailored for > > > > persistent memory within the kernel. The allocator maintains > > > > kernel-specific states like DMA passthrough device states, IOMMU state, and > > > > more across kexec. > > > > > > Can you give more details about how this persistent memory pool will be > > > utilized in a actual scenario? I mean, what problem have you met so that > > > you have to introduce persistent memory pool to solve it? > > > > > > > The major reason we have at the moment, is that Linux root partition > > running on top of the Microsoft hypervisor needs to deposit pages to > > hypervisor in runtime, when hypervisor runs out of memory. > > "Depositing" here means, that Linux passes a set of its PFNs to the > > hypervisor via hypercall, and hypervisor then uses these pages for its > > own needs. > > > > Once deposited, these pages can't be accessed by Linux anymore and thus > > must be preserved in "used" state across kexec, as hypervisor state is > > unware of kexec. In the same time, these pages can we withdrawn when > > usused. Thus, an allocator persistent across kexec looks reasonable for > > this particular matter. > > Thanks for these details. > > The deposit and withdraw remind me the Balloon driver, David's virtio-mem, > DLPAR on ppc which can hot increasing or shrinking phisical memory on guest > OS. Can't microsoft hypervisor do the similar thing to reclaim or give > back the memory from or to the 'Linux root partition' running on top of > the hypervisor? > Although Microsoft hypervisor is a type 1 hypervisor and runs on the physical hardware, like Xen, it doens't control all the memory, but is rather granted with memory by either boot loader or by Linux root partition (similar priveleged VM is called "Dom0" in Xen world). IOW, this works in the oposite direction: Linux gives memory to hypervisor, and can reclaim it back. However, doing so on kexec increases downtime as withdrawn pages must be deposited back again after booting to restore the guests ("DomU" in Xen terminology). It worth mentionining, that the "deposited pages" in this context don't mean guest pages, but the pages required by the hypevisor to store Linux root partition state user to control guest partitions. Also, pages reclaim is not possible, if guests are left running during kexec, as hypervisor requires to keep the Linux root partition-related state intact to keep the guest state consistent. > Thanks > Baoquan